This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
R. If O is an arbitrary potential, its restriction to the set of all singletons will be called the self-potential part of <ï>. Clearly, a Gibbsian specification for a potential O with a non-trivial self-potential part ( o))» °}\ Oj-p\ ti)\fk(ti)-fk-n(xtiz\{k-n})\ i= l (s0) < oo because Q"(0,0) S C for all n, and therefore L(Q) ^ C. Since ßS0(0,0) = ß"(0,0)M So )" for all n ^ 1, this will also show that L(QSo) = L(Q)/(p(s0) ^ 1. Since L(P) ^ 1 for each stochastic matrix P, we shall arrive at the desired conclusion L(QSo) — 1. (By the way, the argument above shows that s 0 will be characterized by the equation L(Q) = (p(s0) = inf (p(s).) seW —oo j(t) -> J as t -> oo. As an immediate consequence of these qualitative features of q>j we obtain the following lemma. (12.27) Lemma. For given d ±t \ and J > 0 we define h{J,d) = max \_dcpj(t) — t] = min [t — dq>j{t)~\. The fixed point equation (12.22) has (i) a unique solution t^ = t^(d, J, h) when \h\> h(J,d) or h = h(J,d) == 0, (») two distinct solutions £_ < t + (depending on d, J, h) when \h\ = h(J,d) > 0; and (iii) three distinct solutions t_ < t # < t+ (depending on d, J, h) when \h\ < h(J, d). We will now compute the critical external field h(J, d). We put J(\) = oo and, for d > 1, (12.28) (wzk) for a-almost all w and all ke S. Applying Corollary (7.4) to the probability kernels from C to C which are induced by the mappings w ->• wzk, we see that focpis constant a-almost surely. Thus / is constant (p(a)-almost surely, and a second application of Corollary (7.4) shows that >(a) e ex ^©(Q, #"). The last statement of Theorem (13.36) therefore implies that ßc* >(a) e ex^@(yJ'°). The proof of the first assertion of the corollary is completed by noting that for each r > 0 there is some ar e ex^ z (C) with ccr(w e C: \w\ = r) = 1, namely the Haar measure on the closure of the set {rzk: k e S). (See Theorem 6.20 of Walters (1982), for example.) 2) To prove the converse we assume that J has no root in G, and we let fi e y&(yJ'°) be such that ß{al) < oo. We will show that fi = fic. We start with the observation that the infinite sums X J(j - i)(7; i ^ = {U}>'^> I 0 otherwise. Here J: S\{0} -> [0, oo [ is an arbitrary even function, and the dot denotes the usual inner product. Each such O is called a ferromagnetic Heisenberg potential. Clearly, any such O is preserved by all spin rotations (cf. Example (5.2) (3)).
2.2
Quasilocality
Having introduced the notion of a Gibbs specification we aim at showing that Gibbs specifications are not as particular as they might seem at a first sight.
Quasilocality
31
We shall do this in two stages: First we shall introduce the concept of a quasilocal specification, and we shall argue that quasilocality is a natural condition. Then, in the next section, we shall show that quasilocal, positive pre-modifications are necessarily Gibbsian. The motivation for introducing quasilocal specifications arises from the physical idea of a strict separation of microscopic and macroscopic quantities: A microscopic part of a system does not possess any information about the macroscopic state of the system. Let us describe the microscopic and the macroscopic quantities. A real function / o n f i will be considered as describing a macroscopic observable if / is measurable with respect to the tail a-field (2.19)
f ^ n ^ A = n
&S\A-
2T is also called the a-algebra at infinity. The reason for this interpretation is obvious: The ^"-measurability of a function / j u s t means that the value of/ is not affected by the behaviour of any finite set of spins. (At first sight it might seem that ST = {Q, 0 } . Quite on the contrary, &~ is very rich. A typical example of a tail event is < lim |A„| _1 Y, ai exists and belongs to B >, where B e S and (A„) is a cofinal sequence in y ; of course, this is assuming that E is equipped with a measurable additive structure.) The condition of properness in the definition of a specification implies that every specification y preserves the macroscopic observables, in that yAf = /whenever A e ^ and / is bounded and tail measurable. In particular, y maps the set of all bounded tail measurable functions into itself. Next we ask for a precise meaning of the term "microscopic quantity". It is natural to say that a function / on Q is a microscopic quantity if / is arbitrarily close to functions which only depend on finitely many coordinates. There is, however, no canonical interpretation of the word "close". The simplest meaning is "close in the uniform norm". This leads to the concept of a quasilocal function. (2.20) Definition, (a) A real function / on Q is called a cylinder function or a local function if/is J^-measurable for some A e ^ . For each A e ^ w e write ifA for the linear space of all bounded immeasurable functions, and we let if = I J A S ^ - ^ A denote the set of all bounded local functions. (b) A function / : Q -> M will be said to be quasilocal if there is a sequence (/Jnâi °f l° c a l functions /„ such that lim,,^ \\f — f„\\ = 0. Here || • || is supnorm. We write if for the space of all bounded quasilocal functions. Clearly, if is the uniform closure of if'.
32
Gibbsian specifications
(2.21) Remarks. (1) A measurable function / is quasilocal if and only if (2.22)
lim
sup
| / ( O - / f o ) l = 0.
(The notation lim means that the limit is taken along the directed set y.) Clearly, each quasilocal / satisfies (2.22). Conversely, (2.22) implies that / is the uniform limit of the local functions œ ->/(coArçs\A) when r\ e Q is arbitrarily fixed and A runs through a cofinal sequence in y . In particular, we conclude from (2.22) that a non-constant tail-measurable function can never be quasilocal. (2) Suppose E is a metric space and d is any of the usual metrics on Q. that induce the product topology. Then each uniformly continuous function / : Q —> R is quasilocal. To see this, let e > 0 be given and 8 > 0 be such that | / ( 0 — f(r))\ < e whenever d(Ç,j}) < 8. By the definition of the product topology, there exists some A e y such that ÇS\A = VS\A implies d((,, TJ) < 8. This yields (2.22). In particular, if E is a compact metric space then every continuous function on £2 belongs to SI. (3) If E is finite and Q is endowed with the product topology of the discrete topology on E then / : Q -» IR is quasilocal if and only if / is continuous. Because of the preceding remark we only need to verify that each quasilocal function is continuous. But this follows easily from (2.22) and the fact that
is a metric for the product topology on Q; here n: S -» N is any bijection.
o
We started from the question in which we asked for a condition on a specification y expressing the idea that a microscopic part of the system only has a microscopic horizon. Now we are ready to introduce such a condition. Formally, this condition will remind the reader of the notion of a Feller kernel. (2.23) Definition. Let us say that a specification y is quasilocal if, for each A e y , / e S implies yAf e <£. Clearly, to verify that a given specification y is quasilocal we only need to check that yAf e S when A G y and / e if. Also, it is easily seen that each independent specification is quasilocal. The next proposition deals with the quasilocality of ^-specifications. (2.24) Proposition. Let X e Ji(E, S) be an a priori measure. (a) Suppose y = pX is a X-specification. If each pA is a local function, or if X is finite and each pA is quasilocal then y is quasilocal. (b) Suppose 0 is a X-admissible potential such that all Hamiltonians H^ are quasilocal. Then y® is quasilocal.
Quasilocality
33
(c) Conversely, suppose E is finite, X is counting measure, and y is a quasilocal specification. Let p be the unique X-modification with y = pX_. Then pA e S£ for all A e y . Proof, (a) Let A e y and / e if. If pA is local then so is fpA and therefore y A / = XA(fpA) e if. So let us suppose pA is quasilocal and X is finite. For given £ > 0 we choose a local function g with ||pA — g\\ < e. Then XA(fg) is local and
\\yAf-Ufg)\\
S\\f\\X(Ers.
As e was arbitrary, y A / e if. (b) Let A e y and e > 0. By hypothesis there is a local function u with || JïA — u|| ^ e. We put v = e~u. Then XAv :g e£ZA < oo. We consider the local function g = v/XAv. We have
K\pt-g\ = K(\hflxAht-vixAv\) ^ AA{\h* - v\)ßAh* + XAv\\/XAht
- l/XAv\
Z2XA{\ht-v\)/ZZ = 2y?(|l-exp(fl*-u)|) ^ 2(e£ - 1). Now if / e if then XA(fg) is local and htf
- AA(/0)|| ^ 11/11 ||AA(|p* - |)|| ^ 2(e£ - 1) ll/ll.
As e was arbitrary, y A / e i£. (c) For each A e y we may write PA=
Z
!K=<;} 7 A 1 {*,=<;}>
Çe£A
and we have l{ffA=ç} e i£ and yAl{ffA=C} e S. Thus pA e if'.
o
In the case of a finite state space £ there is also a converse to assertion (b); see Corollary (2.31) below. We stress some particular cases of Proposition (2.24). (2.25) Example. Let X e Ji{E,é>) and $ be a uniformly convergent Xadmissible potential. Then (2.24) (b) shows that y® is quasilocal. Under additional assumptions we have the following results (which are stronger because of (2.24) (a)). (i) If $ even has finite range then each pA is local. (ii) If , besides being uniformly convergent, has bounded Hamiltonians HA then pAeJ£ for all A e y . (This case is only possible if X is finite. It occurs, for instance, when $ is absolutely summable.) o
34
Gibbsian specifications
We conclude this section with the observation that examples of specifications y with \y(y)\ > 1 can easily be constructed when y is not assumed to be quasilocal. In this case, however, the non-uniqueness phenomenon cannot be interpreted as a kind of phase transition. It corresponds rather to a transition which is induced by a change of the experimental conditions. (2.26) Remark. Let (X,9C) be a measurable space and (fix)xeX be a family of random fields. Suppose (i) there is a tail measurable function £: Q -• X with /ix(£ = x) = 1 for all x e X;and (ii) there is a family (yx)x e x of specifications such that the functions (eo, x) -* y£(A |co) are measurable and jux e ^(y x ) for all x e X. Then there is a specification y satisfying {fix: x e X} c 3(y). Proof: We put yA(4 | ta) = yi (w) (^ |
( A e ^ , / l e f , ( a e Q).
It is easily checked that y is a specification and satisfies fixyA = fixy£ = \ix for all A e if and x e X. G The conditions of the preceding remark are met whenever X is countable and (fix)xeX is a family of random fields which are pairwise mutually singular on 2T and specified by suitable specifications. (The latter is true whenever (E,&) satisfies a weak regularity condition; see Sokal (1981).) Let us now present an example with an uncountable X. (2.27) Example. Let E = { 0 , 1 } , S = N , X = [0,1], and 3C be the Borel «r-field on X. For x e l w e let fix denote the Bernoulli measure on Q with probability x for "success", i.e., fix = (Xx)s with lx = xô1 + (1 — x)ô0. The supports of the fix,s are separated by the tail measurable function 1 " £, = liminf- £ n~*<x>
f;;
<*- i —1
this follows from the strong law of large numbers. From Remark (1.25) we know that fix e ^(yx), where yx = Xx is the independent specification with single spin distribution Xx. Clearly, the functions (co, x) -> y£(A\a>) are measurable. Thus Remark (2.26) provides us with a (non-quasilocal) specification y such that @(y) contains all /i x 's. In fact, we have g(y) = {J w{Ax)nx: w e &{X,&)}. For " ZD " trivially holds, and if fie &{y) then for all A e if and A e 3FK we have
Gibbs representation of pre-modifications
35
H(A) = pyA(A) = J ii(dco)yfœ)(A\œ)
= J p(dœ)p^(A) = J C(ß)(dx)px(A). Applying de Finetti's theorem we thus can conclude that ^(y) consists of all exchangeable random fields, i.e. of all random fields which are preserved by all permutations of finitely many spins. De Finetti's theorem will be proved later in Example (7.31). o
2.3
Gibbs representation of pre-modifications
In Section 2.1, Gibbs specifications were introduced on physical grounds. Here we will provide a mathematical justification for considering Gibbsian specifications. We shall consider any /l-specification y = pX,, and assume that the associated /l-modification p is a quasilocal positive pre-modification. (We have seen before that these assumptions are natural.) Under these conditions, the Gibbs representation theorem (2.30) below will state that y is Gibbsian for some potential O. In addition, O can be chosen to satisfy a normalization condition. The significance of this normalization will be discussed in Section 2.4. Examples can be found in (2.38). (2.28) Definition. Let a e ^(E, i) and define probability kernels aA from STA to !F (A cz S) as in Remark (1.25). We say a potential O is ^-normalized if *B*A
= I aB(dO
whenever 0 # B cz A e y . In particular, we say O is a gas potential with vacuum state a e E if O is normalized by the Dirac measure 8a at a. (2.29) Comments. (1) O is a-normalized if and only if aBQ>A exists when 0 # B cz A e £f, and a.^Q>A = 0 whenever i e A. In particular, O is a gas potential with vacuum state a if and only if Ox(a>) = 0 whenever coi = a for some i e A. (2) The terminology "gas potential" and "vacuum state" comes from the interpretation of a random field as a lattice gas: For x # a, the event [oi = x] means that the site i is occupied by a particle of type x, whereas {ui = a) is thought of as the event that i is empty, o We are now ready to state the Gibbsian representation theorem. We say a family (p A ) A e ^ of functions on Q is quasilocal if each pA is quasilocal. (2.30) Theorem. Let X e Ji{E, S) be an a priori measure, and suppose p is a positive quasilocal pre-modification such that XApA = 1 for all A e Sf. Then for each a e E there is a unique X-admissible gas potential Q>" with vacuum state a
36
Gibbsian specifications
such that p = p°a. If, in addition, logp A is bounded for all A e £f then for each a G âP{E, ê) there exists a unique uniformly convergent ^.-normalized À-admissible potential O" with p = p°". Proof 1) Let a G 3P(E, S). For each A e Sf U {0} we define
PA/= I (-D MXC W/ CŒA
whenever / : Q -> M is a measurable function for which all functions a s \ c / exist. The operator pA has the following properties. (i) PA/ i s ^A-measurable. This is because, for each C, <xs\cf is measurable with respect to SFC. (ii) For all A G SP U {0}, Z
«S\A/ =
PA/-
AŒA
This comes from the inclusion-exclusion principle. (iii) For all 0 # ß c A, a ^ / ) = 0. It is sufficient to check this when B = {i}. In that case we have ««(P^/) =
E
( - l)|i4NC|«{i}(«s\c/ - «s\<cu{i}/) = 0.
For if i £ C then a {l} a sxc = asXc = a{l}aSX(CU{i}), by Remark (1.25). 2) For A e SP we put u^ = logp^ and Q>A = —pAuA. <&A exists when a = ôa for some a e E or uA is bounded. It follows from properties (i) and (iii) of pA that <&A is ^-measurable and a-normalized. 3) Next we claim that OA = —pAuA whenever 0 # .4 c A e y . As p is a pre-modification, M O - MxM = " A ( 0 - " A M for all Ç, to G Ü with ÇSV4 = c a ^ . Integrating over Ç with respect to aA(-\œ) we obtain a^u^ — uA = a^uA — uA. Thus for each C c i w e have <XS\C(UA -
" A ) = C*S\CVA(UA -
" A ) = a S («A - " A ) -
Since
E (-ir ci = o, CcX
we conclude that pAuA = pAuA. 4) Finally we look at the partial sums H* A which were defined in (2.13). For A c A w e have (J^^cA
= - Z P^MA + .4c:A
0ftAcA\A
Z P^"A .4c:A\A
Gibbs representation of pre-modifications —
— a
S\AUA +
a
=
«S\A(aAUA -
"A)
=
«S\A(«A"A -
"A)
37
S\(A\A)UA
where vA = aAuA — uA. The third equation follows from property (ii) of pA, and the next to last equation was proved in Step 3) above. Suppose now that a = öa for some ae E, and let a e Q denote the constant configuration taking the value a. Then for each co e Q we have «S\A^AM =
lo
gPA(
and the quasilocality of pA implies lim H ^ A M = lim aSSAvA(co) A
A
= l°gpA(aAtoSNA)/pA(to) = vA((o). If a is arbitrary but uA is bounded then the quasilocality of pA entails that of uA and vA. Therefore lim ||fl* A - vA\\ £ lim j a s ^ ( d Q sup K K C S \ A ) - vA(œ)\ = 0. A
A
(y
Thus in both cases HA exists and satisfies # A = "A = « A " A - l o g P A -
Equivalently, h* = p A exp( — aAuA). Integration yields ZA =^A^A = exp(-a A u A ) because aAuA is ^-measurable and XApA = 1. Hence PA = PA5) The uniqueness of $ will follow from Theorems (2.34) and (2.35) (a) below, a The preceding proof shows that Theorem (2.30) remains true when the hypothesis of quasilocality is replaced by the weaker condition that lim sup IPA(WACS\A) - P A M I = 0 A
Çefi
for all co e Q and A e y. Let us look at two particular cases. (2.31) Corollary. Suppose E is finite, X is counting measure, and y is a positive quasilocal specification. Then for each a e &{E, S) there is a unique anormalized k-admissible potential with y = y®. is uniformly convergent, and H^eS for alike
38
Gibbsian specifications
Proof. Let p be the unique ^-modification with y = pi.. By assumption, pA > 0 for all A e y . From the paragraph below (1.31) we know that p is a premodification, and Proposition (2.24) (c) asserts that p is quasilocal. By Remark (2.21) (3), each pA is continous with respect to the natural product topology on Q. Hence so is log pA. As Q is compact, each of the functions log pA is bounded. Thus the result follows from Theorem (2.30). o Next we ask for ^-modifications which are Gibbsian for a nearest-neighbour potential in the sense of (2.17). (2.32) Corollary. Let p and a satisfy the conditions of Theorem (2.30) and
{i}{J{jeS:{i,j}eB}.
Then $ is a nearest-neighbour potential. Proof. Suppose A e Sf is not a complete subgraph of (S, B). Then there are distinct sites i,j e A such that; <£ B(i). By definition, ® ^ = - Z (-irc|as\cUx, CaA
where uA = logp^. We split the last sum into contributions corresponding to the intersection of C and A\{i,j}. Each D c A\{i,j} corresponds to the four subsets ö i = ö , ß 2 = ö U {i}, D3 = flU{i},D4 = DU {i,j} of A. The contribution of D to the sum in question is thus equal to a
S\DiMX
—
a
S\D2UA ~
a
S\DiUA +
a
S\D4UA
= xS\D2(ct{i]uA - uA) - CS\DM{Ï\UA - UA)As p is a pre-modification, a
{i}UA
— UA = a{i}U{i}
~
U
{i}-
Combining this equation with the hypothesis that u^ is J^ (i) -measurable, we see that the contribution of D is equal to a
B(i)VD! M {»} ~
a
B(i)\D2U{i}
— aB(i)\D3u{i}
+ aB(i)\D4M{i}-
This expression vanishes because B(i)\D1 = B(i)\D3 and B(i)\D2 = ß(i)\Ö4Thus <S>A = 0, and the proof is complete, o
Equivalence of potentials
2.4
39
Equivalence of potentials
A given Gibbsian specification y can be defined in terms of many different potentials. For example, let O be a /l-admissible potential and c — (cA)AeS? a family of real numbers cA such that X \CA\ < oo A3i
for all i e S. Define ¥ by ¥ x = OA + c^ (A e 9>). Then xnA#0
for all A e y , and the expression on the right is a finite constant. This shows that ¥ is a /l-admissible potential, and p* = p®. To summarize this observation, we may state that a Gibbsian specification remains unaffected by the addition of constants to the potential. But more than this is true. Indeed, to conclude that p* = p* we do not need that H* - H% is constant. It is sufficient that tf£~* = H* - H% be immeasurable. This is because every ^-measurable factor of h® reappears in the partition function Z* and thus cancels in p*. This fact suggests the following definition. (2.33) Definition. Let O and ¥ be two potentials. We say O and ¥ are equivalent and write O ~ ¥ if, for all A e Sf, HX^ is ^-measurable. Clearly, " ~ " is an equivalence relation, and if O ~ ¥ then O is /l-admissible if and only if so is ¥ . The significance of" ~ " is explained by the next theorem. (2.34) Theorem. Let I e Jf{E, S) and suppose O and ¥ are À-admissible potentials. Consider the following statements: (i) O ~ ¥ 00 Pq> = pv (Hi) y® = y*
(iv) #(<&) n ^ ( ¥ ) # 0. Then the implications (i) o (ii) => (Hi) hold. Suppose further there is a second countable topology on E such that S is the Borel o-algebra, X is everywhere dense, and the functions
40
Gibbsian specifications
(ii) => (i). For each A G y we have p* = pA and therefore
HZ-v = log{ti*/hX) = \og(ZX/ZZ). Z* and Z* are ^-measurable. Thus $ ~ *¥. (ii) => (iii). Obvious. (iii) => (ii). Let A e ^ and c o e f i b e given. The equation y%(- \a>) = yA(- \a>) is equivalent to lA{AA) = 0, where AA = {C 6 £ A : P ? ( C « S \ A ) / P I ( C « S \ A ) # 1}.
Our topological assumptions imply that lA is everywhere dense and AA is open. Hence AA = 0 and thus p^co) = PA (ct)). (iii) => (iv). Under the hypothesis ^(O) U ^(*F) ^ 0, this implication is trivial. (iv) => (i). By assumption, there exists some \i e ^(
for l x us\A(/i)-almost all (C,co) e £ A x £ S \ A . The expression on the right is independent of £. Hence for XA x /lA x
Equivalence of potentials
41
(2.35) Theorem, (a) Suppose $ — *F is normalized by some a 6 0*(E,
«SXAtf£=
X
as^d>A =
X
<^.
Indeed, the interchange of integration and summation is trivial when a is a Dirac measure, and in the alternate case it is justified by the uniform convergence of 3>. The second equation in (2.36) can be verified as follows. If A <= A then <xs\AQ>A = <&A because
*A=
I
(-D
W
\A4
Moreover, since <1) ~ 0 and 3> is a-normalized we have «SNAA*
= «*(«£) =
I
s a
(
ADAÏ0
Hence 0> = 0 = *P. (b) Let *F be given by
^ = - 1 (-ir% vc iogp*. Theorem (2.30) then states that *F is a-normalized and p T = p*, and Theorem (2.34) yields *F ~ G>. Let us derive a simpler expression for *F. As the partition functions ZA are ^-measurable, we have
^,= x (-irvcfl? = 1 1
(-i^ c W®.-
BTM^O C<=A
If there is some i e A\B then the inner sum equals X
( - i r c | ( a s x c d > B - aSX(cu{i})(DB) = 0
CŒAJÏC
because <5B is J^-measurable. Hence *F is given by
42
(2.37)
Gibbsian specifications
(-ir c | a S X c O B .
VA= X I B=>A CŒA
The reader is invited to check directly that ¥ , as defined by (2.37), has the required properties, o We emphasize the following consequence of formula (2.37) for the unique a-normalized potential ¥ with ¥ ~ O: ¥ is supported on the subsets of the sets in the support of O, i.e., ¥,, is identically zero unless there exists some B z> A for which O B is not constant. In particular, if O is a pair potential or a nearest-neighbour potential then so is ¥ . We conclude this chapter with an example concerning the equivalence of magnetic models and lattice gas models. (2.38) Example. Let £ = { — 1, +1} and X be equidistribution. We think of £ as the state space of a magnetic spin having only two possible orientations, up or down. For each A e £f, the product
°A = n « ie A
indicates the parity of down spins in A. Therefore it is reasonable to look at potentials O of the form
(AeSf),
where J is a real function on £f. (The minus sign is due to convention.) Such potentials are called spin potentials. It can be immediately checked that each spin potential is normalized by X. By the way, there is also a purely mathematical reason for looking at spin potentials: The functions aA are precisely the characters of Q = Es, considered as a compact group with coordinatewise multiplication. Thus for each spin potential O the J(v4)'s are the Fourier coefficients of the associated Hamiltonians H®. Now let us switch to a lattice gas interpretation of E. A site i is thought of as being occupied by a particle if and only if u, = 1. This suggests considering potentials ¥ of the form r,
,
\K(A)
ifffj = 1 for a l l i e d
T.-Kwrjc+w-y otherwise; here K is any real function on Sf. Clearly, such a ¥ is a gas potential with vacuum state — 1. For which choice of ¥ is the lattice gas model equivalent to a spin model with interaction ? Suppose O e J , i.e., X \3{Ä)\ < oo for a l l i e S. Asi
Theorem (2.35) and equation (2.37) assert that there is a unique gas potential
Equivalence of potentials
43
*P ~ $ which is given by (2.39)
K(A)=-YJ
I
(-l)^c|7(B)(-l)|B\q
= _ 2 W X ( - lfVI J(B). B=>.4
*P is uniformly convergent but does not necessarily belong to &. For we have the estimate \\V\\t = I
|KG4)| ^ I 2W I
^31
A 3l
|J(B)|
B =) yl
= 2 X l^)^11"-1 B3i
with equality when ( — 1)|B|J(B) has constant sign. Conversely, let us start from a gas potential *Pe J . Then (2.37) provides us with a unique equivalent spin potential which is given by (2.40)
-J(A)=
X
X (-1) MVC| 2-' B \ C| K(B)
= X K(ß)2-Ißl X (_l)H\ci2lci = X K(B)2-IBI. B=>^4
<1> satisfies
||«D||^ X X |K(B)|2-'BI = ^11^11, ^3i
B=>A
^
for each i e S, hence cD e ^ . For a better understanding of formula (2.37) it might be helpful to check directly that the potentials and *P which are related by equations (2.39) and (2.40) are indeed equivalent. This is most easily done by expressing the function f\ie A iai + l)/2 in terms of the CTB'S, or vice versa, o
Chapter 3 Finite state Markov chains as Gibbs measures
This chapter should be viewed as an intermezzo. Its contents will not be needed for an understanding of later chapters, although it will occasionally reappear in some remarks and examples. There are, however, two reasons for inserting this chapter at this early stage. First, a reader who is familiar with Markov chains will ask if the usual description of Markov chains by one-sided conditional probabilities is related or even equivalent to a description by means of Gibbsian conditional probabilities. In the case of a finite state space E, this question has a positive answer which will be given in Section 3.1: There is a one-to-one correspondence between the set of all positive transition matrices P and a suitable class of nearest-neighbour potentials
3.1
Markov specifications on the integers
Throughout this chapter we choose the integers for the parameter set, i.e., we put S = Z. We also fix a finite non-empty state space E. Of course, we let ê be the power set of E, and the a priori measure X on E is counting measure. (3.1) Defînidon. Let y be a specification with parameter set Z and state space E. We say y is a positive homogeneous Markov specification if there is a function
Markov specifications on the integers
45
g{-,-,-) > 0 on £ 3 such that 7{iM = y\co) =
flf^i-i.y.cOj+J
for all ieZ, y e E, and <w e Q. (3.2) Comment. Formally, the preceding definition only refers to the singleton part of y. But because of the consistency of y each condition on its singleton part has consequences for all yA's. In fact, Theorem (1.33) implies that each positive homogeneous Markov specification is uniquely determined by the corresponding function g. Therefore g will be called the determining function of y. o We are going to show that each positive homogeneous Markov specification y admits a unique Gibbs measure p. In fact, pi is a Markov chain with a positive transition matrix which can be computed explicitly in terms of y. To state the result we introduce some notation. Let P = (P(x,y))x,yeE be a stochastic matrix with non-zero entries. Then we let pip e é?{Q, êF) denote (the distribution of) the unique stationary Markov chain with transition matrix P. pP is uniquely determined by the condition (3.3)
/lp((7; = X0,<Ti
=
+ 1=X1,...,Oi+n
=
Xj
ccP(x0)P(x0,x1)...P(xn_1,xn).
Here i e Z, n e Z+, x0, ..., x„e E, and aP e ]0,1 [ £ is the unique probability (row) vector satisfying the equation otPP = aP. (A proof of existence and uniqueness of aP will be given in Appendix 3.A.) For each A c Z w e let (3.4)
SA = {i e Z\A: \i - j \ = 1 for some; e A}
denote the boundary of A. (3.5) Theorem. The relation establishes a one-to-one correspondence y<-^P between the set of all positive homogeneous Markov specifications and the set of all stochastic matrices on E with non-vanishing entries. For given P the corresponding y is determined by the equation (3.6)
yA(
(A e Sf,œ e Q, Ç e EA).
Conversely, P can be expressed in terms of the determining function g of y as (3.7)
P(x,y) = Q(x,y)r(y)/qr(x)
(x,y e E).
Here Q(x,y) = g(a,x,y)/g(a,a,y) for some arbitrarily fixed a e E, q is the largest positive eigenvalue of Q = (Q(x,y))x yeE, and r e ] 0 , oo[ £ a corresponding right eigenvector.
46
Finite state Markov chains as Gibbs measures
Before entering into the proof we give some comments and draw a conclusion. (3.8) Comments. (1) The expression on the right of (3.6) can be expressed explicitly in terms of P as follows. Each A e y is of the form A
= u k + !>•••»'*+ »*}• k= l
for some n ^ 1, where ik e Z and nke N are such that the sets {ik,...,ik are pairwise disjoint. For such a A (3.3) gives
+ nk}
Hp{aA = C\(T8A = œ8A) n
= EI pKk^
Here P m is the w'th matrix power of P. (2) Equation (3.7) establishes a well-defined mapping from the set of all positive functions g on E3 to the set of all positive stochastic matrices on E. Indeed, let a e E be fixed and g: E3 -> ]0, oo [ be given. Then Q=
(g(a,x,y)/g(a,a,y))XiyeE
is a positive matrix. Therefore a well-known theorem of Perron and Frobenius (cf. Appendix 3.A) guarantees that Q has a unique eigenvalue q > 0 with the following properties: (i) all other eigenvalues have modulus strictly less than q, (ii) there exists a strictly positive right eigenvector r corresponding to q, and (iii) each right eigenvector corresponding to q is a multiple of r. Consequently, the matrix P=
(Q(x,y)r(y)/qr(x))x,yeE
is uniquely determined by Q and therefore by g. Clearly, P is stochastic,
o
As a by-product of Theorem (3.5) we obtain the following characterization of the positive homogeneous Markov specifications. A nearest-neighbour potential O is called homogeneous if there are two functions q>1 : E -> U and q>2 : E x E -> M such that A
Woi.rç+i)
iM = { U + l } .
(3.9) Corollary. /I specification y is a positive homogeneous Markov specification if and only if y is Gibbsian for some homogeneous nearest-neighbour potential O. Proof. Suppose y is a positive homogeneous Markov specification. By Theorem (3.5), there exists a positive stochastic matrix P such that y can be written
Markov specifications on the integers
47
in the form (3.6). In view of Comment (3.8) (1) this means that y is Gibbsian for the homogeneous nearest-neighbour potential 0
= J
(-log P(at,ai+1) [ 0
iîA = {i,i + 1}, otherwise.
The converse is obvious, D Combining Corollary (3.9) with Theorems (3.5) and (2.35)(b) we obtain the following result: Given any a e 3?{E, S\ the relation
establishes a bijection <&<->P between the set of all a-normalized homogeneous nearest-neighbour potentials and the set of all positive stochastic matrices. In the particular case when E = { — 1,1} and a is equidistribution, this bijection will be studied in detail in Section 3.2. We now turn to the proof of Theorem (3.5). Proof of Theorem (3.5). For the sake of brevity we let IT denote the set of all stochastic matrices on E with non-vanishing entries and T the set of all positive homogeneous Markov specifications. 1) If P G II and y is defined by (3.6) then y e T and nP e &(y). To show this we fix some A e Sf. It is easily checked that HP(aA = Ck MA = cöANA) = i*p(oA = CkdA = mdA) for all C G EA, œ e Q, and all A e Sf with A => A U 3A. Thus we have
M K = C}n^) = jd/i p r A K = ci-) A
A
for all C e E and all cylinder events A'\-a.3~K and therefore for al\Ae^~A. shows that /zP = nPyA. Hence fiP e ^(y). To show that y is a specification we only need to prove that (3.10)
This
yAyA(crA = C|-) = yAK = C|-)
whenever 0 ^ A c A e Sf and ( e EA. ((3.10) implies the consistency equation yAyA = 7A because yA and yA are proper.) As pip e &(y), the argument preceding definition (1.23) shows that (3.10) holds almost surely with respect to nP. But both sides of (3.10) are #3A-measurable and fiP charges each atom of J^ A . Hence (3.10) holds everywhere. Finally, formula (3.6) immediately shows that y has the determining function (3.11)
g(x,y,z) =
P(x,y)P(y,z)/P2(x,z).
Hence y e T, and we have shown that equation (3.6) defines a mapping 6: n -> T such that /iP e <${â{P)) for all P e n . 2) Next we show: If P G IT and y = S(P) then P can be recovered from y by means of formula (3.7). In particular, 6 is injective. Let a e E be fixed. By
48
Finite state Markov chains as Gibbs measures
definition, the determining function g of y is given by (3.11). Thus for all x, y e £ we have from the definition of Q Q(x,y) = g(a,x,y)/g(a,a,y)
= P{a, x)P(x, y)/P(a,a)P(a, y).
Putting q = l/P(a, a) and r(x) = P(a, x) this equation can be rewritten as Q(x,y)r(y) = qr{x)P{x,y). Summing over y we see that q is an eigenvalue of Q and r a corresponding right eigenvector. Moreover, if q' e C\{q} is any other eigenvalue of Q with eigenvector r' e C £ then q'/q is an eigenvalue of P with eigenvector (r'(x)/r(x))xeE. As P is stochastic we may conclude that \q'/q\ < 1 (see Appendix 3.A). Hence q is the Perron-Frobenius eigenvalue of Q. This shows that P is obtained from Q by (3.7). 3) Let g: £ 3 -> ]0, oo [ be the determining function of some y e Y, and let P e n be defined in terms of g and some a e E by means of (3.7). Then g satisfies (3.11) if and only if the equation (3 12)
g(*' y>z) d(a> x > a ) g(x, a, z) g(a, a, a)
=
d(a, x, y) g(a, y, z) g (a, a, y) g (a, a, z)
holds for all x, y, z e E. Indeed, (3.11) is equivalent to the statement g(x,y,z) g(x,a,z) for all x,y,ze equals
^P(x,y)P(y,z) P(x,a)P(a,z)
E. According to the definition of P, the expression on the right
Q(x,y)Q(y,z)/Q(x,a)Q(a,z) =
g(a,x,y) g{a,y,z) lg(a,x,a) g{a,a,z) g(a, a, y) g (a, a, z)\ g (a, a, a) g (a, a, z) '
Thus (3.11) is equivalent to (3.12). 4) The injection 6: Il -> Y is surjective. For let y e Y have determining function g, and let P e n be defined in terms of g and some a e E via (3.7). We will prove that y = 6(P). Let y = 6(P). Then Theorem (1.33) asserts that y = y as soon as y^ = y^ for all i e S. But the latter statement is equivalent to equation (3.11) and therefore, by stage 3) above, to (3.12). We are thus left with proving (3.12). To this end we put A = {1,2}. We also fix any z e E and write [xy] = yA(
(x,ye E),
where to e Q is such that a>0 = a and co3 = z. The consistency equations ÏA
=
yAy{i} = y A y { 2 } t n e n
[xy] = g(a,x,y)
ta
^e
form
£ [wy] = g(x,y,z) ueE
Hence
tne
£ [xt>] veE
(x,yeE).
The one-dimensional Ising model
g(a,x,y)/g(a,a,y)
= [_xy]/[ay]
g(x,y,z)/g(x,a,z)
= [xy]/[xa]
49
and for all x, y e E. Consequently, the expressions on the left and on the right of (3.12) both coincide with the ratio [xy]/[aa]. This proves (3.12). Hence
y = *(n 5) It remains to show that <&(y) a {fiP} when P e l l and y = S(P). This is an immediate consequence of a uniqueness theorem in Chapter 8, but there is also a direct proof using the ergodic theorem for Markov chains. (One of the simple proofs of the ergodic theorem is included in Appendix 3.A). Let i e Z, n ^ 1, A = {i,i + 1,..., i + «}, k ^ 1, and A = A(fc) = {i- k + l,...,i + n + k - 1}. For each Ç G EA and coeflwe have y A K = £|co) =
Y
HP(.VA
= iWdA = WOA)
= Pk(œt.k, QP(U C i+1 )... P(Ci+n-,, £•+.) x Pk(Ci+n, œi+n+k)/Pn+2k(œ^k,
œi+n+k)
The ergodic theorem for Markov chains states that lim Pk(x, •) = aP indepenk œ dently of x G E. Hence ^ lim y A W K = Ç\co) = a , ( Q j P K i , C i + i ) - i > K i + 1 , - i . U = M ^ A = 0Therefore, if ^ G ^(y) then the dominated convergence theorem gives n{oA = 0=
!i m 1 K^co)yMk){aA = £|co) fc->ao
Thus ^ = fip. The proof is thus complete, D To conclude this section, we note that Theorem (3.5) admits a straightforward modification which covers the case when S = N and fiP is replaced by the Markov chain with transition matrix P which starts at time 0 at a fixed initial point a G E. The details are left to the reader.
3.2
The one-dimensional Ising model
The Ising chain is a simplified model of a ferromagnetic or antiferromagnetic substance. It was suggested by W. Lenz and first investigated by E. Ising in his thesis in 1924. The model is characterized by the following assumptions:
50
Finite state Markov chains as Gibbs measures
(i) The substance consists of spins with two possible orientations + 1 and — 1 ("up" and "down"). (ii) The spins form an infinite linear chain, i.e., they are located at the sites ofZ. (iii) The interaction energy of two spins at and cr,-with \i — j \ = 1 is — Ja^j, where J e IR is a coupling constant. (1/|J| is proportional to the absolute temperature. In the case J > 0 any two adjacent spins have minimal energy if and only if they are aligned in that they have the same sign. This means that the interaction is ferromagnetic. On the other hand, if J < 0 then any two adjacent spins prefer to point in opposite directions. Thus in this case we have a model of an antiferromagnet.) (iv) There is no interaction between non-adjacent spins. (v) There is a constant heU describing the action of an external field (directed upwards when h > 0). We read these assumptions as an invitation to investigate the set @(<&J,h) of Gibbs measures for a particular potential (&Jh. We have to choose E = {-1,1},S = Z, and (3.13)
®JAh
-J<Ti<Ti+l if A = {i,i + 1}, if A = {i}, = < -hoi 0 otherwise.
Q>J,H is called the I sing potential with interaction J and external field h. (Thus the Ising potentials are just the homogeneous nearest-neighbour spin potentials.) We look for an explicit description of <S(<&J,H). Clearly, the Gibbsian specification yJ,h for 0>J,H is a positive homogeneous Markov specification with determining function (3.14)
g(x,y,z) = e*h+Jx+Jz)/2cosh(h
+ Jx + Jz),
x, y, z e { — 1,1}. Theorem (3.5) thus shows that (3.15)
9{&>») = {^, h },
where fiJh is the distribution of the stationary Markov chain with transition matrix P, h which is defined in terms of (3.14) via (3.7). Let us compute Pj h. We put a = 1 and write + and — instead of + 1 and •l.Then Q
g(+, -,-)/g( + + ,-) g(+, + ,-)/g(+ + , - ) e" 2 " l
e
g( + ,-,+)/g( g( + ,+,+)/g(
-2*-4J\
1 I
The largest solution of the characteristic equation (e" 2 '' - q)(l -q)-
e~2h-*J - 0
+ ,+,+) + ,+,+\
The one-dimensional Ising model
51
ofßis 1 4- e~2h
(3.16)
/ i _ e" 2 ''^ 2
I
,Jtfc = i ± i — +
/ e — +
= e ''(coshfc + V e
4J
fl
+ sinh2/i).
Hence P,,*(-, - ) = e"2*^,1*. JJ.A( + > + ) = «7,* and therefore P-i/J
ij.fc- I . _
-1
-!
A short computation gives
(3.18) W D - i f n 2\
Sinh
"
V e ^ J + sinh2/jy
Using (3.3), (3.17), and (3.18) we may compute the \ij^-probabilities of cylinder events. In particular, we have (3.19)
nJih(<Ti) = (e _ 4 J + sinh2 h)'1/2 sinh h
for all i e Z. In the case J = 0, of no interaction, we see that q0 h = 1 + e~2h and therefore both rows of P0 h coincide; hence fi0 h is a product measure. In the rest of this section we shall investigate the "low temperature limit". O J '' 1 will be endowed with a factor ß, the inverse absolute temperature, and we shall ask for the behaviour of <#{ß<&J'h) = ^ ( O ^ ' 1 ) = {ßßJ,ßh} as ß tends to infinity. This behaviour will turn out to be closely related to the set of ground states for O7''1. A configuration co e Q, is called a ground state of O J '' 1 if for each i e Z the pair (o}i,o}i+1) is a minimal point of the real function \j/\ (x,y) -»• — Jxy —fc(x+ y)/2 on { - 1 , l} 2 . (Note that the potential y
x
=
Moi,oi+i) [0
if
^ = {U + 1}. otherwise
is equivalent to
J i m œ P ^ = (o i and lim <xP h (l) = 1. Hence ßßhß 0->oo
converges weakly towards <5+, the Dirac
52
Finite state Markov chains as Gibbs measures
measure at the constant configuration co+ e Q defined by u>? = 1, i e Z. Notice that co+ is the unique ground state of O 1 \ Similarly, if h < 0 then œ" = —œ+ is the unique ground state of O1'*, and the Dirac measure cL at a>~ is the weak limit oi ßßßh as ß -* oo. In the case h = 0,O 1 ' 0 has precisely two ground states, namely u>+ and <x>~. On the other hand, we have aP = (j, ^) and
as ß -»• oo. Hence L L hm ^ , 0 =-<$+ +x<5-. /j->00
^
^
Consequently, the limiting measure is the equidistribution on the set {co+, co~} of ground states. In particular, the non-uniqueness of the ground state implies an asymptotic loss of tail triviality: Each of the Gibbs measures nß< 0 satisfies a 0-1 law for all events in 2T, but the limiting measure (S+ + S_)/2 does not. (In order to check the former claim we can anticipate an argument of Chapter 7. Let us write \i = ßßh and y = yß,h, and suppose A e 2T is such that \x{Ä) > 0. Putting v = niA)'1 \Aß we have for each A e «S'7 vyA = n(A)'1 lA(nyA) = v because yA is proper. Hence v e @(y) = {ß} and therefore ß(A) = 1. On the other hand, the limiting measure (<5+ + S_)/2 is not trivial on ST because it assigns probability 1/2 to the tail event [at = 1 for infinitely many i e Z}.) We will see later that the loss of tail triviality is far more dramatic for the Ising model in two or more dimensions. In higher dimensions the set <&(Q>ß'°) is so strongly attracted by the two ground states a>+ and a>~ that for sufficiently large (but finite!) ß there exist two distinct Gibbs measures for 0^'° which are close to ö+ resp. S_. Thus for these ß a phase transition occurs, and Theorem (7.7) will show that this is equivalent to the existence of (many) Gibbs measures for O^'0 with a non-trivial tail cr-field. Case 2. Ising antiferromagnet in the low temperature limit. Here we put J = — 1, and we distinguish the cases h > 2, h = 2, \h\ <2, h= —2 and h < — 2. In the case h > 2 we have
lim a P j h (l) = 1, and therefore lim nßJ<ßh = ö+.a>+ is the unique ground state ß->00
'
/)->00
of O J ' \ Similarly, if h < — 2 then a>~ is the unique ground state of Q>Jh and
/)->00
Next we suppose that \h\ <2. Then <5)Jh has precisely two ground states, namely the configurations a>+~ and a>~+ with
The one-dimensional Ising model
,
a >
,
f
53
1 if i is even,
i ~ = ^ = [_1
if.isodd
We let <5+_ and <5_+ denote the Dirac measures at co+~ and co~+. Inspection of (3.16) and (3.17) shows that lim aPtJfh(l) = 1/2 and /J->00
s p ^ = (l 0 Hence lim Hßj,ßh = ~ö+__ +-ô_
+.
Thus, in this case we also observe an asymptotic loss of tail triviality. This phenomenon is again a pale reflection of a phase transition in two or more dimensions. Finally, we consider the case h = 2. (The remaining case h = — 2 is similar.) It is then obvious from equation (3.16) that lim qß~}t2ß = (\/5 — l)/2, the ß golden ratio. Thus ~*w lim PßJt2ß = F^\
(^ß
- lV
,/5 - 1
Since lim aP/(j2/)(l) = (1 + yj 5)12^5 = a f (l), we conclude that lim nßJt2ß = UP, the stationary Markov chain with transition matrix F. The letter F reminds us of Fibonacci. Indeed, F is the unique stochastic matrix satisfying F(-, - ) = 0 and F{ + , -)F(-, +) = F( + , +)F( + , + ). Consequently, for all [' e Z, n ^ 1, and x 0 , . . . , x„ +1 e {— 1,1} we have ^fi^i
+l
=
x
l>
• • • >CTi + n
'l/fln+(*o+*»+i)/2 [0
=
x
n\(7i
if x
=
x
X
0^(Ti
+n+l ~
X
n + l)
( ;> ;+i) # (-1» - 1 ) otherwise,
for
all 0 ^ j ^ n
where (a0,a1,a2,...) = (1,1,2,3,5,8,13,...) is Fibonacci's sequence. In particular, ftp can be thought of as the equidistribution on the infinite set {co e Q: (co;, (oi+1) ^ (— 1, — 1) for all i e Z}. This set is precisely the set of all ground states of O-7'2. fiF is trivial on 3~. (This is a well-known consequence of the ergodic theory of Markov chains and will also follow from Example (10.24) (2) and Theorem (10.35) (iii) below.) Thus in spite of the existence of uncountably many distinct ground states there is no asymptotic loss of tail triviality. This comes from the fact that the set of ground states for O-7'2 does not exhibit a long range order: If co is a ground state and i e Z then (Of remains unknown even if the restriction of co to the set {j e Z: \j — i\ > 1} is known. A ground state
54
Finite state Markov chains as Gibbs measures
degeneracy of this kind cannot be expected to give rise to a phase transition, even in higher dimensions. The preceding discussion can be summarized as follows: In the low temperature limit ß -» oo, the unique Gibbs measure nßJtßh for the Ising potential (frßj,ßh c o n v e r g e s weakly to the equidistribution on the set of ground states for
3.A
Appendix: Positive matrices
Let £ be a finite set with cardinality |£| ^ 2. A matrix Q = (Q(x,y))x yeE is called positive (in the strict sense) if Q{x,y) > 0 for all x, y e E. The following theorem of Perron and Frobenius was used in Section 3.1. (3.A1) Theorem. Each positive matrix Q has a distinguished eigenvalue q > 0 with the following properties. (i) \z\ < q for all eigenvalues z ^ q of Q. (ii) There is a right eigenvector r corresponding to q such that r(x) > 0 for all x e E. (Hi) q is simple, i.e., if r, r' are right eigenvectors corresponding to q then r' = cr for some c e C . q will be called the PF-eigenvalue of Q. Proof. We introduce the set R of all column vectors r e [0, oo [ £ \{0} and define a function / : R -» ]0, oo [ by fir) =
min
Qr(x)/r(x).
xeE:r(x)>0
Heregr(x) = £ Q(x,y)r(y)./is
continuous and satisfies the equation f(tr) =
y eE
f(r) for all t > 0. Hence f(R) coincides with the set / ( < r e R: £ r(x) = 1 •
and the latter set is compact. We put q = max f(R). q is an eigenvalue of Q. For let r e R be such that q = f(r), and suppose Qr 7^ qr. Then Qr(x) ^ qr(x) for all x e E, with strict inequality for at least one x. Therefore the vector Q(Qr — qr) has strictly positive coordinates, and this implies f(Qr) > q which is in contradiction with the definition of q. Consequently, q is an eigenvalue of Q, and r is a right eigenvector for q satisfying r(x) = q^Qrix) > 0 for all x s E.
Appendix. Positive matrices
55
Next we consider an arbitrary eigenvalue z of Q. Let u e C £ be a right eigenvector for z and a = minQ(x,x) > 0. Let / denote the identity matrix. The equation (Q — al)u = (z — a)u shows that for all x e E X (Q(x,y) - al(x,y))\u(y)\ ^ \z - a\\u(x)\ ye E
and therefore X Q(x,y)\u(y)\^(\z
- a\ + a)\u(x)\.
ye E
Consequently, \z — a\ + a ^ max/(i?) = q. Hence either z = q or \z\ < q. Finally, we let r e R be as above and v be an arbitrary right eigenvector corresponding to q. We may assume that v is real. (Otherwise we may look at its real and imaginary part separately.) Let c = min v(x)/r(x). xe E
Then v(x) ^ cr(x) for all xe E. This implies v = cr because otherwise v(x) - cr(x) = q'1
£ Q(x,y)(v(y) - cr(y)) > 0 ye E
for all x e E which is in contradiction with the choice of c.
n
Next we consider positive stochastic matrices. (3.A2) Remark, (a) The PF-eigenvalue of a positive stochastic matrix P is 1. (b) For each positive stochastic matrix P there is a unique probability (row) vector a on £ satisfying OLP = a (and therefore a(x) > 0 for all x e E). Proof. Let q be the PF-eigenvalue of P. Applying Theorem (3.A1) to the transposed matrix PT, we see that P admits a left eigenvector ( for q such that £(x) > 0 for all x e E. Thus Zf(x)P(x,y)
= qt(y)
(y e E).
xe E
Summing over y we obtain that q = 1. This proves assertion (a). To prove statement (b) it is sufficient to note that ( is unique up to a factor, D A second tool needed in Section 3.1 was the ergodic theorem for positive stochastic matrices. (3.A3) Theorem. Let P be a positive stochastic matrix and a the unique probability vector with otP — a. Then
56
Finite state Markov chains as Gibbs measures
lim P"(x,y) = oc{y) n-*oo
for all x, y e E. Proof. For s ^ O w e put \j/(s) = s log s — s + 1. I/MS strictly convex and attains its minimum 0 at s = 1. For an arbitrary probability vector n on E we define the relative entropy Jf(7i|a) = X oc{x)iJ/{n{x)/a{x)) xe E
of n with respect to a. We claim that Jf(nP\oc)< Jf(n\oc) for all n ^ a. (For 7t = a equality holds.) Indeed, for n ^ a we have xTk xTk
V^TE
,ti
a
W
a(x)
«W/ \a(j;)/
= Jf(7r|a)
because ip is strictly convex, a(-)P(-,x)/a(x) is a probability vector, and 7t( •)/(*(•) is not constant. Now let x G £ and s > 0 be given. Let (5 > 0 be a lower bound of the positive continuous function 71-> J4f(n\a) — Jf(7tP|a) on the compact set Ke of all probability vectors n with £ |7T(y) - a(y)| ^ e. }>e£
Suppose P"(x, •) e K£ for infinitely many n ^ 1. Then W + 1 ( x , -)|a) ^ jr(P"(x, -)l«) - <5 for infinitely many n. This is impossible because (Jf(P"(x, -)la))«ai *s decreasing and nonnegative. Hence X \Pn(x,y) - x(y)\ < e ysE
for sufficiently large n. This proves the theorem,
a
A further proof of Theorem (3.A3) will be given in Example (7.15), and a far more general ergodic theorem will be proved in (10.34).
Chapter 4 The existence problem
The first question which we will ask concerning a given specification y will always be: Does y admit any Gibbs measure \i e ^(y)? In many cases we shall pose an even stronger question: Does <&(y) admit two (or more) distinct measures which are distinguished by certain additional characteristic properties? In order to give a positive answer to these questions we need to construct one, or even two (or more), elements of ^(y). There is a natural device for doing so: Introduce a topology on 0*(Q, SP), pick one (resp. two or more) boundary conditions co e Q, and show that (I) the net (yA(-\co))Ae has a cluster point (with respect to the topology chosen), (II) each cluster point of (yA(- \co))A belongs to <&(y), and
(III) each cluster point of (yA(-|co))Ae^ exhibits the desired additional properties. Whilst problem (III) is the most interesting in concrete models, the existence of a Gibbs measure merely requires us to solve problems (I) and (II). The latter is the objective of this chapter. Clearly, any solution to problems (I) and (II) depends crucially on the choice of the topology on é?{Q, SP). In fact, (I) and (II) impose conflicting demands on this topology. Problem (I) is trivial when ^*(Q, S?) is compact, whereas (II) is trivial when the evaluation map \i -> /x(/) is continuous for each bounded measurable function / on Q. In other words, problem (I) requires a coarse topology, whilst (II) requires a sufficiently fine topology. It will turn out that a practicable compromise is provided by the so-called topology of local convergence. This topology will be introduced in Section 4.1. If the state space E is finite, the topology of local convergence coincides with the weak topology. So, in this case it is metrizable. In general, however, the topology of local convergence does not do us the favour of being metrizable. This will necessitate our working with nets instead of sequences. (A standard reference to the theory of nets is the monograph of Kelley (1975).) For the chosen topology of local convergence, none of the statements (I) and (II) holds in general. In fact, there are simple and innocent looking examples of specifications y for which
58
The existence problem
space shall be used for the first time. (It will also be needed later on, mainly in Section 7.3.) For the convenience of those readers who are not familiar with this notion, some basic properties of standard Borel spaces are stated and proved in Appendix 4.A. On the first reading, however, it might be preferable to skip the technicalities of Section 4.2. In fact, there is no loss in doing so for anyone who is only interested in the case of a finite state space E. For, it is well-known that in this case the weak topology on ^(Q, #"), and thus the topology of local convergence, is compact, and this implies that assertion (I) holds automatically without any condition on y and œ. In Section 4.3 we shall turn to a discussion of problem (II). We shall see that the topology of local convergence is tailor-made for (II) to hold for every quasilocal specification y. In fact we shall obtain a more general result than (II): the boundary condition co may be replaced by a net (va) of random boundary conditions, and we can deal with a convergent net (ya) of specifications instead of a single specification y. Among other things, this generalization implies some important continuity properties of the correspondence that maps a potential to its set of Gibbs measures, as will be shown in Section 4.4. In this section we shall also combine the results of the preceding sections to state a general answer to the existence problem. (See Theorem (4.22).)
4.1
Local convergence of random fields
Let (E, S) be an arbitrary measurable space, S a countably infinite set, (Q, #") = (£ s , 8s), and ^(Q, J^) the set of all probability measures on (Q, #"). We will choose a topology on ^*(Q, #"). If nyA = ß for a given specification y then we have some information on the behaviour of ß on SFK, but nothing can be said about the /i-probability of tail events. Therefore we need a topology which only depends on the local behaviour of the probability measures. If E is finite (and thought of as a discrete metric space) then the weak topology on ^(Q, 2P) is such a topology. For, it is well-known that for finite E a sequence (/On gi m ^ A ^ O converges weakly to some /i e ^>(Q, #") if and only if lim n„(A) = n(A) for each cylinder event A; cf. also Remark (4.3)(3) below. n-*oo
There are two possibilities to deal with the case of an infinite E. We could either impose topological conditions on E and use the weak topology on ^(Q, SF) (but if applied to Gibbs measures this would force us to require suitable continuity properties of specifications and potentials), or simply might stick to the topology of convergence of cylinder probabilities. We choose the latter possibility. We thus consider the algebra (4.1)
J F ° = U J^A
of all cylinder events in D.
Local convergence of random fields
59
(4.2) Definition. The topology of local convergence (or if'-topology for short) on ^"(Q, &) is defined to be the coarsest topology on ÉP(Q, ^) for which the evaluation maps v -> v(A), A e J^ 0 , are continuous. Equivalently, the iftopology is defined by the requirement that for each ß e ^"(Q, &) the sets lve^(Q,JF): max \v(Ak) - ß(Ak)\ < el I lSkS» J
with A1,..., Ane J^ 0 , e > 0, and n ^ 1 form a base of neighbourhoods of \i. Thus a net (na)aeD in ^(Q, J^) converges locally to /i if and only if limna(A) = D ß(A) for all A e J^°. (In this case we will write ßa -g ß.) From now on we will always assume that ë?(Q., 3?) is endowed with the £?topology. (4.3) Remarks. (1) For each y, e 0>(Q, J*) we let /i° = \i\SFQ denote its restriction to J^ 0 . According to Caratheodory's extension theorem, the projection p: /i ->• n° is a bijection from ^>(Q, &) onto the set ^(Q, J^°) of all u-additive normalized nonnegative set functions on J*0. ^(Q, J*70), considered as a subset of the compact Hausdorff space [0,1]^°, is equipped with the topology of setwise convergence. The if-topology on ^(Q,J^) is precisely the initial topology of p and turns p into a homeomorphism. In particular, ^(Q, &) is a Hausdorff space. (2) The if-topology is the coarsest topology such that for each / e S the evaluation map v ->• v(/) is continuous. Thus a net ( A O „ 6 D converges to some /i in the if-topology if and only if lim ßa{f) = ß{f) for a l l / e J?. (This justifies D _
the name if-topology.) For let / e if, /i e ^(Q, J*), and e > 0. Then there are some A e ^ and a step function n k= l
in ifA such that At,..., put
à = eilt /
A„ e &K are pairwise disjoint and \\f — g\\ < e/4. We
Kl-
k= l
Then for each v e ^>(Q, &) with max |v(i4k) - ß{Ak)\ < Ô we have Iv(/) - n(f)\ ^ 21|/ - srll + f
| f l k | |v(^k) - / i ( ^ ) | < e.
k= l
Therefore the mapping v ->• v(/) is continuous.
60
The existence problem
(3) Suppose E is a metric space and Q is equipped with the product topology. Then the y-topology is finer than the weak topology on ^{Q., J5"). This follows from Remarks (2.21)(2) and (4.3)(2) together with the so-called portmanteau theorem which states, in particular, that the weak topology is already induced by the integrals of all uniformly continuous bounded functions. Moreover, if E is countable with the discrete topology, each local function is continuous, so that the weak topology coincides with the Jf-topology. However, the „Sf-topology is only metrizable if J5"0 is countable, which happens only if E is finite. In this case, a standard choice for a metric inducing the „Sf-topology is d(fl, V) = J ] 2"" Y^
where (A(/i))
4.2
lM(0rA(»i) = Ç)-v((TA(#i) = OI.
j is any cofinal sequence in y .
o
Existence of cluster points
This section is devoted to problem (I) of the introduction to this chapter. More generally, we consider an arbitrary net (jj a ) J€D of random fields. We ask for conditions which ensure that (ßa)aeD has a cluster point relative to the ^-topology. This question is trivial when E is finite. For in this case the ^-topology coincides with the weak topology which, as is well-known, turns ^(Q, #") into a compact space; see also (4.11)(2) below. Consequently, a reader who is only interested in the case of a finite state space E can (and is advised to) skip this section and proceed directly to Section 4.3. To begin with, we observe that (na)aeD can only have a cluster point when (4.4)
lim Mm'mîna(Am) = 0 m-*oo
aeD
for all sequences (Am)mèl in #"° with Am \ 0 as m \ oo. For let / i b e a cluster point and £ > 0. As n is cr-additive, there is some m ^ 1 with n(Am) < e. na is frequently in the open neighbourhood {v e ^(Q, #"): v(Am) < e} of ß. Thus lim inf na{Am) < e. aeD
As (Am)m^1 is decreasing, (4.4) follows. Conversely, standard methods show that (/i a ) I6D has a cluster point whenever (Hx)aeD is equicontinuous in the sense that (4.5)
lim lim sup ßx{Am) = 0 m-*cc
aeD
for all sequences (Am)mil in J r ° with ^4m | 0 ; cf. Proposition (4.9) below. Unfortunately, the latter condition is difficult to verify in general. This is
Existence of cluster points
61
because the set of coordinates on which Am depends is allowed to become arbitrarily large. We thus resort to the following concept. (4.6) Definition. A net (na)a€D in ^(Q, 3F) is said to be locally equicontinuous if for each A e ^ and each sequence (Am)m^1 in !FA with Am j 0 lim lim sup na(Am) = 0. Under suitable topological conditions the local equicontinuity of a net will be sufficient for the existence of cluster points. (4.7) Definition. A measurable space (£, S) is called a standard Borel space if there exists a metric d on £ which turns £ into a complete separable metric space and is such that S is the Borel c-algebra with respect to d. (4.8) Comment. Some results concerning standard Borel spaces will be collected in Appendix 4. A. In particular, we will show there that if £ is a Borel subset of a complete separable metric space X and S the set of all Borel subsets of E then (£, S) is standard Borel. (In general, the corresponding metric on £ will not be topologically equivalent to the given metric on X.) o (4.9) Proposition. Let (E,S) be a standard Borel space. Then every locally equicontinuous net in £?{&., 3F) has at least one cluster point in £?(Q., 2F). Proof. Let ()Ua)a6D be a locally equicontinuous net in ^(Q,#"). Its restriction (Ha)aeD t o ^° is a n e t in the compact Hausdorff space [0,1]-^°. Therefore there is a subnet (fxa )peD> such that (n° )peD- converges (setwise) to some [i° e [0, Vf°- Clearly, fi° is finitely additive. In fact, [i° is c-additive on each 3FA. For if Am e !FA (m ^ 1) are such that A„10 then lim fi°(A„) = lim lim naß(AJ m^oD
m-*oo ß e D'
^ lim lim sup ßa(Am) = 0. m-»oo
oceD
We thus conclude that the mappings A -> n°{aA e A) (A e l A , A e y ) constitute a consistent system of marginal distributions. As (£, S) is standard Borel, a well-known theorem of Kolmogorov implies the existence of a unique fx e &>(Çl, &) with n\^° = n°; see Theorem 62.3 of Bauer (1981), for example. This n is the local limit of the subnet (fxa)ßED-. • In the following, we will provide some sufficient conditions for the local equicontinuity of a net (ixa)aED. An obvious sufficient condition is the local uniform domination of(/xa)aeD which means that for each A e y there exists a finite measure vA on !FK with following property: Given any e > 0, there is some ô > 0 such that lim sup fia(A) < s for all Ae^A with vA(A) < Ô. This oteD
62
The existence problem
domination property stands behind the more specific conditions which will be considered below. As a warm-up, we state a simple corollary. (4.10) Corollary. Suppose (E, S) is a standard Bor el space, and for each A e ^ we are given a finite measure vA on J \ . Then the set JT = {/i e ^(Q, 3F): p(A) ^ vA(^4) for all A e !FA and A e £f} is compact. Proof. X is closed and each net in J T is locally equicontinuous. D (4.11) Examples. (1) Let (E, S) be a standard Borel space and X G Jl(E,ê) a finite a priori measure. Suppose y = pX. is a ^-specification with bounded pA's. (For example, y might be Gibbsian for an absolutely summable potential.) The set 1ß{y) is then contained in a set JT as considered in the corollary. Hence &(y) is relatively compact. (2) Suppose E is finite. Then the set 3?{£l, &) of all random fields is compact (and metrizable, cf. (4.3)(3)). This follows from (4.10) by putting vA(erA e A) = |yl|foralM c £ A a n d A e y . o We will now prove a theorem on the local equicontinuity of nets of Gibbs measures or Gibbs distributions with random boundary conditions. On the one hand, this theorem will provide an answer to problem (I) of the introduction to this chapter. In fact, this problem will correspond to the particular case where y" = y, va = Sa, and (AJ a e D = (A) Ae ^. On the other hand, choosing va = p.ae ^(y11) we shall obtain a statement on the existence of cluster points of nets of Gibbs measures for possibly varying specifications. (This statement extends the above Example (4.11)(1).) To state the result we introduce some notation. Given a net (A a ) aeD in £f, we shall write Aa -g S if (A0L)aeD is cofinal in £f, i.e., if for each Ae £f there exists an a 0 G D with A^ => A for all a ^ a 0 . (4.12) Theorem. Let {E,S) be a standard Borel space and Xe Jt(E,S). Let (/OaeD be a net of random fields of the form px = vay%a, where (va)aeD is a net in £P{Çl, 3F\ (yx)aeD a net °f X-specifications y" = p"X„ and (A a ) a e B a net in y with Aa -g S. Suppose that for each A e y and e > 0 there are sets ß e f and BA G êA such that B c {oA G ß A }, XA(BA) < oo, lim sup sup pA(co) < oo, and lim sup pa(Q\B) ^ e. xeD
Then the net (/OaeD is locally equicontinuous and therefore has a cluster point. Proof. Let A G y and (Am)mil be a sequence of sets Am = {aA G AmA} G J \ with Am 10. Let e > 0 be given and B e J* and BA G
Existence of cluster points
63
Then lim sup nMm) ^ I™ sup na(Am D 5) + e for all m ^ 1. Since Aa -^ S there is some a 0 with Aa => A for all a ^ a 0 . For a ^ a 0 we have /*« = v«TÂa = VsJÂJÂ = ^«TÂ = P Â ( M A ) and therefore /xa(i4m H Ä) =
fiakA{p*lAmnB)
^IIPAlBIIMA(Keylm,AnßA}) = ll P ^i B II^V m ,Anß A ) for all m ^ 1. Thus
Hm sup na(Am n ß) ^ cAA(>imiA n ß A ) IED
for all m ^ 1 and some constant c < oo. Since kA(BA) < oo and kA(Am A D BA) -> 0 as m -> oo. The theorem is now evident, o
AmA[0,
The hypotheses of the preceding theorem are trivially satisfied when k is finite and the pA's are bounded uniformly in a. These conditions are natural whenever E is bounded in some sense. In general, however, we can only require that the pA's be bounded on suitable subsets of Q. This case is treated in the corollary below. To fix the ideas the reader might think of the specific case E = R,k = Lebesgue measure, K( = [ — tj~\, £ ^ 1. (4.13) Corollary. Let {E,S) be standard Borel, k e Ji(E,S\ (0")« eD a net of k-admissible potentials, (AJ a s D a net in y with Ax -^ S, (vj a e l ) a net in ^*(Q, J r ), and p.x = vayA" for all a e D. Suppose there is a sequence {K()(il in S such that the following conditions are satisfied: (0 0 < k{K() < oo for all £ ^ 1; (ii) lim lim sup /ia(<7; £ K() = 0 for all i e S; and /-•oo
ae D
(Hi) each A e y is contained in some A e y swc/i t/iat /or all £ ^L 1 lim sup
sup
|ffA'(a>)| < oo.
TTzen t/ie net (/OasD is locally equicontinuous and therefore has a cluster point. Proof We verify the hypotheses of Theorem (4.12). Let A e y and e > 0 be given. Assumption (iii) implies the existence of some A e y with A c A such that for all { ^ 1 c(0 = limsup "ED
sup 0)6K'X
\H%(co)\ ES\"
64
The existence problem
is finite, and hypothesis (ii) guarantees that there is some £ ^ 1 with X Hm sup na(ot i Kf) < 8. ieA
a-»oo
We put B = Kf x £ S \ A and BA = K A . Then /lA(flA) = /l(K^)|A| < oo, B c lim sup /za(Q\B) ^ lim sup £ M«^ <£ K/) < e, cteD
aeDi'eA
and lim sup sup p£"(co) ^ lim sup sup /zf(co)//lA(/z*° l K e B j | c o ) ^ e 2 c ( ^ ( K , r | A | < oo. The net (na)aeD thus satisfies the hypotheses of Theorem (4.12). D (4.14) Comments. (1) The assumptions (i) to (iii) of the preceding corollary are trivially satisfied when A is finite and lim sup ||
for all i e S and <&A is bounded on Ks( for all { ^ 1 and Ae SP. o We conclude this section with a proposition which is of general interest but will not be needed later on. Its proof should be skipped on a first reading. (4.15) Proposition. Suppose ^ e ^"(Q, J^) is a cluster point of a locally equicontinuous sequence (^„)„èi in ^ ( ü , J*). Then there is a subsequence (n„k)k^i which converges to [i. Proof. We put nil
and for each n ^ 1 and A e S? we let hn A denote a Radon-Nikodym density of n„\^A with respect to v\!FA. We write séA for the (7-algebra which is generated by the functions h„ A, n ^ 1, thus «s/A c J*A. We let s/A denote the system of all finite intersections of sets of the form {h„ A ^ c} with n ^ 1 and rational c. «s/A generates «s/A, and s/° = (J «s/A is countable. We write
Since n is a cluster point of (^„)„ai> there exists a strictly increasing sequence ("k)kêi such that max \n„k{A() - n(A,)\ < 1/fc
Existence of cluster points
65
for all k. Clearly, lim n„k{A) = fi(A) k-*oo
for all A e sé°. We show that n = lim fi„k. k For fixed A e ^ w e define ^°° s*k = y-es*K-
KA) = Um fi„k(A)l.
We will show that séA = séA. As stf\ contains the generator s#A of séA and s/° is stable under finite intersections, it is sufficient to prove that s/A is a Dynkin system. By definition, this means that (i) Q e srfA, (ii) if A, B e stfA with A a B then B\A e stfA, and (iii) if (-B^^i is a sequence of pairwise disjoint sets in sé\ then B = [J Be e ^ . (i) and (ii) are trivial. To prove (iii) we put Cm = [j Be,m^l. As
Clearly, Cm e sé\ for all m ^ 1. Let e > 0 be given.
(AO«ai is equicontinuous on J \ and B\Cm J. 0, there exists m ^ 1 such that lim sup n„(B\Cm) < e and fi(B\Cm) < e. «-•QO
This implies lim sup fi„k(B) < lim /i„k(Cm) + e = MQ.) + £ ^ KB) + e and limhu>„ k (ß)^ lim fi„k{CJ k->co
k-*<x>
= KCJ > n{B) - e. As e was arbitrary, B e r f ^ . Hence séA is a Dynkin system, and this implies that sé\ = séA; see Theorem 2.3 of Bauer (1981), for example. By standard extension arguments we conclude from the identity sé\ = séA that lim n„k(f) = n(f) f o r all bounded ^-measurable functions /. But if Ae^A
andfA is a version of the condition probability v(A\jrfA) then H„(A) = v(h„tAlA) = v(hn,AfA) = nn{fA)
for all n ^ 1. Therefore lim n„k(A) = n(A) for all Ae&A. arbitrary, the proposition follows.
•
Since A was
66
The existence problem
4.3
Continuity results
Here we turn to problem (II) of the introduction to this chapter. We start with an example of a specification y which violates (II). y has a finite state space. Thus for each <x> e Q the net (yA(- |eo))Aey> has cluster points. In fact, this net even converges to a limit. But this limit cannot belong to ^(y) because ^(y) is empty. Intuitively, y describes a lattice gas which consists of a single particle with a completely random position. (4.16) Example. Let E — {0,1} and S be arbitrary. For each a £ S we look at the configuration co" which is defined by 1
f 1 if i = a, | 0 otherwise,
and we let 0 denote the constant configuration taking the value 0. For each A £ y we define a proper kernel yA from STA to ?F by TIM- 1 X h(coa) yA(A\œ)=i <-A
ifcoSXA = 0sVA, otherwise;
LU(0AU>S\A)
here A e 3F, co e Q. The system y = (y A ) Ae y is a specification. Indeed, let 0 ^ A c A £ y , / £ =£?, and co e Q. IfcoSXA = 0SXA then WA/MHAI-1
1
X yAf(œa) ae A
HAP x lAr1 £/•(">•') +iAr x / K ) UËA
ieA
aeA\A
= TA/MSimilarly, if cos\A ^ 0syA then MA/M
= yA/(°A«s\A) = /(0A»S\A) = y A /M-
Hence yAyA = yA. It is easily seen that vAyA -^<S0 for each net (v A ) Ae y in ^(Q,#"). But <S0 does not belong to ^(y). In fact, ^(y) is empty. For suppose there exists some ji £ ^(y). Then
/'(j>i> i ) = /'(y>= o >= 1 }) = Z Wfij}^ = o-j = i) = o and
=
"fe« °)
= WA({0})
I
Mdeo)y A K = 0A|co) = 0,
Continuity results
67
where A e SP is arbitrary. Finally, /
\ieS
neS
because /.({Co«}) =
W A ({CO"})
= I A P V K A = 0SXA) S lAp 1
whenever a e A e y , Consequently, ^(Q) = 0. As this is impossible, $(y) = 0. o In the example above, the non-existence comes from the fact that yA(-\co) depends on the behaviour of co at infinity which cannot be controlled by the topology of local convergence. Thus, in order to exclude this source of nonexistence we need to rule out y having a tail dependence. As we know from Section 2.2, this can be done by requiring that y is quasilocal in the sense of Definition (2.23). The theorem below will show that the condition of quasilocality is tailor-made to ensure property (II) of the introduction to this chapter. We need to introduce a concept of convergence for specifications. To avoid technicalities we choose the simplest. Let (yx)aeD be a net of specifications. We sa Y (ya)oLED converges uniformly in the J5f-topology to a specification y, and write ya -g y, if lim||tf/-yA/||=0 for all A e S? and / e if ; here || • || is the sup-norm. Clearly, this condition implies that yA(-\œ) -g yA(-\œ) for all A e Sf and coeQ. (4.17) Theorem. Suppose y is a quasilocal specification and (y")^,, a net of specifications with y* -$ y. Let {Ax)xeD be a net in if with Aa-^ S and (v a ) aeD a net in 0>(Q, &) such that vay^ -$ \i for some \i e 0>(Q, &). Then \i e <&(y). Proof. Let A e ^ and / e if be given. By hypothesis, yAf e Ï£. Thus KVA/)
= l i m \Jl
(V*f)
and therefore IWA(/)
- M/)l = Hm \\y%m(yj) - vay*A(f)\ xeD
= lim|v a y Aa (y A /)-v c< y A (y A /)| IES
^lim \\yAf - y°Af\\ = 0.
68
The existence problem
The second equality follows from the fact that yA = yA yA as soon as Aa =3 A. Since / and A are arbitrary, it follows that p. e ^(y). 6 (4.18) Comment. The above theorem provides an answer to problem (II) of the introduction to this chapter: If y is quasilocal and p is any cluster point of the net (yA( • | a>))A e y for some boundary condition coeQ then p e ^(y). This follows from (4.17) by considering a subnet (yA (• \(o))aeD which converges to p, and setting ya = y and va = <5m, the Dirac measure at co. More generally, suppose we are given a net (&x)aeD of non-empty finite subsets of & such that A« ^ (1 {A: A e # . } ^ 0 and A« -^ 5. Let (v„)eD be any net in 0>(Çl, &) and ha = \&a\~1 I
vayA
(cceD).
Ae«,
If y is quasilocal then each cluster point of the net (/OaeD belongs to <$(y). This follows from (4.17) because pa = payA for all a e D. (As we shall see in (5.18), nets (pa)aeD of the above form can be used to construct Gibbs measures with symmetries.) o In order to pave the way for some further applications of Theorem (4.17) we will now provide a sufficient condition for the uniform ^-convergence of Gibbsian specifications. (4.19) Proposition. Let X e Ji{E, i) and (a)aeD be a net of X-admissible potentials. Suppose there is a potential (D such that lim WHf-fW = 0 otsD
for all A e y . Then
s AA(ifcr - ^D/z* + w i 3v z r3 - i/zfl 5 = yA (i/Ir*-ii) + izr-zA i/zA ^27^(1^"*-H) ^2\\hX~9-
1||
g2(exp||iï*-*l|-l). Therefore, if/ e i ? then
Continuity results
69
p r / - ^ / i i = ii^(/(pr-p^))ii ^ 11/11 UAUPA - PA 1)11
^2||/||(exp||Hr _ *l|-l)By hypothesis, the final expression tends to zero as a runs through D.
a
Proposition (4.19) implies the following particular result: For any finite a priori measure A, the mapping Q> -> y® from the space J 1 of all absolutely summable potentials into the set of all Gibbsian specifications is continuous. There are two standard construction of nets (^"Xeo which converge to a given potential Q> in the sense of Proposition (4.19). These constructions are characterized by two particular types of boundary conditions outside a given region A e y , namely "free boundary conditions" and "periodic boundary conditions"; see the examples below. (4.20) Examples. Suppose À e Ji{E, S) is finite and
J0>. i f ^ A , [0
otherwise.
A
Suppose O is 1-admissible. (This certainly holds when O e f and A is finite.) We then can look at the associated Gibbs specification y">". The restriction of }>*4(-\a>) to J ^ does not depend on the choice of co e Q and is called the Gibbs distribution for O in A with free boundary condition. Since lim sup \\HT~°\\ £ lim
£
||0>J = 0
for all A e y , we can conclude from Proposition (4.19) and Theorem (4.17) that each cluster point of the net (y*4(- \co)Aey belongs to ^(O). Incidentally, the free boundary condition coincides with a configurational boundary condition whenever $ is a gas potential with a vacuum state a e E. For in this case we have, letting a denote the constant configuration with value a, O A = 0 > 4 a S V i ) for all A e y , and thereby yf(- \
A = S n r ] [mk,mk +
pk-l]
k=l
with m — (m 1 ,...,m d )e S and p = (p1,...,pd)e
Nd.
70
The existence problem
We fix such a A and think of it as a torus; that is, we identify A with the factor group S/p • S. Accordingly, we write i = j if i, j e S are such that ik = jk (modp k ) for all 1 ^ k d. Also, if A, B e y then the notation A = B means that B = A + i for some i e S with i = 0. For each A e £? we put A* = {i G A: i = j for some j G /!}. (Clearly, if /I = B then ,4 # = B*. The converse, however, is false.) In the case i c A w e also consider the set £f(A) — {B e 6f: B* = A}, and we choose an arbitrary set &(A) of representatives of each ( = )-equivalence class in £f{A). Finally, we let 5A: Q -> Q denote the periodic continuation of the projection erA, i.e., for each œ G Q we put <xA(co) = (cOj(i))< e s> where 7(1) is the unique element of A with 7(0 = i- The shift-invariance of ,, o
— •< B s «M)
I
0
otherwise.
<5A is called the A-periodic modification of 3>. Assuming <5A is /l-admissible, we can look at the specification y*4. The restriction of y£*{-\co) to 3FK is independent of <x> G Q and is called the Gibbs distribution in A for $ with periodic boundary condition. Now let co G Q be given. We claim that each cluster point of the net (yTi'l^&esr belongs to ^(3>). This follows from Proposition (4.19) and Theorem (4.17) provided we can show that Hi/*4"*!! -> 0 for all A G ¥ as A runs through 5^D. So let A G y be given and A G 5^D so large that A D A . For each A a A with A D A ^ 0 we can assume without loss that B D A ^ 0 for all B G £(4). Then
I
I
AczA BeM(A) 4 Ar\Ayt0 B<=A
#B ° 5A =
£
and thus
||flf-*||g
I
I
||0„|| + I ||OJ
.4<=A BeM(A) AnAïQ B\A#0
A\A^0 .4nA#0
Û2 X ||0>J. .4\A#0 .4nA#0
Under our hypotheses on <S>, the last expression tends to zero as A runs through Sfu. The claim is thus proved, o We conclude this section by mentioning a simple consequence of Theorem (4.17). Occasionally, Gibbs measures are defined by the condition \iyK = /jon 3FK (rather than #") for all A e y . Clearly, this condition implies \iyK -g fi.
Existence and topological properties of Gibbs measures
71
Hence, if y is quasilocal it follows from Theorem (4.17) that \i e ^(y). We thus have the following result. (4.21) Remark. Suppose y is a quasilocal specification. Then \i e ^(y) if and only if /i(^) = fy A (/l|-)d/i for all A e ^ A and all A e ^ .
4.4
o
Existence and topological properties of Gibbs measures
We now combine the results of the preceding sections to state a general though somewhat vague existence theorem. (4.22) Theorem. Let (E, S) be a standard Borel space and y a quasilocal specification. Suppose there is a locally equicontinuous net (/J.a)aeD of random fields of the form /j.a = vayA ,oceD, where (va)a eDis a net in 0>(Q, #"), (y ")„ e D a net of specifications with ya-j}y, and (AJ J 6 f l a net in if with Aa-gS. Then ^(y) contains a cluster point of ( j j j ^ , , and is therefore non-empty. Proof According to Proposition (4.9), (na)xeD has a cluster point \i. /i is the limit of a subnet (fia ) ß £ i y oï{^.a)a,eD. Applying Theorem (4.17) to this subnet we see that fi e ^(y). D Suppose we want to apply the preceding theorem to a given specification y. Then, of course, our main task is to find a locally equicontinuous net (/OaeD of the required form. Unfortunately, there is no general rule which allows us to make a clever choice of ( / J J I E B . Rather this choice depends on the particular properties of the specification y at hand. However, in general we are not interested in the mere existence of a Gibbs measure but in the existence of a Gibbs measure with certain additional properties. We then need to choose a net (na)aeD such that all its cluster points have these properties. In concrete models, there is often a natural candidate for such a net, and we are only left with the problem of verifying its local equicontinuity. As a tool then we can (and will) use Theorem (4.12) or Corollary (4.13). As a matter of fact, we shall proceed in this way in the proofs of Theorem (6.21), Proposition (18.12), and Proposition (20.7). There are no existence problems when y is a quasilocal ^-specification for some finite A such that all pA's are bounded. For in this case each net ( / j j , , ^ of the form na = vayA with va e ^(Q, SF~) and Aa -^ S satisfies the hypotheses of Theorem (4.12). In particular, this holds when y is Gibbsian for some
72
The existence problem
absolutely summable potential. The next (and final) theorem deals with this important particular case. As in (2.11) we let 0& denote the Fréchet space of all absolutely summable potentials, and for a given finite X e Jt(E, ê) we look at the correspondence $ from â& to 0>(Q., 3?) which maps each
(c) The graph {(0,/i): O e ^ / j e &($>)} of the correspondence & is closed. (d) & is upper semicontinuous, i.e., for each closed subset F of 0>(Q, !F) the set <$-\F) ^ { O e f : #(0)flF # 0} is closed. Proof, (a) Let O e ^ and coeQ.. Comment (4.14)(1) and Corollary (4.13) show that the net (y®(- \(o))Ae9> is locally equicontinuous. Therefore Example (2.25) and Theorem (4.22) imply that ^(
= / i ä ^ /i.
Theorem (4.17) thus shows that /i e ^(O). (We note that statement (c) does not rely on the hypothesis that (E, S) is standard Borel.) (d) Suppose O is the limit of a sequence (O n ) nè x in ^ _ 1 (F). (As J 1 is metrizable, we only need to look at sequences.) We will show that O e ^ _ 1 ( f ) . By definition, there is a sequence (/x„)nai m F with /i„ e ^(cD"). (/Ongi has a cluster point \i. This follows from Comment (4.14)(1) and Corollary (4.13) because lim ||
Appendix. Standard Borel spaces
73
(actually of a subsequence, by Proposition (4.15)). In particular, lim (<J>"% fin) = (<1>, /i). Since the graph of ^ is closed, fi e ^(<1>). As F is closed, n e F. Hence £(0) fi F ^ 0 and thereby
4.A
Appendix. Standard Borel spaces
This appendix is devoted to a proof and some consequences of a fundamental result on standard Borel spaces, the isomorphism theorem (4.A6). This theorem has not been used in Chapter 4, but one of its consequences will be needed later on in Chapters 7, 14, and 15. Of course, the isomorphism theorem also contributes to a proper understanding of the notion of a standard Borel space. (4.Al) Definition. Let (X,SC) and (Y,%/) be two measurable spaces. We say (X, SC) and ( Y, <W) are isomorphic and write (X, SC) ~ ( Y, <&) if there is a bijection f:X-> Y such t h a t / a n d / " 1 are measurable. We write (X, SC) £ (Y, <W) if there is some B e <& such that (X,SC) ~ (B,<W\B), where <W\B = {A e <&: A c B}. (4.A2) Remarks, (a) The relation " £ " between measurable spaces is transitive. (b) If S ^ 0 is a countable set and (X, SC), ( Y, <30 are measurable spaces with {X,SC)^{Y,^) then (Xs,iFs)£(Ys,^s). (c) If (X,^) and {Y,<&) are measurable spaces then {X,SC) ~ {Y,<&) if and only if (X,SC)^{Y,<&) and (Y, «0 S (*,#")• Proof. Only the "if" part of (c) needs to be proved. Suppose there are sets B G <W, A e SIC and bimeasurable bijections / : X -» B and 3: Y -» A We define ^„ e <2T, ß„ g
F(x)H
i f x e / l U U ^ ^ i ) .
3 _1 (x)
i f x e {J
'f-l(y)
i f ^ e ß U U (ß 2 n + i\ß2 n + 2 ),
(A2n+1\A2n+2)
and n>0
G(y)=^ g(y)
if ye [j «Ï0
(B2n\B2n+1).
74
The existence problem
It is easily checked that G o F = id x and F o G = id y . Hence F is an isomorphism of (X,9£) and (Y,<&). • We will now look at standard Borel spaces; cf. Definition (4.7). First, we note that a measurable space (E, S) is a standard Borel space if (and only if) (E, S) ~ (X, 3ß{X)) for some complete separable metric space X with its Borel er-algebra 3ß(X). This is because in this case it is possible to define a metric on E in such a way that the isomorphism between E and X becomes an isometry. Next we observe that the configuration space of a spin system is standard Borel whenever the single spin space is standard Borel: (4.A3) Remark. If S ^ 0 is countable and (E, S) is a standard Borel space then (Q, #") = (Es,£s) is a standard Borel space. Proof. The case of finite S is easy. So we assume without loss that S = N. Let d be a complete metric on E which generates S, and D a countable dense subset of E. Define d(co, to') = X 2~" 1 A d(œn, to'n)
(to, to' e Q).
nil
It is easily seen that dis a complete metric on Q which induces the product topology on Q. For each fixed a e D, {to e Ds: to„ = a eventually} is a countable dense subset of Q. Each J-open set is a countable union of sets of the N
form P| {d(an,ton) < e} {to„ e F,iV ^ l,e > 0) in
K = {0,1}N
with the cr-algebra, @)(K), which is generated by the cylinder events. It is well-known, and easy to check, that &(K) is the Borel cr-algebra with respect to the metric dK(to,co')= X 2~"|co„-co;|
(to, to' s K)
on K. (K, dK) is a compact, and therefore complete separable, metric space. The space (K, @>(K)) is idempotent in the following sense: (4.A5)
(Ks, @(Kf) ~ (K, âS(K))
for each countable set S i= 0. This is because each bijection tp: S x N -» N defines an isomorphism / from (K,&(K)) to (Ks,@(Kf) by /((co„) Bèl ) = ton e {0,1}. The main properties of standard Borel spaces are collected in the next theorem. We let ^ß(M) denote the set of all subsets of a set M.
Appendix. Standard Borel spaces
75
(4.A6) Theorem. Let (E, S) be a measurable space, and suppose that (E, S) £ (X, 08(X)) for some complete separable metric space X and its Borel a-algebra 08(X). Then ~(K,08(K)) UN,<$(N)) if E is _({1,...,|£|},
(E,S)~
uncountable, countably infinite, finite.
In particular, (E, S) is standard Borel, and for each standard Borel space (E, S) there exists a metric on E such that E is compact and ë is the corresponding Borel a-algebra. The proof of Theorem (4. A6) will be preceded by two propositions. (4.A7) Proposition. Let (X,d) be a complete separable metric space. Then (X,08(X)))Z(K,08(K)). Proof 1) (X,08(X)) £ (H,38(H)), where H = [0,1[ N and 08(H) is the Borel (T-algebra for the metric dH(co,a>') = £ 2~"\co„ - m'n\
(co,co'e H)
«ei
on H. Let {z„:n^
1} be a dense subset of X and define / : X -> H by
V1 + d(x,z„)Jnèl It is easily seen that / is injective and continuous and thereby measurable. To prove that f(X) e 08(H) and f~u.f(X) -> X is measurable it is sufficient to show that the system {A cz X: f(A) e 08(H)) contains all closed sets. To this end we first observe that for each x e X and e > 0 there is some 0 < ô(x, e) < e such that for all y e X with dH(f(y), f(x)) < ô(x, s) we have d(y, x) < e. For a given closed set A cz X we consider the open sets Ge= (J
{dH(-,f(x))<ô(x,e)},
xe A
e > 0. Clearly, f(A) cz GE for all e > 0. We show that f(A) = f] G1/m. (This mal
will imply f(A) e 0S(H)) Let to e f] G1/m. Then for each m ^ 1 there is some mal
xm e A with dH(co,f(xJ) < ô(xm, l/m) < l/m. We define a sequence (mk)kil recursively by m, = l,mk^ 2k, l/mk+1 < ô(xmk, l/mk) - dH((o,f(xmJ), k^\. Then for all k ^ 1 we have dH(f(xmkJ, f(xmJ) < ô(xmk, l/m J and therefore d(xmhi,xmh) < \/mk ^2 k. Hence (xnJk â l is a Cauchy sequence and
76
The existence problem
converges to some x e A. As / is continuous, f(x) = lim / ( x m J = co. Hence k x co e f(A). ^ 2) ([0,1[,#([0,1[))£(K,#(K)). To see this we define a function g: [0,1 [ ^ K by g(t) = (t„) n â l with tn = [2"t]mod2, 0 g t < 1. Thus (t„) nàl is the unique dyadic expansion of t with tn = 0 infinitely often. It is well-known that ôf is injective and measurable. #([0,1 [) is the complement of the countable set {co e K: co„ = 1 eventually}, and co -» £ «„2"" is a continuous inverse of g. "âl 3) From 2) and Remark (4.A2)(b) we obtain (H, 08(H)) S (KN,<%(K)N). Combined with 1) and relation (4.A5) this implies (X, &(X)) S (K, @(K)). a (4.A8) Proposition. Let (E, S) be a measurable space with uncountable E, and suppose that (E, S) £5 (X, &(X)) for some complete separable metric space (X, d) and its Borel o-algebra âS(X). Then (K,@(K)) £ {E,S). Proof. 1) We consider the set si of all subsets A of X with the property that there is a continuous bijection from a complete separable metric space onto A. We will show that si => 0&(X). The proof rests on the following four observations. i) si contains all open subsets of X. For if A <= X is open then the metric d'(x,y) = d(x,y) + |d(x,.XV4)-1 — d(y,X\A)~l\ (x,y e A) turns A into a complete separable metric space. ii) si contains all closed subsets of X. This is because the restriction of d to a closed set A a X is complete, and choosing one element of each nonempty set of the form {x e A: d(x,y) < 1/n} (n ^ 1 and y e D, a countable dense subset of X) we obtain a countable dense subset of A. iii) si is stable under countable disjoint unions. iv) se is stable under countable intersections. For let /„: Y„ -» An (n ^ 1) be continuous bijections from complete separable metric spaces Y„ onto subsets AnoîX. Then Y = |(y„)„ à l e f i r»: L(yn) = MVx)
for all n ^ l |
is a closed subset of the complete separable metric space Y\ Yn (cf. Remark (4.A3)) and therefore, by ii), a complete separable metric space. Moreover, / : Y -» p | An defined by /((yJng i) = fiiy^ is a continuous bijection. Now we look at the set si* = {^1 e ^/: X\/l e si}. According to i) and ii), si* contains all open sets. Moreover, if (An)n^1 is a sequence in si* then [J A„ e J / * . This is because iii) and iv) imply that [j An = (J (y4„n"ff X \ y 4 J e j ^ and X \ y A„ = f] X\Anesi. n è l \
fc
=l
/
"al
(T-algebra. This proves that si* =3 ^(X).
nâl
Hence si* is a
Appendix. Standard Borel spaces
77
2) Suppose (Y,d) is an uncountable complete separable metric space. We construct a continuous injection g: K -> Y. We let °ll denote the countable set of all open balls in Y with rational radii and centers in a countable dense subset of Y. % is a countable base of Y. The set Y0 = (J {U: U e W, U countable} is countable. Hence Yj = Y\ Y0 is uncountable, and each neighbourhood of any point in Yx is uncountable. For each n 2: 1 and each (co l5 ..., co„) e {0,1}" we choose a point y(co1,...,con) e Yt and a neighbourhood [ 7 ( Q J 1 , . . . , Q J B ) 6 $f of y(coj,...,
1) CZ [7 ( « ! , . . . , (U„),
and d(f/(cu u .. .,co„,0), [ / ( c o j , . . . , ^ , 1)) > 0. This can be done by recursion on n because if y (co !,..., coj and U{oiu...,(a^ are already chosen then {7(0»!,..., co„) D Y1 is uncountable and therefore contains at least two distinct points (which then are called y(œ1,...,œn,0) and y(co1,...,con,l)). Now let co — (co1,co2,...) e K. For each n ^ 1 we have d{y(co1,...,con+1), y(coy,...,con)) :g 2~". Hence {y(co1,... ,co n ))„ äl is a Cauchy sequence and converges to some éf(co) e Y. This gives us a mapping g: co -> g(co) from X to Y. g is injective and continuous. For.let co # co' and n be the smallest integer with con # a»;,. Then the closures of U{co1,...,con) and U(œ'1,...,œ'n) are disjoint and contain g{co) and éf(co') respectively. Hence g(co) # fl(co'). On the other hand, d(g(co),g(co')) ^ 2"«"-1» ^ 2dK(co,o/) because [/((u 1 ,...,co n _ 1 ) contains g(co) and öf(co'). 3) The results 1) and 2) above imply the proposition as follows. By hypothesis, there is a set A e 3&{X) with (E,ê) ~ (A,@{X)\A). According to 1), there is a complete separable metric space Y and a continuous bijection / : Y -y A. As E is uncountable, so is A and therefore Y. Thus 2) applies, and / o g : K - > , 4 i s a continuous injection. As X is compact, / o g is a homeomorphism from X onto a compact subset of A. Hence (K,0(K)) S (A,@(X)\A) ~ (£,«f), and this implies the proposition, D Proof of Theorem (4.A6). We distinguish two cases. Case 1. £ is uncountable. Propositions (4.A7) and (4.A8) give (£,
78
The existence problem
{/(x)} is closed and thus {x} = / _ 1 ({/(x)}) e g. This implies S = ty{E). Thus each bijection from E onto {n e N: n <; |£|} establishes an isomorphism of (E,S) and ({n e M: n S \E\}, ty({n e N: n ^ |£|})). Clearly, if | £ | < oo then {1,..., |£|} with the discrete metric is a compact space, and its power set is the corresponding Borel cr-algebra. On the other hand, the set N can also be turned into a compact metric space. For example, we may take the metric determined by the conditions d(m, n) = |m _1 — n_11 for m, n ^ 2 and d{\, n) — n _ 1 for n ^ 2. ty(N) is the Borel cr-algebra for this metric, D In Chapter 7 we shall need a property of standard Borel spaces which follows readily from Theorem (4.A6). (4.A9) Definition. Let (Q, F) be a measurable space. A countable subset # of J* will be called a countable core of J* if ^ satisfies the conditions below: (i) # generates F and is stable under finite intersections. (ii) If (/Ongi is a sequence in ^(fi, J*) such that lim |I„(J4) exists for all n->oo
Ae^ then there is some (necessarily unique) /i e ^"(Q, J^) with /i(yl) = lim ßn(A) for all A e
(4.A10) Comments. (1) Suppose 2F has a countable core <$ = Then d{ß,v)=
{A1,A2,...}.
X 2--|A*(-4„) - v(X„)| nil
is a metric on 0>(Q, F). In this metric, 0>(Q, F) is compact. For let (/!„)„èl be a sequence in ^(fi, F). Then the diagonal method gives a subsequence (/i„k)i,àl which converges setwise on c€. Thus condition (ii) provides us with some ß e 0>(Q, F) such that d(n„k, /i) -> 0. (2) Suppose (Q, !F) and (fi', #"') are measurable spaces, and q>: Q -> Q' is a surjection such that J* = (p~lF' = {(^T1^': .4'e J*'}. If <<£" is a countable core of #"' then (p_1<^' is a countable core of F. In particular, if (Q, !F) ~ (Q', F') then #" has a countable core if and only if 2F' has. Moreover, J* has a countable core if and only if its canonical image on the space of all atoms of #" has a countable core, o The last sentence implies that in order to know all u-algebras with a countable core we only need to look at cr-algebras containing singletons. For such (j-algebras the existence of a countable core is equivalent to the standard Borel property. (4.A11) Theorem. Let (Q, SF) be a measurable space. If (Q, IF) is standard Borel then !F has a countable core. Conversely, if'!F has a countable core and contains all singletons then (Q, IF) is standard Borel. (The converse will not be needed later on.)
Appendix. Standard Borel spaces
79
Proof. 1) Because of Comment (4.A10)(2) we only need to look at the three particular standard Borel spaces occuring in Theorem (4.A6). Case 1. Q finite, & = ^ß(Q). Then
n-*ao
finitely additive. In fact, fi0 is cr-additive. For let (Ak)kèl be a sequence in # with Ak i 0. As each Ak is compact, y4fc = 0 for sufficiently large k. Hence lim fi0{Ak) = 0- Carathéodory's extension theorem therefore implies that fc-*ao
fi0 can be extended to some fi e ^(O, •&r). 2) Suppose J^ has a countable core <€ and contains all singletons. If <€ is finite then so are SF and Q. Therefore we assume <€ is infinite. We write ^ = {Cl5 C 2 ,...}. According to Theorem (4.A6) it is sufficient to prove that (Q, J*0 S (X, #(K)). We define a mapping / : Q -> X by /(CB) = ( l C k M ) t à x. / is injective. For if /(co) = /(co') then the Dirac measures at co and co' coincide on 9% and therefore on 3F. As {co} e J^, this implies co = co'. To show / is bimeasurable we put C 1 = C, C° = Q\C when C e9S. For all n ^ 1 and s l 5 . . . , s„ e {0,1} we have /
>
= ( 4 Ï I ^ : Ï I
= SI
x, = U ) = c j ' n . . . n q " .
Hence / is measurable. On the other hand, for each A e 9% U {£2} we have (4.A12) f(A) =
Ç]\xeK:Ar\f]Cï"^0\. nil
I
fc=l
J
For, /(y4) is clearly contained in the expression on the right. Conversely, let n
x e K be such that for each n ^ 1 there is some co„ e A C\ f] Ckk. Let /i„ be k=l
the Dirac measure at co„. Then lim n„{A) = 1 and lim n„(Ckk) = 1 for n-*co
n-»ao
all k ^ 1. In particular, lim fi„ exists on 9%. Hence there is some /i e
^(Q,^)
with fi(A) = 1, /z(Q") = Tfor allfc^ 1 and therefore ^ M H f] Q " ) = 1. Thus there exists some co e A with co 6 C£k for all k ^ 1, i.e. f(co) = x. Consequently, x € f(A). The right side of (4.A12) is a countable intersection of cylinder events in K and therefore measurable. Hence {A a Q: f(A) e &(K)} is a cr-algebra containing (€. This proves that f~1:f(Çl) -> Q is measurable, D (4.A13) Corollary. Suppose that (Q, #") = (£, S'Y for some standard Borel space
80
The existence problem
(E, S) and a countably infinite set S. Then J2" admits a countable core %> which consists of cylinder events. Proof. {E, S) can be chosen to be one of the particular spaces which appear in Theorem (4.A6). In the case of a finite set E we let # be the algebra of all cylinder events (cf. Case 3 of the proof of (4.A11)). If (E,&) = (K,@(K)) then (Q,.F) = ({0,l},$({0,l})) N x S , and we can again take the algebra of all cylinder events in ^({0, l}) N x S . In the case £ = Nwe identify S with N and define <£ = {0} U {{
A*(^{i
jv} = CI) = A*(o-{i
jv-ij = 0 -
Z A*(o"{i
AT} = W
X>1
when £ e N Î 1 - - ^ - 1 } . For each AT, ji determines a unique probability measure onfj! j,). Kolmogorov's extension theorem therefore implies that \i can be extended uniquely to a probability measure on 3F. D
Chapter 5 Specifications with symmetries
This chapter is devoted to a preliminary study of specifications which are preserved by a group of transformations on the configuration space. As might be expected, the investigation of symmetries plays a prominent rôle in the theory of Gibbs measures. Of course, this is suggested by the physical applications. Consider, for example, a piece of iron or nickel. Its atoms are located at the sites of a crystal lattice with certain axes of symmetry. Accordingly, the interaction of the atomic spins remains invariant under a transformation group including some spatial translations and reflections as well as the reversal of all spin orientations. Plainly, a reasonable mathematical model of such a physical system should share these invariance properties. This is one reason for studying specifications with symmetries. A more important motive is illustrated by the phenomenon of spontaneous magnetization: Below Curie temperature, the spin system takes one of several possible equilibrium states each of which is characterized by a well-defined direction of magnetization. In particular, these equilibrium states fail to be preserved by the spin reversal transformation. In other words, the equilibrium states break the spin reversal symmetry of the interaction. This phenomenon of symmetry breaking is a wide-spread and important special type of phase transition and thereby provides one of the main motivations for studying symmetries. Needless to say, there is also a practical reason: The presence of symmetries often permits a great simplification of the mathematical analysis of a model. The phenomenon of symmetry breaking will be discussed extensively in later chapters. Here we address ourselves to the converse problem, the existence of Gibbs measures which inherit the symmetries of a specification. There are two possibilities for obtaining such Gibbs measures: either by means of a symmetrization procedure, or as a limit of averaged Gibbs distributions with suitable boundary conditions. These possibilities will be described in Section 5.2. In Section 5.1 we shall introduce the class of transformations of Q with which we are dealing, and discuss the relationship between the invariance of specifications, modifications, and potentials.
5.1
Transformations of specifications
As before, we are given a countably infinite product (Q., #") = (£,
82
Specifications with symmetries
of S and/or transformations of E. Specifically, we let T denote the set of all transformations x: Q -» Q of the form (5.1)
T:
(co e Q).
Here T^ is any bijection of S, and the xt are invertible measurable transformations of E with measurable inverses. Thus each x e T is a composition of a spatial transformation T^ (which transports the spin at a site; to the site T„J) and the spin transformations xt, i e S, which act separately at distinct sites of S. To specify these components of x we shall occasionally write x = (x^; xt, i e S). Clearly, each x e T i s measurable. More precisely, if A c S and / is 3FKmeasurable then / o x is ^-immeasurable. It is also evident that T is a group. For example, the inverse of x = (T^; xt, i e S) is x = (T" 1 ; T~} , i e S). We mention three particular transformations in T which will be frequently used later on. (5.2) Examples. (1) Let S — Zd for some A ^ 1. Then for each) e S the transformation 0 ; :û)-»(û)j_,), e S
(coeQ)
of Q is called the s/n/t or translation by j . 0 = (Ö,)jeS is an abelian subgroup of T. (2) Suppose £ is a symmetric Borel subset of U. Then the transformation T:GO-»(-<0,-),6S
(coeQ)
in T is called the spin flip or spin reversal. (3) Let £ be a rotationally invariant Borel subset of some UN (e.g., the unit sphere), and M e SO(N) a rotation. Then the mapping w->(MeO;)ieS
(coeQ)
in T is called the spin rotation by M.
o
Suppose next we are given an a priori measure A e Jt(E, S). We shall then say that a transformation T = (x^x^i e S) e T is A-preserving if all xt are Apreserving, in that xt(A) = A for all i e S. (Recall that x^A) = A o T^ 1 stands for the T,-image of A.) It is the object of this chapter to introduce and to study the action of the transformations in T on potentials, specifications, and Gibbs measures. Let x = (T^; T;, i e S) e T be given. For each family q> = (
T((P) = ( f t - l A O T - 1 ) A e y .
Note that this definition applies to potentials and ^-modifications. Clearly, if is a potential then so is T(<5). If p is a ^-modification then Proposition (5.6) below will show that x(p) is a ^-modification, too. Next we define the x-image
Transformations of specifications
83
x(y) = (x(y)A)Ae^ of a family y = (yA)Aey of measure kernels on & by (5.4)
T(y)A(^|co) = y t , lA (T-M|T- 1 a;)
or, equivalently, x(y)ZtA(xA\xco) = yA(i4|ca); here A e y , A e S', and <x> e Q. It can be immediately checked that the r-image of a specification is a specification again. We also note that equation (5.4) is equivalent to the property that (5-5)
(T(y)t.A/) o T = yA(fo x)
for all A e y and all bounded or non-negative measurable functions / on Q. The next proposition shows that the action of x on potentials, Xmodifications, and specifications is defined consistently. (5.6) Proposition. Let X e Jt{E, S) be an a priori measure and x e T a Xpreserving transformation. Then the following assertions hold. (a) x{X) = X.. (b) For each X-modification p, x(p)X_ = x(pX). (c) If 0 is a X-admissible potential then x(<&) is X-admissible, and pT(Q>) = x(p°). Proof (a) This follows readily from the assumption that x is A-preserving. (b) Let A e if and / be a bounded measurable function on Q. Then (5.5) gives M P / U . A / ] ° I = ^A(PA f°
T)
= AA((T(p)ttA/) o T) =
[^.AWPW)]°T
= [ ( T ( P ) A . U / ] O x.
The third equation comes from (5.5), applied to y — A. = T(A.). As T is invertible and A and / are arbitrary, (b) follows. (c) For each A e if we have ^ ' ^ =
I
T(0)tti4OT = H j
/I e . ^ : / I D A 5*0
and therefore hffl o T = /i*. Applying (5.5) to y = A. = T(A.) we obtain Z£A° ° * = ( ^ A O ° T = K(KS
° T) = z*.
This shows that T(O) is A-admissible, and looking at equation (2.8) we find Pl^ox
= pt =
This implies assertion (c). D
x{p%tAoX.
84
Specifications with symmetries
We now turn to the main subject of this chapter, namely potentials and specifications which remain invariant under the action of a transformation in T. (5.7) Definition. Let t = ( T ^ T , , ie S)e T and I a T. (a) A function q> on Q. is called x-invariant if q> o x = q>. More generally, a family
(A e ^ J eS,Ae
&,coe£l);
here A + j = {i + j : i e A}. Similarly, a potential $ is called shift-invariant or translation-invariant or homogeneous if <^+; o 0,. = 0>x
04 e ^ J e S).
The set of all shift-invariant absolutely summable potentials will be denoted by 3$@. Plainly, 3&@ is a || • ||0-closed subspace of the space 3& of all absolutely summable potentials (cf. (2.11)), and the norm
ilfcllo = X I M
(<*>e^0)
turns 0&@ into a Banach space. The shift-invariant bounded range potentials are dense in 8ß@. o The next corollary to Proposition (5.6) clarifies the relation between the T-invariance of a potential and the t-invariance of the associated Gibbsian specification. (5.9) Corollary. Let X e M(E, S) and a X-preserving x e T be given. (a) If p is a x-invariant À-modification then the corresponding X-specification pX_ is x-invariant. Conversely, suppose E is countable, S = ty(E), X is equivalent
Gibbs measures with symmetries
85
to counting measure, and y is a x-invariant specification. Then the unique X-modification p with y = pXm is x-invariant. (b) Let <ï> be a X-admissible potential. If <ï> is x-invariant then so is p*. Conversely, suppose <ï> is normalized by some a e 3P{E, S) which is preserved by x, and either a is a Dirac measure or <ï> is uniformly convergent. Then the x-invariance of p* implies the x-invariance of <ï>. Proof, (a) The first statement follows immediately from Proposition (5.6)(b). The converse is a consequence of Remarks (1.28)(5) and (3) and the fact that Î ' I . A K . A = (T<Ü),„A|TCO) = yT,A(T{(7A = COA}\xCO) = y A ((7 A =
(0A\(0)
for all A G if and œ e Q. (b) The first assertion is obvious from Proposition (5.6)(c). For the converse we note that T(<1>) is a-normalized because
for all 0 # S c ,4 e y . Therefore, if p* is i-invariant then Proposition (5.6)(c) and Theorem (2.34) show that T(<Ï>) ~ <ï>, whence Theorem (2.35)(a) implies T((I>) = (I). D In the case S = Zd, d^.1, the shift-transformations 6j preserve each a e &(E, S). Therefore a normalized potential <ï> is shift-invariant if and only if p* is shift-invariant, and for countable E and positive X the latter occurs if and only if y* is shift-invariant.
5.2
Gibbs measures with symmetries
We will now turn to the problem of existence of Gibbs measures with prescribed symmetries. We start with a simple observation. (5.10) Remark. Let y be a specification and xeT.lffie &(y) then x(n) e ^(x(y)). In particular, ^(y) is invariant with respect to all symmetries of y. Proof. For all A e if we have T(MM?)A
= \ Md<ü)t(y)A(-
|TO>)
= J^(dû))y t f , A (T _ 1 -|û)) = T(WI.-»A) =
T
M-
n
(5.11) Corollary. If &(y) = {n} then n is preserved by all symmetries of y. What about the case of non-uniqueness of ^(y)? Does then &(y) contain a measure which is invariant with respect to all symmetries of y? This question
86
Specifications with symmetries
will be answered using two different approaches. These will be described in the two theorems (5.15) and (5.19) below. Later on, we shall only use the second of these. Thus Theorem (5.15) and its examples can be skipped on a quick first reading. For each subset / of T we let (5.12)
^ ( Q , ^ ) = {/ie ^(Q, &): x(\i) = \i for all x e /}
denote the set of all /-invariant random fields. Clearly, if [/] = {id} U (J {T„ O ••• • o T l : T, e / or if 1 e / (1 ^ i g n)} nil
is the group generated by / then ^/(O, IF) = ^/](Q, 3F\ We also notice that ^j(Q, J^) is closed in the if-topology. This is because fo x e if for all fe i£ and T e T, and /j, is /-invariant if and only if n(f° x) = n(f) for all fe if and re/. We are interested in the set (5.13)
9I(y) =
9(y)r\PI(a,&)
of all /-invariant Gibbs measures for an /-invariant specification y. Our first approach to the existence of such objects consists of a symmetrization procedure which enlarges the symmetry group of a Gibbs measure. Let / 0 , /j c T and x e T. We write (5.14)
Iy o / 0 = {Tl o T 0 : TX e / ^ T Q e / 0 } .
In particular, we put t o / 0 = {i}o / 0 and / 0 o x = / 0 o {T}. If / 0 and ^ are subgroups of T then / 0 is a normal subgroup of [/0 U / t ] if and only if Tt o / 0 = / 0 o i j for all Tt e / 1 ; and the latter implies [/0 U/j] = / 0 o / x = /i o /0.
(5.15) Theorem. Let / 0 and It be two subgroups of T such that xt o / 0 = I0 o xt /or a// Tj e / t , ara/ /et y be an It o I0-invariant specification. Suppose that either (i) there is a topology on lY which turns 1^ into a compact topological group and is such that the evaluation map e: (x, a>) -> xco from Iy x Q onto Q is measurable {provided It is equipped with the Baire o-algebra); or (ii) yio(y) is compact (in the &-topology), and for all T 1( T 2 e / t there exists some T 0 e I0 (possibly the identity) such that I , O I 2 = I 2 O I , O T 0 . Then %o(y) # 0 implies 9IlOl0(y) * 0. Proo/. 1) We first assume that hypothesis (i) holds. We take any v e <SIo(y). If I\ is a finite group then it is natural to consider the measure A* = Uil" 1
Z *i(v).
Similarly, in the general case we let m denote the normalized Haar measure
Gibbs measures with symmetries
87
on (the Baire a-algebra on) J^ ; an existence proof for m can be found in Cohn (1980), e.g.. We then define fi = e(m x v ) = J m(dx)T(v). We show that /i e ^IlOl0(y). (5.10) implies
First of all, \i is specified by y because Remark
WA = 1 m(dx)T(v)yA = J m(dx)T(v) = /i for all A e y . Next we consider a transformation Tt o T0 e Ix o I0. For each fe ££ we have M / ° T i ° T 0 ) = J m(dr)v(/o T ^ ^ O T) = J m(dx)v(/o Ti o T) = J m(dr)v(/o T)
proving the Ix o 70-invariance of /i. The second equality follows from the hypothesis that for each x e Ix there exists some f0 e I0 such that x0 O T = x o f0. For this gives v ( / o Xl o T0 o
T)
= v(/o
Tl
o T o f 0 ) = v ( / o Tt o
T)
because v is 70-invariant. The third equality comes from the fact that m is preserved by the translation x ->• TX O T of / t . 2) Turning to the case of hypothesis (ii) we only need to show that
*W(y) * 0 for each finite subset F of I1. For in this case we may use the compactness of the closed subsets ^IoUF(y) of ^Io(y) to conclude that 0*
H ^oUF(y) = ^ 0 u/ 1 (y) = ^ /l o /o (y). Fc/, |F|
Let v e &Io(y) and F = { T 1 5 . . . , T „ } c /j be given. For each i = (i 1 ; ...,i n )e Z" we write X' =
x[l
o •• • o T^ n .
As t ' e / i , Remark (5.10) ensures that x'(v)e^(y). T'(V) e &Io(y) because T 0 o T'(V) = Xl
O f 0 (v)
=
We even know that
T'(V)
for all T 0 e I0 and suitable x0e I0. We further note that T1' O TJ'(V) = xi+J(v) for all i, j G Z". Indeed, it is sufficient to check this when only one coordinate of i, say ik, is different from zero. In this case we have, setting;' = (j1,...,jk,0,...,0) and v = TJ'~J"(V),
88
Specifications with symmetries T1' 0 TV'(V) -
Ti+J(v)
= T'kk O TJ"(V) -
Tj' O T ^ ( v ) = 0 .
For, our hypothesis on It implies that It acts commutatively on the 70-invariant measure v. We now use the argument of the Markov-Kakutani fixed point theorem. For each N ^ 1 we consider the cube A^ = Z" fl [ — N, NY and define ^ = lA^r1
x T'(V). ieA„
Since ^Jo(y) is convex, /j,N e ^Io(y). As ^Io(y) is compact, the sequence (fJ.N)„äl has a cluster point /i in ^Jo(y). We show that \i is F-invariant and therefore ^ o uF(7) = ^ o ( 7 ) n ^ F ( y ) ^ 0 . Let j e Z", / e if, and e > 0. Because foxje AT ^ 1 such that IM/) - M / ) l + I M / °
TJ)
- Kf°
if, there are infinitely many *j)\ < B.
Thus \fi{f) — fi(fo Tj)\ < e provided we can show that lim | M / ) - M / °
T>)|
= 0.
JV->oo
But M / ° ^ ) = | A A , r 1 X V(/OT^') = |AN|-1
X
V(/OT ; )
and therefore IM/î-M/^Ol^lAjvr1 ^ 11/11
X
V(/OT ; )
\ANA(AN+j)\/\AN\-^^0.
The proof is thus complete, D (5.16) Corollary. Let I be a subgroup of T and y an I-invariant specification with &(y) =£ 0. If I is a compact group and e is measurable, or if ^(y) is compact and I is abelian then ^j(y) =£ 0. Proof. Take I0 = {id}, Ix = I in Theorem (5.15). D For more complicated subgroups I oi T the existence of /-invariant Gibbs measures may be proved by an iterated application of Theorem (5.15). For instance, this gives ^7(y) ^ 0 when y is /-invariant and / is a direct product of finitely many compact or abelian normal subgroups of /. (5.17) Examples. Let S = Zd for some d ^ 1 and y an /-invariant specification such that &(y) is non-empty and compact. Consider the following choices of/.
Gibbs measures with symmetries
89
(1) / = ® = {dy.jeS}, the shift-group. As 0 is abelian, Corollary (5.16) gives %(y) # 0. (2) / = R o 0 , where 0 is the shift-group and R the finite group which is generated by the mirror reflections in the planes {x e Ud: xk = 0} (1 ^ fe ^ d) and {x e M.d: xk = exf} (1 ^ k < £ ^ d,e = ± 1). Note that R contains all lattice rotations. Each x e R is of the form xco = (
for all fe Se
and
!{Ta,A:Ae«a}AfapJ^0 then each cluster point ß of the net
90
Specifications with symmetries
/*« = i#.r z \ii
(^D)
is x-invariant. Proof. For each / e S£ we can write |/x(/o T) - /x(/)| ^ lim sup |/x„(/o T) - /x a (/)| IED
= lim sup |/x„(/o T„) - n„(f)\ D
= lim sup D
v.fWr1 Z
= lim sup
*.r
z
[>A(/° O
- yÂ/])
KA/-7X/])
A E I ,
/
£ 11/11 l i m s u p l ^ r M I ^ A i A e ^ j A ^ J = 0. The third equality follows from equation (5.5) and the 1,,-invariance of y" and v„. Since / was arbitrary, we conclude that T(/Z) = p. a Here is a counterpart to the general existence theorem (4.22). (5.19) Theorem. Let (E,ê) be a standard Borel space, I cz T a set of transformations, and y an I-invariant quasilocal specification. Suppose there exists (i) a net (ya)xeD of I-invariant specifications with y" -ft y; (ii) a net (^„)lieD of non-empty finite subsets ^a of Of such that Aa = f] A -ft S and Ae«.
\{z,A:Ae<Xa}A<Xa\/\<Xa\-ftO for all T e I; and (Hi) a net (vj a e i ) in ^ ( Q , J*) such that the net
Ha-W.r1
Z WÂ
(aefl)
AE«,
is locally equicontinuous. Then ^(y) contains a cluster point of (pa)aeD and is thereby non-empty. Proof Proposition (4.9) ensures that (pJaED has at least one cluster point p. As \ia = pay\a, Theorem (4.17) shows that pe$(y). By Proposition (5.18), p e <S,(y). a The general comments on Theorem (4.22) also apply to the preceding existence theorem. Some typical applications will be given in the examples below and, on a more concrete level, in Sections 6.2 and 6.3.
Gibbs measures with symmetries
91
(5.20) Examples. Let S be the integer lattice of any dimension d ^ 1. As in (5.17), we consider the shift-group © and the group R which is generated by all mirror reflections of S. Also, we let T° denote the group of all "pure spin transformations", i.e. of all T e T whose spatial part T + is the identity. Likewise, TA° stands for the group of all pure spin transformations which preserve a given a priori measure À. As an illustration of Proposition (5.18) and Theorem (5.19), we shall use three different types of boundary conditions to construct Gibbs measures which are invariant under some subgroup / of T° o R o @, the group of all t e T whose spatial part T + is a composition of a translation and a reflection or rotation. (1) Configurational boundary conditions. Let / be any subgroup of T° o R o @, y an /-invariant quasilocal specification, and a> e Q any configuration such that reo = œ for all x el. Consider the cubes AN = Sfl [ — N,iV]d, and suppose the sequence IANI" 1
I
)%,+i(-|û>)
(NeN)
of averaged Gibbs distributions has a cluster point ß. Then ß e ^j(y). Indeed, the /-invariance of ß follows from Proposition (5.18) by putting D = N, yN = y> VN = <5o» tN = x e I, and 02N = {AN + i: i e AN}. For, if T e / then T„, is a composition of a reflection or rotation r and a translation by some j e S. Hence {T + A: A e ^ j = {AN + i: i e AN + j} and therefore KVV: A e MN}A@N\/\<%N\ = \(AN +j)AAN\/\AN\
^ 0.
On the other hand, ß is also a cluster point of the modified sequence lAk(N)l-1
E
yA„+i(-l<ö),
' s Aie,«,
where k(N) is such that k(N)/N -> 1 and iV - /c(iV) -> oo. (For example, we can take k(N) = N - ^/N.) Comment (4.18) thus tells us that ß e <S(y). The proof is thus complete. Needless to say, the above result still holds when the /-invariant configuration co is replaced by a random boundary condition ve^(fi,/). (2) Free boundary condition. Let A e M(E, S) be finite, and suppose that / <= TA° o R. (I thus consists of compositions of reflections and A-preserving pure spin transformations.) Consider an /-invariant potential O e l Let AN be as above and let
92
Specifications with symmetries
from Proposition (5.18) and Example (4.20) (1) that each cluster point of the sequence above belongs to (SI{^>). By Comment (4.14)(1), such cluster points certainly exist when (E, S) is standard Borel. (3) Periodic boundary condition. Let X e Jt{E, S) be finite and O a 1admissible potential. Suppose that either $ has finite range or $ e 0&. We fix some integer m S: 1 and let T}m) denote the group of all A-preserving pure spin transformations x = (id; xh ie S) e T° which are m-periodic in the sense that T; = tj whenever i—j = mk for some k e S. We assume that $ is invariant with respect to a transformation group / which satisfies 0 c / c TA(m) o R o @ , For each N ^ 1 we choose a translate AN of the cube S D [1, miV]d such that A,y | S, and we let 5>N = 5>A" denote the corresponding periodic modification of $ which has been introduced in Example (4.20)(2). We let {vN)Nèi be an arbitrary sequence in ^(Q, #") and consider the sequence
to = w £
(N ^ i). s
By definition, ^jvl^ÂA i the Gibbs distribution in AN for $ with periodic boundary condition. We let %($>) denote the set of all cluster points of (A*jv)jväi- By t n e s a m e reasons as in the preceding example, the set %(Q>) does not depend on the choice of (vN)N^ 1. So we can assume that vN e ^j(Q, #") for all N. We will show that %(&) c <^7(
Be 01(A)
K The second equality can be justified as follows. On the one hand, we have (x*B)* = xNJfB* for all BeSf, whence y(tNtlA) = {x^B: B e Sf(A)\. Since x^ preserves the relation = (for A = AN), this implies that {x^B: B e &(A)} is a complete set of ( = ^representatives of ^(xN^A\ On the other hand, ^Af, ° TN = T ° <^v For l e t i e S be arbitrary and j e AN be such that j = x"1*'. Then i = x^j = xNj(t j and thus
Gibbs measures with symmetries
= Tj o a. = T; o fft-,(. off Aj =
ffioio
93
ffA„.
The Tjy-invariance of <&N is thus proved. Now let D = N and, for each N e D, TN be as above, yN = y*", and ^jy = {Ajy}. Since / o TN = fo T eventually for all / e JS? and tNi).AN = AN for all iV, these quantities meet the hypotheses of Proposition (5.18). Thus each ß e ^oi®) is T-invariant, and the proof of the inclusion ^0(<S>) a (SI(^>) is complete, o We conclude this chapter with a basic definition. (5.21) Definition. Let y be a specification. A symmetry T of y is said to be broken if there exists some ß e &(y) such that T(/I) ^ /i. It is an immediate consequence of Remark (5.10) that \@{y)\ > 1 whenever y has a broken symmetry. This simple fact provides us with a key which opens the door to a large class of models exhibiting a phase transition. In fact, most examples of phase transitions which will be considered later on are examples of symmetry breakings. As was mentioned at the beginning of this chapter, in physical applications it is important to know if a phase transition is accompanied by a breaking of one or several symmetries, and if some other symmetries are still preserved. In the next chapter we shall encounter three basic models showing distinct kinds of symmetry breaking.
Chapter 6 Three examples of symmetry breaking
Although one of the main objectives of this book is to discuss the phenomenon of non-uniqueness of Gibbs measures, we have not shown so far that this phenomenon is indeed possible. In this chapter we will do this. We shall present three basic examples of nearest-neighbour potentials which exhibit a phase transition. In each of these examples, the phase transition is characterized by some sort of symmetry breaking. The first example is the simplest model that shows a symmetry breaking. It is a spatially inhomogeneous version of the Ising chain of Section 3.2. The corresponding potential is invariant under the spin-flip transformation, but some Gibbs measure for this potential is not. The second example is the famous Ising model on the twodimensional square lattice. This is the simplest shift-invariant model which shows a phase transition. Again, the phase transition is due to a breaking of a spin-flip symmetry. (The shift-invariance is preserved.) The third example is the two-dimensional discrete Gaussian model. Its spins take integer values and interact via a quadratic nearest-neighbour potential which is shift-invariant and shows several further symmetries. Shlosman (1983) discovered the existence of Gibbs measures with a staircase structure which breaks the shiftinvariance of the model. (The analoguous Gaussian model will be considered in Example (13.43).) The inhomogeneous Ising chain yields itself to a direct computation which will be carried out in Section 6.1. The two other models, which are twodimensional, will be treated in Sections 6.2 and 6.3 by means of a famous estimation technique which can be traced back to Peierls (1936) and combines physical and geometric ideas. The Peierls argument has turned out to be a general device for proving phase transitions. A key to this device is provided by the physical notion of a ground state. A preliminary version of this notion already appeared in Section 3.2, and a formal definition will be given at (6.18). Roughly speaking, a ground state relative to a potential <5 is a configuration at which the O-energy takes a minimum. (This minimum may be either absolute or relative. Shlosman's result will show that relative minima are to be taken into account as well.) The Peierls device applies to potentials <5 with several distinct ground states. Let us give an outline of the main steps of this device. To fix the ideas we will stick to the case where S is the square lattice and <5 is a nearestneighbour potential. (i) Pick any ground state m of <5.
Inhomogeneous Ising chains
95
(ii) Look at any configuration £ which is a local perturbation of co in the sense that {i e S: Ç; # coj is finite. Draw closed polygons along the outer boundaries of the connected components of {i e S: Ç; ^ co;}. Call each such polygon a contour. (iii) Check if co is stable in the sense that each local perturbation £ of co requires an additional amount of energy which is at least proportional to the total length of the contours. (iv) If co is stable and ß is large, a configuration £ with a long contour requires a large amount of energy relative to ß) contains a measure fi^ which is a "random perturbation" of co, in the sense that fi^ -» ôœ as ß -» oo. If this program can be carried out successfully for at least two distinct co's then clearly |^(/?)| > 1 when ß is large enough, and this will complete the proof of a phase transition. (If these co's are related to each other by means of symmetries of $ it will be sufficient to study just one of these.) Besides the Ising model and the discrete Gaussian model, there is a large number of further models which can be analyzed by means of a suitable version of the Peierls argument above. Many of these will be discussed in Chapters 18 and 19 as examples of a general theory of phase transitions that relies on Peierls' ideas. A different general theory along Peierls' lines has been developed by Pirogov and Sinai (1976); cf. the Bibliographical Notes on Chapter 19.
6.1
Inhomogeneous Ising chains
We put S = N, the set of all natural numbers, and let E = { —1,1}. The a priori measure X on E is counting measure. We choose a sequence (./„)„>! of real numbers such that Jn > 0 for all n ^ 1 and (6.1)
^ e~2J" < oo.
For instance, we might take J„ = clog(l + n) with c > 1/2. We define a nearest-neighbour potential $ by (6 2)
$
{-J"f7»f7«+i [ 0
if A =
("'" + !}> otherwise.
Under the action of $ adjacent spins tend to be aligned. That is, $ is
96
Three examples of symmetry breaking
ferromagnetic. Clearly, the spin-flip transformation (6.3)
(TCO); =
—cOi
(a> e Q, i e S)
is a symmetry of O. For n,ve ^(Q, &) we write |>,v] = {sv + (l -s)fi:0^s^
1}.
(6.4) Theorem. Consider the model above. Then
where /*_ = T(/J + ), /J+(<7;) > 0 for all i e S, and thereby n_ ^ fi+. The proof rests on the following lemma. (6.5) Lemma. Let A = {1,..., N}, n e A, x e E, and a> e Q. Then yli^n = x\co) = ( 1 + xcoN+l f l tanhJij/2. Proof. Since 2y*(
In order to prove this we introduce a bijection ri<->fj from EA onto itself by setting N
Vi = <%+i f l nj
(i e A),
j=i
»i = fiifii+i
(i^i
iN =
fjNœN+l.
Then we can write Z%(co)yfa((o)=
I
r;e£A
^exp[-/^(fjf(a S N A )] ' N
= %+i L ( ïî»/;)exp n-1
û)Ar+12Ar r i c o s h J ; = coN+12N(
\
i=l
/ TV
msinhJ;
ncoshJ.jfntanhJ;). yie A
Inhomogeneous Ising chains
97
A similar but simpler computation shows that Z%(œ) = 2N n coshJ;. ieA
Combining these equalities we obtain the desired result, D Proof of Theorem (6.4). 1) For each <x> in the set A+ = {on = 1 for all sufficiently large n), the limit ß+ = lim yfi
N]{-\co)
N->oo
exists and does not depend on the choice of <x>. Indeed, let A be any cylinder event. We choose some n with A e ^{x,...,„-i}, and for x e E we let <x>n'x e Q be any configuration such that an(con,x) = x. For N > n we have from Lemma (6.5) (6-6)
y?u....N](A\co)
= E yTu...,n-iM\oin'x)yfi
*}K = x\a>)
xe E
= E yfi
n-1}(A\(o'-')(l
+ xœN+1 f i tanh
J^IL
In the limit N -» oo, the last expression converges to a limit which does not depend on co e A+. Thus ß+ exists. According to Theorem (4.17), ß+ e <&(<&). A similar argument shows that there exists some /i_ e ^(O) with /x_ = lim y®,...,*}(• M JV-»oo
for all CL) e A_ = xA+. In particular, [/x_,/x+] c ^(O). 2) Next we show that T(/I + ) = /x_ # /x+. Let co e A+. Then TCO e /1_ and therefore T(/I+)=
lim T(yfi
N](-\co))
JV-»oo
= lim y?i,....N}(-\*°>) JV-»oo
The second equality comes from the fact that x is a symmetry of O; cf. Corollary (5.9). Using Lemma (6.5) and hypothesis (6.1) we further obtain that ^ + (
= 2 lim#
1
JV}(a„=
l|co)- 1
98
Three examples of symmetry breaking
= J] t a n h Ji
= n a - 2/(1 + g2/o) >0 for all n. In particular, this implies that T(/I + ) ^ /i + . 3) Finally we prove that &(<&) c [/i_,/i + ]. Let /i e ^() be given. For each n e S w e conclude from Lemma (6.5) that zeE
= I
y{*l
„}(*„= - Z | c 0 " + 1 ' z )/l(c7 n + 1 = Z )
= (1 - t a n h J „ ) / 2 = 1/(1 + e 2/ "). Hypothesis (6.1) ensures therefore that
Hence by the Borel-Cantelli lemma H(A_ U A+) = n(o„ ¥" <Jn+1 for at most finitely many n) = 1. Consequently, for each cylinder event A we obtain, using the dominated convergence theorem, /x(^)=lim
J
Mdco^^.^M
= n(A_)n_{A) +
KA+)fi+(A).
Hence \i = \i{A_)yL_ + n{A+)n+ and therefore ji e [^_,^ + ].
D
(6.7) Comments. (1) The constant configurations a>+ and a>~ minimize ^ for all A e Sf and are thus ground states of $. Let S+ and
lim#(j80) = [<5_,<5+]. /)-*oo
(2) The condition (6.1) is not only sufficient but also necessary for a phase transition. For if (6.1) does not hold then
The Ising ferromagnet in two dimensions
99
CO
ntanhJ ; = 0 i=n
for all n ^ 1. Hence equation (6.6) shows that, for all co efl, lim y* ,...,#}(' \
exists and is independent of co. Applying the dominated convergence theorem as in Step 3) of the preceding proof we can conclude that |^(3>)| = 1. (3) Step 3) of the proof above has also shown that n+(A+) = p-(A_) = 1. First of all, this means that the two extreme points /*+ and p_ are supported on disjoint tail events. This is true in general, as we shall see in Chapter 7. Secondly, the event A+ U A_ can be characterized as follows. A configuration co belongs to A+ UA^ if and only if the graph G(co) with vertex set S and edge set {{i,i+ 1} c S : c o ; = co1+1} = {A = {i,j} c S: \i -j\
= l.O^co) = min
contains a (necessarily unique) infinite connected component. The relation H(A+ U A J) = 1 which has been proved to hold for all fi e &(<&) can thus be restated by saying that the random graph G(-) on the probability space (Q, 3F, fi) exhibits the phenomenon of percolation. The phase transition for can be viewed to be a consequence of this fact, together with the further fact that the percolation event splits into the two disjoint tail events A+ and A_ which are related to each other by the symmetry t. This geometrical point of view is very similar to the ideas of Peierls which will be described in the next section. In fact, the proof of Lemma (6.14) will show that the random graph G ( ) is closely related to Peierls' notion of a contour. The same random graph will reappear in Chapters 18 and 19. These chapters contain a theory of phase transitions which rests on percolation, o
6.2
The Ising ferromagnet in two dimensions
Again we choose the set E = { —1,1} for the state space (and counting measure for the a priori measure), but now the parameter set is S = Z2, the square lattice. We consider the ferromagnetic Ising potential $ with coupling constant 1 and vanishing external field. «S is given by (6.8)
<S>A = \~°i0j ( 0
if ^ = { U } , | i - 7 l = l, otherwise.
100
Three examples of symmetry breaking
again be denoted by ö+ resp. cL. As ®A takes its minimum at a>+ and eo~ for all A e £f, co+ and a>~ are ground states of <ï>. Thus <ï> exhibits a ground state degeneracy. To see whether this ground state degeneracy implies a phase transition we need to verify that the ground states eo+ and eo~ are stable in the sense that the set
In physical terms, n+{o0) is the magnetization of the Ising spin system when Hß+ is its state. The last sentence of the theorem above can thus be rephrased as follows: At sufficiently low temperatures, the two-dimensional Ising ferromagnet admits an equilibrium state of positive magnetization, although there is no action of an external field. This phenomenon is called spontaneous magnetization. Before turning to the proof we take this opportunity to mention a number of further results on the Ising ferromagnet. Using correlation inequalities, one can show that for each ß ;> 0 the local limit ni = lim y%*(- \œ+)e %(ß<&) exists. Its magnetization is maximal, in that /4(
(ß^ßc).
These remarkable formulas have a long history which extends over the three
The Ising ferromagnet in two dimensions
101
decades from 1941 to 1973; see the Bibliographical Notes on this section. A further remarkable result was obtained independently by Aizenman (1980a, b) and Higuchi (1981) on the basis of the pioneering work of Russo (1979): / 4 and its spin-flip image {it = t(fi+) are the only extreme Gibbs measures for ß
A = S n ( [ M 1 , N 1 ] x [M2,JV2])
(Mi < N1,M2 < N2)in S, and we consider the set (6.11)
B = {{»,;} c S: \i -j\
= 1, {i,j} D A # 0}
of all nearest-neighbour bonds which emanate from sites in A. Each bond b = {i,j} e B should be visualized as a line segment between i and j . This line segment crosses a unique "dual" line segment between two nearest-neighbour sites u, v in the dual rectangle A* = { M 1 - i , M 1 + i . . . , N 1 + | } x { M 2 - i . . . , J V 2 + ± } . The associated set b* = {u, v} is called the dual bond of b. Hence b* = {ue A*: \u - (i +j)/2\ = 1/2}. We write (6.12)
B* 4 {b*: beB} = {{u,v} c A * : | u - u | = 1}
for the set of all dual bonds. A subset c of B* is called a circuit if c = {{uik~i\ um}: 1 ^ k ^ t] for some finite sequence (u <0) ,...,u i() ) with uv) = u(0), |{u(1), ...,u(e)}\ = t, and { u ^ V ' } e B* for 1 ^ k S S. \c\ = S is called the length of c. A circuit c is said to surround a site a e A if for each "path" (i <0) ,..., i(n))
102
Three examples of symmetry breaking
in S with i<0) = a, i(n) i A, and {i*"^1', i(m)} e 5 for all 1 ^ m ^ n there is some m with {i (m - 1) ,i (m) }* e c. We let Ca denote the set of all circuits in B* which surround a. (6.13) Lemma. For each a e A and ^ 2: 1 we have \{ceCa:\c\
S}\£S3<-1.
=
Proof. Each c e Ca of length ^ contains at least one of the £ dual bonds {a + (k — 1,0), a + (/c, 0)}* (k = 1,..., ^) which cross the horizontal half-axis from a to the right. If the remaining ( — 1 dual bonds in c are successively added then at each step there are at most three possible choices for attaching the next dual bond to the preceding one. D It was the key idea of Peierls to look at circuits which occur in a configuration as follows. For each œ e Q we let B*(co)={b*:b =
{i,j}eB,col*a>j}
denote the set of all dual bonds in B* which cross a bond between spins of opposite signs. A circuit c with c c B*(a>) is called a contour for co.
1
-
+ + +
—
-
+
—
+
1
+1
+ +
+
+ + + +
Q
-
— -
-
1
+ +
-
+
+
-
+
+
+
1
+
+1
+ + + + + + + + + + + - + - - +
+ +
+ +
+ -
+ + + + + + + + + + +
Figure 6.1 A contour surrounding a, as constructed in Lemma (6.14). There are two reasons for looking at contours. First, if œ is constant on S\A but takes a different value at some a e A then a is surrounded by a contour for œ. Secondly, there is an intimate connection between the Ising Hamiltonian H®(œ) and the set B*(co). This relation implies that the occurrence of a contour c requires an energy proportional to the length of c. Consequently, if ß is large then long contours will be very improbable with respect to y^"(- \co). These two facts will be established in the next two lemmas. They will imply
The Ising ferromagnet in two dimensions
103
the theorem as follows. If we fix the co+ boundary condition outside A then with overwhelming probability the minus spins in A form small islands in an ocean of plus spins. Therefore, taking the limit A f S we will obtain some Gibbs measure ßß+ which is arbitrarily close to ö+ when ß is large enough. Similarly, the co~ boundary condition will lead to some ßt e ^(/?(D) close to cL. As ö+ and c>_ are distinct, so are ßl and ni when ß is large. Hence |^(jßO)| > 1 when ß is large. Now we turn to the details. Our first lemma ensures the existence of contours. (6.14) Lemma. Suppose co e Q is such that co; = 1 for all i e S\A and toa = — 1 for some a e A. Then there exists a contour for co which surrounds a. Proof. Thinking of S as the vertex set of a simple graph with bond set {{i,j} <= S: \i —j\ = 1}, we let D denote the largest connected subset of {i e S: oii= —1} containing a, and we also let D' be the unique infinite connected component of S\D. We define c = {{i,j}*
eB*:ieD,jeD'}.
c is the "outer boundary" of D. (See Figure 1.) We have c a B*{to) because D is a maximal connected set of spins of equal sign and D' is disjoint from D. Moreover, for each path in S from a to a site in S\A the "bond of the last exit from D" defines an element of c. Thus we have c e Ca provided c is a circuit. Since D is connected, this will follow once we have shown that nc(u)^ \{b*ec:b*3u}\
= 2
for all u e [j b*. To see this we look at the four sites in the set Nu = b'ec
{u + ( ± i , ±j)} a S. Suppose first that nc{u) = 1. Using the fact that j e D' whenever i £ D' and {i,j}* e B*\c, we conclude that Nu c D'. But Nu n D # 0. Hence nc{u) # 1. To exclude the case nc(u) = 3 it is sufficient to note that 1D(0 changes its value an even number of times when i runs clockwise through Nu. Finally, let nc(u) = 4. Then the two sites u ± (£,£) belong to one of the two sets D and D', and the remaining two sites u ± (\, —%) belong to the other one. This is impossible because D and D' are connected. (A rigorous proof of this "obvious" impossibility can be based on the Jordan curve theorem; see Theorem V. 10.2 of Newman ( 1951 ), for example.) D The following contour estimate is the heart of the Peierls argument. (6.15) Lemma. Suppose c c B* is a circuit. Then yj?{c <= B*{-)\co) S for all ß > 0 and all œ e Q.
e~2m
104
Three examples of symmetry breaking
Proof. For each ( e Q w e have -H*(0=
I
Uj
{U}eB
= \B\-
X
(1-dCj)
{iJ}eB
= =
\B\-2\{{iJ}eB:ti*i;j}\ \B*\-2\B*(C)\.
We put ^ 1 = {CeQ:CSXA = co SXA , CC J5*(0} and A2 = {C e Q: ÇSXA = coSXA, c D B*(0 = 0}. We define a mapping xc: Q -> Q by - Ci if i is surrounded by c, ' Ci otherwise.
1
Thus xc flips all spins in the "interior" of c. For all {i,j} e B we have
(For if {i,j}* e c then precisely one of the sites i,j is surrounded by c, as can be shown by means of the Jordan curve theorem.) Thus J3*(T C £)AJ3*(Ç) = c. In particular, xc is a bijection from A2 to y4l5 and for £ e ^42 w e have tf*(0 - fl^(T£0 = 2|B*(OI - 2|J5*(T C C)| = -2|c|. Thus y£*(c c B*(-)\œ) ^
£
= I
exp(-/5H*(0)/ I cxp(-ßH%(TcQ)l
Ce-42
exp(-/Mf£(0) X exp(-/5Jff*(0)
/ CsX 2
= exp(-2/S|c|), as was to be shown,
a
Proof of Theorem (6.9). For ß > 0 we define r(ß) = 1 A £ / ( 3 e - 2 " / .
The Ising ferromagnet in two dimensions
105
Then r(ß) -> 0 as ß -> oo. The preceding three lemmas yield yfK=-l|û>+) ^ X
yHccß*(')|ft)+)
and thus yrto=-l|û>+)^'(A for all a e S, /? > 0 and each rectangle A in S. For each N ^ 1 we put AN = Sfl [-iV,iV] 2 and
vj?.+= l A ^ r
£ )C<(-iß>+)-
As ^(Q, J^) is compact, the sequence (v^ + ) N à l has a cluster point /i^. From Example (5.2(D)(1) we know that \i% e ^@{ß<^). The preceding estimate implies that nß+(aa = - 1 ) ^ r{ß) for all aeS. Consequently, if A e Sf and fe &A then \iï(f)-ô+(f)\^ni(\f-f(
^ K = - l )
^ 211/111A|r(A Hence lim ßl = ô+ and therefore /S-oo
limd(^@(0O>),<S+) = O. /J->oo
Finally, we put /i^ = x{ßß+). Since T preserves $ and commutes with each translation, ixt e ^@(/?<ï>); cf. Remark (5.10). As x maps local functions to local functions, lim ni = x ( lim fi{\
= T(C5+) = c5_
and thus limd(%(ß
a
106
Three examples of symmetry breaking
We conclude this section with a few remarks on the two-dimensional Ising antiferromagnet without external field. The corresponding potential 4* differs from the ferromagnetic Ising potential <1> by its sign: *F = — <1>. In the notation (5.3) we have *? = f(), where f: co -> ((— l)m+Bco(m>II))(m n ) e S is the spin flip on the odd sublattice of S. Let fi{ e y@(ß(b) be as above. By Proposition (5.6) and Remark (5.10), the measure v£ = r(ni) belongs to ^(ß^V). If ß is so large that ni{a0) > 0 then vÇ(a0) > 0 > VÇ(
M+i0 ( O > 1 ) (v$) = f ( M + ^ ) .
Whilst all Gibbs measures for the Ising antiferromagnet on Z2 are still periodic, the example below exhibits a complete breaking of shift-invariance: There exist Gibbs measures which differ from all its translates.
6.3
Shlosman's random staircases
As in the preceding section, we let S = Z2 be the square lattice. But now we choose a different state space: We put E = Z, and we let k be counting measure. We consider the potential (6.16)
(ot-oj)2 0
HA = {i,j},\i-j\ otherwise.
= l,
(We note that 0 is equivalent to the Ising potential (6.8) when E = { — 1,1}. On the other hand, if E = R and X is Lebesgue measure then the potential
exp(-jSfl*(0) ^ Z I I exp(-/?(Ç ; - C.--(i.o))2)
^(Z £ e x p(-^ 2 )) I A ! < C 0 whenever ß > 0, co e Q, and A e y . Next we observe that has several symmetries. (6.17) Remark. Each of the following transformations of Q preserves Q). (i) The lattice translations 6h i e S; cf. Example (5.2)(1).
Shlosman's random staircases
107
(ii) The lattice reflections r^ and r2. These act on Q as {rlco)i = co(_iijJ2), (r2(o)i = co(ii _,-2). Here i = (i*i, i2) e S and coeQ. (iii) The lattice rotation r0 which is defined by (r0co); = co(i2 ^ ii); i = (iiJj) e S, co e Q. (iv) The spin reflection T: CO -> -co; cf. Example (5.2)(2). (v) The spin translation t which acts as (to) ; = co,- — 1, J e S, co e Q. o Knowing the results on the two-dimensional Ising model we might expect a breaking of the symmetries T and t. However, far more than this is true. We will show that at sufficiently low temperatures each of the O-symmetries listed above is broken. Let us consider the ground states of <£ first. So far we have not provided a general definition of a ground state. In some particular cases we used the term ground state for each configuration co which minimizes the potential. In the case under discussion, the class of all configurations which minimize each ®A consists of all constant configurations. However, there are also some non-constant configurations which will be seen to play an important rôle in the low temperature limit. These non-constant configurations are ground states of <£ in the following, more general, sense. (6.18) Definition. Let ¥ be a potential. A configuration co e Q is called a ground state of ¥ if
Hl(0 ^ Hl(co) whenever A e y and £ e Q are such that CS\A = œs\\Thus co is a ground state if the energy of each finite perturbation of co exceeds that of co. Clearly, if co minimizes each of the functions T x for some potential T ~ ¥ then co is a ground state of T. Let us return to the potential O at (6.16). For each z e Z we define a configuration coz by (6.19) cof = z Î! if i = (f J, i2) e S. (For z # 0, coz reminds of an infinite staircase.) Clearly, TCOZ = co~z. The configurations coz, z ^ 0, belong to distinct levels of the specific energy. Indeed, if AN = S fl [ - N, N] 2 then lim \AN\~lHl(œz)
= z2.
JV->oo
Nevertheless, all coz's are ground states of <£. (6.20) Remark. The configurations coz, z eZ, (and also their images under the symmetries r0 and t" (n e Z) of <£) are ground states of <£. Proof. We fix some z e Z and a rectangle A of the form (6.10). Let £ e Q be such that CS\A = œs\A but Ç # coz. Then
108
Three examples of symmetry breaking
H*(t)-H*(œ*)=
£
£
t=M, e=M-L-\
+ Z
(C(t.m) - C(k,o)2
Z
[(C(k+i,o - C(k,0)2 - z 2 ] .
The first term is strictly positive. To see that the second term is nonnegative we use the inequality s2 - z2 ^ 2z(s - z)
(se [R)
and observe that for each / ZJ
(C(k+iy> ~~ C(k,o ~
=
Z
k= M 1 - l =
z
)
[(C(k + l,0 ~~ ^fk + l/)) ~~ (C(k,0 — tt'fk/))]
(C(Afi+l,0 ~ ^ ( f y + l . o ) ~~ U>(M,-l,
= 0.
D
There are further ground states of 0, for example the configuration co with CO; = 1 for il ^ 0 and co; = 0 otherwise. The ground states t"œz and r0 o fco z (z, n e Z), however, are stable: In the low temperature limit ß -> oo, the set ^(ßO) is attracted by each of the Dirac measures at these staircase-shaped ground states. In other words, if ß is large then ^{ß(5>) contains a collection of random fields which are random perturbations of these staircase configurations. It is these Gibbs measures to which the title of this section refers. They were discovered by Shlosman (1983). (6.21) Theorem. Let E = Z, S = Z 2 , and suppose 0 is given by (6.16). / / ß is sufficiently large then there exists a family (/if)z6z of Gibbs measures in ^(/?) with the following properties. (i) For all i e S and z e Z, /if (CT,) = œz and ß^i = œï) > 1/2- In the low temperature limit ß -> oo, /if tends to the Dirac measure bz at a>z. (ii) For each z e Z, /if is invariant under the (^-symmetries ö ( 0 1 ) , t~z o ö ( 1 0 ) , rl o T, and r2 which preserve a>z. Moreover, i(/if) = ^(/if) = \itz and r0(ßo) = A«o(iii) The subset {t"(/if ), t" o r0(/if ): n, z e Z} of$(ß) is linearly independent. We emphasize the following consequence of the theorem above: If 0 ^ w ^ z # 0 and 0 < s < 1 then the Gibbs measure s^ + (1 — s)r0(n^) for /?$ is different from all its images under the O-symmetries 0,- (0 # i e S), t" (0 # n e Z), x, r 0 , r l5 and r2. The proof rests on a modification of the Peierls contour argument which has been used in the preceding section. We fix again a rectangle A of the form
Shlosman's random staircases
109
(6.10), and we consider the sets B and B* of bonds, resp. dual bonds, in A. These have been defined in (6.11) and (6.12). Let Ç e Q, z eZ, / c ^ l . Let us agree to call a circuit c in B* a (z, k)-contour for £ if £,- — cof ^ /c and £,- — co? < k whenever i,j e S are such that {i,j}* e c and i is surrounded by c whilst) is not surrounded by c. (6.22) Lemma. Let z e Z, /c ^ 1, and suppose Ç e Q and a e A are suc/i f/iaf Ca ~ œa = k and CS\A = Ö | \ A - ^ e n there ex/sfs some (z, k)-contour c for £ w/iic/i surrounds a. Proof. Same as Lemma (6.14).
D
Next we need an estimate for the probability of contours. For each circuit c in B* we consider the mapping tc: Q ->• Q defined by {Ci - 1 if i is surrounded by c, {tcÇ)i = < [Ci otherwise; i e S, C e Q. Thus fc equals the O-symmetry t in the interior of c and leaves all spins in the exterior of c unchanged.
(6.23)
(6.24) Lemma. Let z e Z, It ^ 1, £ e Q, and c be a (z, k)-contour for Ç. TTien
ifftO - fl*(te0 ^ |c|. Proo/. For each fceß with fc* ^ c we have <5fc(Ç) = <J>6(£CC j - TTius
H*(C) - H*{tc0
=
X
<5(&),
beB:b*ec
where <5(fe) = <5fc(Ç) — Q>b(tcÇ). For each fe e B with fe* e c we write fe = where i6 is surrounded by c and j f c is not. Then
{ib,jb},
ô(b) = (C„ - C J 2 - (Cib - Ch - l ) 2 = 2(Cib - CJb) - l. We split c into the three disjoint subsets T = {b*ec:jb
= ib + (1,0)},
-c = {b* e c:jb = i b - (1,0)}, and J = {b*ec:jb
= ib±
(0,1)}.
2
Each horizontal line in U through sites of S meets the polygon in IR2 through the sites in c an even number of times. This shows that \~c\ = |"c"|. If b* e c1 then
k-Ch
= (Clh-K)-(^-K)^k-(k-i)
=i
110
Three examples of symmetry breaking
and thus 0(b) ^ 1. If b* e ? then
and therefore ô(b) 2; 1 + 2z. Finally, for b* e ~c we have Ci. - CA = (k - O - (0. - < ) - 2 è 1 - 2 and thereby 0(b) ^ 1 — 2z. Hence V 5(6) ^ |c>| + |c"|(l + 2z) + \r\(l - 2z)
= |c'i + \r\ + \r\ = 14 This proves the lemma,
a
For /? > 0 we set
r(0) = 1 A 2 X A3e"YNote that r(/?) -> 0 as ß -> oo. (6.25) Lemma. For all ß > 0,z eZ,k^.l,
and aeAwe
have
y H k - - ««I = *l<»*) = r(j8)k. Proof. In the case r(/?) = 1 there is nothing to show. Thus we assume r(ß) < 1. From the t-invariance of O we obtain y f V „ - otf = -*!<»*) = yl*(-oa
- to~a*
=
/c|œz)
= yrK-cor^fe|co-z). Therefore we only need to prove that
yfV„ - col = fe|«z) ^ K/?)72 for all z e Z, /c ^ 1, and a e A. Lemma (6.22) asserts that the probability on the left is dominated by Z ys*(aa — œa ^ K c is a (z,/c)-contour|coz). ceC a
(Recall that Ca stands for the set of all circuits surrounding a.) For fixed c e Ca, tc is an injection from Al = {Ceil: Ca~co^k,c
is a (z, fe)-contour for Ç, CS\A = ^S\A}
into ^ 2 = {C e Q: Ca-co*^kThus Lemma (6.24) gives
1, CS\A = Û>|\A}.
Shlosman's random staircases
ZP(coz)yP(oa
- vza ^ fc, c is a (z, /c)-contour|coz)
= £
exp(-M(0)
^ £
exp(-#**(tcO-0|c|)
^e-W
111
£
exp(-ßH*(Q)
CeA2
Using Lemma (6.13) we arrive at the estimate tf>«
- (ol £ k\œ*)
g £ ^ 3 ' e - " r ô > . - « z ^ /c - l|coz) /gl
=
r{ß)yZ*(aa-
Iterating this inequality we obtain the result. D Proof of Theorem (6.21). Let ß be so large that r(ß) < 1/2. We consider the squares AN = S D [ — N, AT]2, JV ^ 1. For each z e Z we define
vLHAjvr1 £ yC+;(-l«z). ieA„
Lemma (6.25) shows that
vkAWa-<\^k)Sr{ßf for all a e S and k ^ 1. In particular, lim s u p v £ z ( K | ^ / c ) = 0 fc->oo N i l
for all a e S. Thus Corollary (4.13) implies that the sequence (v$ z ) N è l has a cluster point /zf. Comment (4.18) shows that /zf e 0(/?O). Assertion (ii) follows immediately from Example (5.20) (1). To prove (i) we observe that HßA\oa-o)l\^k)^r{ßf forallaeSandfc ^ 1.Thus/zffloj) < oo and lim npz{f) = 5 z (/)forall/e JSf. /)->00
Since /xf is preserved by rx o T, /xf(
/lf((7ao(r"'o0(ai>O))o0(Oia2))
= ßt(Go + for all a e S.
za
l)
112
Three examples of symmetry breaking
Finally we turn to the proof of (iii). We shall prove the slightly stronger statement that it is impossible to find real numbers an z, bn z(n, z e Z) such that 0 < I ( K z | + |fc„, z |)
I [ < W V ) + Kj" o r 0 ( ^ ) ] = 0, and b„i0 = 0 for all neZ. (The last condition takes account of the fact that r o(^o) = ^o-) Suppose such numbers existed. Then n,z
because /zf (Q) = 1. Without loss we may assume that n,z
n,z
+
here a = a v 0 and a ~ = ( —a) + . We choose a number ô > 0 with 2(5 < 1 — 2r(/?) and an integer N > 1 such that
I
( K z | + |b„,z|)<«5.
|n|v|z|âlV
We also define the sets I± = {2Nz - n: \n\ v \z\ < N, ±a„ iZ > 0} U {4N2z - n: \n\ v \z\ < N, ±b„, z > 0}. First of all, we claim that I+ D /_ = 0. For suppose the contrary is true. Then there exist integers — N < m, w, n, z < N such that either 2N(z — w) = n — m and (m,w) ^ (n,z); or 4N2(z — w) = n — m and (m, w) ^ (n,z); or 4N2z = 2Nw — m + n and z # 0. Looking at absolute values we see that each of these three equation is impossible. We consider the sites i = (2N,4N2) and j = (4N2, — 2N), and we let — N < n, z < N be given. If an z > 0 then t\nl)(ai
e I+) £ fitfKa,
= 2Nz - n)
= n'M = cof) £ 1 - r(ß). Similarly, if b„z > 0 then f o r0(ßl)(ai e I+) £ t" o r 0 (^)((7, = 4N2z - n) = pfaj = co])^lOn the other hand, if a„ z < 0 then t"(nl)(at e I+) ^ f(ßf)(at = fi^i and for bn z < 0 we have
* 2Nz - n)
± oif) S r(ß),
r(ß).
Shlosman's random staircases
f o rMWi
113
e h) ^ tf(
Thus we obtain the contradiction 0 = 1 K^OxfXff,- e / + ) + fen,zf ° r0(/i£)(<7, e / + )]
^
Z
[ « z + C ) ( l - riß)) - (a~z + b-Mß)~] - «5
|n|v|z|<JV
^ (1 - <5)(1 - r(j8)) - r{ß) - Ô
^ 1 - 2r(ß) - 20 > 0. This completes the proof of Theorem (6.21).
a
To conclude this section we mention that Lemma (6.24) and therefore Theorem (6.21) can easily be extended to the class of potentials which arise if we replace the function x -> x2 in (6.16) by a strictly convex, even function on Z. Of course, the results of this section can also be extended to higher dimensions (by looking at closed hypersurfaces instead of circuits). Finally, we note that some further aspects of this model will be discussed below in (9.17) and (9.25).
Chapter 7 Extreme Gibbs measures
In Chapter 6 we have seen that ^(y) is not necessarily a singleton. This fact leads us to examine the question of the structure of ^(y) in general. By the very definition of ^(y), this problem is a special case of a more general question: What is the structure of the family of all probability measures that are preserved by a family of probability kernels? This question is a, by now, classical topic of ergodic theory, especially the ergodic theory of Markov processes. Therefore we may (and will) take advantage of some results and ideas that were developed in that context. The basic observation is that ^(y) is a convex set. Therefore our interest concentrates on the extreme elements of &(y), and the question above will be specified as follows. I) What are the special properties of the extreme elements of 'Siyp. II) Is ^(y) uniquely determined by its extreme elements? As for question I, we shall see that the extreme elements of ^(y) are characterized by a zero-one law for all y-invariant events. The latter are nothing but the tail events. Therefore the extreme elements of ^(y) may be interpreted as the possible thermodynamic phases of a physical system that is modelled by y. Each such phase can be approximately rediscovered in large but finite volumes with suitable external boundary conditions. All of this will be the subject of Section 7.1. Section 7.2 is devoted to four applications. In particular, we shall deal with exchangeable distributions and with the product of two specifications. In Section 7.3 we shall turn to question II. Under fairly general conditions we shall show that ^(y) is a simplex: Each element \i of &(y) can be uniquely decomposed into extreme elements of ^(y), in that ß is the barycenter of a unique probability measure wfl on the extreme boundary of ^(y). The construction of w^ is very similar to the construction of an entrance boundary of a Markov process. In the final Section 7.4 we shall apply this result to pairs y, y of specifications that are microscopic perturbations of each other. We shall show that then &(y) and ^(y) are isomorphic and macroscopically identical in some sense. From a logical point of view, Sections 7.2 and 7.4 will not be needed for the understanding of later chapters. On a first reading one should thus concentrate on Sections 7.1 and 7.3. A few results of this chapter will depend on the backward martingale convergence theorem. A reader who is not familiar with this theorem is referred to Bauer (1981), Corollary 60.9, or Breiman (1968), Theorem 5.24, or any other (non-elementary) introduction to probability theory.
Tail triviality and approximation
7.1
115
Tail triviality and approximation
We consider the general setting of Chapter 1. We are thus given a countably infinite product space (Q, &) = (£, ê f, and we look at the set ^(y) of all Gibbs measures for a given specification y. We start from the observation that ^(y) is convex, i.e., if \i, v £ <&(y) and 0 < s < 1 then s/z + (1 — s)v e ^(y). The most interesting elements of a convex set are its extreme points. (7.1) Definition. An element ß of a convex subset ^ of any real vector space is said to be extreme in <é if \i # sv + (1 — s)v' for all 0 < s < 1 and v, v' e <€ with v # v'. The set of all extreme elements of # will be denoted by ex # and is called the extreme boundary of (€. First of all we note that the extreme boundary of &(y) remains invariant under symmetries of y: This fact is a complement to Remark (5.10). (7.2) Remark. Let y be a specification and x e T be a transformation of Q. If ß e ex ^(y) then T(/I) £ ex (S(x(y)). In particular, each symmetry of y maps ex ^(y) onto itself. Proof. Suppose T(/I) = sv + (1 — s)v', where 0 < s < 1, v, v' e ^(f(y)), and v # v'. Applying t" 1 to this equation and using Remark (5.10) to conclude that t~1(v), T - 1 (V') e ^(y), we obtain a contradiction to the extremality of \i. a
The main objective of this section is to obtain a characterization of the extreme elements of ^(y) by a zero-one law for the tail events. This characterization rests on a general proposition which is well-known in ergodic theory. (7.3) Proposition. Let (Q, J^) be a measurable space, n a probability kernel from iF to 3F, and \i e ^(O, J^) with \m = \i. Then the system Sn(ß) =
{Ae
of all ß-almost surely n-invariant sets is a o-algebra, and for all measurable functions f: Q -*• [0, oo [ we have (fß)n = f\i if and only if fis J^n(ß)-measurable. Proof. 1) Jn(ß) obviously contains Q and is stable under the formation of complements and countable disjoint unions. In order to show that Jn{\i) is a cr-algebra we therefore only need to check that J^(/i) is stable under intersections. We take any two sets A, B e J^n(ß). Then n{Ar\B\-)£n{A\-)ATt{B\-)=lAA
1B = I ^ B
and MUrm -n(Ar\B\-))
= fi(Ar\B)-
ßn(A fl B) = 0.
/WI.S.
116
Extreme Gibbs measures
Thus A PI B e Jn(fi). 2) Suppose {f/j)n = f/j,, and let c > 0. We will show that {/ ^ c} e «/„(//). Writing g = l{ / ä c j, we have J dfifng = fi(fng) - fi(fg ng) {/
= (fuMg) - Kfg ng) = (M(g) - n(fg ng) = KfgiX - ng))Since fg ^ cgf and ng ^ 1, the last expression is at least cn(g(l - ng)) = cfin(g) - cfi(g ng) = c j
d/i 7r#.
{/
Thus j
d/x(/-c)»rg^0
and therefore \^f
/i-a.s.
As /i(g — 7tg) = 0,ng = g /i-almost surely. Hence {/ ^ c} e J^(/x). This proves that / is j^(/i)-measurable. 3) Conversely, suppose / is j^(/i)-measurable. We will show that (fn)n = fix. / i s the limit of an increasing sequence of ./„(/^-measurable step functions. Therefore it is sufficient to prove that (lAn)n = lAjx for all A e JK(n). We choose any B e 2F. Then (lAfi)n(B) = (lAfi)n(A f) B) + £fin(AnB)
+
= fi{ArtB) =
(lAn)n(B\A)
fi(lAn(Q\A\-)) +n(lAlau)
VAH)(B).
Similarly, (lAn)n(Q\B) ^ (l^)(Q\ß). This implies (\Ap)n(B) = {lAfi)(B) because (lAfi)n{B) + (lAfi)n(Q\B) = n(A) = (lAfi)(B) + The proof is thus complete.
(lAnM\B).
•
We will say a probability measure fi is trivial on a c-algebra se, or se is trivial relative to \i, if \i satisfies the zero-one law JX(A) = 0 or 1 for all Ae se. (7.4) Corollary. Let (Q,J^) be a measurable space, U a non-empty set of probability kernels from 3F to 3F, and 0>u = {n e &>(Q, &): fin = n for all % e n }
Tail triviality and approximation
117
the convex set of all H-invariant probability measures on (Q, #"). Let \i e 0>n be given and Ju{n) = (~) J*n{n) be the a-algebra of all ^-almost surely 7ten
U-invariant sets. Then fi is extreme in 0>n if and only if /i is trivial on
Proof. 1) Suppose there exists a set A e J'ni/J.) such that 0 < /i{A) < 1. Then we may look at the conditional probabilities V = H(-\A) = M
v = fi(-\a\A) = fn,
where / = lA/n(A) and / ' = ln\A/n(Q\A). Clearly, v # v' and /x = n(A)v + (1 — n(A))V. But Proposition (7.3) asserts that v, v' e ^ n because / and / ' are ./„(/^-measurable for all n e Yl. Thus /i is not extreme. 2) Conversely, suppose /i is trivial on J"n{n) and has a representation /i = sv + (1 — s)v' with 0 < s < 1 and v, v' e â?n. Then v is absolutely continuous with respect to \i. Thus v = fn for some measurable function / j ^ 0. From Proposition (7.3) we conclude that / is ^ n (/i)-measurable, and the /i-triviality of */n(/z) implies/ = n(f) = 1 /x-a.s.. Hence v = /i and also v' = \i. This shows that ß is extreme, D The reader will wonder if Corollary (7.4) remains true when Jn{n) is replaced by the c-algebra Sn = {Ae&:n(A\-)
= lA for all ne 11}
of all strictly Il-invariant sets. (The proof that Jn is a cr-algebra is similar to the argument below (1.19).) However, simple examples show that this is not necessarily the case. We include such an example as an aside. (7.5) Example. Let Q = {1,2,3} and n be the stochastic matrix / 1 1/2 \ 0
0 0 0
0 \ 1/2 1 /.
Then ^ Jt = { ( s , 0 , l - s ) : 0 ^ s g l } is the set of all rc-invariant probability vectors, and ex^„ = {(1,0,0), (0,0,1)}. On the other hand, a function / on Q satisfies nf = f if and only if 2/(2) = /(I) + /(3). Thus the cr-algebra of all strictly rc-invariant events is JK = {£1,0}. Consequently, each \i e ^ is trivial on JK. Thus ./„-triviality does not imply extremality. But if JX e S?n then Jn(n) is the power set of Q. Thus ./„(/^-triviality clearly implies extremality. o
118
Extreme Gibbs measures
Fortunately, in all cases in which we are interested J^uiß) may indeed be replaced by Jn because J*n(ß) turns out to be the /^-completion of Jn. For example, this holds when n is a specification. Recall definition (2.19) of the tail (j-field ST. (7.6) Remark. Suppose y is a specification. Then J>y = ST. If ß e &(y) = 0>y then ^y(n) is the /i-completion of ST. Proof. 1) If A G ST then yA(A | • ) = \A for all A e Sf because all yA's are proper. Conversely, i(AeSy then A = {yA(.4 | • ) — 1} e ^ for all A e ^ and therefore 2) Let ß G <%) and ,4 e J^(/i). Then
5= 0 Ae£f
U
{yAA\-)=l}
AcAey
belongs to ST, and /i(ylA5) = 0 because l„ = limsupl { y A W . ) = 1 } = limsupl x = l x
/i-a.s.. D
The preceding remark implies in particular that Jy{ß) is /i-trivial if and only if 3~ is /i-trivial. Consequently, Corollary (7.4) implies assertion (a) of the following theorem. (7.7) Theorem. Let y be a specification. Then the following conclusions hold. (a) A Gibbs measure ß e @(y) is extreme in &(y) if and only if ß is trivial on the tail a-field ST. (b) If /te @(y) and v e ^(Ci, 3?) is absolutely continuous with respect to ß then v e 1S(y) if and only if v = f\i for some ^-measurable function / 2ï 0. (c) Each ß G 'S (y) is uniquely determined (within ^(y)) by its restriction to the tail a-field ST. (d) Distinct extreme elements ß, v of @(y) are mutually singular on 3T, in that there is some A e ST with ß(A) = 1, v(A) = 0. Proof, (a) This has already been proved. (b) Since J*y{ß) is the /i-completion of ST, each ^.(/immeasurable function is /i-almost surely equal to a ^-measurable function. Thus v = f/j. for a 2Tmeasurable function if and only if v = fß for an Jy{/immeasurable function. Therefore the assertion follows from Proposition (7.3) and the Radon-Nikodym theorem. (c) Let fi, v e &(y) be such that ß = v on ST. Then ß = (/t + v)/2 e &(y), and assertion (b) implies that ß = fß and v = gß for ^-measurable functions /, g ^ 0. But ß = v = p on ST. Thus / = g = 1 jû-a.s. and therefore ß = v. (d) This is an immediate consequence of (a) and (c). D
Tail triviality and approximation
119
At this stage we should comment on the physical meaning of extreme Gibbs measures. (7.8) Comment. Suppose we are observing a well-defined state of a real physical system in equilibrium. We will find that the microscopic quantities are subject to rapid fluctuations, whereas the macroscopic quantities remain constant (on a "human" time scale and within the given bounds of accuracy). Suppose we want to describe the observed state by a mathematical object. Because of the microscopic fluctuations we are led to a probabilistic point of view: We shall try to describe the state by a probability measure \i. Of course, \x should be consistent with the observed empirical distributions of the microscopic variables. According to the basic principles of Statistical Mechanics, this can be achieved by assuming that /i is a Gibbs measure for a suitably chosen Gibbs specification y; cf. the Introduction. There is, however, a further requirement on \v. fi should be such that the macroscopic quantities are non-random. In Section 2.2 we argued that the macroscopic quantities are just the tail measurable functions. Consequently, the tail measurable functions should be constant /i-almost surely, and this means that \i should be trivial on ST. Theorem (7.7)(a) thus tells us that the system's state will be described by a suitable extreme element of ^(y). (According to assertion (7.7) (c), there exists but one extreme element of ^(y) which exhibits the right macroscopic behaviour). From this we conclude that only extreme Gibbs measures are suitable to describe an equilibrium state of a real system. In more catching terms we may say that a physical system will always pick an extreme Gibbs measure for its equilibrium state. For this reason, an extreme Gibbs measure is often called a phase. This somewhat vague term should not be confused with the physical concept of a pure phase. In fact, the stable coexistence of distinct pure phases in separated regions of space (such as in Figure la below) will also be represented by an extreme Gibbs measure. This can be seen quite nicely in the three-dimensional Ising model; cf. Dobrushin (1973a) and the survey below Theorem (6.9). It is a tempting misunderstanding to believe that the coexistence of two pure phases (which are considered to be described by two extreme Gibbs measures /i, and /i2) was described by a mixture like (/ix + /i 2 )/2. Such a mixture rather corresponds to an uncertainty about the true phase of the system, as will be explained in more detail at (7.27) below. The following pictures which are taken from Aizenman (1980b) may serve as illustration.
120
Extreme Gibbs measures
Figure 7.1a
An extreme fi,coexistence Figure 7.1b
(/zwater + fiice)/2
Let us return to mathematics. The next proposition gives a fairly standard characterization of probability measures with a trivial tail cr-field. Because of this characterization, such measures are sometimes said to have short-range correlations. (7.9) Proposition. For each /i e ^(Q, 3F} the following statements are equivalent. (i) p. is trivial on ST. (ii) For all cylinder events A (or, equivalently, for all A e 3F\ lim sup \n(A D B) - n(A)n(B)\ = 0. Proof. 1) Suppose \i is trivial on &~. Let A e #" be given. The backward martingale convergence theorem asserts that for each cofinal increasing sequence (A„)Bfel in £f H(A\
in the L (/i)-sense. As \i is trivial on 3~, ii(A\ST) = \i(A) /i-a.s.. Thus for each e > 0 there is some A e y such that fi(\fi(A\jrA) - fi(A)\) < e. For all A <= A e Sf we have sup \n(Af)B)-
fi(A)fi(B)\
BeiT.
^ sup \ dn(n(Am < H(\n(Am
- fi{A))
- n(A)\) < e.
This proves (ii). 2) Suppose assertion (ii) holds for all cylinder events A. Then /x(A D B) = H(A)n{B) whenever B e ST and A is a cylinder event. For fixed B e J , the
Tail triviality and approximation
121
system 3> of all A e 3F with \i{A fl B) = n(A)n(B) is a Dynkin system. That is, <2> satisfies the conditions (i) Q e 2; (ii) A2\A1 e 2 when Ax, A2e 3 and A1 cz A2; and (iii) (J„ ä i A„ e 3) when An e 2 (n ^ 1) are pairwise disjoint. Since 3) contains the cylinder events, 3 = #"; see Theorem 2.3 of Bauer (1981), for example. Hence B e § , and this implies that /i(B) = /i(5) 2 and thereby fi(B) = 0 or 1. Thus /i is trivial on 2T. D It is natural to ask if the extreme elements \i of ^(y) even satisfy the stronger regularity property (7.10)
lim &
sup
\n(A\B) - n{A)\ = §
Be^:ß(B)>0
for all cylinder events A. For example, if y is the non-quasilocal specification of Example (2.27) then (7.10) holds for all p. e ex^(y) because these are the Bernoulli measures. However, if y is Gibbsian for a uniformly convergent potential then (7.10) can only be satisfied for some \i e ^(y) when |^(y)| = 1. This follows from the next proposition. (7.11) Proposition. Let X e Ji(E, S) and y = pX_ be a quasilocal ^-specification. (a) If p is positive and (7.10) holds for some \i e &(y) then <&{y) — {//}. (b) Conversely, suppose {E,ê) is standard Borel and, for simplicity, all pA's are bounded and X is finite. If ^(y) = {/i} then ÏA(' \œ) -£ ^ uniformly in œ e Q, and p. satisfies (7.10). Proof, (a) Suppose \i e @(y) satisfies (7.10) and v e &(y) is arbitrary. We show that [i = v. Let A be a cylinder event and e > 0. Then there is some A e y such that H(A) -eSn(A\B)
= niB)-1 J dnyA(A\ •) S ft(A) + e B
for all B e STA with /i(B) > 0. This means that H(A) - e S JA(A I • ) ^ KA) + e
/i-a.s..
By hypothesis, yA(A\ •) e Sf. Thus there are some A e y and fe£fA with ||yA(yl|-) — f\\ S £• Consequently, \f — fi(A)\ ^ 2e /i-a.s.. But /i and v are equivalent on #"A because of Remark (1.28)(2). This implies \f — fi(A)\ ^ 2e v-a.s. and therefore \yA(A\ •) — fi{A)\ fï 3e v-a.s. and |vU)-/x(^)| = |vyA(^)-/x(^)|^3e.
We thus conclude that \i — v. (b) We need to show that lim sup |yA(yl|cü) - \i{A)\ = 0 Sf
med
122
Extreme Gibbs measures
for all cylinder events A. This will also imply that p satisfies (7.10) because n(A\B) = n(By1
idnyA{A\-) B
for all ß e ^ with p(B) > 0. Suppose the converse. Then there exist a cylinder set A, an e > 0, a cofinal sequence (A„) nèl in £f, and a sequence (co"^^ in Q such that \yAn(A\co")-p(A)\^e for all n ^ 1. In view of Theorem (4.12), the sequence (yA„(- la)"))näi has a cluster point v. Theorem (4.17) asserts that v e &(y). But v # /x because \v(A) — ^04) | ^ e. This contradicts our hypothesis &(y) = {p}. a As a by-product of the preceding discussion of property (7.10) we have obtained the following result: For some specifications y the uniqueness relation &(y) = {/"} implies that p is the local limit of the yA(- |eo)'s as A runs through y . We ask if a similar approximation theorem holds when |^(y)| > 1. In this case only the extreme elements of ^(y) can be shown to be limits of suitable yAn(-\co)% and it is impossible in general to specify the approximating boundary conditions a>. (7.12) Theorem. Let y be a specification, p e ex &(y), and (A„)„ >i an increasing cofinal sequence in £?. The following conclusions hold as n -» oo. (fl)
7AJ -» Kf)
A*-a.s.
/or a// bounded measurable functions f on Q. (b) / / £ is a compact metric space with Borel a-algebra S then 7A„( - |O>)-»^
weafc/y
for p-almost all œ e Q. (c) / / y is a ^-specification for some X e Jt(E, S) then for p-almost all œ e Q we have sup
|y A „(/M - p(f)\ -> 0
/ e i ? 4 : 11/11 g l
/or a// A e £f, and therefore
in the topology of local convergence. Proof, (a) By the very definition of a Gibbs measure, VAJ = Kf\#kn)
A*-a.s..
The backward martingale convergence theorem thus implies
Tail triviality and approximation
IKJ^MW)
123
A*-a.s..
Since 9~ is ^.-trivial, p.{f\3~) = p.(f) /i-a.s.. (b) As £ is a compact metric space, so is Q in the product topology, and J^ is the Borel cr-algebra. The space of all bounded continuous functions on D. contains a countable subset C0 which is dense with respect to sup-norm. Thus, for each <x>, yAn(- |co) converges weakly to p. if yAn(/|co) -> p{f) for all fe C0. But (a) asserts that the latter holds for /i-almost all co. (c) Because of Remark (1.28) (3) we may assume that X e 0>{E, $). Let p be such that y = pX_, and let A e y be given. For each n ^ 1 with A c A„ we consider the function pA = AAn\ApAn. We have, writing v = pJ.A, Pi = V(PA\^A„\A)
v-a.s..
This is because pA is ,^^-measurable, and V(PA/) = P-(/) = P A „ ( P A „ / )
= P-^A„\A(PAJ)
= v(p A /)
for each J^^-measurable function / . The backward martingale convergence theorem thus implies v a s
"--
PÂW^PAJI-^AA) We claim that V
(PA
Q ^\„\AJ = P A - J^(df/)/9AKf/s\A)
v-a.s..
For suppose / is measurable with respect to (~) 2TAn\A. Then for each Ç e EA «ai
the function co -> /(ÇcoS\A) is 5~-measurable and therefore /i-a.s. constant. Thus Fubini's theorem gives v(P A /) = J p(deo) J AA(dC)pA(CcoSXA)/(CcoSXA) = J AA(dC) J A*(df/)pA(C^s\A) I p(dco)/(CcoSXA) = jAA(dC)Jp(df/)pA(Cf/SXA)/(C^\A) = v(p A /). So far we have shown that pA -> pA v-a.s.. Now we recall that v = pXA, and we use the identities ^APA = ^A„PA„ = 1»
^APA = P^A(PA) = 1
to obtain that ^AGPA-PAD^O
P-a.s..
124
Extreme Gibbs measures
Indeed, Fatou's lemma implies 2 - J d/x lim sup XA(\p"A - pA\) n->oo
= J d/x liminf AA(pA + PA - IPÂ - PAI) «-•co
^ J d / U j l i m i n f (pA + pA - |pA - p A |)) = MA(2PA) = 2.
Assertion (c) now follows immediately because sup
|y A „(/M - /x(/)|
feä>A:\\f\\il
sup
I4^A„\A(PA„/|W)-MA(PA/)|
sup
|AA(PÂ/-PA/|W)|
^A(IPA-PAIM
for all to e Q. (By the way, choosing / = sign(pA(o-Acus\A) — pA) we see that the last inequality is an identity, and all four terms are nothing other than the total variation distance of the restrictions yAn(- \co)\^A and n\!FA to #"A.) a Let us note that Theorem (7.12) (a) provides us with a more explicit version of Theorem (7.7) (d). For distinct measures /x, v e ex % ) we choose any bounded measurable function / with /x(/) ^ v(/), and we consider the event A = { 7 A „ / " ^ ^ > /•*(/)} f° r some cofinal increasing sequence in S. Then Ae ST, H(A) = l,andv(^4) = 0. We note further that Theorem (7.12)(a) has a converse: If /i e g(y) is such that yAnf -» /x(/) /x-almost surely for all / e i£ then /x is extreme. For, in this case we have fi(f\&~) = /x(/) /x-almost surely for a l l / e =£?, and this means that /x is trivial on ST. Recalling the contents of this section the reader will notice that the product structure of (Q, #") was only used at a few places, namely Proposition (7.11) and Theorem (7.12)(c). Its use in Proposition (7.9) and Theorem (7.12)(b) was unessential. So we may state the following remark. (7.13) Remark. Theorem (7.7) and Theorem (7.12) (a) remain valid without change in the more general setting below. (Q, #") may be an arbitrary measurable space, y any countable index set which is directed upwards by a partial ordering "<=", 3~ the intersection of any decreasing family (2TA)A^y> ofsub-calgebras of 8F, y any family of proper probability kernels yA from STA to #", and &(y) = 0>1 the set of all y-invariant probabilities on (Q, #"). This framework also includes some other cases of probabilistic interest. One such case will be considered in Example (7.16). o
Some applications
7.2
125
Some applications
Before we continue with the general theory we will apply the preceding results to four examples. The first three of these are classical topics of probability theory. (7.14) Example. Kolmogorov's zero-one law. Let (E, S) be a measurable space, S a countably infinite set, (Q, &) = (E,é)s, and X e 0>{E,S\ Kolmogorov's zero-one law asserts that the product measure Xs is trivial on ST. This is a special case of Theorem (7.7) (a) because Xs is the unique element oi^(X_) and thus extreme in ^{X); cf. Remark (1.25). More generally, if /x = Lj <X; is the ieS
product of not necessarily identical probability measures a; on (E, S) then \i is trivial on ST. For \i is the unique element of "&(y), where y is defined by
yA/M=i(nA«i)(dc)/(ctoSXA). Here œ e Q, A e if, and / : Q -> IR is bounded and measurable,
o
(7.15) Example. Stationary Markov chains. Let £ be a finite set, S its power set, P a positive stochastic matrix on E, and pP the distribution of the stationary Markov chain with transition matrix P and parameter set Z. pP was defined at (3.3). From Theorem (3.5) we know, and we shall see again in (8.41) (3), that ßp is the unique element of "&(y), where y is defined in terms of P by (3.6) and Comment (3.8)(a). Thus ßP is trivial on 2T. In particular, \iP is trivial on the right-sided tail f] ^znin.œv This is a particular case of neZ
a well-known zero-one law of Blackwell and Freedman (1964). Moreover, each of the propositions (7.9) and (7.11)(b) implies the ergodic theorem (3.A3) for P. For let x,y e E and a be the unique probability vector on E with ccP = a. Then |P"(x,j;)-aO0| = « M " 1 IM^O = y->a-n = X)~ ßp{(T0 = y)ßp{0„n = X)| -> 0
as n -> oo
because {a^„ = x} e ^zn]-«,«[- For the case of a general state space E we refer to Theorem (10.34). o (7.16) Example. Exchangeability and the zero-one law of Hewitt and Savage. This example is a digression from the theory of Gibbs measures but fits into the more general setting of Remark (7.13). Let {E,S) be a measurable space and (Q, &) = (E,S)N. For each n :> 1 we let /„ c T denote the group of all transformations x of Q of the form x: œ -> (co t4 ;) ièl , where x^: M -> M is
126
Extreme Gibbs measures
a bijection with xj = i for all i > n. Clearly, \I„\ = n\, and I„ increases with n. The union I = \J I„ is the group of all permutations of finitely many coordinates. "- 1 We consider the set 0>l = ^ ( Q , &) of all /-invariant probability measures on (Q, &). Each \i e 0>l is called an exchangeable or symmetric distribution. &J admits a description in the spirit of Gibbs measures. For let JH = {Ae &: z~xA = A for all t e I„} denote the a-algebra of all /„-invariant events. The intersection J = f] J„ is nil
called the a-algebra of all symmetric events. We define a family (y„)näl of proper probability kernels from Jn to 3F by yn(A\a)) = -
£
ni
i£j„
lA(tœ)
(Ae^,œeQ,n^l).
It is then easily checked that 0>1 = {lie 0>(C1,&): nyn = \i for all n ^ 1}.
We are thus in the framework of Remark (7.13). In particular, ex ^ consists of all / j e ^ which are trivial on J. A well-known theorem on exchangeable distributions states that (7.17)
{XN:Xe0>{E,£)}.
ex^f =
Let us give a proof that fits into the ideas of Section 7.1. The key observation is that lim n-*ao
Mn/'off«)-fu(/'°ffi)
= lim
k
(n — kY »•
k
q>eM(k,n) i = l
= 0 for all k ^ 1 and bounded measurable functions fx,..., fk on E. Here M(k, n) is the set of all mappings from {l,...,/c} to {1,..., n} and Mt(k,n) the set of all injective mappings in M(k, n), and the last identity follows from the fact that \M,{k,n)\l\M{k,n)\ = (nl/(n - k)\)/nk -> 1 as n —> oo. Now if/i e ex ^ then Theorem (7.12)(a) in the extended version of Remark (7.13) implies
n[Y\fio(7i)
= limyn
(Ufio(ji
k
= n
iim
y n (yi° ff i)
i —1 n-»oo
Some applications ff
i) ^_a-s-
= if=ll P(fi ° for all k and fy,..., / j e ^ , w e have
fk and therefore \i = a^fi^.
»[lift0*
S) =
127
On the other hand, for each
tirnytt[Y\fioai Hm
= ifl= l
n-*oo
k
=n
Hm
Vnifi ° al) {
n
^-a-s->
- z /i ° ^
i = l n-^co ft j = l
and the last expression is ^"-measurable. This implies AM
Y\fioai
i= l
Mn/« o a, ^
/i-a.s.
i= l
because 2T a J. Consequently, the system 3 = {A e &: n(A\f)
= n(A\F)
/i-a.s.}
contains all cylinder sets of the form A = Ay x ••• x Ak x £ Nnik,co[ with/c ^ 1 and yl t , . . . , Ak e ê. As 3} is a Dynkin system, ^ = ÜF and therefore J = 2T /i-a.s.. Now let /i = AN for some X e 0>(E, ê). Then ji e ^ j . By Kolmogorov's zeroone law, ß is trivial on ST. In view of the above, \i is trivial on J. This is the zero-one law of Hewitt and Savage (1955). We conclude that \i e e x ^ j , thus completing the proof of (7.17). We shall return to exchangeable distributions at (7.31). o (7.18) Example. Product specifications. Let (£, S) be an arbitrary state space, S a countably infinite parameter set, Sy U S2 = S a partition of S into two nonempty disjoint subsets, (Qk, &k) = (£,
A*2)?A
= A^Àns,
x
ß2ylns2
= M1 x fi2
128
Extreme Gibbs measures
for all A e y , and if v* e 9{yk)\{nk} then (fi1 x ft2 + v1 x v2)/2 belongs to ^(y) but is not a product measure. The precise relation between g(y) and the 0(y*)'s is (7.19)
ex^(y) = {/i1 x /i 2 : //* e ex»(y*),fc = 1,2}.
In particular, |ex»(y)| = |ex^(y 1 )||ex^(y 2 )|. The latter fact may be used to construct specifications with a prescribed number of phases. For instance, an iterated product of the Gibbsian specifications for the potentials in Section 6.1 has a power of two as its number of phases. We turn to the proof of (7.19). Let fxk e ex^(y k ) be given and n = /i 1 x /x2. To show that fi is extreme in <&{y) we check that fi is trivial on ST = D ^ S A A x ^S2\A- We take an arbitrary A e ST, and for co1 e Q.1 we let ^(co1) = {co2 e Ü.2: œ1 œ2 e A} denote the co1-section of A. It follows from the definition of product a-algebras that A{œl) e 2/~2 = f] ^i2\\ a n d the Aey
function co1 -> fj.2(A(co1)) is measurable with respect to ST1 = f] ^ A A Acy
As nk is trivial on 2Tk, /i2(yl(co1)) = 0 for /^-a.a. co1, or /^(^(co 1 )) = 1 for /^-a.a. co1. Hence H(A) = J /i1(doj1)/i2(yl(co1)) = 0 or 1. Conversely, let fi e ex^(y) be given. Then Theorem (7.12) (a) asserts that for all Ak e &k (k = 1,2) and /i-a.a. co = co1co2eQ HiA1 x A2) = lim yAn(Ax x yl2|co) n-*oo
= limyAnnSlM1|co1)y2„nS2M2|co2) n-*oo
= lim yAn(Ax x Q ^ t u ^ Q 1 x ^ 2 |co) «-•co
= ^(/l 1 x n 2 ) / ^ 1 x A2); here (A„) is any increasing cofinal sequence in y. Hence fi = fi1 x fi2 with /x* e ^(Q*,.^*), fc = 1, 2. We have /i 1 e ^(y 1 ), for if Ai <= S1 is finite and A1 e F 1 then Moreover, fi1 is extreme in ^(y1). For suppose fi1 = sv + (1 — s)v' with distinct v, v' e ^(y 1 ) and 0 < s < 1. Then fi = sv x fi2 + (I ~ s)v' x fi2 and v x fi2, v' x ^i2 E @(y) in contradiction to the extremality of fj.. Similarly, fi2 e ex ^(y2). The proof of (7.19) is thus complete. As a complement to equation (7.19) we note that 2T = 2Tl x 2T2 /^-almost surely for all n e &(y). Indeed, for any Ak e 3Fk it follows from the backward
Extreme decomposition
129
martingale convergence theorem and the product structure of y that ^(A1 x A2\£T) is /i-almost surely equal to a &~l x .^-measurable function. Since ^" 1 x 3~2 <=. 3~, we can conclude that \x(A\9~) = \x(A\3~^ x ST1) ^-almost surely for all A e !F, and the result follows, o
7.3
Extreme decomposition
In this section we shall deal with non-extreme Gibbs measures. We will show that each Gibbs measure admits a unique representation as a mixture of extreme Gibbs measures, at least when the state space (E, S) is standard Borel. Extreme decompositions are the subject of a well-known branch of convex analysis, the Choquet theory. Under suitable topological assumptions on E, y, and &(y) this theory also applies to the convex set &(y) of all Gibbs measures for a specification y. However, we prefer to follow a measuretheoretic approach due to Dynkin. This approach works without any topological assumption on
for all fieSP. We are going to represent probability measures /x e & in the form (7.20)
n = J v w(dv),
where w e 3P{3P0,£{%)) is a probability measure on a specified subset ^ 0 of
130
Extreme Gibbs measures
&. Equation (7.20) means that
Kf) = J v(/)w(dv) or, equivalently, ef(p) = w(ef) for all bounded measurable functions / on Q. Standard extension arguments show that this already holds whenever /i(A) = J v04)w(dv) for all A in a generator of F that is stable under finite intersections. Proposition (7.22) below will show that representations of the form (7.20) can be established by means of probability kernels with some particular properties. Let us introduce these kernels. (7.21) Definition. Let (Q, F) be a measurable space, sf a sub-er-algebra of &, and & a non-empty subset of &(Sl, F). A probability kernel it from (Q, s/) to (Q, F) is called a (&, j^)-kernel if it enjoys the following properties: (i) For all f j e ^ and A e # , (JL(A\S/) = JI(A\-)
/x-a.s..
(ii) Q.gs = {&> G Q : Ji(-\co) e ^ } 6 ^ . (iii) / x ( ß ^ ) = 1 for all /x G ^ . It will be convenient to write itw = TT(-\CO), w Ê fl. In fact, we can and will assume that Q.@> = Q, which strengthens (ii) and (iii). Otherwise, we take any v G & and define itw = v for co £ Çl&>. The usefulness of ( ^ , jz/)-kernels is demonstrated by the next proposition. (7.22) Proposition. Let (fl, F) be a measurable space, and suppose IF is countably generated. Consider a non-empty subset & of SPÏÇl, êF), a a-algebra i c ^ , and the subset 0>^ = {p e 0>: p(A) = 0 or 1 for all
Aesé]
of all measures in 0* that are trivial on se. Suppose there exists a (0>, sé)-kernel n. Then &^ # 0, and for each p. e 0> there is a unique w e ^(SP^, £{^^)) such that p = j" v vv(dv). w is given by w(M) = fi(n' e M), M e
s(^).
Proof. 1) We express SP^ in terms of n. Let <& be a countable generator of F that is stable under finite intersections. For A e <& and / j e ^ w e let vA{p) =
\{n-{A)-p{A)Yàp
= p(n-(A)2) - p(A)2 denote the variance of n'(A) with respect to p. (Note that p(n'(A)) = p{p(A\sé)) = p(A).) The mapping vA: 0> -» [0,1] is ^(^)-measurable because
Extreme decomposition
131
vA = eK{A)2 — eA. We will show that (7.23)
^/={Aie^:^(7r- =
Ai)=l}
= {ne&: vA{n) = 0 for all A e <#}. In particular, this will imply that 3P^ e s(SP). Let \i e SP be given. Then \i(A\s4) = 7i"(v4) /i-a.s. for allAeïF. Thus \i e ^ if and only if 7i'(i4) = fi(A) /i-a.s. for all ,4 e # \ As {,4 e #": 7t'(y4) = n(A) ^-a.s.} is a Dynkin system, the latter holds if and only if (7.24)
n-(A) = p(A)
ii-a.s.
foralMetf.
(7.24) is satisfied if and only if vA(fi) = 0 for all A e <%. On the other hand, (7.24) is equivalent to the statement /x{w = /x) = 1 because {71' = /1} = {7i"(^) = /i(,4) for all A e
= 0}
because of (7.23). Thus {it e SP^} e se, and we only need to prove that fi(vA(n') = 0) = 1 for all A e # and ^ e ^ . Since vA ^ 0, this will follow once we have shown that /i(vA(iï)) = 0. But fi(vA(n-)) = $ n(dœ)ln»(n:(A)2) - n»(A)2^ = \ d^{n{A)2\sé)
-
n^A)^
= |d/i[7r-(^)2-7r-(^)2]
= 0. 3) Let n e SP be given and w be defined as in the last sentence of the proposition. We first need to show that w is well-defined, i.e., that {71' e M} e se for all M e e(SP^). It is sufficient to check this in the special case when M = {v e SP^. v(A) ^ c} with A e #" and 0 ^ c ^ 1. But then we have {TT e M} = {TC- e ^ } 0 {%\A)
^c}estf.
Thus w is well-defined. Step 2) above ensures that w is a probability measure. By definition, j" (pdw = J (p(n')dfi whenever (? = 1M for some M e e{SP^). By standard arguments this identity extends to all bounded ^(^^-measurable functions cp on SP^. In particular, choosing
132
Extreme Gibbs measures
4) To show the uniqueness of w we take an arbitrary w e ^(^, ei^)) with p = J v w(dv). Let M e s{^) be given. From (7.23) we conclude that V(TT- e M) = lM(v) for all v e 0>^. Thus w(M) = \ lM(v)w(dv) = \ v(7i" e M)w(dv) = /x(7T* e M ) = w(M).
This completes the proof of the proposition,
n
We intend to apply the preceding proposition to the sets 0> = ^(y), y being a specification. In this case Theorem (7.7) (a) asserts that ex^(y) = ^ r , where 3~ is the tail a-field. Thus Proposition (7.22) will give us the extreme decomposition of Gibbs measures provided we are able to construct a (&(y), ST)kernel. This can be done when (E, S) is a standard Borel space. The crucial property of standard Borel spaces is the existence of countable cores, cf. (4. A9) and(4.All). From now on we assume again that (Q, #") = (E,
n-co
"
J
Clearly, Q 0 e y. By the definition of a core, for each œ e Q 0 there is a unique na = n{-\co)eâP{Q.,^) such that n(A\m) = lim yAn(A\m) for all A e
œ e Q\Q 0 w e P u t Tcm = n(- \co) = v0 for some fixed v0 e ^(O, #"). The function n : !F x Q -> [0,1] thus defined is a probability kernel from 3~ to J^. This is because {A e 2F: n(A\ •) is ^-measurable} is a Dynkin system containing (€. 2) Next we let p. e &(y) be given. The backward martingale convergence theorem implies that p(A\9~) = lim p{A\^n) «-•oo
= lim yAn(A\ •)
p-a.s.
n->cc
for all A e (ê. Hence p{£l0) = 1 and p{A\2T) = n{A\ •) ^i-a.s. for all A e ^ and thus for all A e J^. 3) Finally we consider the set Q t = {n' e ^(y)}. Since Q
I = n AeSf
n OOAW=n{A)i Ae
Qj e 5T We need to prove that p{Qx ) = 1 for all p e &(y). This will follow once
Extreme decomposition
133
we have shown that n'yA(A) = n'(A) /i-almost surely for all A e Sf, A e (€, and jx e <&(y). But the last statement follows from Step 2) above because 7fyA(A) = fi(yA(A\ •)]) = = H(A\0~) = Tf(A)
n(n(A\^)\^) /i-a.s..
The proof is thus complete, D In Chapter 14 we shall construct (0, ^-kernels for pairs (0, se) different from {@{y), 0~)- Here we shall apply the preceding two propositions to show that the set $(y) of Gibbs measures is a simplex, in that each [i e &(y) can be uniquely decomposed into extreme components. More precisely, ^(y) is isomorphic to the simplex of all probability measures on ex ^(y). (7.26) Theorem. Let (£,) be a standard Borel space, and suppose y is a specification with <&(y) # 0. Then ex @(y) # 0, and for each [i e &(y) there exists a unique weight w^ e 0(ex &(y), ^(ex &(y))) such that ex£(y)
The mapping a>: \i -> w„ is an affine bijection from @(y) onto 0(ex &(y), e{ex &(y))). In fact, u> is induced by a (&(y), 0~)-kernel n, in that wß is the image of n under the mapping œ -> n™. Proof We put 0 = <S(y) and se = 0~. Then Theorem (7.7)(a) asserts that 0^ = ex &(y). Proposition (7.25) ensures the existence of a (0, ^ - k e r n e l n. By Proposition (7.22), the mapping w. \i -> wu = n(co: nm e •) is an injection from <S(y) into 0(ex<$(y), ^(ex^(y))) such that wß is the unique probability measure on ex <&(y) that represents \i. It remains to show that «* is surjective. Let w e 0(ex <&{y), #(ex <S(y))) be given. Then \i = \ v w(dv) is a well-defined probability measure on (Q, #"). /z e @(y) because WA = JvyAw(dv) = /x for all A e y . By the uniqueness of the representing measure, wß = w. The proof is thus complete, D (7.27) Comment. Suppose we are given a Gibbsian specification y that serves as a model of a physical system. The state space (E, S) will then clearly be standard Borel. In (7.8) we have argued that a physical system will always choose an extreme element of <&(y) for its equilibrium state. However, an experimenter will not always be able to determine the chosen extreme element. It may be impossible to observe all relevant macroscopic quantities sufficiently accurately, and the observed data may suggest a macroscopic picture that varies in the course of a long experiment (which possibly includes repetitions). In this situation the experimenter will use his partial knowledge of the system's
134
Extreme Gibbs measures
macroscopic behaviour to guess the system's state. This guess will be represented by a (subjective or empirical) weight w on the set ex ^(y) of all possible phases, w will be chosen in such a way that the tail behaviour of the mixture ji =
j
v vv(dv)
ex <S(y)
fits the macroscopic data. The experimenter will then agree to say that /x describes the state of the system. Now Theorem (7.26) asserts that each fi e ^(y) is obtained in this way from a unique weight w. Consequently, each /x e <&(y) may be viewed as an experimenter's subjective description of the system's state. This description is characterized by a certain amount of information on the true macroscopic behaviour of the system. If no information is available, the associated weight w will be some sort of equidistribution on ex ^(y); cf. Figure lb above. In the opposite case of complete information, \x will be extreme and w = 6ß. o Theorem (7.26) will be complemented by three corollaries. The first of these will state that the bijection «* commutes with the action of each symmetry x of y. Let T e T be a symmetry of y. According to Remark (7.2), x may also be considered as a transformation of ex &(y) that maps each /xeex &(y) to x(fj.) = /io T" 1 e ex^(y). This transformation is easily seen to be measurable with respect to ^(ex^(y)). Thus for each w e ^(ex ^(y), ^(ex ^(y))) we may define its T-image T(w)
= w(v: T(V) e •) e ^(ex<S(y),^(ex9(y))).
(7.28) Corollary. Let (E, S) be standard Borel and y a specification with &(y) ^ 0. If x e T is a symmetry of y then wt(rt = x(wß) for all /x e ^(y). In particular, \x e @(y) is x-invariant if and only if wß is x-invariant. Proof For each n e ^(y) we have T{H) = j T(v)w„(dv) = j vT(w„)(dv). Thus x(wß) is equal to the unique wt(At) that represents T(^). D It is a further immediate consequence of Theorem (7.26) that @(y) has the same linear dimension as ^(ex^(y), ^(ex^(y))). In particular, we have the following corollary. (7.29) Corollary. Let (E, S) be a standard Borel space, y a specification, and N ^ 1 an integer. Then \ex@(y)\ ^ N if and only if ^(y) is at least (N — 1)dimensional, in that @(y) contains N linearly independent measures. Proof. 1) Suppose ex <&(y) contains N distinct elements fi1,..., fiN. By Theorem (7.7)(d), n1} ..., /xN are pairwise mutually singular. The latter implies the
Extreme decomposition
135
existence of sets Au...,ANe3F such that /J.„(A„) = 1 and Hs(A„) = 0 if t # n. This fact immediately shows that nu ..., nN are linearly independent. 2) Conversely, suppose |ex^(y)| = m < N. Then ^(ex^(y), ^(ex^y))) is isomorphic to the simplex spanned by the basis vectors of Mm. The bijection ^ in Theorem (7.26) is affine. Thus if ^(y) contains N linearly independent elements then so does Rm. The latter is impossible, D The next and final corollary will show in which sense the set ^(y) of all Gibbs measures may be considered as the infinite volume limit of the finite volume Gibbs distributions yA(-1 w), A e y, w e Q. We consider the if-topology on ^*(Q, IF). For a quasilocal specification y we let ^Um(y) denote the set of all /x e ^(Q, J^) that enjoy the following property: There exist nets (or, if you prefer, sequences) (AX)XED in y and (ö)J)iej) in Q such that Ka -$ S and yK(- |coa) -$ /x. Theorem (4.17) ensures that ^iim(y) c y(y)- If y is Gibbsian, each /x e ^Iim(y) is called a limiting Gibbs measure. (7.30) Corollary. Let (E, S) be standard Bor el, X e M(E, S), and suppose y is a quasilocal X-specification. Then &(y) is the closed convex hull of %m(y). Proof. 1) <&(y) is convex and, by Theorem (4.17), closed. Thus &(y) surely contains the closed convex hull of its subset %m(y). On the other hand, Theorem (7.12)(c) asserts that ex^(y) a %m(y). Hence we only need to show that ^(y) is equal to the closed convex hull of ex
consists of all probability measures of the form £ ajVj with v} e ex ^(y), a} ^ 0, Y, Oj = 1, m ^ 1. We will show that ex ex ^(y) is dense in ^(y). Let \i e ^(y) be given and U = jv e <S(y): max |v(^ k ) - fi(Ak)\ < el a neighbourhood of /x. Here Ax, ..., A„ are any cylinder events and e > 0. We need to show that U D ex ex ^(y) # 0. (In fact, we shall prove this when Ax, ..., A„ are arbitrary sets in #".) We choose an integer r > 1/e and split the range [0,1] of the evaluation maps eAi, ..., eAn into r disjoint intervals 1(1), ..., I(r) of length less than e. For each g = {gy,...,Q e {l,...,r}" we consider the set M , = H {veex^(y):v(X k )e/(4)}. k= l
The sets M, constitute a measurable partition of ex ^(y). If v, v' belong to
136
Extreme Gibbs measures
the same M( then \v(Ak) — V(Ak)\ < s for all k. For each { with Me ^ 0 we pick some ve e Me, and we define
Then v e ex ex ^(y). But also v e U because max \v(Ak) - n(Ak)\ = max £
j (e^ - ^k(v,))dwM
< Z w»(Me)£ = £e
This completes the proof of the corollary,
o
We conclude this section with the observation that Proposition (7.25) and thereby Theorem (7.26) carry over without change to the more general setting that was described in Remark (7.13), provided (Q, #") is standard Borel. As was shown in (7.16), this framework also includes the exchangeable distributions. We thus arrive at the following example. (7.31) Example. Let (E, S) be a standard Borel space, (Q, #") = (E, é)N and 0>t be the set of all exchangeable distributions on (Q,#"), cf. Example (7.16). We already know that 0>I consists of all probability measures that are preserved by a family of probability kernels satisfying the conditions of Remark (7.13). The argument for Proposition (7.25) therefore provides us with a (^7, Jf)-kernel, where J is the cr-algebra of all symmetric events. Thus Proposition (7.22) implies that each n e 0>1 is uniquely represented by a probability measure w^ on ex^j. By (7.17), the marginal projection v -> at(v) is a bijection from ex 0>1 to SP{E, ê). It is easily seen to be measurable. Therefore, letting mfl denote the image of wfi under this marginal projection, we end up with the following result: For each n e ^ there is a unique m^ e 0>(0>(E, S), e(0>(E, S))) such that H = Jl N m^(dl). This is the well-known theorem of de Finetti (1931) in the version of Dynkin (1953). o
7.4
Macroscopic equivalence of Gibbs simplices
In this section we shall consider specifications y and y which, in some sense, differ from each other only microscopically. We will show that in this case the simplices <&(y) and ^(y) are macroscopically identical, in that there exists an isomorphism between &(y) and ^(y) that preserves the probability of tail events. Let y, y be two specifications with the same parameter set S and state space (£, £). We will say y and y are equivalent, and write y « y, if there exists
Macroscopic equivalence of Gibbs simplices
137
a number 1 ^ c < oo such that c-lyA(A\-)^UA\-)^cyA(A\-) for all A e 3F and all A in a cofinal subset SP0 of SP. Clearly, we have y % y whenever for each A e SP there is a positive measurable function fA on Q such that yA = fAyA and sup ||log/ A || < GO. In Aey particular, we have the following example. (7.32) Example. Let X e Jt(E, ê) be an a priori measure. Suppose y = pk, and y = pi. are two ^-specifications which are such that c"1 pA :g pA ^ cpA for all A e SP and some 1 ^ c < oo. Then y % y. In particular, let $ and Ô be two A-admissible potentials. Then y® % y* provided that sup ||H* - H%\\ < oo. More generally, y* % y* whenever 5> is equivalent to a potential ¥ with sup \\H* - fl*|| < oo. This condition certainly holds when
Z 11^ - OJ < oo. (The last condition simply means that ¥ is a local perturbation of Q>. IfS = Zd and $ is shift-invariant then there is no shift-invariant ¥ # O that satisfies this condition.) o (7.33) Theorem. Suppose (E,S) is a standard Bor el space. Let y and y be two equivalent specifications. Then &(y) # 0if and only if &(y) # 0, and in this case there is an affine bijection p. «-> fi between @(y) and &(y) having the properties below: (i) p = fi on &~. (ii) fi = pn, where ft is any Ç&(y), 3T\kernel. (Hi) x(fi) = x(p) whenever T e T is a common symmetry of y and y. In particular, |ex^(y)| = |ex^(y)|. Proof. 1) Suppose there exists some p e &{y). Consider the net {pyA)Aey0. For each A e SP0 and A e ^ we have pyA(A) ^ cpyA(A) = cp(A). Thus the net {pyA)A£^0 is equicontinuous, in that lim lim sup pyA{Am) = 0 n->oo
A 6 y0
for every sequence {Am)mèl in !F with Am[0. A standard argument similar to
138
Extreme Gibbs measures
that in the proof of Proposition (4.9) thus shows that there exists a subnet (\)aeD a n d a measure p e 0>(ÇI, 3F) such that WA„(^) 1
ß(A)
for all A e !F, and therefore
mtSf) 1 ß(f) for all bounded measurable functions / on Q; cf. the argument in Remark (4.3)(2). From this we conclude that p e @(y). Indeed, let A e y and A e J*\ Then the consistency of y yields ßMA) = Hm nyAJA(A) D
= um iiy^(A) = p(A). D
In particular, we have ^(y) ^ 0, and Proposition (7.24) ensures the existence of a (^(y), ^ k e r n e l Ä. 2) Let \i e ex <&(y) be given and let p e &(y) be constructed as above. Then p(A) = lim nyK(A) ^ cn(A) D
for all A e 2F. In particular, if A e ST then either p(A) ^ cn(A) = 0 or /ï(Q\yl) S c[i(Q\A) = 0. Therefore p = \i on 2T and, by Theorem (7.7.)(a), peex @(y). As p is represented by 5^ and {it' = /i} e ^", ^(TJ- = p) = £(£• = £) = ^({/i}) = 1. Consequently, /i = /in. 3) Let ^ e ^(y) be arbitrary, and define p = \in. Then p e ^(y). This is because Step 2) implies that H(n- e 9(y)) = J v(£" e 3(y))w„(dv) = 1 and therefore WA = J V-{àoj)nmyA = J /i(d
This proves (iii).
Macroscopic equivalence of Gibbs simplices
139
4) It remains to show that the injection ß -> ßn is surjective. This is done by interchanging the rôles of y and y. Let n be a (@(y), ^ - k e r n e l and fi e ^(y). Then 3) shows that fin e ^(y) and fin = fi on ST. Thus (fin)ft = fin — fi. This completes the proof of the theorem. • The following example shows how the theorem above can be applied to one-dimensional systems. (7.34) Example. Let S = Z, E = { — 1,1}, X be counting measure, and (J„)„eZ a two-sided sequence of positive real numbers such that £ e~2Jn < oo. We consider the nearest-neighbour potential $
= {~J»a"a"+i { 0
üA = {n,n+\},neZ, otherwise.
We will show that |ex^(
I A e £?'• min A<^n<max A
for some n e Z.
o
11®J < °°
Chapter 8 Uniqueness
From Chapter 6 we know of some examples of specifications that admit several distinct Gibbs measures. We are thus led to the question of whether there is a class of specifications each of which admits only one Gibbs measure. In other words, we ask for conditions that imply the absence of phase transitions. We shall investigate two such conditions. The first condition (which is due to Dobrushin) can roughly be stated as follows: The total interaction of a given spin with all other spins should be so small that some crucial quantity is less than one. Obviously, such a condition is tailored for some kind of contraction argument. This contraction argument, together with a discussion of the circumstances under which Dobrushin's condition holds, will be the subject of Section 8.1. In fact, Dobrushin's contraction technique will yield more than uniqueness of the Gibbs measure. It will also allow us to estimate the distance, in some sense, of the unique Gibbs measure from other probability measures. In Section 8.2 we shall use these estimates to obtain further information on the unique Gibbs measure fi: We shall construct consistent regular versions of the conditional probabilities \i{-\3Ty) for arbitrary, not necessarily finite regions V, and we shall also derive some upper bounds on the mixing coefficients of /i which describe the decay of correlations between distant regions. A second uniqueness condition will be studied in Section 8.3. This condition only applies to one-dimensional systems. It states that the interaction between very distant spins should decrease so rapidly that the total interaction energy of the spins on any two complementary half-lines is finite. As a matter of fact, we shall see that any two Gibbs measures for such an interaction are necessarily absolutely continuous with respect to each other. By the zero-one law for extreme Gibbs measures, this property already implies the uniqueness of the Gibbs measure.
8.1
Dobrushin's condition of weak dependence
In this and the subsequent section we shall exploit a technique that was invented by Dobrushin (1968a). We shall analyse a specification y by looking at an 5 x S-matrix C(y) = (Cij(y))ljES that describes the interdependence between the spins at distinct sites. More specifically, Cy(y) will be an estimate
Dobrushin's condition of weak dependence
141
of how much the conditional distribution yt(- \co) of ai depends on the value oij of the spin at j . To define C{y) we first need to introduce a distance between probability measures on the state space E. We shall simply take the uniform distance which is nothing other than one half of the total variation distance. This choice will be convenient for our purposes. Let (E, S) be a measurable space and a t , a 2 e ^{E, ê). We define a distance ||œ1 — a2|| of al and a 2 by three identical expressions, namely (8.1)
K-aall^maxM^-MA)! = max|ai(/)-a2(/)|/<5(/) s
= a(l0i -02D/2. Here max extends over all bounded (non-constant) measurable functions / on E. f (8.2)
0(f) 4 sup |/(x) - /OOI = sup /(x) - inf f(x) x,y€E
xe E
XG E
is the oscillation of such an /, a e Jt{E, S) is an arbitrary measure with a.l « a. and cc2 « a (such as a = a t + a 2 , e.g.), and gx and g2 are the a-densities of ax resp. a 2 . It may be worth mentioning a special case. If E is countable (and
(8.3)
K-a2|=Jl|ai({x})-«2(W)|.
(Just let a be counting measure in (8.1).) Let us check that the expressions on the right of (8.1) are equal. For each bounded measurable / we have, putting m = I sup f(x) + inf f(x) I / 2, \XG E
\*1(f)-«2{f)\
=
x eE
J j
\a((g1-g2)(f-m))\
^«(101-021)11/-«Il = a(l0i-02IW)/2, with equality when f — m = sign(ö'1 — g2) \\f — m\\ on {gt # g2}. In particular, equality holds w h e n / = l{ 9l>92 }- The equality of the three expressions on the right of (8.1) now follows immediately. After these preliminaries we turn to the definition of Dobrushin's interaction matrix C(y). Let y be a specification with state space E and an arbitrary parameter set S. In fact we shall only look at the kernels yt = y^ (i e S) that correspond to the singletons in SP. To some extent this procedure is suggested by Theorem (1.33). Let ie S be fixed. As yt is proper, yt is uniquely determined by its ar projection y?(- \co) = <7;(y;(- \co)), coeQ.. More explicitely, y° is defined by
142
(8.4)
Uniqueness
y?(A\œ) = yi({oieA}\œ)
(Ae^œeQ).
Thus y? is a probability kernel from (Q, ^ l } ) to (E, S). We are interested in the cOj-dependence of y°(- \co) for each j e S. This co^-dependence may be estimated by the quantity
(8.5)
cu(y)=
sup
\\y?(-\0-y?(-m\.
The matrix C(y) = (Q/y)),- J s S is called Dobrushin's interdependence matrix for y. It is the basic object of this section. We intend to introduce a condition on a specification y which implies that ^(y) contains but one element. For intuitive reasons, <$(y) should be a singleton whenever y is close to an independent specification. In other words, the interdependencies described by y should be weak, in that yf (• |co) depends on co only weakly. We therefore look for a quantity which estimates the dependence of yf(- |co) on all of co. Having defined C(y) it is natural to think of the sum YJ Qj(y) a s s u c n a quantity. There is, however, an important fact which must be remembered: This sum tells us nothing about the dependence of y°(• |co) on the behaviour of co at infinity. For example, if y is the specification of Example (2.27) then C^y) = 0 for all i, j e S, but yf (• |co) does depend on co and ^(y) is not a singleton. Therefore, if £ C^y) is taken as an estimate JsS
of the co-dependence of y?(- |co), a tail dependence of yf must be excluded. This can be done by assuming that y is quasilocal. We are thus led to introduce the following condition of weak dependence. (8.6) Definition. Let y be a specification, y will be said to satisfy Dobrushin's condition if y is quasilocal and c(y) * sup X Cy(y) < 1. ieS
jeS
We are now ready to state Dobrushin's uniqueness theorem. (8.7) Theorem. Suppose y satisfies Dobrushin's condition. Then \^(y)\ ^ 1. If (E, S) is a standard Bor el space then |^(y)| = 1. The above theorem is an immediate consequence of a more general comparison theorem which will be stated and proved at (8.20) below. Before going on to that subject we will provide a sufficient condition for Dobrushin's condition to hold, and we will also look at some examples. In analogy with (8.2), we let
s(f) = sup i/(o - m\ denote the oscillation of a real function / on Q.
Dobrushin's condition of weak dependence
143
(8.8) Proposition. Let X e <M(E, S) and Q> be a À-admissible potential such that SUPYJ(\A\-1)S(<S>A)<2ieS
A3Î
Then y* satisfies Dobrushin's condition. (Note that, up to ^-admissibility, there are no restrictions on the selfpotential part of <£.) Proof. First of all, each H* is quasilocal. For if A e y then
sup
\H*(0~H%(r,)\^Y<
I
<W,
and the expression on the right tends to zero as A runs through y . Thus Proposition (2.24)(b) guarantees that y*" is quasilocal. To check that c(y*) < 1 we need to estimate Cy(y*) for i, j e S, i ^ j . Let Ç, n e Q with Cs\{j} = *ls\{j} b e given. We put "oW = -Hfi}(xCs\{i}l
«iW = -^}(^s\{i})
(* e £)
and v = u1 — u 0 . v is bounded. More specifically, we have <5(«) ^ sup X l«M*Cs\{i}) - ^(Xf7s\{;}) x,}> e £ A3 i
- <MKs\{i}) + <M)»fe\{i})l
^2
X ,5(a>j.
Next, for each 0 ^ t ^ 1 we write ut = tUi + (1 — t)u0 = u0 + t«;, /i, = exp ut/A(exp ut), Xt = Ji,A. Then X0 = ot(yf(-\Ç)) and At = afyf(• |rç)). We thus need to estimate ||A0 — Aj_||. By (8.1) and Fubini's theorem, 2||A0 — Ax|| = A ( | Ä 1 - Ä 0 | ) ^ J d t A 0
at K
On the other hand, the dominated convergence theorem implies that — A(exput) = X(v expu t ). at d Thus —ht = (v — Xt(v))ht and therefore 2||A0 — Ax|| ^ J d t A . d « - ^ « ) ! ) o
144
Uniqueness
s)àtXt{(v-Xt(v)fY12 o
^ J dÜ,((D - m)2)112 o
for each me U. Choosing m = I sup v(x) + inf v(x) J / 2 \xeE
xeE
) \
we obtain 2||A0 — At|| ^ ||v — m\\ = S(v)/2. Inserting our previous estimate of ô(v) and taking the sup over (, r\ we find that Cu{y*)^
£
ô(d>A).
Consequently,
jeS
£ j*i
A=>{i,j)
Our hypothesis on O thus implies that cfy*) < 1.
•
It is interesting to note that the constant 2 in Proposition (8.8) is sharp. Indeed, we shall see at (16.27) that for every e > 0 there is a pair potential O with state space { — 1,1} such that sup £ (\A\ - 1)0(®A) < 2 + e ieS
A3i
and |^(0)| > 1 and thus c(y9) ^ 1. In some special cases, however, the estimates in the preceding proof can be improved. We consider two examples. (8.9) Examples. (1) Lattice gas models. Let E = {0,1} and X be counting measure. Suppose
[0
otherwise.
Clearly, ô(
Asi
Dobrushin's condition of weak dependence
145
Thus, compared with the condition in Proposition (8.8) we have an additional factor of 2. This factor can be picked up by improving the estimate of ô(v) in the preceding proof. Let us assume that £,- = 1, rjj = 0. Then v(0) = 0 and
.(1) =
X
K(A).
Thus <5(»)g
I
\K(A)\.
Inserting this estimate into the preceding proof we obtain the result. (2) Spin models. Let £ = { — 1,1} and suppose $ is a spin potential in the sense of Example (2.38). Thus there is a function J: y -> M with 0>A = — J(A)<jA for all AeST. Here aA = \\ at. Plainly, ô{d>A) = 2\J{A)\. We will show that y"> satisfies Dobrushin's condition whenever (8.10)
sup X (|^| - l)tanh|J(^)| < 1. ieS
A3i
This condition improves Proposition (8.8) in that \J(A)\ is replaced by the smaller quantity tanh \J{A)\. This can be achieved by an ad hoc estimate of Cij{y
2\\x0-x,\\
= X IMx)-Äo(*)l = 2 | M D - M D I x=±l
= Itanhu^l) — tanhu 0 (l)| because ut(— 1) = — u,(l) and thus 2/i((l) = expu t (l)/coshu t (l) = tanhu t (l) + 1. Next we use the inequality Itanhb - tanha| < 2tanh(|b - a\/2). (For its proof it is sufficient to note that tanh is increasing, odd, and concave on [0, oo [. Indeed, let a < b. If a > 0 then tanh b — tanh a ^ tanh (b — a) — tanh 0. Thus we only need to consider the case a ^ 0. For similar reasons we can assume that b ^ 0. But then the concavity yields tanhb + tanh ( - a ) ^ 2 tanh (b - a)/2, and the claimed inequality follows.) We obtain P o - l J gtanh|ü(l)/2| ^tanh
£
^
tanh|J04)|
X A^{i,j]
\J(A)\
146
Uniqueness
and thus I
C y (y*)^ X
jeS
(\A\-l)Uinh\J(A)\.
A3i
Thus (8.10) implies c(y*) < 1. It is worth looking at an even more specific case. Let S = Zd and suppose
£tanhJ(i)
This in turn holds when (8.12)
£ J(i) S 1.
We note that (8.12) is fairly close to (8.11) if J takes many positive but small values. Two further comments are in order. (i) In general, the number 1 that appears in condition (8.12) cannot be replaced by a larger constant. This will follow from Theorem (16.27) below. On the other hand, it is obvious that (8.11) (and all the more (8.12)) may be far from being optimal in specific cases. This holds in particular when d = 1 and J{i) = J if |i\ = 1 and J(i) = 0 otherwise. Then (8.11) reads tanh J < 1/2, but we know from Section 3.2 that the Gibbs measure is unique for every J. (ii) Condition (8.12) coincides with the condition for the absence of spontaneous magnetization that is obtained in the so-called mean field approximation. In this approximation all spins except one, say
(The factor of
\
i#0
//
xeE
\
i#0
/
= tanh I s YJ J(i) ) • By consistency the effective spin s should be chosen in such a way that m{s) = s. In other words, in the mean field approximation the absence of spontaneous magnetization just means that the equation m(s) = s does not admit a non-zero solution. Of course, s = 0 is always a solution. As tanh is concave on [0, oo [, 0 is the only solution if and only if
Dobrushin's condition of weak dependence
147
_ d X J(i) = -rm(s) s = 0 ^ 1 ,
ieS
OS
in agreement with (8.12).
o
Proposition (8.8) and the examples above can be summarized by saying that Dobrushin's condition holds whenever the interaction is small enough. However, the validity of Dobrushin's condition is not limited to this case, as we will show now in three further examples. (8.13) Examples. (1) Large external fields. Let E = { — 1,1} and O be any potential such that sup exp f i
£
ieS
«5(0^)] ^ (\A\ - 1)0{<S>A).
Asi:\A\>l
A3i
The Gibbs specification y® then satisfies Dobrushin's condition. To check this we can assume that h > 0. (Otherwise we interchange the rôles of 1 and — 1.) We then proceed just as in the proof of Proposition (8.8). In the notation of this proof we have, putting m = v(l), 2U0-^\\^)dtXt((v-m)2yi2 o g sup U(v - v(l))2)112 Ogtgl
supht{-l)1>2\v(-l)-v(l)\
= t
= sup(l + exp[u,(l) - u f (-l)])- 1 ' 2 5(i;). t
Moreover, ut(l)-ut(-l)^2h-
£
Ô(OA).
A=>i:\A\>l
Combining these inequalities with our earlier estimate of ô(v) we obtain the result. (2) Low temperatures. Suppose O is any potential such that a^supX ieS
(\A\-l)ô(®A)
A3i
is finite, and consider the Gibbsian specification ypo for $ at inverse temperature ß. By Proposition (8.8), yßo satisfies Dobrushins's condition whenever ß is small enough, i.e., when the temperature is high. We will now show that Dobrushin's condition can also hold at low temperatures, provided O admits but one ground state. Let E be finite and X counting measure. In addition to the hypothesis a0 < oo, we assume that, for some (necessarily unique) co e Q,
148
Uniqueness
b9 4 inf
inf
lHfi}(0 - Hf^œ^K
> 0.
In particular, there exists at most one ground state for O, namely a>. (The assumption b® > 0 certainly holds when E and O are as in the preceding example, with an external field h which satisfies the condition |fc|>supi
X
<5(OJ.)
Under these hypotheses, yß
ilill ^
«K-)) 2 ) 1/2
SUp J,((l> 0<(<1
exp[u t (x)-u ( ((U;)]) 1/2
z
^ <5(«)(
xe.E\{eOij
^ô(v)(\E\
-l^expC-j?^]
and thus c ( 0 ^ « | £ | - l^expE-/^]. Hence c(y^">) < 1 when ß is large enough. (3) Potts antiferromagnet for large JV. Let E = {1,..., JV} and X be counting measure. Also, let (S,B) be a simple graph as defined in (2.17). Consider the potential \J5{oi,oj) if A = {i,j}eB, ®A = 10 otherwise. Here J > 0 is any constant, and S{x,y) =
'1 if x = y, 0 otherwise
is Kronecker's function. (In the case JV = 2, the above <5 equivalent to the antiferromagnetic Ising potential.) For each i e S w e let di = {j e S: {i,j} e B} be the set of all neighbours of i. Suppose that JV > 3|<3i| for all i e S. Then c(y°) < 1, independently of the choice of J. Indeed, let i e S and j e di be given. Using the notation of the proof of Proposition (8.8) we can write
Uo - IJ = Wi
- h\)/2
^/l(|e"' — e"°|)/2(eUl)
zUnV Z xe E
exp -J
Z <*(*.£*) ke3i
— exp - J £ <5(x,rçk) tea«
Dobrushin's condition of weak dependence
S Zftfa)- 1 ^ 2(N -
£
l ex P l-JSixXj)-]
- exp
149
1-mx^yW
\di\y\
In the last step we have used that Z^(fy) ^ | {x e E: x # nk for all k e di} | ^ N - |<9i|. We thus obtain that Cy(y*) ^ 2(iV - [<9ff)"1 whenever) e di. Since Cy(y*) = 0 when) <£ 5i, we conclude that c(y*) ^ sup 2\di\(N - |ôi|) _1 < 1. As a final remark, we note that the preceding estimate of c(y°) can be improved by a factor of 1/2 if J = oo. This shows that c(y*) < 1 whenever N > 2\di\ for all i e S and J is sufficiently large, o After these examples we turn to the proof of Theorem (8.7). Without additional effort we may consider a more general problem, the comparison of Gibbs measures for two (possibly distinct) specifications. It is worthwhile doing so, as will turn out in the next section. We need to develop a concept of comparison of two probability measures H,fion (fi, &). In contrast to the macroscopic equivalence which was considered in Section 7.4 we will now look at the microscopic (that is, local) behaviour of ß andfi.To state our criterion of local comparison we first need to introduce the oscillations at single sites. Let / e <£ and j e S be given. The oscillation off at j is defined by
(8.14)
00)=
sup
|/(0 -Ml
Clearly, 00) = sup
S(fjia),
caefi
where fJ
ô(f) S £ ôj{f)
for all fe&. This follows from (2.22) because
i/(o - m \ ^ i/(o - /(CvffcvOi + z ôj(f) for all C,)|eQ and A e y . We are now ready to introduce the concept of local comparison. Let /i, fi e ^>(fi, 3F) be given. We shall say a vector a = (a,-),-eS e [0, oo[ s is an estimate for fi andfiif
(8.16)
\ß(f)-fi{f)\=Y.^¥f) je S
150
Uniqueness
for all / G !£. Note that in this case a;- is an upper bound of o)(M - Oj(p)\\ More generally, if A e 5^ then £ a;- is an upper bound of jeA
\\ß-fi\\A±
sup
\ß(f)-fi(f)\/o(f).
Let us state some further simple facts. (8.17) Remarks. (1) The constant vector a = 1 is always an estimate. This follows from (8.15) because the left-hand side of (8.16) is at most 5(f). (2) Suppse a is such that (8.16) holds for all f e ££ only. Then a is an estimate. For l e t / e J ? . Define fA e <£ by fA = /(
IM/) - ß(f)\ = Hm IMA) - ß(A)\ ^ lim sup £ Aey
ajôj(fA)
jeS
je S
(3) A coordinatewise limit of a sequence of estimates for ß and /i is also an estimate for ß and fi. This follows from the preceding remark. For if / e S£ then the sum in (8.16) contains only finitely many non-zero terms, o Now we fix two specifications y and y, and we let \i e <£(y) and fi e ^(y) be given. We look for an estimate for ß and fi in terms of y and y. To this end we assume that y is quasilocal, and for each i e S we let bt: Q -> [0, oo[ be a measurable function such that \\y?(-\
Then ä is also an estimate for ß and fi. Proof. For each finite A c S w e define a vector aA by . [at A a, a-1 = s [üi
if i e A otherwise.
Dobrushin's condition of weak dependence
151
We show by induction on |A| that each aA is an estimate. The lemma then follows from Remark (8.17)(3). By hypothesis, a9 is an estimate. So we assume that aA is an estimate, and we take any i e S\A and put A = A U {i}. We will show that aA is an estimate. Let / e J*? be given. Then IM/) - ß(f)\ S \K7if) - ß(ytf)\ + \ß(ytf -
yJ)\
because n e &(y) and ß e ^(y). We estimate each term on the right separately. The second term is at most /ï(b ; )^(/). This is because for each a> \7ifW - yJH\
= \y?{fUa\a>) -
tft/UMI
£\\y?{-\o>)-y?{-\a>)\Wiia) S bt(œ)ôt(f); the notation fi(0 was introduced after (8.14). To estimate the first term we use the hypotheses that yj e !£ and aA is an estimate. These hypotheses imply jeS
We are thus led to look at àjiyif). Clearly, àfyif) = 0. So we fix any j # i. For C i e f i with Cs\{j} = Vs\{j) we have
\yt(f\0-yt(m\ S \y?{f,,i-fiJQ\ ^Wfi,,-fiJ ^ W)
+
+ \y?(fi,,\Q +
y?(fi,,m
\\y?(-\0-yX\ri)\\Hfi,n)
CijiyMf).
Consequently, W)
S ôj(f) +
C^yMf).
Combining all preceding inequalities we obtain \fi(f) - ß(f)\ S I afWf)
+ CyftM/)] +
fifaMf)
because aA ^ a,- and Cu(y) = 0. Since aA is an estimate, we may replace at by ïït A a;. This shows that aA is an estimate, and the proof is complete. • The preceding lemma readily implies the comparison theorem belows For a given specification y and any « ^ O w e let cn(y) = (q(y)) i J e S denote the n'th power of the interaction matrix C(y). In particular, C°(y) is the identity matrix. We put
152
(8.19)
Uniqueness
D(y) = (Dil(y))lJeS = £
C(y).
(8.20) Theorem. Let y and y be two specifications. Suppose y satisfies Dobrushin's condition. For each i e S we let b{ be a measurable function on Q such that
lly,°(-M-y?(-MH^M for all co e Q. If jx e &(y) and fi e &(y) then \n(f)-fi(f)\^
Z
aiinDuMßibj)
ij'e S
for all
fe&.
Proof. We write C = C(y), D = D(y\ and b = (fi(b,))leS. Replacing bt by 1 A bt if necessary we may assume that bt ^ 1 for all i e S. We need to show that the vector Db is an estimate for n and fi. According to Remark (8.17)(1) the constant vector a = 1 is an estimate. An iterated application of Lemma (8.18) shows that for each n ^ 1 the vector fl(")
= Ca + £ Ckb k= 0
is an estimate. By Remark (8.17)(3) we therefore only need to prove that a(n) tends to Db coordinatewise as n -> oo. Dobrushin's condition implies that
Z q ^ c(y)" for all n ^ 0 and i e S. Thus the row sums of D are at most 1/(1 — c(y)). In particular, D has finite entries and Db exists. Also, Ca = Z Q -» 0 coordinatewise as n -> oo. Thus a(n) -> Db, and the proof is complete, D Theorem (8.20) was stated for specifications y and y and Gibbs measures H e @(y),fie @(y). But in fact we have only worked with the "single-spin parts" (}>;) ; e S and (y~i)ieS oî y resp. y, and we have only used the fact that nyt = n and fi% = fi for all i e S. We have thus proved a more general result than stated. In many cases, however, this generalization is only formal. This follows from Theorem (1.33). We still need to give the proof of Theorem (8.7). Its first assertion follows immediately from Theorem (8.20): Putting y = y and bt = 0 for all ieSwe see that n(f) = fi{f) for all / e J5f and thus n = fi whenever ß, fie &(y). The statement of existence will also follow from Theorem (8.20). It will be restated and proved in (8.23) below.
Further consequences of Dobrushin's condition
8.2
153
Further consequences of Dobrushin's condition
In this section we shall exploit the comparison theorem (8.20). We suppose y satisfies Dobrushin's condition. First we shall show that &(y) contains a (necessarily unique) element /u. Then we shall derive some properties of this fi. We shall construct consistent regular versions yv of the conditional probabilities n(-\&y) for each, not necessarily finite, subset V of S, and obtain explicit estimates on their local variational distances. In particular, we shall sharpen the conclusions of Proposition (7.11)(b) by estimating the speed of convergence of yA(- |co) to /i as A runs through Sf, as well as the speed of convergence in the uniform mixing property (7.10) that describes the decay of correlations of /i. We start with a simple lemma that deals with the conditioning of a specification relative to a fixed configuration on some part of S. It will be convenient to add to each specification y a Dirac kernel y0. y0 is a probability kernel from ^"0 = $F to & and is defined by (8.21)
y$(A\œ) = lA(œ)
(Ae^,œeQ).
(8.22) Lemma. Let y be a specification, and let V c S and œ eQ.be given. We define probability kernels yA by y/XA\Q = y\nv(A\a>f&s\K); here A e Sf, A e # \ £ e Q. Then y(V'a) ^ (y A ) A e ^ is a specification. Moreover, the following conclusions hold. (0 IfV = Stheny(V'm) = y. (ii) If V is finite then yv(- \œ) e
10 = 1 7AnF(d'7|toACS\A)yAnF(- |coA/jSXA).
As
n\\v = <JiK\v for y AnF ( - |coACSXA)-almost all n, the expression on the right equals 7AnK)W(- I^ACSXA) = TWO \Ü)ACS\A) = M' 10(V œ)
Thus y ' is a specification. Assertion (i) is obvious. To prove (ii) we assume that V e ^. Then for each A e ^ w e have
154
Uniqueness 7v(- \co)yA = J yv(dC\a>)yAnv('
|Û> A CS\A)
d
= hv( CHyAnv(-\0 = yv(-\œ). Thusy K (-|co)e^(y (K - m) )Next, if A e Sf and / e i f then 7 A / = yAf\vf(wA<7s\A) e ^ provided that y is quasilocal. This proves (iii). For the proof of (iv) we let i, j e S be two sites. If i e V then % = y; and therefore Cij(yiV,
lim yA(-\a>) AeSfy
in the J5?'-topology. In fact, if A e y and A e Sfv then sup
\yA(A\œ) - yv(A\œ)\ S
Ae^.ioEil
I
Dtj(y).
ieA,jeV\A
The expression on the right tends to zero as A runs through £fv. In particular, sup Ae^.toeCl
\yA(A\co) - (i(A)\ S
Z ie
A/y)-
A,jeS\A
(iii) If V <^W c S then y^yK = yw. In particular, nyv = n; that is, yv is a regular version of n(- \3~y). (iv) (yv)VŒS is quasilocal, in that yvf e S for all f e S and V c S. Proof. 1) Let F c S b e infinite, coeQ, and A, A' e Sfv with A c A' be given.
Further consequences of Dobrushin's condition
155
We consider the specifications y — y(A'm) and y = y(A,m). We know from (8.22) that y satisfies Dobrushin's condition. Also, yA(-\co)e ^(y) and yA.(-\co)e ^{y). Finally, if j e A then % = % = yp and if j e S\A' then $>(• | 0 = #>(• 10 = <5ra, for all £ e Q. Hence llf(-|0-f,°(-|OII^^(0=lA-\A(j) for all Ç e Q and j e S. Consequently, we have all ingredients that are necessary to apply Theorem (8.20). We obtain \yA(A\<°) - yAA\a>)\ z
Z
<5,(UA/y)
ieS,j'eA'\A
for each cylinder event A. Applying (8.22)(iv) we conclude that for each A e ^ IITA
— TA-IIA —
sup
\yA(A\oi) - yA,(A\œ)\
AE^COEQ
^
Z
DM
ieA,jeV\A
As Z A/y) < °° f° r all U the last expression tends to zero as A runs through J'ES
Sfy. Thus the preceding estimate shows that the net (yA)Ae <,v is a Cauchy-net with respect to || • ||A for each A e y . Consequently, the limit yr(A\co)= lim yA(A\co) AeSTy
exists for all cylinder events A and toefl, and HTÀ-TVIIA^
Z
A/y)
ieA,jeK\A
for all A e Sf and A e £fv. We also conclude that yv(- \co) is a probability measure on each J \ . As (E,S) is standard Borel, yK(-1co) can be uniquely extended to a probability measure on J^; cf. the proof of Proposition (4.9). The resulting yv is a probability kernel from & to #" (hopefully from ^"K to #"), and assertion (ii) holds. 2) We still do not know if yv{A\-) is ^y-measurable for each Ae&. Nevertheless, we proceed showing that yv is proper. Let A e 3~v be a cylinder event. Then yy(A\-)=
lim
yA(A\-)=lA
because each yA is proper and A e 3~A for all A e y K . By a monotone class argument the identity yv(A\ •) = lA extends to all A e 9~Y. 3) Next we prove (iv). Let V a S be infinite and / e &, say / e <£A with A e Sf. By hypothesis, yJ e <£ for all A e £fv. From (8.1) and Step 1) we conclude that lim \\yAf - yvf\\ ^ 0(f) lim \\yA - yv\\A = 0. AEtfy
At/,
156
Uniqueness
Thus yvf e !£. 4) To prove (iii) we let V c W and / e if. As yvf e ^ and y^(- |co) = lim yA( • | co) for all co, ywyvf=
lim yAyF/.
On the other hand, from Step 3) we know that lim \\yAnvf - 7vf\\ = 0 . Thus lim y A y F / = Hm yAyAn^/
= lim y A / = ywf. 4ey„
As / was arbitrary, y^yF = yw. The last part of (iii) follows by (1.20). 5) Turning to the proof of (i) we let V c S and co e Q be given. As a consequence of 2) and 4) above, the argument proving Lemma (8.22)(ii) also applies when Fis infinite. Thus yF(- \co) e ^(y (F ' m) )- But |^(y(F-£u))l ^ 1 because of Lemma (8.22)(iv) and the first part of Theorem (8.7). 6) It remains to show that yv(A\ •) is ^-measurable for all A e 3F and V cz S. (This does not follow from (ii) because 9~v is strictly smaller than 0 9~A if V is infinite.) By construction, yv(A\ •) is ^-measurable. Therefore AEJ;
we only need to prove that yv(- |Q = yv(' \rj) whenever Ç, rj e Q are such that Cs\v = Vs\v- B u t f° r s u c n C, V we have yiV'° = y(V^. Thus the result follows from assertion (i). D The preceding theorem has several aspects. One of these is illustrated by the following example. (8.24) Example. Let (E, S) be a standard Borel space and (S, B) a simple graph as defined in (2.17). Suppose y is a Markovian specification. Here Markovian means that, for all A e 5^ and A e !FA, yA{A\ •) is measurable with respect to #a A , where ÔA 4 {i e S\A: {ij} e B for some j e A} is the boundary of A. If (S, B) is locally finite (in that dA e ^ for all A e ^ ) then y is quasilocal. We assume that y even satisfies Dobrushin's condition. Then the extension (y F ) F c S of y is still Markovian. That is, for each (not necessarily finite) V c S and any A e J'y, the function yFC4|-) is êFdvmeasurable. To see this we may assume that A is a cylinder event. Let Ç e Ev be a fixed reference configuration. Then for all co e Q we have yv(A|co) = yv(A|Ccos\v) =
lim Ae&>y
yA(A\Çcos\v)
Further consequences of Dobrushin's condition
157
because yv{A\ •) is ^y-measurable. As dA\V c dV for all A e £fv, the Markov property of the yA's implies that the last expression is an J^-measurable function of co. Thus we have proved that yv is also Markovian. Consider now the unique element ß of &(y). If V c S and A e J'y then by (8.23)(iii) fi{A\^dr) = M y ^ l O I ^ K ) = yr(A\ •) /x-a.s. and therefore M^4|^K) = M ^ I ^ 3 K )
/i-a.s..
This property of ß is called the global Markov property, as opposed to the local Markov property which states that ß{A\2TA) = /x(A\^dA) /i-a.s. whenever i e , f A and A c S i s finite. We have thus shown that under Dobrushin's condition the local Markov property of ß (which holds by assumption) extends to the global Markov property. (Incidentally, a random field with the local Markov property is called a Markov field.) o A second aspect of Theorem (8.23) is the final estimate in assertion (ii). It shows that ß = lim yA( • | co) uniformly in co. (Under different hypotheses, the same conclusion was obtained in Proposition (7.11)(b).) In addition, it provides an estimate of the speed of this convergence by means of the quantities (8.25)
D(A,A,y)4
£
Dij(y).
ieA,jeS\A
We are thus led to ask how fast D(A, A, y) decays to zero when A is fixed and A runs through £f. We will show how the rate of this decay can be estimated in terms of the decay properties of the interdependence matrix C(y). (8.26) Remark. Let y be a specification. Suppose there is a semimetric s on S such that cs(y) A sup X QGOe*-» < 1. ieS
jeS
Then for all A, A e if we have D(A,A,y)^ |A|(1 -c s (y))- 1 exp[-s(A,S\A)], where s(A, S\ A) =
min
s(i, j) is the s-distance of A and S\ A. The condition
ie A,j'eS\A
cs(y) < 1 is satisfied whenever y is Gibbsian for some potential Q> such that sup £ e«A)(\A\ - 1)Ö(0A) < 2. ieS
Asi
Here s(A) — max s(ij) is the s-diameter of A e if. iyje
A
Proof. We start noting that (8.27)
sup £ e'<'-»D0.(y) S 1/(1 - cs(y)). ieS
jeS
158
Uniqueness
Indeed, using the definition (8.19) of D(y) and the triangle inequality for s we see that for each i0 e S
X e^-x/y) ^ i + I jeS
n ï l it
£ I
fleS
I
i„eS Jt = l
cjiyf = 1/(1 - cs(y))-
Thus for each A, A e if we have e s(A - s \ A) D(A,A,y)^
£
e^'D^y)
ieA,jeS\A
^ |A|/(1 - cs(y)). The second statement is obtained by a slight modification of Proposition (8.8). There we have shown that
for all i 7^ j . Thus
X Cy(y*)e*-» g \ X e^»(|^| - 1)<5((DJ for all i e S.
D
(8.28) Examples. Let S = Zd for some d ^ 1, and let y be a shift-invariant specification which satisfies Dobrushin's condition. (1) Exponential decay. Suppose, in addition, that (8.29)
X e'|JlC0;(y) < oo
for some t > 0. For example, this holds when y has finite range in that C0j(y) = 0 for all but finitely many j . Condition (8.29) also holds when y = y* for some shift-invariant potential $ which satisfies £ etdiamA(\A\
- 1)5(
here diam A is the Euclidean diameter of A. Under (8.29), D(A, A, y) Aey,> 0 exponentially. More precisely, there are constants 0 < c, C < oo such that D(A, A, y) ^ C | A | exp [ - cd(A, S\A)] for all A, A e ^ ; d(A, S\A) is the Euclidean distance of A and S\A. Indeed, combining (8.29) and Dobrushin's condition we may find a constant 0 < c < t such that C - i A l - £ e<WC0j.(y) > 0 .
Further consequences of Dobrushin's condition
159
Putting s(i,j) = c\i — j | in Remark (8.26) we obtain the result. (2) Power decay. Suppose, instead of (8.29), that there is a power p > 0 such that (8.30) £ \j\pCoj(y) < oo. JeS
This holds, for instance, if y = y * for some shift-invariant potential $ with £
(diamA)"(|^|-l)5(Oit)
By the dominated convergence theorem, (8.30) and Dobrushin's condition imply that £ ec|j'|Aplog(1+|Jl)C0j-(y)< 1 JeS
for some c > 0. Thus cs(y) < 1 for the metric s{i,j) = c\i - j \ A plog(l + |i - ./I). As s(i, _/') — p log 11" — j \ is bounded from below, we conclude from Remark (8.26) that there is a constant 0 < C < oo such that D(A,A,y) £ C|A|d(A,S\A)- p for all A, A e Sf with A c A . Thus D(A, •, y) has a power decay with the same power p. o As a third consequence of Theorem (8.23) we obtain detailed information on the decay of correlations of the unique p. e $(y). We know from Proposition (7.11)(b) that, under some hypotheses, p exhibits the uniform mixing property (7.10). Roughly speaking, condition (7.10) states that an event in a finite region A of S is almost independent of all events that occur sufficiently far apart from A. The degree of dependence may be described by the so-called uniform mixing coefficients (8.31)
cp(A,A,p)=
sup
\p(A\B) - p(A)\
of p.; here A, A e £f. Now we are able to improve (7.10) by finding upper bounds on these coefficients: Under Dobrushin's condition, cp(A, A, p) S D(A, A, y) for all A, A e £f. In particular, the preceding Remark (8.26) and Examples (8.28) provide us with estimates on the decay of cp(A, A, p) as A tends to infinity. Such estimates are especially useful for proving central limit theorems; see the Bibliographical Notes. (8.32) Corollary. Suppose y satisfies Dobrushin's condition, and let $(y) = {p}. Then (p(A,A,p)SD(A,A,y) for all A, A e Sf.
160
Uniqueness
Proof. Let A, A e Sf. If A e &A and B e 3~A with n(B) > 0 then B
Thus
n(A)\\,
and the result follows from Theorem (8.23)(ii). (In Theorem (8.23) {E,ê) was assumed to be standard Borel. But this hypothesis was only used to prove the existence of ft and the yF's. Here we may dispense with the standard Borel property because n exists by assumption.) o We conclude this section looking at the covariances fi(fg) — n(f)n(g) of bounded measurable functions / and g. We first note that these covariances may be estimated in terms of the uniform mixing coefficients provided / is local and g only depends on all spins outside some finite region, or vice versa. Indeed, if A, A e 5^ and / is J^-measurable and g is ^-measurable then (8.33)
\n(fg) - n(f)n(g)\ £
q>{A,A,n)ô(f)ô(g)/2.
To see this we may assume that 0 :g / ^ 1 and 0 ^ g :g 1. This is because (8.33) is invariant under affine transformations of / and g. We can also assume that fi(g) ^ 1/2; otherwise we replace g by 1 — g. Then Fubini's theorem gives Mfg) -
ii(fMg)\
rg } dt } ds | M / ^ t,g ^ s) - n(f ^ t)fi(g ^ s)\ o o l
^(p{A,A,/i)jdsfi(g^s) o = (p{A,A,(j)(j.(g) ^cp{A,A,li)à{f)ô{g)l2. Combining (8.33), Corollary (8.32), and Remark (8.26) we obtain covariance estimates for certain pairs of functions /, g. The next proposition deals with pairs of quasilocal functions. (8.34) Proposition. Suppose y satisfies Dobrushin's condition, and let <§(y) = {ft}. Then for all fge&'we have
\Kfg) - Kf)p(9)\ ^ Z
WWyWg).
Proof Rescaling g if necessary we may assume that g > 0 and fi(g) = 1. We define a specification y by yA = hAyA, where A e ^ and hA = g/yAg. y is easily
Further consequences of Dobrushin's condition
161
seen to be a specification. (In fact we may dispense with checking the consistency of y because we shall only need its single-spin part Cy,);,= s .) We put fi = g\i. Then fi e <&(y). For if A e £f and / e 3? then
ßn(f) = Kg yÀKPi) = WA(» yÀfg)h\g) = W A ( / S ) = Kfg) = £(/)•
Thus Theorem (8.20) applies, giving the estimate
\Kfg) - Kf)Kg)\ = W) - Kf)\ S I ôi(f)Du(y)fl(bj, iJeS
where bj is any measurable upper bound of the function co -> ||y?(- |ct>) — y/°(- |<x»)||. We claim that bi = ôj(g)/4y}g is such an upper bound. This will complete the proof because fi(bj) = ôj(g)fi(g/4yjg) = Sj(g)nyj(g/4yjg) = ôj(g)/4. Wefixany j e S and co e Q, and we write a = yf(- \co) and y,°(- \co) = t/a, where u{x) = g{xcos\y})/yjg(co), x e E. We also put m = (supu + inf«)/2. Then we obtain, using (8.1), ||«a - a|| = a(|u - l|)/2 ^ a((u - a(u))2)1/2/2 ^ a((u - m)2)1/2/2 g ||u - m||/2 ^ (5(u)/4 g bj((o). The proposition is thus proved,
a
As a first application of the preceding proposition we state a counterpart of Remark (8.26). (8.35) Remark. Let S = Zd for some d ^ 1 and s a shift-invariant semimetric on S. Suppose y is a quasilocal specification such that cs(y) A SU p £ CyOOe*-* < 1, and let <S(y) = {n}. Then for all /, g e S we have
I IM/^-M/^MJIe«0'0 ieS
^ <5S(/)<5S(0)/4(1 - c,(y)). Here
Proof. The properties of s imply that s(0,/c) = s(0, -k)^
s(0,i) + s(i,j) + s(0,./ + fc)
162
Uniqueness
for all i, j , ke S. Proposition (8.34) thus shows that \ß(fg°0k)-ß(f)ß(g°0k)\es(O'k)
Z SeS
Summing over k, j , and i in that order and using (8.27) we obtain the result, a Needless to say, the considerations of Example (8.28) apply to the preceding remark as well. The details are left to the reader. As a second application of Proposition (8.34), we establish a smoothness property of the unique Gibbs measure for a potential ® which satisfies Dobrushin's condition. For simplicity we confine ourselves to a homogeneous setting. Suppose that S = Zd. We consider the Banach space M& of all shiftinvariant potentials ® which are such that (8.36)
|||®|||4 £
\A\\\®A\\
A30
is finite. We put @ = {® e $B: \\\®\\\ < 1}. Clearly, each ®e 3 satisfies the hypothesis of Proposition (8.8). (8.37) Corollary. Suppose that S = Zd and I e J((E, ê) is finite. For each ® e 3> let ßgf be the unique Gibbs measure for ® and X. Also, let g e ^ be such that Z ôi(g) < oo. Then the function ® -> /%(#) on 3) is continuously differentiable. ieS
More precisely, for each ® e 3) and *F e 3l& we have p.
dtn*+w(g)
Here fv = £
t=0
= - Z [>*(# fv°0i) - ß
\A\-^A.
A3Ü
Proof. We fix any O e ®, 4* e l 0 , and œ e Q. For each t e R we let
- yt+tv{f\
and
sup Z sup q/y™) < l. ieS
j e S \t\ S t 0
Therefore, if Dtj = sup Dii(y9+ev) then sup Z A; < °o|r|£r0
ieS
jeS
Further consequences of Dobrushin's condition
163
Turning to the main part of the proof, we conclude from Theorem (8.23)(ii) that /VH4-(0)
= lim yA +,4, (0M
for all t with \t\ ^ t0. On the other hand, it is easily checked that
whenever \t\ ^ t0. It is therefore sufficient to show that
supK/tf,^-!
sup
\
£
|»|£»o AeyU{S},A=>A
keA\A
In view of Lemma (8.22)(ii) we can apply Proposition (8.34) to estimate T±. Since *J(HJ\
I fv°e-k) = ôj( £ keA
/
\A\A\\A\-^A
V^nA^O
S2
£
TOI,
A3J:A\A*Q
we obtain that
The inner sum tends to zero as A runs through y . The dominated convergence theorem thus shows that TY -> 0 as A -> S. To estimate T2 we observe that fv°6_k e S for all keA. Theorem (8.23)(ii) (together with a version of equation (8.1)) thus ensures that T2 -> 0 as A -> S. As for the term T3, we can again use Lemma (8.22) and Proposition (8.34). We obtain the estimate £•
Since
i,jeS,keS\A
164
Uniqueness
Z ^ ) ^ Z keS
keS
I
i^r1ii^ii = 2ii^ii0<œ,
A=>{0,k}
we conclude that T3 can be made arbitrarily small when A is chosen sufficiently large. The proof is thus complete, D
8.3
Uniqueness in one dimension
A main advantage of Dobrushin's uniqueness condition (8.6) is its universality: It is not supposed that the parameter set S has any specific structure. •Here we shall prove a more particular uniqueness result. We shall assume that S has a chain structure. That is, we take S = Z, the integers, or S = N, the natural numbers. We shall also confine ourselves to the case of Gibbsian specifications. But then we shall see that Dobrushin's condition of small total interaction can be replaced by a condition on the decay of the interaction. In contrast to the former the latter condition remains unaffected when the potential is multiplied by a scalar factor ß. (Note that a uniqueness condition for general S can never have this property. This is exemplified by the phase transition in the two-dimensional Ising model.) Whilst Dobrushin's uniqueness theorem (8.7) relied on a general contraction argument we shall now use a special feature of one-dimensional systems. Z and M may be exhausted by intervals, and intervals have but a bounded number of boundary sites. This fact will allow us to find a uniform bound on the oscillation of energy in arbitrarily large regions when the boundary condition varies. The usefulness of such a uniform bound is demonstrated by the following proposition. The hypothesis of this proposition is similar to the concept of macroscopic equivalence which has been studied in Section 7.4. In fact, these conditions are both designed for implying a (uniform) mutual absolute continuity of Gibbs measures. A further condition of this type will appear in (9.1). (8.38) Proposition. Let y be any specification. Suppose there exists a constant c > 0 with the following property. For every cylinder event A e 3F there is a set A e y such that yA(A\0^cyA(A\n) for all Ç,rçe Q. Then \<${y)\ S 1. Proof. Suppose @(y) contains two distinct elements ji and v. Then (^u + v)/2 e @(y)\ex @(y). Therefore it is sufficient to show that ^(y) c ex ^(y). Let n e <&(y), Be ST, and suppose \x{B) > 0. We put v = //(• \B). By Theorem (7.7)(b), v e &(y). Our hypothesis implies that v ^ c/z. Indeed, let A be any
Uniqueness in one dimension
165
cylinder event. Then we have for some suitable A e ^ v(A) = vyA(A) = I v(d{) J n(dri)yA(A\Q >c[v(àQ[ii{àri)yM\ri)
= cii(A).
Using the monotone class theorem we may conclude that v ^ c/i on all of SF. In particular, 0 = v(Q\B) 2; c/x(Q\B) and therefore fi(B) = 1. Consequently, fi is trivial on &~. Hence Theorem (7.7)(a) ensures that ß is extreme, a We now state the uniqueness theorem for Gibbs specifications in one dimension. As before, Ô(OA) denotes the oscillation of <&A. (8.39) Theorem. Let S = Z or N, (£, ê) be any state space, X e JÎ(E,S) an a priori measure, and $ a A-admissible potential such that (8.40)
sup ieS
Z A G£f,min A^i<ma\
ô(%) < oo. A
Then |#(0)| ^ 1. / / (£, S) is standard Borel then |#(0)| = 1. Proof. 1) To prove the first statement we shall check that y* satisfies the hypothesis of Proposition (8.38). We let s denote the finite supremum in (8.40). We consider an interval A in S of the form A = ] — n, n] D S, n ^ 1. For all C, n e Q and œ e EA we have
l*tf Ksw) - #A(™/S\A)I S
Z
Ô(OA)
/inA#0,/l\A#0
s Z
Z
w)
fc= + n min ^4^fc<max A
S 2s. (The term with k = —n does not appear when S = N.) Thus e-2s/tf(a»ySXA) ^ A*(Û)CS\A) ^ e ^ ^ s v O Integrating over co with respect to AA we obtain Z*(C) ^ e 2 s Z » and for all Ae tFA. Consequently, putting c = e~4s we have
whenever i e ^ A and £ rç e Q. But for each cylinder event A there is an interval A with A e 3FK. The hypothesis of Proposition (8.38) is thus verified. Consequently, |^(0)| S 12) To prove the second statement we assume that (£, S) is standard Borel.
166
Uniqueness
Passing to an equivalent potential if necessary we can assume that \\<J>A\\ = ô(%)/2 for all ,4 e 5^ with \A\ > 1. Condition (8.40) then implies that |H% — YJ ®{i}\\ < °° for all A e 5^. Therefore the /l-admissibility of
means that >l(exp( —O^)) < oo for all i. Suppose for a moment that O^ does not depend on i, in that 0{j} = cp o at for all i e S and some measurable cp: E -* U. y® then coincides with the Gibbs specification for the potential Ô = (^{\A\>i}(^A)Ae^ ar*d the finite a priori measure I = e_<"/l; cf. (1.28)(3) and (2.18). By (8.40), $ e f . Thus Theorem (4.23) guarantees that ^(
£
diamA 0(®A) < oo.
Here diam /I = max A — min A is the diameter of A. Indeed, the sum in (8.40) takes separate account of diam A translates of each set A, whereas in (8.42) one of these translates is singled out by the requirement min A = 0. By similar reasoning, (8.42) is equivalent to the condition „ diam A AsO
I-A |
(2) Let S = Z, and suppose O is a shift-invariant pair potential of the form
ttA = {i,j},i¥=j, otherwise,
where q>: E x E -> M is measurable, symmetric, and bounded. Then the expression in (8.42) takes the form ô(q>) £ n1"". nil
Thus
Uniqueness in one dimension
167
Theorem (8.39) provides a second proof of the uniqueness statement in Theorem (3.5). On the other hand, if ${0,1} is unbounded then $ may exhibit a phase transition; see Chapters 11 and 13. (4) The uniqueness result in Comment (6.7)(2) is not covered by Theorem (8.39). This does not detract from Proposition (8.38). In fact, looking at (6.6) we easily see that the hypothesis of Proposition (8.38) is satisfied when
Chapter 9 Absence of symmetry breaking. Non-existence
As we have pointed out in Chapter 5, the phenomenon of symmetry breaking is an important special type of phase transition. In fact, the discussion of this phenomenon is one of the main themes of this book. Three examples were already presented in Chapter 6, and quite a number of further examples will be provided later on. Here we shall deal with the converse problem, the conservation of symmetries. More precisely, we shall ask for conditions under which a given symmetry T of a specification y preserves each element [i of ^(y). By (5.11), this is trivially true when \@{y)\ = 1. Therefore we shall only be interested in conditions which are weak enough to cover at least some cases of phase transitions. For example, we want to consider specifications y with two (or more) symmetries, one of which is broken whilst we will show that the other remains unbroken. We shall distinguish between two kinds of symmetry: discrete symmetry and continuous symmetry. A symmetry T of y is said to be continuous if x belongs to the image of a homomorphism from the additive group R into the symmetry group of y. Otherwise x is said to be discrete. (For example, if the symmetry group of y is isomorphic to (Z/2Z) x (IR/Z) then the transformations that correspond to an element of {0} x U/Z are continuous symmetries whereas all other symmetries are discrete.) It is not surprising that it is much more difficult to break a continuous symmetry than to break a discrete symmetry. The former requires a slower decay of interaction or a higher spatial dimension than the latter. Suppose, for example, that S = Zd and y is Gibbsian for some shift-invariant bounded nearest-neighbour potential. If d = 1 then \@(y)\ = 1, by Theorem (8.39). In particular, there is no symmetry breaking in one dimension. Two dimensions are sufficient for a discrete symmetry breaking in bounded nearest-neighbour systems. This is exemplified by the two-dimensional Ising model of Section 6.2. A continuous symmetry breaking is possible in three or more dimensions, as will be shown in Chapter 20. What about the breaking of continuous symmetries in two dimensions? One of the main results of this chapter states that this is impossible, at least when we impose some natural conditions. We shall confine ourselves to Gibbsian specifications. The underlying parameter space S will be assumed to be either one-dimensional or the square lattice Z 2 . In the first case we shall look at single symmetries x such as spatial translations and pure spin transformations, regardless of whether the latter are discrete or continuous, x will be shown to remain unbroken, provided the potential satisfies a mild decay condition in terms of x. The case S = Z2 will
Discrete symmetries in one dimension
169
be treated in Section 9.2. In this case a discrete symmetry breaking occurs even in simple models. Therefore we shall only consider continuous symmetries. Accordingly, we shall look at (possibly periodic) one-parameter families (T')teR °f P u r e s P m transformations. A breaking of such a family of symmetries will turn out to be impossible for any potential that depends smoothly on (T')teR and satisfies a decay condition. As a by-product, the preceding results on the absence of symmetry breaking also imply the non-existence of Gibbs measures for certain potentials <E>. Indeed, suppose T e T is dissipative. Roughly speaking, this means that under the action of T each point eventually tends towards some sort of infinity. Such a T does not admit an invariant probability measure. Consequently, if z can be shown to preserve each p e ^(<E>) then ^(<E>) must be empty.
9.1
Discrete symmetries in one dimension
As a starting point for all the results of this chapter we shall prove a general proposition which is closely related to Theorem (7.33) and Proposition (8.38). (9.1) Proposition. Let (E, <S) be a standard Borel space, y a specification for E and S, and T e T a symmetry of y. Suppose there exist two numbers a, b 5: 0 such that for each cylinder event A e êF there is a set A e y with (9.2)
ayA(T-'A\-)
+
byA(xA\-)^yA(A\-).
Then T preserves each p e ^(y). Proof. By the extreme decomposition theorem (7.26) it is sufficient to show that each /jeex ^(y) is r-invariant. So we let /jeex ^(y) be given. Let A e êF be a cylinder event and A e £f be such that (9.2) holds. Then a H(T~1A) + b H(TA) = a pyA(x~1A) + b pyA(rA) ^ pyA(A) = p(A). This means that a x(p) + b x"l(p) ^ / j o n the algebra of all cylinder events. The monotone class theorem then implies that this inequality holds even on all of &. Now suppose that T(P) =£ p. Then also T _ 1 (/I) =£ p. By Remark (7.2), r(p) and T_1(JU) are extreme in ^(y). Thus Theorem (7.7)(d) ensures that there are sets Ax, A2 e ^"such that piA^ = p(A2) = 1 andr( J u)(^ 1 ) = %~i(n)(A2) = 0. Putting A = A1 H A2 we conclude that 0 = a z(p)(A) + b T-1(/*)(,4) ^ p(A) = 1. This contradiction proves the proposition, D
170
Absence of symmetry breaking. Non-existence
(We note that a misunderstanding of the notion of a T-invariant specification might lead to the erroneous conclusion that hypothesis (9.2) was a trivial consequence of the assumed t-invariance of y. We can avoid this confusion by again looking at (5.4) and (5.7)(b).) We intend applying the preceding proposition to Gibbsian specifications. To this end we need to replace (9.2) by suitable conditions on potentials. Our next proposition is a first step in this direction. (9.3) Proposition. Let (E, i) be a standard Borel space, X e Jt{E, ê) an a priori measure and <S>bea À-admissible potential. Also, let xbea X-preserving symmetry of Q>. Suppose we can find two constants 0 ^ c ^ 1 and 0 ^ C < oo with the property below. For each A e y there are a set A e y with A c A and a X-preserving transformation x e T such that (i) x is a localized version of T, in that (xœ)A = (xœ)A, ( T - 1 ^ = ( T - 1 « ^ , and (fa»)s\A = a>s\A for all œ e Q; and (ii) c H* o f + (1 - c)H* o r 1 -H*^C. Under this hypothesis, each p. e $((&) is T-invariant. Proof. Let a = (1 — c)ec, b = c e c and A e J^ be any cylinder event. We choose A e y so large that A e ^A and Ae y with A c A as in the hypothesis. We will show that y = y* satisfies (9.2). Condition (ii) implies that
cfc*o f + (i - c)hX ° r 1 ^ (A* o f)c(/j* o r 1 ) 1 - ' ^ e" c /i*.
On the other hand, as f equals the identity outside A we have Z*of = Z*of-i=Z*.
Thus ap*0x-'+bPZox^P*. Further, we have f M = x~lA and xA = xA because Ae JFA and f = x on A and T™1 = T - 1 on A. From this we conclude that a tf(T-U| •) + b y * M | •) = XA{pZ(alA ° f + b\A o f" 1 )) = XA{{aP*ox-'
+
bp%ox)\A)
= AA(p*li4) = y*(X|-). The second equality follows from the hypotheses that f is A-preserving and preserves all spins outside of A. Thus (9.2) holds, and an application of Proposition (9.1) completes the proof. • After these general preliminaries we turn to a study of one-dimensional systems. We shall look at two kinds of transformation: spatial translations and transformations without spatial part. We start with translations.
Discrete symmetries in one dimension
171
Let S = Z. We seek conditions on a potential such that given these conditions each Gibbs measure is shift-periodic with a fixed period p. For simplicity we shall look only at shift-invariant pair potentials. (An extension to more general potentials is possible. We leave this to the interested reader.) Without loss we shall assume that the self-potential part is incorporated into the a priori measure; see (1.28)(3) and (2.18). Consequently, we shall confine ourselves to potentials O of the form
[0
i f A. = {i, i + k} a n d
k^l,
otherwise,
for any choice of measurable real functions q>k on E x E. Such a potential O is absolutely summable if and only if £ \\cpk\\ < oo. (9.5) Theorem. Let S = Z, (£, ê) be any standard Borel space, X e Jt(E, S) a finite a priori measure, and e ^® a potential of the form (9.4). Let p ^ 1 be given and suppose that (9.6)
£ k\\
(pk\\ < co.
Then each fi e &(
6p-invariant.
Proof We shall verify the hypothesis of Proposition (9.3) with the constants c = 1 and
C = 4p X ll%ll +2 X *ll%+p-%llLet A e 9* be given. We choose two integers m, n with A c [m + p, n — p], and we put A = [m, n] fl Z. We also define a localized version 0 of 0p by a>;_p
if m + p ^ i ^ n,
(Op«); = < (Oi-p+„-m+1 a»,-
if m ^ i < m + p, if i e Z\A.
Condition (i) of Proposition (9.3) then evidently holds. We are thus left with proving condition (ii). We write Ht = Zi + E 2 + E 3 + E with E
i =
Z
m ^ i <j ^ n — p m ^ i ^ n — p,j>n E
3 =
Z
«Pj-I^J» 0))»
172
Absence of symmetry breaking. Non-existence
and £4 =
Z
®A-
An[n-p+l,n]#0
Similarly, with £l =
Z
^2 =
Z
^3=
(Pj-iiOi-p'Oj-p)'
Z
i<m,m+pi^j^n
and
£4 =
Z
^ ° 0,.
An[m,m + p - l ] # 0
Clearly, we have Z t = t,1. Moreover, lS2-^2l^
Z
l^-ifa.^-^-i-pfa»«^)
s Z ( k -p)ll%-%- P ll k>p
= Z
k
\\
k>l
and IZ3 - £ s l ^
Z
l<ü>f-.-(ffi»^-)-<ü!/+p-£(^»<»))l
i < m ^j ^ n — p
^ Z
feii%+p-%ll-
Finally, | Z 4 | v | £ 4 | ^ p | | ( D | | 0 = 2p Z ll%llfc>i
Combining these estimates we conclude that \\H*oÖp-H*\\ZC.
The theorem thus follows from Proposition (9.3).
a
We now make some simple remarks on hypothesis (9.6) of Theorem (9.5). (9.7) Comments. (1) The set of all integers p that satisfy (9.6) is a subgroup of Z. Thus condition (9.6) is consistent with the fact that {p e Z: 6p(n) = n for all p. e <&(&)}
is also a subgroup of Z.
Discrete symmetries in one dimension
(2) Condition (9.6) is closely related to the condition
173
Z \\(pk\\ < oo of
absolute summability. On the one hand, if lim ||
Z ll%ll ^ Z kll%+p-%ll for all p ^ 1. This is because
li%li ^ Z \\
Z &H%+i - %H = II9II Z Z VW - J{k + 1)) til
nil tin
^ Z li%iifcfel
Equality holds whenever the last sum is finite. (3) Condition (9.6) is weaker than the uniqueness condition (8.42). More precisely, suppose O is of the form (9.4) and satisfies (8.42). Let q>k = q>k — inf
oA = \~ßv ~jr(7i(7j
iïA = { u }
' '' *j'
} 0 otherwise. By Comment (9.7)(2), (9.6) holds for p = 1. Consequently, 0(O) = %(0). This result is trivial when a > 2 or ß is small since |^(0)| = 1 in these cases. But it is non-trivial when 1 < a ^ 2 and /? is large because then the spin-flip symmetry of <5 is broken; see (20.21) and Comment (8.41)(2). (2) Long range Ising antiferromagnets. Let S, E, X, a, ß be as above and define a potential Ô by
Ô, = j - ^ - 1 ) ^ - Jl""^ \
0
if
^ = {U}, i * J,
otherwise.
Ô satisfies (9.6) for p = 2. Consequently, each p. e $(<&) is periodic with period at most two. As a matter of fact, the present model is isomorphic to the
174
Absence of symmetry breaking. Non-existence
preceding example. For let x e T be defined by (TCO), = (— l)'cOj, co e Q, i e S. Then <ï> = x(O). Hence Proposition (5.6) and Remark (5.10) imply that the involution fi -> x(fi) is a bijection between ^(O) and (S{^>). As x commutes with 62, the 62-invariance of the Gibbs measures in ^(O) implies that of those in ^(0), and vice versa. The fact that ^(O) and ^(<ï>) are isomorphic also shows that (S{^>) exhibits a breaking of 61 whenever 1 < a ^ 2 and ß is large enough. Thus it is no accident that Ô fails to satisfy (9.6) for p = 1. o We now turn to the second subject of this section, the conservation of pure spin transformations in one-dimensional systems. Here we say that a transformation x e T is a pure spin transformation if x can be written in the form (9.9)
T.œ^iZiœùçs,
with a family (x,-, i e S) of invertible transformations of E. In other words, x is a pure spin transformation if its spatial part x,,, (cf. (5.1)) is the identity. As in (5.20), we let T° denote the set of all pure spin transformations which preserve the underlying a priori measure X e Ji{E, S). We put S = Z or M. As before, we shall confine ourselves to the consideration of pair potentials, and the extension of our results to more general potentials is left as an exercise. But now it is not appropriate to assume shift-invariance. So we shall look at all potentials O of the form (9 10)
G i )
1ÎA
= & j} otherwise,
and
'
<
j
'
with arbitrary measurable functions %/. E x E -> R. A pure spin transformation x = (x;, i e S) is a symmetry of such a O if and only if
sup l
+
x,yeE
for i < j , and suppose that (9.12)
C(0,x)^sup X
J(hJ)
ne S i^n<j
Then x preserves each /J. e ^(O). Proof. We shall again verify the hypothesis of Proposition (9.3). This time we choose the constants c = 1/2 and C = C(0, x). Let A e y be arbitrary and m < n be two sites in S such that A ^ [m,n]nS3A.
Discrete symmetries in one dimension
175
We define a localized version £ e T° of x by xœ = (TCO)ACOS\A (CO e Q). £ clearly satisfies condition (i) of Proposition (9.3). To prove condition (ii) we will show that Hi o £ + H* o r 1 - 2Ht ^ 2C(d>, T). The expression on the left equals the sum of the three terms Zi =
l>ufo
Z
_ 2(
PiMh o/)].
m ^ i<j ^ n
and
As O is T-invariant, Z x = 0. The summands of 2 2 and Z 3 are dominated by J(i,j) because (p^xj1 a}) = (p^x^aj) and (ft/rf1
sup Z
sup
\(pij{Tix,y)-(pij(x,y)\
(provided O is i-invariant). This follows from the identity \
+
(pij(xf-'x,y)-2
V\<\k\
shows that C(
176
Absence of symmetry breaking. Non-existence
Conditions (9.12) and (9.14) are spatially uniform. This does not exclude inhomogeneities, as is shown in the example below. (9.15) Example. Let S = N, £ = {-1,1} 2 , and X be equidistribution. We think of E as the four vertices of a square in U2. Let us write x = (x1,x2), œ i = (œn> œn\ a n d Oi = {on,oi2) whenever x e E, œ e Q, and i e S. We define a nearest-neighbour potential $ of the form (9.10) by , ij
v '
=
f - ^ i y i - Kxlx2yly2 \ 0
if j = i + 1, otherwise.
Here 1 ^ i < j < co, K ^ 0, and (J„)„ èl is a sequence in ]0, oo[ such that £ exp(-2J„) < oo. näl
3> is invariant under the two reflections T (1) , T (2) e T° which are given by (T^'CO); = ( - C O ; ! , ^ ) , ^
2
^ ) ; = ((On, ~
(Ot2),
coeQieS. For this $ and T = T<2) the expression at (9.14) equals 2K. Thus Theorem (9.11) implies that each p e <$(<&) is T(2)-invariant. On the other hand, the symmetry T (1) is broken. This is easy to see when K — 0. For let Sj and S2 be two disjoint copies of N. By the natural identification of EN and {— 1, l} s i us 2 ; y® can be thought of as a specification with state space { — 1,1} and parameter set S t U S2. In the case K = 0, y® is a product specification. Its factors are the Gibbs specification for the potential (6.2) and the independent specification with single spin distribution (ôl + 3.^/2. Consequently, (7.19) and (the proof of) Theorem (6.4) show that ^($) contains a measure /i + which satisfies n+{an) > 0 for all i e N and is given by n+ = lim ?*,...,#}(• |co+). Here a>+ is the constant configuration with value + 1 . Such a /i + also exists in the nontrivial case K > 0. This can be deduced from a famous inequality of Griffiths (1967a). This inequality implies that each y*,...,jv}(
we refer the reader to section VI of Griffiths (1972), for example.) Consequently, the inequality fi+((rn) > 0 also holds when K > 0. In particular, t (1) (/i + ) # fi+. Thus ^($) exhibits a breaking of the symmetry T (1) . O As we have said in the introduction to this chapter, the absence of symmetry breaking entails the non-existence of Gibbs measures for potentials with dissipative symmetries. This is made precise by the next corollary to Theorem (9.11). (9.16) Corollary. Let S = Z or N, {E,S) be a standard Bor el space, Xe Ji(E, S), and t e T° a pure spin transformation. Suppose T is dissipative, in that
Discrete symmetries in one dimension
177
there is a bounded measurable function / ^ 0 on E such that X(f) > 0 and l i m / O T ' = 0 A-a.s. for some i e S. Also, let O be a x-invariant X-admissible pair potential of the form (9.10) with the property (9.12). Then <${<$>) = 0. Proof. Suppose that ^(
In applications of Corollary (9.16) it will often be natural to choose f{x) = 1 A h*i}(xœs\{i})
(x e E)
for suitable i e S and t a e Q . For instance, we may do so in the following example. (9.17) Example. Let S = Z, E = Z or IR, and X be counting resp. Lebesgue measure. Also, let u: E -> [0, oo[ be a measurable function. We assume that u diverges at infinity in such a way that X(e~") < oo, but satisfies the quadratic growth condition c(u) = sup [u(x + 1) + u(x — 1) — 2u(x)] + < oo. xe E
For example, we may take u{x) = ß\x\p (x e E) with ß > 0 and 0 < p ^ 2. We define a potential O of the form (9.10) by putting
* ^
=
fu(;y-x)
if; = i + l ,
{o
ifj>i+l.
Here i, j e S and x j e £ . We will show that $ is 1-admissible but <S(^>) = 0. On the one hand, for each A e y and œ e f i w e have | AA(dC|
|A|
= l(e~") . Hence 0 is ^-admissible. On the other hand,
= e~2uM
(x e E),
where œ = 0.) Moreover, C(, T) = c(u) < oo. We thus conclude from Corollary (9.16) that (S(^>) is empty. (This example shows the advantage of
178
Absence of symmetry breaking. Non-existence
condition (9.12) over (9.14). The latter only holds when the growth of u is at most linear.) An additional remark is in order. If E = Z and u(x) = ßx2 for some ß > 0 then $ is nothing other than the one-dimensional version of the potential (6.16) which was investigated in Section 6.3. In that section, the underlying parameter set was Z2, and we have shown that, for sufficiently large ß, ^(ß<5)) is infinite dimensional (and thereby non-empty). We thus have an example of a potential which admits no Gibbs measure in one dimension but infinitely many extreme Gibbs measures in two dimensions, o
9.2
Continuous symmetries in two dimensions
Throughout this section we set S = Z2, the square lattice. We assume we are given a family (xt)teU of A-preserving pure spin transformations x* = (x\, i e S)e T° which enjoy the property (9.18)
T S O T ' = TS+'
for all s, te U.
That is, the mapping t -> x' is a homomorphism from the additive group U. into T°. Note that t -> x* is not assumed to be an isomorphism. In other words, p (T') ( S R may be periodic in that some x is the identity. Thus our setting includes an action of R as well as an action of the compact circle group U./Z. (We do not need to require that T'CÜ be jointly measurable in t and a>, although this will generally hold in applications.) We now turn to the problem of showing that it is impossible to break the continuous symmetry (T ( ) (eR . For simplicity we shall again only consider pair potentials. So we shall look at potentials $ of the form
(9 19)
(0
XA =
{i.j},i±j,
otherwise.
The (pijS are assumed to be measurable real functions on E x E such that (Pij{x, y) = cpjiiy, x) for all i # j and x, y e E. The theorem below is the main result of this section. Its proof is postponed until the end. (9.20) Theorem. Let S = Z 2 , (E, S) be a standard Borel state space, X e Ji(E, S) an a priori measure, and (x')teU a family of X-preserving pure spin transformations that satisfy (9.18). Also, let <5) be a (x')teU-invariant k-admissible pair potential of the form (9.19). Suppose that for <S> and {x')teR the following conditions of smoothness and decay hold. (i) For all i # j and x, y e E, the function t -> (ptj{x, x]y) is twice continuously differ entiable.
Continuous symmetries in two dimensions
179
(ii) There exists a symmetric function J: S x S -> [0, oo[ and a constant K < oo such that d2 sup sup --y ç>y(x, t]y) ^ J(i, j) x.yeE
teU
tu
for all i 7^ j , and (9.21)
£
|i-j|2J(U)^Klogn
>ES:0<|i-j|gn
/or a// i e S and n ^ 2. Under these hypotheses, each fi e @(
mod 2n
(i e S,œ
e Q).
r
We define a (T )(eR-invariant pair potential $ of the form (9.19) by
- x).
Here x, y e E, i, j e S with i ^ j , and ß > 0 and p > 2 are fixed parameters. (The condition p > 2 implies that is absolutely summable.) Proposition (8.8) and Dobrushin's uniqueness theorem show that |^(0)| = 1 whenever ß is small enough. What about the case of large ßl We shall see in (20.21) that the continuous symmetry (Tr),eK of $ is broken whenever 2 < p < 4 and ß is sufficiently large. Here we obtain an intermediate result: For p ^ 4 and arbitrary ß there is no breaking of (t') ( E R . Indeed, in the latter case Theorem (9.20) applies because d2 foïViM^jy)
_ d2 = ~ß\i - J\ "-^cosiy
+
t-x)
Sß\i-j\-p and
z \j\2\j\-p^i 0
fc
z /e2-"=s z F-* = l |j"i|v|j 2 | = k
k=l
180
Absence of symmetry breaking. Non-existence
^ 8 £ fc"1 ^241ogn k= \
for all n ^ 2. (2) Shlosman's rotator. Let E, X, and (T') I S R be as above. Define <5 by (9.19) and (Pij(x,y)=
— ßcos(x — y) iî\i — j \ = y/2, < ßcos2(x — y) if\i — j \ = l, 0 otherwise;
here i, j e S,x,y e E, and ß > 0. It follows immediately from Theorem (9.20) that each ß e ^(<5) is preserved by the symmetries T' of $. The point of this example is that O exhibits a phase transition when ß is large enough. The phase transition depends on the fact that O has a further (discrete) symmetry, namely the rotation by n of all spins on the even sublattice of S. For large ß this additional symmetry is broken, as will be proved in Subsection 18.3.8. o Theorem (9.20) may also be used to prove the absence of symmetry breaking for symmetry groups / which are larger than K o r a factor group of U, provided each element of / can be imbedded into a one-parameter subgroup of L The latter is possible whenever / is a connected Lie group. We will not discuss these generalities, however. Instead we shall illustrate the basic principle by a further example. (9.23) Example. Classical Heisenberg models. Let JV ^ 2, E be the unit sphere in UN, and X be normalized surface measure on E. Let <5 be defined via (9.19) by putting
{x,yeE,i^
j).
Here K(-, •) is any symmetric real function on S x S, and the dot denotes the inner product on UN. We assume that \K(i,j)\ ^ K\i — j \ ~ p for some K > 0 and p ^ 4 and all i ^ ;. (Note that the present example includes a reformulation of Example (9.22)(1) as a special case.) Let r be any rotation of UN and T e TA° be defined by TCO = (rco;)ieS. We will show that every /i e ^(<ï>) is T-invariant. In a suitable orthonormal basis of UN, r is represented by a matrix of the form 1 M(r 1 ,...,r„) =
M(r x ) M(rn),
Here
Continuous symmetries in two dimensions (COSt
— Sin£\
M(t) = [
)
,
181
^
(te M\
\sint cost / and r1,..., rn e ] 0,2% [ are suitable rotation angles. The mapping (t1,...,tn)-> M(t 1 ; ..., tn) establishes a homomorphism (with kernel 27iZ") from the additive group W onto a (necessarily abelian) subgroup of the rotation group SO(N). For each t e R we let T' e TA° denote the simultaneous rotation of all spins by means of the rotation that is determined by M(tr1,...,trn). Then f T 1 = T, and (T'XSR satisfies (9.18). (The group (T )teR is isomorphic to IR if and only if there is no s ^ 0 such that (sr1,...,sr„)eZn. Otherwise (T'), E R is isomorphic to IR/Z.) O and (T'), E R satisfy the assumptions (i) and (ii) of Theorem (9.20). Indeed, a straightforward calculation shows that
A 22(x-M(tr1,...,tr„)y)
< max r} dt for all x, y e E. Thus (9.21) follows in the same way as in Example (9.22)(1). Consequently, Theorem (9.20) implies that each p e %(<&) is r-invariant. We have thus proved that the S0(./V)-symmetry of O is not broken, o As a final application of Theorem (9.20) we state a result on the non-existence of Gibbs measures for potentials with dissipative continuous symmetries. This result is a two-dimensional counterpart of Corollary (9.16) and is proved in the very same way. (9.24) Corollary. Let S = Z 2 be the square lattice, (£, S) a standard Borel space, XeJt(E, ê) and (Tf)reR a one-parameter family in T° that satisfies (9.18). Suppose (T')ISR iS dissipative, in that there exists a site ie S and a bounded measurable function f ^ 0 on E such that 1(f) > 0 and lim / o Tfk) = 0 X-a.s. k->oo
for some sequence (t(k))keN in IR. Also, let <5?be a (r')teR-invariant pair potential of the form (9.19) which meets the hypotheses (i) and (ii) of Theorem (9.20). Then <&(<&) is empty. (9.25) Example. Let E = IR and X be Lebesgue measure. Also, let u: E -» [0, co[ be an even twice continuously differentiable function such that X(e~") < co and supu"(x) < co. (For example, we may takew(x) = ß\x\p/(l + xe E
\x\9) with ß > 0 and 0 < p - q ^ 2 ^ p.) We define a potential $ via (9.19) by setting \u(y - x) i f | i - ; | = l, 11 ' (0 otherwise. Here x,yeE and i, j c S. The same argument as in Example (9.17) shows that O is A-admissible. On the other hand, O is invariant under the dissipative one-parameter group (T') [E R of spin translations that are given by T'CO =
182
Absence of symmetry breaking. Non-existence
(coi + t)ieS, t e U, œ e Q. The hypotheses (i) and (ii) of Theorem (9.20) evidently hold. Thus Corollary (9.24) shows that ^(O) = 0. This result is particularly interesting when compared with the behaviour of the discrete Gaussian model which was studied in Section 6.3. Suppose that u(x) = ßx2 for some ß > 0. (This choice of u implies that y° is Gaussian; cf. (13.13). It gives us a model for a system of harmonic oscillators with nearest neighbour coupling.) The resulting potential O is then identical with the potential (6.16) of Section 6.3. But in that section the underlying a priori measure X was counting measure on Z instead of Lebesgue measure. The choice of X has dramatic consequences. Whilst ^(O) = 0 when X is Lebesgue measure, we have |ex^(fl>)| = oo when X is counting measure on Z and ß is large enough; cf. Theorem (6.21) (and Corollary (7.29)). The former result sheds some light on the latter. The restriction to integer valued spins prevents y° from being invariant under the full continuous translation group (T') ( 6 R which cannot be broken in two dimensions. The remaining discrete symmetry group (T')rez °f th e discrete Gaussian model is small enough to admit existence, and large enough to imply (together with the spatial symmetry group 0 ) an interesting phase structure of ^(O). The picture changes when S = Zd for d ^ 3. Then ^(O) ^ 0 even when X is Lebesgue measure. This will be seen in (13.43). o In the rest of this section we shall prove Theorem (9.20). So we assume that its hypotheses hold. We fix a parameter f e R and look at the symmetry T' of $. We will show that O and T' satisfy the conditions of Proposition (9.3). That is, we shall construct, for each square A e Sf, a transformation f e TA° which coincides with T' on A, preserves all spins outside some larger square A, and is such that (9.26)
^ o i + ^ o f
1
-
2Ht
admits an upper bound that does not depend on A and A. It is fairly clear that we shall take a transformation f of the form (9.27)
f = (T 1 '«,ieS),
where t( • ) is a real function on S that equals t on A, gradually decreases outside A, and is identically zero outside some large square A. The point will be to find a proper rate of decrease. The main part of the proof will consist of two lemmas. In the first lemma we shall exploit the smoothness hypothesis (i) of Theorem (9.20) to obtain an upper bound for (9.26). Then we shall use the decay condition (9.21) to estimate that bound when t(-) is suitably chosen. (9.28) Lemma. Consider the setting of Theorem (9.20). Let t(-) be any real function on S, and let f e TA° be given by (9.27). For any A e if, the expression in (9.26) then admits the upper bound
Continuous symmetries in two dimensions
(9.29)
183
J(i,j)(t(i)-t(j))2.
X {ij}r\A*Q
Proof. By definition, the expression in (9.26) is equal to the sum
Here Mx>y) = Vvtt^X'^y)
+
- 2<^(x,j,)
for all x, y e E and i ^ j . Thus, we need only show that
for all i ^ j . This follows from the (T'), e R-invariance of O and Taylor's formula. Indeed, let i ^ j be given and write s = t(j) — t(i). Then ^iM,y) = (Pij(x,T?y) + (p^x,ij'y)
- 2<prfx,y)
2
d \u\)-^(pij{x,x]y)du
= 1 (s S
< J(iJ)
J (s-
\u\)du
= J(i,j)s2 for all x,yeE.
This proves the lemma,
o
Lemma (9.28) leaves us with the problem of defining t(-) in such a way that t(-) = t on a given A, t(-) = 0 outside some A, and (9.29) remains bounded. This will be achieved by means of the following key quantities. We define
(9 30)
-
««-{JHO,*)-
Uli,
and for each L ^ 1 (9.31)
Q(L) = £ q(/c)
and fl (9.32)
r(«,i)
ifn<0, X
<ï(fc)/Q(L)
ifO
n
0
if n > L.
For each L ^ 1, (r(n,L))„ ai is a decreasing sequence. Its rate of decrease is precisely adjusted to the decay condition (9.21), as will become evident from the next lemma.
184
Absence of symmetry breaking. Non-existence
We shall work with the maximum norm on S. So we put ||i|| = 1^1 v \i2\ when i = (i1,i2) e S. (9.33) Lemma. Let N ^ 1 and C > 0 be given. Under the decay condition (9.21) on J there exists an integer L ^ 1 such that the following conclusion holds. If A = {Il il ^ N + L) and t(-) = r(|| • || - N, L) then the sum at (9.29) is at most C. Proof. 1) We start with an estimate for \t(i) - t(j)\. Clearly, t(i) = t(j) whenever ||i|| = \\j\\ or ||i|| ^ N and \\j\\ ^ N. So we may assume that \\j\\ > || i|| v N. In this case we have 0 S t(i) - t(j) =
X
q(k-
N)/Q(L).
Afv||«||
As q(-) is decreasing and \\j\\ — \\i\\ ^ \\j — i||, the last sum is at most \\j - i\\q(\\i\\ - N)/Q(L). Also for the same reasons it is not larger than h
fiV)/Q(L)
=
Q(\\j-i\\)/Q(L).
(=1
2) Next we observe that log log L < Q(L) < 1 + logL for all L ^ 2. Indeed, L
dx Q(L) > f — = log log L - log log 2 > log log L 2 xlogx and L
dx
2(L)<1 + J — = l + l o g L . 1
X
3) For every 0 < p < 2 there is a constant K(p) < oo such that
I j(u)ii;-;ir^K(p) for all i e S. This follows from (9.21). To prove this we write the preceding sum in the form I
J(i,j)\\j-i\\2+(p-2).
I k
fcgO jeS:2 g|U'-i||<2
k +1
As ||j|| ^ | j | ^ 2|| j | | for all j , the above expression is less than
X 2^"2> kgO
J(i,;)l;-»l 2
X |j-i|g2
k +2
^ £ 2 , t ( ^ 2 ) Klog2 , t + 2 = Klog2 X (fc + 2)2- ( 2 - p , k kgO
£ K(p) < oo.
Continuous symmetries in two dimensions
185
4) After the preceding preparations we are ready to consider the sum (9.29). We split it into three partial sums, each of which will be shown to converge to zero as L tends to infinity. The first partial sum (which will be denoted by Z J extends over all i, j e S with ||i|| ^ JV < \\j\\. The second sum, Z 2 , comprises all pairs {i,j} which satisfy JV < ||i|| ^ N + L, \\j\\ > \\i\\, and \\j — 'II ^ (ll'll ~ N) 2 . The third one, E 3 , runs over all remaining pairs {i,j} with £(0 # t(j). These are characterized by the properties JV < ||/|| ^ N + L, yil>lli||,andy-i||>(||i||-iv)2Applying the estimates in the preceding steps 1), 2), and 3) successively we see that Z ^
J(UJM\\J-i\\)2/Q(L)2
£ \\i\\SN<\\j\\
^Q{LT2
I
J(i,j)(l+log||;-«||)2
I
^2ß(L)" 2 £
£
J(Uj)\\j-i\\
PUSN j ' e S 2
^ 2(2JV + l) K(l)ß(L)^ 2 . Because of 2), the last expression tends to zero as L tends to infinity. Next, 1) and (9.21) yield £2^
I
^
J(i,j)\\j-i\\2q(\\i\\-N)2/Q(L)2
I
N<\\i\\^N+L |lJ-it|S(||i||-^) 2 Q(L)-2 £ 9 (||i|| N<||i||SN+L
= KQ(L)-2
- JV) 2 Klog[2(||;|| - JV)2]
£ 8(JV + fe)g(/c)2log(2/c2) k=l
^ 8(JV + l)Kß(L)- 2 t
kq(k)2log(2k2)
k= l
We have used the fact that |{||-|| = l}\ = 8*f for all { ^ 1. As fclog(2fc2) ^ 3/q(k) for all k ^ 1, the last sum is less than 3ß(L). Thus L 2 ^ 24(JV + 1)K/Q(L) -* 0 as L -» oo. Finally, 1), 2), and 3) also give
£3^
I
^(U)ß(ll./-iH)7ß(L)2
I
N<||i||SJV + L !|j-<||>(||i||-N)
2
^ ß(L)- 2 X 8(JV +fe)sup k=l
ieS
^ 8(JV + l)ß(L)" 2 X fe sup k=l
-J(U)(i + log II; - 'ID2
X 2
\\j-i\\>k
AUJ) 1611; - 'II1/4
X
i e S ||j-i||>fc
2
186
Absence of symmetry breaking. Non-existence
S 21(N + l)Q(L)~2 t
k 2 SU
~
k=l
P
isS
Z
||j-ij|>k2 2
J(UMj - '117'4
^ 27(iV + l)K(7/4)ß(L)" 2 X k~ k>l
as L -> oo. Consequently, if L is sufficiently large then Ej + E 2 + E 3 ^ C. This proves the lemma. • It is now easy to complete the proof of Theorem (9.20). Proof of Theorem (9.20). We need to show that each x' preserves each ß £ ^(<ï>). This will be achieved by checking that T' and $ satisfy the assumptions of Proposition (9.3). To this end we can and will assume that t = 1. Let A £ y be given and N ^ 1 be so large that A a {|| • || ^ N}. We put C = 1. For this choice of iV and C we select an integer L according to Lemma (9.33), and we let A = {|| • || ^ N + L] and
x=
{x[^-N'L\ieS).
Clearly, f e TA°. From (9.32) we immediately see that f is a localization oft 1 in the sense of condition (i) of Proposition (9.3). We further deduce from Lemma (9.28) and Lemma (9.33) that the expression at (9.26) is not larger than 1. Thus condition (ii) of Proposition (9.3) also holds, with the constants c = C = 1/2. We therefore conclude that each ß e $(<&) is T1-invariant, D We conclude this chapter with two comments on the validity of Lemma (9.33) in dimensions d =£ 2. (9.34) Comments. (1) As we have mentioned before, a continuous symmetry breaking does occur among short-range systems in three or more dimensions. This will be proved in Chapter 20. Nevertheless, it is interesting to see how Lemma (9.33) fails in that case. So let S = Zd with d ^ 3, and suppose J is such that J(i,j)^. 1 for all nearest-neighbour pairs {i,j}, and J(i,j) = 0 whenever \i — j \ =£ 1. Such a J satisfies any (spatially homogeneous) decay condition. Let N ^ 1, rN = £ /T 2 , and t(-) = f*(||i|) be any real function k^N
on S that only depends on the maximum norm ||-|| on S, equals 1 on {II'II ^ N}, and vanishes outside some large cube A = {||-|| ^ N + L}. For such a f(-) and A the sum at (9.29) is at least 2d/rN. To see this we note first that | {Hi = k}\ ^ 2dkd~i ^ 2dk2 for all k ^ 0. Combining this and the Cauchy-Schwarz inequality we obtain
J&mm-tu))2
rN x {i,ijnA#0
£ 2drN £ k2(t*(k) - t*(k + l)) 2 k=N
Continuous symmetries in two dimensions
187
£ 2 d T | Lfe_1fc(t*(fc)- t*(fc + 1)) J = 2d(t*(N) - t*(N + L + l)) 2 = 2d. Since r^ -> 0 as iV ->• oo we conclude that in three or more dimensions it is impossible to choose a function t( • ) in such a way that (9.29) becomes bounded uniformly in N. (2) The proof of Theorem (9.20) does extend to one-dimensional systems. In fact, the main idea of the proof can be adapted to prove the absence of a continuous symmetry breaking for certain one-dimensional systems which are not covered by Theorem (9.11). To be precise let us assume that S = Z and that all other hypotheses of Theorem (9.20) hold, except that (9.21) is replaced by the condition that (9.35)
\i-j\2J(Uj)SKn
£ jeZ:0<\i-j\£n
for all i e Z and n ^ 1. (Note that this condition includes the quadratic falloff J(iJ) = \i — j \ ~ 2 which is just beyond the decay condition (9.12).) Then the conclusion of Theorem (9.20) is still valid. Since Lemma (9.28) neither depends on the dimension nor on the decay properties of J, the claim above will be proved once we have established a counterpart to Lemma (9.33). To this end we redefine q(k) by putting q(k) = 1 for all k, instead of (9.30). (9.31) and (9.32) remain unchanged. Let N ^ 1 be given, L = N, A = {|-| S 2N}, and t(-) = r(|-| - N,N). The sum at (9.29) can then be estimated as follows. As before, we split this sum into three parts Z 1 ; Z 2 , and S 3 . Di extends over all i, j eZ with |i| ^ N and \j\> 2N. Z 2 takes account of all pairs i, j eZ with N <\i\ <2N and |/| < N or | j \ > 2N, and Z 3 comprises all pairs i, j with N < |i\ < | j \ <, 2N. To estimate these partial sums we first conclude from (9.35) that (9.36)
n
X
J(UJ)£*K
jeZ:|i-j|än
for all i e Z and n ^ 1. Indeed, choosing an integer k ^ 0 with 2k ^ n ^ 2k+1 we see that the left side of (9.36) is at most 2t+1
I
L/-i|ä2 k
J{i,j)^2Y,2-m2-k-m mSO
^ 4K £
Y
2k +m g | j - i | < 2 k + m + 1
\i~j\2J(i,j)
2~m = 8JC.
(By the way, (9.36) also implies (9.35) with a different constant K.) Using (9.36), we obtain the estimate Zx S I
I
WSJV |>-i|>N
./(U) ^ (2JV + l)8K/N S 24K.
188
Absence of symmetry breaking. Non-existence
Similarly,
Z N<\i\S2N
Z, =N~
X J(UM\i\-N)2+
N
Z
k2 X J(i,j) + (N-k)2
2
s N~ x Z
\j-i\Zk
k = l i=±(JV + k)
J(i,j)(2N-\i\)2
X •'(U) \j-i\^N-k
< 16K. Finally, S 3 = iV-2
X
J(UJ)\\i\-\J\\2
N<\i\<\j\£2N
^N~2±
X *• N<]i\<2N
Z
J(i,j)\i-j\2
|j-i|g4JV
because of (9.35). Consequently, the expression in (9.29) is less than or equal to C = 44K. This result provides us with a counterpart of Lemma (9.33) and thus proves the claim that there is no continuous symmetry breaking in one-dimensional systems which satisfy the hypotheses of Theorem (9.20) with (9.35) instead of (9.21). o
Part II Markov chains and Gauss fields as Gibbs measures
As its title suggests, the main development of this part centres around the probabilistic key words Markov chains and Gauss fields. Of course, we shall consider these classical subjects from a Gibbsian point of view. That is, we shall be concerned with specifications y which are such that ^(y) contains a Markov chain or a Gauss field, respectively. The extreme boundary ex g(y) then consists only of Markov chains resp. Gauss fields, and this fact will enable us to analyse &(y) by Markovian resp. Gaussian techniques. Needless to say, the motivation for these investigations comes primarily from Probability Theory rather than from Statistical Mechanics. Chapters 10 and 11 deal with the case where y is a Markov specification with infinite state space and parameter set Z, the integers. The topics which are discussed there include the identification of the Markov chains in g(y), the existence and uniqueness of a shift-invariant Gibbs measure for y, and the non-existence of Gibbs measures for y. Some explicitly solvable examples of phase transition are also provided. Chapter 12 is devoted to Markov specifications and Markov chains on infinite trees. In particular, this chapter contains an analysis of the phase transition region of the Ising model on a Cayley tree. The subjects discussed in Chapter 13, the last chapter of this part, are Gaussian random fields and specifications. For a Gaussian specification y, the extreme boundary of &(y) consists of Gaussian random fields which can be described explicitly. The main problem is that of finding effective criteria on y for which ^(y) is non-empty. This problem admits a satisfactory solution when y has finite range or S = Zd and y is shift-invariant. Whilst Chapter 11 depends on Chapter 10, Chapter 12 can be read independently of Chapters 10 and 11. Likewise, Chapter 13 is independent of the Markov chapters 10 to 12. The only chapters of Part I which are indispensable in order to understand this part are Chapters 1,2, and 7. With a few exceptions which can be read when the need arises, the results of this part will not be used in Parts III and IV.
Chapter 10 Markov fields on the integers I
In this and the next chapter we shall be concerned with random fields which have parameter set S = Z, the integers. In this chapter, the state space will be any measurable space (E, S), which is endowed with an a priori measure X e Ji{E, S). In Chapter 11, £ will be countable. We shall look at ^-specifications y = pX_ which are Markovian in the following sense: For each interval A = {i, i + 1,..., k] in Z and each A e tFA, yA(A\a>) depends on co via c o ^ and a>k+1 only. This condition is clearly satisfied whenever y is Gibbsian for some nearest-neighbour potential. If y is Markovian then each \i e &(y) exhibits the property that KA\fA)
= ii(A\&{i-i,k+i))
|*-a.s.
whenever A and A are as above. This property is called the two-sided Markov property, and each /i that enjoys this property is called a Markov field. It is natural to ask how the two-sided Markov property is related to the familiar Markov property of Markov chains. The latter states that ^04|^{...,;-2,i-i}) = li(A\&{t-i})
|*-a.s.
for all i G Z and A e &{i
Two-sided and one-sided Markov property 10.1
191
Two-sided and one-sided Markov property
Throughout this chapter we set S = Z. We fix an arbitrary measurable space (£, S) and an a priori measure A on (£, S). As far as A-specifications are concerned, Remark (1.28)(3) asserts that there is no loss in assuming that A e Sf{E, S). We shall do so throughout the proofs of this chapter. Since S = Z, we shall often work with intervals in Z. It will therefore be convenient to also use the standard interval notations for lattice intervals in Z, in that (10.1)
V,kl={jeZ:i<j
{jeZ:i<j^k}
\_i,k-] =
{jeZ:iSjSk}.
Here — oo ^ i < k ^ oo. The following definition is basic to this chapter. (10.2) Definition. A specification y on Z is said to be a Markov specification if 7]<,k[(^l ') i s ^{j, ^-measurable for all A e ^ i i k [ and all i,keZ with i + 1 < k. A A-modification p is called Markovian if p]Uk[ is #[ ; k]-measurable whenever i,keZ are such that i + 1 < k. Clearly, a A-specification y = pA. is Markovian whenever p is Markovian. p in turn is Markovian whenever p = p® for some A-admissible nearestneighbour potential , and Corollary (2.32) provides a converse of this assertion: Each positive Markovian pre-modification is Gibbsian for some nearest-neighbour potential. What about Markovian A-modifications p which are not necessarily positive? Such p's are constructed in the example below. The details are rather lengthy and should be skipped at a first reading. (10.3) Example. Let A e Jl{E, S) be an a priori measure, and suppose that for each j e Z w e are given a measurable function py: E x E -> [0, oo [ such that Pj(x, • ) is a probability density with respect to A for all x e E. We write k
9iA°>)=
n Pj(<»j-i><»]) j=i+l
whenever i < k and coeQ, and we assume that Zik = X]iMgik < oo for all i < k. (Since A]i>ktöf£ik_1 = 1, the latter holds whenever all p/s are bounded.) We define a family p = (p A ) A e ^ as follows. For any finite interval A = ]i, /c[ we put = ] k[
''
\9i,Jzuk \di,k-i
ifZu>0, otherwise.
If A is the union of pairwise disjoint and non-adjacent intervals 7 l 5 . . . , IN, we put
192
Markov fields on the integers I N
PA
= EI Pi„n=l
(This construction generalizes that of Comment (3.8)(1). In fact, if pj > 0 for all j then p is the Gibbsian specification for the nearest-neighbour potential
hj},
We claim that p is a Markovian A-modification. Indeed, it is evident that p is Markovian and AApA = 1 for all A e ^ . We need only check the consistency of the pA's. We shall do this by verifying condition (c) of Proposition (1.30). So let A , A e ^ with A c A and a e Q b e given. We must show that J AA(dC|o>) J AA(d^|co)|pA(C)pA(^) -
PMPA(0\
=0
for AA\A(- |a)-almost all co. We only do this when A and À are intervals. The extension to general A and A is straightforward. So let m S i < k S n, A = ]i,/c[, and A = ]m,n[. We distinguish the following six cases for co: (i) ZmJco) > 0, Z u (eo) > 0, (ii) k < n, ZmJco) = 0 < ZtJco), (iii) fe = n, Z m ,„M = Ziik(œ) = 0, (iv) Zm>n(co) > 0 = ZLk{co), (v) k < n, ZmJco) = Z i,k(œ) = 0' (yi) ^ = n, Zmi„(o)) = 0 < Ziifc(co). In the first case we have PA(0PA(>?)
= ö'm,iMö't,nMö'i,k(0ö'!-,k(»7)/-Zm,nMZi,t(co) =
PMPÀQ
for all Ç,rçe Q with CS\A = 1s\\ = % \ A - The cases (ii) and (iii) are similar. In case (iv) we have PA(0 = 9mA(O)9i.k(0gkA(°yzm,«((O) = ° for AA(- |co)-almost all Ç because Zitk(co) = AA{gUk\co) = 0. Thus pA(£)pA(>?) = 0 = pA(rç)pA(Ç) for 1A(- |co)2-almost all (Ç,rj). Case (v) is similar. To treat case (vi) we first note that ^AV\(l{Z m „ = 0}^A9m,n)
= l{Z m „ = 0}Zm,n
= 0-
Thus AA(gm „ > 0|co) = 0 for AA\A(- |a)-almost all co e {Zmn = 0 < Zu„}. But for these co we have PA(0PA(>?)
= ö'm,;Mö'i,n-i(0ö'i,n('?)/Zi,nM = 0
for 1A(- |co)2-almost all (Ç,rç).This completes the proof that p is a Markovian 1-modification. o Having introduced the notion of a Markov specification we recall the notion of a (possibly inhomogeneous) Markov chain. In contrast to common usage,
Two-sided and one-sided Markov property
193
we shall use the term Markov chain for a probability measure on (E, S)z rather than a sequence of random variables on a suitable probability space. Of course, the former is to be thought of as the joint distribution (or "canonical version") of the latter. (10.4) Definition. Let (Pj);eZ be a family of probability kernels from S to S, A probability measure \i on (Q, J*) = (E,
fi(ai+J e Aj for all 0 ^ j ^ n) = 1 ^(A0(dxo) J ^ ( x o . d x j . . . J P i + ^ x ^ ^ d x J . A0
Ai
A„
(ii) For all i e Z and A e S, »fa
e
^l^-oo,i[) = Ptfa-^A)
M-a.s..
If all P;'s are equal to some P, p is called simply a Markov chain for P. Definitions (10.2) and (10.4) involve two different kinds of Markov property. Suppose y is a Markov specification and \x e ^(y). Then KA\^i,kl)
= y]Uk[(A\-)
/i-a.s.
for all i,keZ with i + 1 < k and A e i*j U [ . Taking conditional expectations with respect to !F{Uk} we obtain n(A\&rn,k}) = y]i,kl(A\-)
ju-a.s.
and therefore (10.6)
KAWun)
= ii(A\&{itk})
/i-a.s.
for all i, keZ with i + 1 < k and all A e J^i?k[. Property (10.6) is called the two-sided Markov property, and each \x which satisfies this property is called a Markov field. On the other hand, if n is a Markov chain then (10.4)(ii) implies the following left-sided Markov property: If i e Z and ^ e ^ i j 0 0 [ then (10.7)
KA^-^n)
= KAl&w)
/i-a.s..
It is well-known that the left-sided Markov property is equivalent to the right-sided Markov property which states that (10.8)
KAl&u,^)
= n(A\P{l])
H-SL.S.
for all ieZandAe ^_ 0 0 j i [ . (10.7) and (10.8) together are therefore called the one-sided Markov property. (To derive (10.8) from (10.7) let B e J 2 ^ and C e #j, t. Then
194
Markov fields on the integers I
n{Ar\Bnc) = p(iAnBp(c\&{i])) = =
fi(fi(A\^{i]nBfi(c\^{i])) li{li{A\&{{])\Bnc).
(10.7) entered into the first equality.) Recall that two further Markov properties were already discussed in (8.24): the global and the local Markov property. In fact, the two-sided and the one-sided Markov property are nothing other than the specialization to the simple graph Z of the local and global Markov property, respectively. This is shown in the remarks below. For each K c Z w e put dV = {i e Z\V: \i - j \ = 1
for some ; e V).
(10.9) Remarks. (1) The two-sided Markov property is equivalent to the local Markov property on Z which asserts that (10.10)
n(A\^rv) = n(A\^dv)
/i-a.s. for all Ae&Y
whenever V cz Z is finite. (2) The one-sided Markov property is equivalent to the global Markov property on Z which is defined by the requirement that (10.10) holds for all V c Z. (3) The one-sided Markov property implies the two-sided Markov property. Proof. 1) Suppose V, W <= Z are such that VD(WüdW) = 0 and (10.10) holds for both V and W. Then (10.10) holds for VUW.To prove this let B e &Y. C e J v , and A = BC\C. For any D e J KUff we have H(ADD) = =
n(n(B\^dvncr]D) H(n(B\Pgr)p(C\PdwnD)
and therefore p(A\#yUW) = n{B\^dv)n(C\^dw)
/x-a.s. .
Since d(V(J W) = dV(J dW, we conclude that ß{A\^vuw)
= ß{A\^d(VUW))
/i-a.s..
By a standard argument, the last identity extends to all A e JVuw2) According to Step 1) above, equation (10.10) holds for any finite union of non-adjacent finite intervals whenever it holds for finite intervals. This proves Remark (1). 3) Remark (3) follows directly from Remarks (1) and (2), but we shall prove (3) before (2). Suppose /i satisfies the one-sided Markov property, and let i + 1 < k, A e &v
Two-sided and one-sided Markov property
195
ß(c, n B, n A n B2 n c 2 ) = M i c ^ r w ^ M Q i J^ } )) = MMC1|^'{£})iBl/iMl^'{,.t})iB2/x(c2|^{t})) = /^(lcInji1^(^l^{i.t})1Ji2nc2)The first equality follows from (10.7), the second from (10.8), and the last from both (10.7) and (10.8). This proves (10.6) and thereby Remark (3). 4) To prove Remark (2) we first note that the one-sided Markov property simply means that (10.10) holds for any infinite interval V. By Steps 3) and 1) above, this implies (10.10) for every finite union of non-adjacent finite or infinite intervals. If V is an infinite union of arbitrary non-adjacent intervals, we let W be a finite subunion and A e !FW. Since dW cz dV and (10.10) holds for A and W, (10.10) also holds for A and V. A standard extension argument then gives (10.10) for V and all A e J'y. a For our purposes, the main point of the preceding remark is assertion (3): Every Markov chain is a Markov field. The converse does not hold in general. This is fairly clear from a comparison of assertions (1) and (2). In fact, we shall provide a counterexample in (11.33). We are thus led to ask if the converse holds, at least, under suitable conditions. The next section is devoted to this question. We conclude this section with a more explicit version of Remark (10.9)(3). (10.11) Example. Suppose n is a Markov chain for transition kernels (P;);6z, where P,(x, •) = p;(x, -)X for all x and i and the p/s satisfy the hypotheses of Example (10.3). Define p as in that example, and let y = pX.. Then ß e
J
d/x y,i,t[(ojj,t[ e B\ •)
whenever A,C eS,Be ê]i-K, and i + 1 < k. But using the notation of Example (10.3) we obtain from (10.5) ß{aieA,a]Uk[eB,akeC) =
J
n(dœ)^(dy)\{Zik(azx{k]y)>0]
{<7i
=
J
C
=
B
/i(dtö) J Â(dy)Zik(œzx{k]y)y]
{ateA}
$ *]i'k[(dQgao>z\v,k](y) e B\cozx{k}y)
C
J
n(dœ)\(Pi+i...Pk)(œi,dy)y]
{"it A]
1
C
v(dco)y]Uk[{<j]iMeB\œ).
{
This shows that ß is specified by y.
o
£B\uj_\{k]y)
196
Markov fields on the integers I
10.2
Markov fields which are Markov chains
Let p be a Markovian A-modification and y = pXm. We pose the following question: Are there any elements of <&(y) which have the one-sided Markov property? There is a natural approach to this problem. For given \i e @(y), we shall try to define transition kernels P; by the formula Pi(fi-i,A) = lim 7[iit[(ff, e A\ •)
/i-a.s..
Here i e Z and A e S. Of course, this is assuming that the limit on the right only depends on ai_1 /i-almost surely. Since this limit is measurable with respect to f] ^{i-i}u[k,œ[> this amounts to the requirement that there is no k>i
influence of the right tail f] # j t
œ[.
k>i
To be more specific we define (10.12)
p{Uk{ =
À]umj]p]itk[
whenever i < j < k. Since p is Markovian, pjik[ is measurable with respect to ^{ij,k]. Clearly, 7u.t[(o/e i4|eo) = j"
À(dx)p]iiM(xœ^{j])
A
for all A e S and œ eCi. Next, we fix any \i e <&(y) and put (10.13)
Vj = nX{]]
(jeZ).
(At this point it is convenient to assume without loss that A e 8?{E, ê).) Since
yw = P W % we have (10.14)
p{j}Vj = ßy{j] = ß.
On the other hand, we also have (10.15)
n = pjUk[Vj
on ^ . t [ \ W .
Indeed, let / be bounded and measurable with respect to &]i,k[\{j}- Then Vj(pjuk[f) = vA(h,k[\{j}Pn,k[)f) = vjh,k[\{j}(Pv,k[f) = Mi.*[(Pn.*[/)
=
^7n.t[(/) = M/)-
From (10.14) and (10.15) we conclude that (10.16)
pfuki = Vj(pu]\^]iM\U})
v,-a.s..
Taking conditional expectations with respect to ^{ij}u[k,œ[ w e a l s o obtain (10.17)
pik[ = Vj(p{J}\^{ijnKoo[)
v r a.s..
Consequently, for any two fixed integers i, j with i < j the sequence (p^k[)k>j
Markov fields which are Markov chains
197
is a backward martingale with respect to Vj. This martingale converges v,almost surely and in L^v^-norm towards a limit p{i
(10.18) pioo[ = v M ; } i n ^ W k>j
= vMj}\f]
k>j
&r{uMk.aDi)
v r a.s..
Now take i = j — 1 and suppose that p/j_ 1]00[ does not depend on the right tail, in that (10.19)
v a s
pjj_Uao[ = Vj(P{J}|JVw))
r--
f o r a11
3e z-
Under this condition, n is a Markov chain. This is ensured by the lemma below. (10.20) Lemma. Let p be a Markovian X-modification, and suppose fi e ^(pX) satisfies (10.19). Then there exists a family (p,) jeZ of measurable functions pf. E x E -> [0, oo [ with the properties below. (0 Pj(
on E x S is a probability kernel. (Hi) ß is a Markov chain for (Pj)jeZ. Proof. Without loss we assume that X e 0>(E, S). For given j e Z we let q/. E x E -> [0, oo [ be any measurable function such that v
j(P{j]I^{j-uj}) = aMj-i, °))
v a
r -s- •
Define J
'
\l
otherwise.
Pj clearly satisfies (ii). Also, for each B e ^{j-X} we have j vj{dœ)À(qj(œJ_1, •)) = J /z(dco) J A(dx)q.(co,_t,x) B
B
= Jdv,.q>,._1,<7J.) = f dv,.p w B
B
= ß(B) = v,.(B) and therefore kiq^co^y, •)) = I for v,-almost all a>. Thus pj(aj_l,aJ) = q/oy-i.CT,-) v,-a.s., and this gives (i). Finally, (10.18) and (10.19) imply that
198
Markov fields on the integers I
k>j
and therefore Pj(°>-i» °j) = vj(P{j)l-^i-oo.j]) v r a.s.. Consequently, \{ A e S and B e 3F^_m^ then j d/z P/oy.!,^) =• v,(lBn{«,,e^}Pj(oy-i,oy)) B =
V/(lBn{ffj.e^}P{j})
= /z(Bn{o > e^}). This gives property (10.4) (ii) and completes the proof of the lemma,
o
By virtue of the lemma above, the problem of establishing the one-sided Markov property has been reduced to the problem of finding sufficient conditions for (10.19). We shall discuss two such conditions, namely (I) extremality of p in @(y), and (II) shift-invariantce of p, combined with an irreducibility condition on y. Case (I) is treated in the theorem below. (10.21) Theorem. Let p be a Markovian À-modification on Z and y = pX_. Then each peex <&(y) is a Markov chain for suitable transition kernels Pj of the form Pj(x, •) = pj(x, -)X. Here j e Z, x e E, and pj is a measurable nonnegative function on E x E. Proof. We verify (10.19). Let j e Z be fixed. We write A = {j — 1, j}, and we let / be any nonnegative function on Q which is measurable with respect to 0 "^AU[k, » [• We define a function g on EA by 9(0 = j p(dœ)f(ÇœZXA)
(C e EA).
We will show that / = g(aA) y,-almost surely. Applying this to / = Py-i>œ[ and using (10.18) we shall arrive at (10.19). We fix any Ç e EA. Since fis measurable with respect to pj ^AU[k m[, Fubini's k>j
theorem implies that /(£o-Z\A) is measurable with respect to the right tail f] 3Flk œ[ . By Theorem (7.7)(a), p. is trivial on ST and, a fortiori, on the right k>j
tail. Thus /(Ço-Z\A) is constant /^-almost surely, and therefore /(ÇOZ\A) = 9(0 p-a.s.. Since ( was arbitrary, we conclude from Fubini's theorem that XA x p((Ç, co)eEAx
Q: f(tcoZXA) ± g(Q) = 0.
This can be rewritten as X x Vj_t((x,co) E E x Q: f(xcozxlj}) # g(co^xx)) = 0. Because of (10.14), p. is absolutely continuous with respect to y,^. Thus y,--!
Markov fields which are Markov chains
199
may be replaced by p in the last equation, and this gives vj(f # # K ) ) = ^ x H((x, a>) 6 E x Q: /(xcoZNW) # ^(eo^x)) = 0. The proof is thus complete.
•
We note in passing that the preceding proof has shown that ^\ ~ fl ^Vj[k,oo[ k>j
whenever f] #" [t
œ[
/i-a.s.
is trivial and A = {j — 1, j}. The local absolute continuity
k>j
of p with respect to Xs is an essential prerequisite for this to hold. In general, the preceding statement fails. This problem has been discussed by von Weizsäcker (1983). One might suspect that Theorem (10.21) has a converse, that each Markov chain in &(y) was extreme in @(y). This is not the case, as we shall see in (11.33) and (11.39). Combining Theorems (7.26) and (10.21) we obtain the following corollary. (10.22) Corollary. Suppose (E, S) is a standard Borel space, and let p be a Markovian A-modification and y = pA_. If &(y) ^ 0 then &(y) contains a Markov chain. Turning to the above-mentioned case (II), we shall now assume that p is homogeneous (i.e., shift-invariant). We shall also need that p is irreducible in the sense below. (10.23) Definition. Let p be a homogeneous Markovian ^-modification. We say that p is irreducible if for each N ^ 1 there exists a set CN e S, an integer n(N) ^ 1, and a measurable function hN: £-> [0, oo[ such that CN1E as iV t oo, A(hN) > 0 for sufficiently large N, and P?-n(N)MN)[(«>) ^
K(C00)
whenever coeQ and N ^ 1 are such that co_n(N) e CN and œ„(N) e CN. The following examples should give some feeling for the condition of irreducibility. (10.24) Examples. (1) Let X e Jt(E, S) and 0> be a A-admissible shift-invariant nearest-neighbour potential. We may assume without loss that
Suppose that
j « ^ + (%} + 0>{i+1})/2 [0
if A = {/, i + 1}, otherwise.)
200
Markov fields on the integers I C = s u p Zf0}(co) < oo meiï
and cN^
sup
0 {0>1} (to)< co
co:coo,coi e CN
for some sequence CN ] E. Then p * is irreducible. For we can just take n{N) = 1 and hN = l c „e- 2c Vc (2) Let £ be a countable set and À be counting measure. Also, let P be a stochastic matrix on £. Define p in terms of P by putting pj(x, y) — P(x, y) in Example (10.3). Thus k
P];,k[M =
n
fK-i,fflj)
'^"'(to,-.^)
Lj=i+1
whenever P k '(GO,-, eoj > 0, co e Q, and i + 1 < k. In particular, we have for all n ^ 1 and to e Œ with P2n(co_n,co„) > 0. We claim that p is irreducible whenever P is aperiodic and irreducible in the usual sense. (See, for example, Chapter 7 of Breiman (1968).) Indeed, it is well-known that such a P enjoys the following property: For all x, y e E there exists an integer n(x, y) ^ 1 such that P"(x, y) > 0 for all n 5: n(x, y). Therefore we can take any sequence (Qv)jväi of finite subsets of £ with CN1E, together with n(N) = max n(x,y) x eC and -» " hN=lcN
min
Pnm(x,y)P"m(y,z)/P2nm(x,z).
o
x,y,zeC„
We now state the theorem which deals with case (II) above. Recall from (5.2)(1) and (5.13) that ^&(y) stands for the set of all shift-invariant elements of ^(y). (10.25) Theorem. Let (£, S) be a standard Borel space, X e Ji{E, S) an a priori measure, p an irreducible homogeneous Markovian À-modification on Z, and y = p/L Under these conditions, each ß e ^&(y) is a Markov chain for some transition kernel P satisfying P(x, •) = p(x, -)X for all x e £ and some measurable function p: E x £ -> [0, oo[. To prove this theorem we shall verify (10.19). A key to accomplishing this is provided by the proposition below. (10.26) Proposition. Suppose the hypotheses of Theorem (10.25) are satisfied, and define v = /J.À^0y Let {fk)k^i be a sequence of v-integrable functions which exhibit the following properties:
Markov fields which are Markov chains
201
(i) As k -> oo, fk tends in Ll{v)-norm towards some limit / œ . (ii) There is a set A e tf containing 0 such that fk is measurable with respect to ^&m (resp. J*Äu{-k}) far all k ^ 1. Then / œ is measurable with respect to J ^ v-almost surely. Proof. We assume without loss that X e &{E, ), and we only look at the case where fk is J^u^-measurable for all k. (The other case is similar.) As (E, S) is standard Borel, there exists a regular version n(A\a>) of the conditional probability v(A\^A)(a>), where A e STA and a> e Q. We extend n to a proper probability kernel from J ^ to J^ in the natural way, and we define a probability measure v on (Q x Q, 3F x J^) by v = \ v(dco) n(-\œ) x 7i(- \CO). Clearly, v(A x Q) = v(Q x A) — v(A) for all A e #". We also have v((U) e Q 2 : CA = VA) è 1 v(dco)7r(cxA = cojco)2 = 1 because 7i is proper. We intend to show that (10.27)
lim \ v(d£ d»/)|/ t (0 - fM\
= 0-
In fact, combining (10.27) with assumption (i) and the inequality v(l/k - nfk\) = J v(dco) J 7T(dC|co)lJ n(dr,\œ)Uk(0 - /*(>?)] I
SlmCdnMiO-fMl we obtain that lim v(|/ œ - nfk\) = 0, t-»00
and this proves the proposition because each nfk is J^-measurable. The proof of (10.27) is broken up into three steps. Step 1. For given Ç, n e Q, x e E, k e Z and n ^ l w e define
P]k-n,k+n[(XCz\{k}%
(p*(x,C,i0 = (p„k(x,C) A ç>*(x,»/), and %k(C,>7) = « ( - , U ) ) . We claim that (10.28)
lim \ v(dC,dr,)cpnk-%,r,)\fk(0 - fM\
= 0
for all n ^ 1. Let /c be so large that [k - In, k] D A = 0. For all x e £ and Ç1, C2 e Q with Ci = CA w e then have
202
Markov fields on the integers I
\W)~W)\^t
i= l
\W)-fk-n{
Therefore the integral in (10.28) is at most Jv(dÇ\dÇ 2 )
ti*(Ax)
= 2 J v(dC) J A(dx)^-(x,C)|(/ t - /t-,,)(xCz\{*-B})l = 2 vA{k_„}(p]([_"2nik[|_/^ — A-«l) = 2 V/l]t_2„?„[(P]t-2„,k[IA
—
A-nl)
= 2vy lik _ 2BiB[ (|/ k -/ t _J) = 2v(|/t-/t_J). In the above, the first equality comes from the fact that fk(co) does not depend on C0fc_„. The third equality depends on (10.12) and the fact that \fk — fk_„\ is measurable with respect to ^k-2n,ki\{k-n}- The final equality is justified as follows. Let g be any bounded measurable function and A = ]/c — 2n, /c[. Then vyA(g) = A**AU{O}(PA0) = A**A(PA*{O}0) = WA(^{O}ÔO =
v{g)
because pA is measurable with respect to J\UAA which is contained in &[0}The proof of(10.28) is completed by noting that lim v(\fk — / k _„|) = Obecause of assumption (i). *'^00 Step 2. For all e > 0 there is a number ô > 0 and an integer n ^ 1 such that sup v{q>* < ô) < e. |k|>«
This comes from the irreducibility of p. For let CN, hN and n(N) be as in Definition (10.23), and let a =
^
Ô
) ^ J v ( d 0 > ) 7 r ( ( T t _ „ , (T t + „ 6 C w | o > ) 2
^vn(ak_n,ak+neCN)2 =
But if \k\ > nthen
v
(°Jt- n ,°Jt + n
e
Cjv)2-
Markov fields which are Markov chains v
203
K - n > ak+n e CN) = fj.(ok-„, ak+n e CN)
^ Mo*-« e Q ) + n(ak+n e CN) - 1 = 2«(CW) - 1 This completes Step 2. Step 3. It is fairly obvious how Step 2 can be used to derive (10.27) from (10.28). Let us introduce the abbreviation 9k(C,ri) = 1/(0 -fM\
(1 ^ fc ^ co,£»7 efl).
For fixed e > 0 we let c > 0 be so large that v(gœ 1^ àc j) < e. By virtue of Step 2, we can find an n ^ 1 and (5 > 0 such that v((p*~n < ô) < e/c for all k > 2n. Assumption (i) and (10.28) ensure that v(l/* - / J ) < e/2
and
?(%*-"&) < £<5
for all sufficiently large /c. For these k we obtain, using the triangle inequality
gk(C,rj) g âX»7) + |/k(C) - /„(Ol + UM - fJn)\, v(flk) ^ v(l { % -n < i } g t ) + <5-1 ?(„*-"&)
^ v{lm-r,
^ cv((p^
+ 2v(\fk-fao\)
<ô) + v(gœl{ë^c])
+E + 2e
This completes the proof of (10.27) and the proposition, D With the help of Proposition (10.26), the proof of Theorem (10.25) is straightforward. Proof of Theorem (10.25). Let p e %(y) be given. We apply Proposition (10.26) to the sequence fk = p?-lM, k ^ 1. Assumption (i) of (10.26) holds because ( / A ä i i s a backward martingale with respect to v = pX^y (Recall (10.17).) In particular, / œ = P]°-i>0O[ v-a.s.. Assumption (ii) of (10.26) holds with À = { — 1,0} because p is Markovian. Hence Proposition (10.26) implies that P]°-i,oo[ is equal to some ^-measurable function g v-almost surely. Consequently, (10.19) holds for j = 0. For arbitrary j e Z w e use the shift-invariance of p and p to deduce that v
MJj-i,oo[ = 9° 0-j) = v (pi-i.cot ° Oj = g)
= 1. The theorem thus follows from Lemma (10.20).
o
204
Markov fields on the integers I
10.3
Uniqueness of the shift-invariant Markov field
Throughout this section we shall assume that the hypotheses of Theorem (10.25) are satisfied. Thus (E, S) is assumed to be a standard Borel space, X is any a priori measure (without loss: probability measure) on (E, S), p an irreducible homogeneous Markovian A-modification, and y = pX_. We intend to show that there is at most one p e ^&(y). This will be achieved as follows. First of all, Theorem (10.25) asserts that any p e ^@(y) is a Markov chain for a suitable transition kernel P. Clearly, the one-dimensional marginal distribution a = at(p) is P-invariant, in that ccP = a. We shall show that P is even ergodic in the following sense: For a-almost all x, Pn(x, •) tends to a in total variation distance as n -> oo. Of course, this ergodic theorem is interesting in its own right. We then shall use the ergodic theorem to show that each p e %(y) has short-range correlations and is thus trivial on the tail u-field. By virtue of Theorem (7.7), this means that &@(y) c ex &(y). But this implies that \%(y)\ S l. To carry out this program we fix any p e &e(y). By Theorem (10.25), there is a measurable function p: E x E -> [0, oo[ such that P(x, A) 4 J X(dy)p(x, y)
(xeE,Ae
S)
A
is a probability kernel and p is a Markov chain with transition kernel P. Let us look at the iterates P" of P. For each n ^ l w e define a measurable function p": E x E -» [0, oo[ recursively by (10.29)
p\x,y)
= p(x,y),pn+1(x,y)
= X(p"(x, -)p(-,y)),
where x,y e E. Then P"(x, •) = p"(x, -)X for all x e E and n ^ 1. Looking at (10.5) we see immediately that P"(a-n,A) = n{p0 eA\&{-.„}) = p(a0 e A\^_x^n])
p-a.s.
for all ,4 e S and n ^ 1. Because of (10.14) this is equivalent to the statement that (10.30)
p"(o-n,<j0) = v(p {0} |#{_ ni0} ) = v(p {0} |^_ 00i _ n]U{0} )
v-a.s.
for all n ^ 1. Here, as before, v = pl{0}. Next we consider the one-dimensional marginal distribution a = ot(p) of p. We set (10.31)
r(x) = J cc(du)p(u,x)
(x e E).
Then (10.32)
a = rX, and
r(a0) = v(p{0}|Jr{0})
Indeed, for all A e S we have
v-a.s..
Uniqueness of the shift-invariant Markov
field
205
rX(A) = [a(du)P{u,A) = aP(A) = a{A), and (10.30) (for n = 1) gives j
r((T0) dv = j X(dx) j ^(da))^^..!, x) =
j
dvv(p{0}|Jjr{_1,0})
{a0EA}
=
I
dvp {0} .
We also remark that (10.33)
r>0
/l-a.s. on {hN > 0}
for all sufficiently large N, where hN is as in Definition (10.23). In particular, if {hN > 0} | £ (as is the case in the examples in (10.24)) then a and X are mutually absolutely continuous. To prove (10.33) it is sufficient to show that r 2: (2a(CN) — l)hN for all N^l.
A-a.s.
From (10.32) and (10.16) we obtain, writing n = n{N),
r(a0) = v(v(p{0}|^7_„;„[X{0})|^r{0})
^ /X(
v-a.s. .
The proof of (10.33) is completed by noting that
n-»oo
and lim X(\p"(x, •) — r|) = 0
for oc-almost all x e E.
n-*oo
Proof First of all we note that j ct(dx)X(\pn(x, •) - r\) = j /x(dca) j A(dy)|p"(eo_B,y) - r(y)| =
v(|p"(«7_ B ,«7 0 )-r(«7o)|)
206
Markov fields on the integers I
for all n ^ 1. To prove the convergence to zero we put /„ = pn(ff_„, c 0 ). From (10.30) we conclude that (/„)„ai is a backward martingale with respect to v, and its limit in L1(v) is .7 00
V
(0}
»Oi JS5-œ'-"]ut°W-
Also, /„ is measurable with respect to 3F^_n^. That is, (/„)„ â i satisfies condition (ii) of Proposition (10.26) with A = {0}. We thus obtain, from that proposition, that / œ has an J^-measurable version. From this and (10.32) we conclude that /» = v(p{o}l^{o}) = r(a0)
v-a.s. .
This proves the first part of the theorem. To prove the second part we note that X(\p"(x, •) — r|) is decreasing in n for all x. Indeed, we have rX
= a = ap = (^ X(dy)r(y)p(y, •))}.
and therefore r = $ A(dy)r(y)p(y, •)
/l-a.s. .
Combining this with (10.29) we get A(|p"+1(x, •) - r\) = p ( d z ) | J A(dy)p(y,z)lp»(x,y)
- r(y)]\
^$A(dy)$A(dz)p(y,z)\p»(x,y)-r(y)\ =
A(\p"(x,-)-r\)
for all n ^ 1 and x e E. Using Fatou's lemma, we finally arrive at the conclusion Ja(dx)lim/l(|p n (x,-)-r|) n-*ao
^ lim J a(dx)A(|p"(x,-) - r|) = 0 n-»oo
which proves the second part of the theorem.
•
We observe that Orey's ergodic theorem for countable state Markov chains is a special case of Theorem (10.34). To see this let E be countable, X be counting measure, and P be an aperiodic irreducible stochastic matrix on E. Suppose P is positive recurrent. As is well-known, this means that there is a unique probability vector a on £ with aP = a. By virtue of (10.5), there is a unique Markov chain /x for P with
Uniqueness of the shift-invariant Markov field
lim X \Pn(x,y)-a(y)\ n-*oo
207
=0
yeE
for all x e E. This is Orey's ergodic theorem (cf. Blackwell and Friedman (1964)). Note that a special case of this theorem was already stated and proved in(3.A3)and(7.15). We are now ready to turn to the main result of this section. (10.35) Theorem. Let (E,S) be a standard Borel space, ke Jt{E,$), p an irreducible homogeneous Markovian À-modification, and y = pX. Then either y®(y) = 0 or \y&(y)\ = 1- In the latter case there exists a probability kernel P from S to S with the following properties: (i) There is a measurable function p: E x E -» [0, oo[ such that P{x, •) — p(x, -)h for all x e E. (ii) There is a unique a. e ^{E, S) with ctP = a such that lim P"(x, •) = a in n-*oo
total variation distance for a-almost all x. a has a k-density r which is positive on {hN > 0} for sufficiently large N. (Hi) y®(y) = {p.p} c ex^(y), where p.P is the Markov chain with transition kernel P and one-dimensional marginal distribution a. Proof. We shall prove that %{y) c ex3%). This will imply that \%(y)\ ^ 1. For if %{y) contained two distinct elements then any non-trivial mixture of these would belong to &B(y)\ex &(y). Let p. G %(y) be given. According to Theorem (10.25), p is a Markov chain for a suitable transition kernel P with A-density p. To show that p is extreme in
lim sup \p(A f)B)-
p(A)p(B)\ = 0
for all cylinder events A e &. Let us take any such A. We pick two integers i < k such that A e ^,-,k], and we choose arbitrary integers m, n with m < i and n > k. We shall prove that (10.37)
sup \p{A^B)-
p{A)p{B)\
g j a(dx)A(|p i -"(x,-) -r\
+ |p"-*(x,-) - r\ + \p"~m{x,-) - r\),
where a, r, and pe are defined in terms of p and p as before. According to Theorem (10.34), the expression on the right of (10.37) tends to zero in the limit m -> -co, n^-oo. Thus (10.37) gives (10.36). To prove (10.37) we let A* e £{Uk] be such that A = {a[iM e A*}, and we choose any B e ^mM. For jeï and z e E we define a probability measure v/ on (EP- °°[, Su-œ[) by the requirement that
208
Markov fields on the integers I
vf((7j e dx0,...,oj+t
e àx() =
öz(x0)P(x0,dxi)...P{xe_x,dx()
for all £ 2: 0. Thus vf is a Markov chain for P which starts from z at time j . Then we can write H(A H B) = \ n(da>) \ l(dx)pi-m(œm, x J X(dz)P"-%k,z)
x) J v*(dOM£ [ijk] ) vzn(dri)\B(w^Mri)
\
and H(A)H(B) = \ X(dx)r(x) \ v?(dQU.(Cp.t]) x J /x(dca) j X(dz)p"~m(œm,z) I
v^drjUsiai^^r,).
Thus \H(A^B)-PL(A)H(B)\
^ \ Hideo) \ X(dx) \ v*(d£) J l(dz)|p- m ( füm ,x)p"- k (C k ,z) -r(x)p»-m(com,z)\ = J a(dw) J X(dx) j px-^dy)
\
X{dz)\pi-m{W,x)p"-k{y,z)
-r(x)p"-m(w,z)\. Estimating the last term by means of the triangle inequality \pi-m(w,x)pn-k(y,z)
-
r(x)pn'm(w,z)\
^\pt'm(w,x)-r(x)\p''-k(y,z) + r(x)\p"-k(y,z) - r(z)\ + r(x)\r(z) -
p"-m(w,z)\
and using the fact that J X(dx)r(x)Pk-i(x, •) = aPk~l = a we arrive at (10.37). This completes the proof that n is extreme and thus unique. The additional assertions (i) to (iii) of the Theorem follow from Theorem (10.25), (10.32), (10.33), and Theorem (10.34). o We conclude this chapter with a comment on Theorems (10.25), (10.34), and (10.35). These theorems were stated for a Markovian homogeneous Xspecification y. However, we made no use of the consistency property yAyA = yA, A <= A, of y. Moreover, it is possible to dispense with the assumption of absolute continuity with respect to X. (See Papangelou (1983).) Consequently, the conclusions of the above-mentioned theorems hold for any shift-invariant Markov field fionZ with a standard Borel state space (E, S) provided /j, admits suitable versions of the conditional probabilities n(- |^_„, B[ ) which satisfy an irreducibility condition similar to (10.23).
Chapter 11 Markov fields on the integers II
In this chapter we will continue the study of Markov fields on S = Z under more restrictive assumptions. The state space E will be countable, and we shall look only at Markov specifications y which are positive and homogeneous. Such specifications y are always Gibbsian for a suitable shift-invariant nearest-neighbour potential
210
Markov fields on the integers II
11.1
Boundary laws, uniqueness, and non-existence
Throughout this chapter we assume that S = Z and £ is a countable state space. The a priori measure X on E is counting measure. We look at Markov specifications y which are homogeneous (i.e., shift-invariant) and positive. Positivity means that pA(a>) = yA^A = a>A|a>) > 0 for all A e if and a> e Q. Recall from Remark (1.28)(5) that p = (p A ) A e ^ is the unique /l-modification with y = pi.. p is a Markovian pre-modification. In particular, p is quasilocal. Theorem (2.30), Corollary (2.32), and Corollary (5.9) thus show that p = p* for some A-admissible shift-invariant nearest-neighbour potential S> (which may be chosen to be a gas potential for some vacuum state a e E). It will be convenient to pass from S> to the positive matrix Q = (Q(x, y))x,yeE on E which is defined by ß((oi_1,o)£) = exp[-0 { i _ l i i } ((o) - iO{,-1}(o)) - i
Q"(x,y)
for all x, y e E and n ^ 1.
Indeed, for all co e fi and n > 1 we have Qn(œ0,œn) = ZjS,B[(a))exp [-•|) - i® w (û>)] < GO. The specification y can be re-expressed in terms of Q as follows. If co e Q and A = ]i,/c[ then k
(11.2)
yA(
o*-,(û),-,û)t).
If A is the union of pairwise disjoint and non-adjacent intervals I1,...,IN
then
N
(11-3)
7AK
=
Û»AI«O)
= I l ^H(ffr„ = »rj« 0 )B= l
Q
We agree to write y = y if y is defined by means of Q via equations (11.2) and (11.3). Note that these equations are a special case of the construction in Example (10.3). Consequently, equations (11.2) and (11.3) establish a welldefined mapping Q -> yQ from the set of all positive matrices on E which satisfy (11.1) to the set of all positive homogeneous Markov specifications. The discussion above has shown that this mapping is onto. However, it is certainly not one-to-one. To see this more clearly we introduce an equivalence relation between positive matrices P and Q with finite powers by writing P ~ Q if and only if yp = yQ. Of course, this equivalence relation should be viewed as a special case of the equivalence relation between potentials which was defined in (2.33).
Boundary laws, uniqueness, and non-existence
211
(11.4) Remark. Let P and ß be two positive matrices on E which satisfy (11.1). Then P ~ ß if and only if there is a number q > 0 and a function r : E -> ] 0, oo [ such that (11.5)
P(x, y) = Q(x, y)r(y)/qr(x)
for all x, y e E.
Proof. Looking at (11.2) we immediately see that P ~ ß whenever (11.5) holds. Conversely, suppose P ~ ß. From (11.2) (as applied to a singleton A) we obtain that P(x,y)P(y,a)/P2(x,a)
=
Q(x,y)Q(y,a)/Q2(x,a)
for all x, y, a e E. We fix a reference element a e E and define q = ß( fl) a)/P(a, a), r(x) = ß 2 (x, a)/P2(x,a)
(x e E).
Then P(x,y) =
Q(x,y)Q(y,a)/P(y,a)r(x)
for all x, y e E, and for y = a we obtain P{x,a)/Q{x,a) = qlr{x). Replacing x by y in the last equation and inserting this into the next to last equation we arrive at (11.5). • If £ is finite, the Perron-Frobenius theorem (3.A1) implies that every positive matrix Q is equivalent to a stochastic matrix P; cf. Comment (3.8) (2). (This observation was a basic tool in the proof of Theorem (3.5).) In the case of an infinite E, an analoguous result was obtained by Vere-Jones (1962). To state it we need to introduce the quantity (11.6)
L(Q) = lim Q"(x, x)1/n e [0, oo]
(x e E).
(The existence of the limit follows readily from the supermodularity relation ß n + m ( x , x ) ^ ß"(x,x)ß m (x,x), m, n^l. The limit does not depend on x because ß n + 2 (x, x) ^ ß(x, y)Q"(y,y)Q{y, x) for all n ^ 1 and x, y e E) We note that L(ß)" 1 is the radius of convergence of the power series £ ß"(x, x)z". nâl
The result of Vere-Jones now asserts that every positive matrix ß with L(ß) < oo is equivalent to a substochastic matrix P with L(P) = 1. As we shall not need this result, we refer the interested reader to Chapter 6 of Seneta (1973). What we shall need is the simple remark below which states that equivalent positive stochastic matrices P and ß with L(P) = L(ß) = 1 exhibit the same recurrence properties. Let us recall that a positive stochastic matrix P either satisfies £ P"(x, y) = n£l
oo for all x, y e E, or £ P"(x, y) < oo for all x, y e E. In the former case, P is »gl
212
Markov fields on the integers II
called recurrent, whilst in the latter case P is said to be transient. P is recurrent if and only if Hp(ty < GO) = 1 for all (or some) x, y e E. Here /i£ e é?(Ez+, é>z*) is the Markov chain with transition matrix P and initial point x, and T = min {n ^ 1: an = y} is the time of the first visit at y. A recurrent P is called positive recurrent if Hp(xy) < GO for some (and thereby for all) x, y e E or, equivalently, if there is a probability vector a on E such that a.P = a. Otherwise, P is said to be null recurrent. (A reader who is not familiar with all this is referred to Breiman (1968), for example.) We note that a recurrent P always satisfies L(P) = 1. (11.7) Remark. Suppose P and Q are positive stochastic matrices with L(P) = L(Q) = 1 and P ~ Q. If P is recurrent then P = Q, and if P is transient then so is Q. In particular, a positive stochastic matrix Q which is null recurrent or transient with L(Q) = 1 can never be equivalent to a positive recurrent positive stochastic matrix. Proof. Suppose P and Q are related to each other by (11.5). Then Pn(x, x) = Qn(x, x)/qn for all x e E and n ^ 1. Hence q = L{Q)/L{P) = 1. Thus £ P"(x, x) = Y, Q"(X> X ) f° r a ll *•> a n d this shows that P is recurrent if and nil
nil
only if Q is. To show that in this case P — Q we need to prove that r is constant. But for all x, y e E and n ^ l w e have from (11.5) r(x) = Q"r(x) = ^ ( r ° *„)
^ J ^(T, = k)Q-kr(y) k= l
= r(y)/ig(^ ^ n). Since Q is recurrent, we conclude that r(x) ^ r(y) for all x, y e £. This completes the proof, D From now on we shall assume that we are given a positive matrix Q for which (11.1) holds, and we shall investigate the set ^(Q) = ^(yQ) of all Markov fields for Q. Our first result provides a characterization of all Markov chains in ^(Q) and, in particular, a description of all extreme elements of <&(Q). In (11.31) and (11.39) we shall see how this description can be used to determine ex <&(Q) in particular cases. The characterization of the Markov chains in <&(Q) involves the notion of a boundary law for Q. (11.8) Definition. Let Q be a positive matrix. A boundary law for Q is any family {;, rt: i e Z} of row vectors *?,- e ]0, co[ £ and column vectors rt e ]0, co[ £ such that *f;Q =
Boundary laws, uniqueness, and non-existence
213
Suppose Q is stochastic and {/;,r;: i e Z} is a boundary law for Q. If rt = 1 for all i e Z, then {/;: i G Z} is a family of probability vectors on E such that
Ufa = x„,..., ffi+„ = x„) = ^(x 0 )6(x 0 , x j . . . Q(xB-1,xB)ri+H(xn).
Here i G Z, M ^ 0, and x 0 , . . . , x„ G £. (b) Every Markov chain \i G ^(Q) admits a representation of the form (11.10) in terms of a boundary law for Q. (c) Every \i G ex^(Q) has a representing boundary law {^i,rt: i G Z}. Moreover, for any fixed ae E there exist two sequences (x„)nfel and (j n ) n è i "i E such that ^(x)/^ 0 (a) = lim ô n+, (x„,x)/ô n (x n ,a) n-*oo
/or all x e E and i < 0, and ri(y)/r0(a) = lim ß » - ' ^ , j n )/ô"(a, y„) n-*cc
for all y G E and i > 0. Proof, (a) By Kolmogorov's extension theorem, there exists a unique \i G ^(Q,J^) which satisfies (11.10). Comparing (10.5) and (11.10) we see that ß is a Markov chain with transition matrices (11.11)
Pi =
(Q(x,y)ri(y)/ri_l(x))x,yeE,
i e Z. From Example (10.11) we know that /J. G <S{y), where y is defined in terms of the functions (x,y) -> Pfay) = 6(*.3')',/(j')/rI--i(x) as in Example (10.3). But a glance at (11.2) shows that y = yQ. Hence ^ G &(Q). (b) Let ^ G ^(Q) be a Markov chain with transition matrices (P ; ) i6Z . We look for vectors rt G ]0, CO[ £ which are such that (11.11) holds for all i G Z. This will give us the required representation of \i. Indeed, since Pt is stochastic, (11.11) implies that Qrt = ri_1 for all i G Z. Also, putting ^(x) = /x(<7, = x)/r,(x) we have «f£rÉ = 1, SiQ(y)=
I M ^ = x)fJ+i(*.3')/ri+i(3')
214
Markov fields on the integers II
for all y and i, and Ufa = x0,...,ai+n
= x„) = ß(at =
x0)Pi+1(x0,x1)...Pi+n(xn_1,xn)
= fi(x0)Q(x0, xt)...
Q(xn_t, x„)ri+n(xn)
for all i e Z, n ^ 0, and x 0 x„e£. To find the vectors rt we first note that each Pt is positive. This is because 0 < W{?-i.i}(ff|-i = x,at = y) = n(ot„i =x,ol = y) = A*(o"«-i = x)Pj(x,3>)
for allieZ (11.12)
and x, y e £. Next we observe that Q(x,y)Q(y,a)/Q2(x,a)
= n(at = j ' k i - i = x,ai+1 = a) =
P,(x,y)Pi+1(y,a)/PtPi+1(xta)
for all x, y, ae E and i e Z. We exploit this identity in a similar fashion as in Remark (11.4). We fix an a e E and define numbers qi > 0 in such way that li/Qi-i = Pi+1(a,a)/Q(a,a) for all i e Z. We put r,(x) = qi+1Q2(x,a)/Pi+iPi+2(x,a)
(ieZ,xe
E).
Equation (11.12) can then be rewritten as Pi(x,y)Pi+1(y,a)/Q(x,y)Q(y,a)
=
qjr^x).
Setting y = «we get Pi(x,a)/Q(x,a) =
q^Jr^x).
Replacing x by y and i by i + 1 in the above and inserting this into the next to last identity we obtain (11.11). (c) Fix any ß e ex ^(Q). According to Theorem (10.21), ß is a Markov chain. Assertion (b) thus shows that ß satisfies (11.10) for a suitable boundary law {th rt: i e Z). Let a e E, i < 0, and x e E be given. For /^-almost all co e {a0 = a} we have ^ W f i ' ^ ^ o W = MCT; = x\a0 = a)
=
l i m ji(ffj = n-»oo
X|^_00>_I1]U{0})(ÛJ)
= lim y{L„,o[(?i = x\co) = lim Q n + i (œ_ n ,x)Q- i (x,a)/Q>_„,a).
Boundary laws, uniqueness, and non-existence
215
The third equality follows from the backward martingale theorem and the ^-triviality of the left tail (cf. the proof of Theorem (10.21)). Since /J(CT0 = a) — W{o}(°b = a) > 0, there is an a> e {a0 = a} such that the preceding identities hold for all x e E and i < 0. Putting xn = a>_n we arrive at the statement for the / / s . The statement for the r;'s is obtained similarly by looking at fi(Gt = x\a0 = a) for i > 0. a We now turn to the questions of uniqueness and non-existence of Markov fields for a positive matrix Q. As far as shift-invariant Markov fields are concerned, a complete answer to these questions is provided by Theorem (10.35). Let us restate that theorem in the present setting. Recall that a positive recurrent positive stochastic matrix P admits a unique probability vector a e ]0, oo IE with a.P = a. Putting /,• = a, ri = 1, and Q = P in (11.10) we obtain a Markov chain fi e ^(P) which is the unique shift-invariant Markov chain for P. This Markov chain will be denoted by \iP. (11.13) Theorem. Let Q be a positive matrix on E which satisfies (11.1). Then either ^©(Q) = 0 or |^©(Q)| = 1. The latter case occurs if and only if Q is equivalent to a positive recurrent stochastic matrix P with positive entries. In this case P is unique and ^©(Q) = {nP} c: ex^(Q). Proof. If Q ~ P for some positive recurrent positive stochastic matrix P then iip e ^©(P) = ^®(Q), as was shown above. Conversely, suppose there is some n e ^©(Q). By Example (10.24)(2) and Theorem (10.35), /i is unique, extreme in ^(Q), and a Markov chain for some transition matrix P. P is positive (cf. the proof of Theorem (11.9)(b)) and admits an invariant probability vector. Hence P is positive recurrent and \i = \iP. Also, since \i e ^(Q) fl ^(P) we have Q ~ P. Indeed, we can either refer to Theorem (2.34), or simply note that y\(°\ = (t>K\w) = fj,(aA = oiA\adA = œdA) = y^(aA = wA\w) for all A e y and œ e Q. (dA was defined above (10.9).) The uniqueness of P follows from the uniqueness of fi and also from (11.7). a As a consequence of Theorem (11.13) we obtain the result that ^(Q) is either empty, or a singleton, or infinite dimensional. Later on we shall, in fact, see that all these possibilities occur. (11.14) Corollary. Let Qbe a positive matrix which satisfies (11.1). (a) For each /J. e $(Q) we have the following alternative: Either \i is shiftinvariant, or its translates 9^) (i e Z) are pairwise distinct. (b) If Q ~ P for some positive recurrent stochastic matrix P with positive entries then either ^(Q) = {/iF} or |ex^(Q)| = oo.
216
Markov fields on the integers II
(c) / / Q is not equivalent to any positive recurrent stochastic matrix with positive entries then either ^(Q) = 0 or |ex^(Q)| = oo. Proof. To prove (a) we let p e &(Q) be given. We assume that 6>;(/i) = 6}(n) for distinct integers i, j . We will show that p is shift-invariant. First of all we note that p is periodic with period p = \i — j \ ^ 1, in that 6p(p) = p. Next we put p' = dxifi). From Remark (5.10) we know that p' e ^(Q). We need to show that p' = p. This will follow once we have shown that p' = p on J*p2. For this gives that ß = W]?m,P„[ = ß'y$m,Pni = I*'
on JV,, p n [
for all integers m < n, and therefore p = p'. We let v and v' denote the distribution of (erp;); 6 z under /i and p', respectively. The assertion p' = p on J*pZ is equivalent to the statement v = v'. This identity will follow from Theorem (11.13) provided we can show that v, V e ^@(QP). To see this we first note that v and v' are shift-invariant because p and thus p! are periodic. To show that v, v' e f£(Qp) we let j e Z, x e E, A = [ j - nj + ri]\{j}, and £ e EA be given and define Çp e £ p A by C£ = C; (i e A). Then V?U}VA = C, <Jj = x)
=
J
v(dœ)Qp(œj.1,x)Qp(x,(uj+1)/Q2p(œj_1,œj+l)
= K(yPA = Cp,(Tpj = x) = V(<7A = C, <7j = X).
Hence vy?„ = v for ally, and Theorem (1.33) implies that v e &(QP). Similarly, v' e ^(Qp). This completes the proof of (a). To prove (b) and (c) we suppose that either Q is equivalent to some positive recurrent stochastic matrix P and ^(Q) ^ {UP}, or Q is not equivalent to such a P and
Boundary laws, uniqueness, and non-existence
217
(11.15) Theorem. Let Qbe a positive matrix on E which satisfies (11.1). If inf X
Qn(x,x)>0
xeEn=l
for some integer N ^ 1 then ^(Q) = %(Q). Proof. We put p = N\. Then inf Qp(x, x) ^ inf max Q"(x, x)pl" xeE
XEE
lgngN N
^ inf AT1 X xeE
Q"(x,xyln
n—1 N
^ inf AT1 X (1 A ß"(x,x)) p xeE
n=l
^ inf A T ' ( £ 1 Aß-(x,x)Y xeE
\n =l
/
>0. We shall use this inequality to show that 9p(p) = p for all p e ex ^(Q). Corollary (11.14) then will imply that ex^(Q) <= %(Q), and this will prove the theorem because of the extreme decomposition theorem (7.26). So we take any p e ex^(Q). According to Theorem (11.9)(c), p admits a representation of the form (11.10) in terms of a boundary law {V,-,^: i e Z}. This representation shows that 9p(p) = p provided we can find a constant c > 0 such that £t_p = c£t and ri+p = cri for all i e Z. To show the latter we choose any number 8 with 0 < 3 < inf ß p (x, x). For all x e E and i e Z we have *e£ '*(*) = ' i - p o ' t o ^ ^-pWß p (x,x) >
ô^.p(x)
and r ; W = Qpri+P(x) 2: Qp(x,x)ri+p(x) > öri+p(x). Thus (11.16)
ti>Mi_p
and
r, > Sri+p
for all i e Z.
We put c = ^iri+p. This is independent of i because ^ri+P = Vt-iQ)ri+p = ^-i(ßr i + p ) = / H ^ - H , for all i e Z. (11.16) shows that 0 < cô = ^(<5ri+p) < ^
= 1.
Now we define t\ = ft_p/c and *f," = (/, - <5/,_p)(l - c<5)"\ i e Z. The systems {S'i,r,: i e Z} and {«f",^: i e Z} are boundary laws for Q. Indeed, (11.16) en-
218
Markov fields on the integers II
sures that each (•' is positive, and we have
and similarly /f"Q = (f"+1 and
for all i e Z,
we have /^ = c<5// + (1 — c<5)//'. As /^ is extreme and 0 < cô < 1, we conclude that n = n' = p.". This shows that (f • = (f; and therefore (f^p = C(f; for all i e Z. A similar argument works for the r;'s: The boundary laws {/„ ri+p/c: i e Z} and {/;,(?•,• — <5ri+p)(l — cä)""1: i e Z} define two measures v', v" e ^(Q) which satisfy /^ = c^v' + (1 — cô)v". The extremality of \i gives /^ = v', whence ri+p = crt for all i. The proof is thus complete, D Combining Theorems (11.13) and (11.15) we arrive at the following result. (11.17) Corollary. Let Qbe a positive matrix on E which satisfies (11.1) and is such that N
inf X Qn(x,x)>0 xeEn=l
for some N ^ 1. Then ^(Q) = {fiP} when Q is equivalent to a positive recurrent stochastic matrix P with positive entries, and ^(Q) = 0 otherwise. This corollary provides both: a sufficient condition for the existence of a unique Markov field for Q, as well as sufficient condition for the non-existence of such a Markov field. Some comments on these conditions are in order. (11.18) Comments. (1) The uniqueness condition of Corollary (11.17) certainly holds whenever Q can be written as Q = tP + (1 — t)I with 0 < t < 1, P being any positive recurrent stochastic matrix with positive entries, and / being the identity matrix. Indeed, such a Q satisfies inf Q(x, x) ^ 1 — t > 0 and is xeE
positive recurrent because the probability vector a with txP = a. also satisfies a.Q = a. If £ is finite, each positive stochastic matrix Q can be written as above. We thus obtain Theorem (3.5) again. (2) It is interesting to compare the uniqueness condition of Corollary (11.17) with the uniqueness condition which is obtainable from Theorem (8.39) (resp. Comment (8.41)(3)): The identity |^(Q)| = 1 holds whenever there are functions u, v from E to ]0, oo[ and a number C > 1 such that
C-^Q{x,y)lu{x)v{y)^C for all x, y e E. (To check this it is sufficient to note that yQ coincides with
Boundary laws, uniqueness, and non-existence
219
the Gibbs specification for the bounded nearest-neighbour potential $
f-log[ß(ffi,oi+i)MoiMoi +1 )] { 0
=
^
if A
= {'»' + !}» otherwise
and the (necessarily finite) a priori measure (uu)A.) The condition above implies that N
Y
£ Qn(x, x) < oo
for all iV ^ 1
xeE n=l
because
Z 6"(x,x)^
C YJ U(X)V(X) xeE
xeE
for all n ^ 1, and Clu(a)v(a)
YJ U(X)V(X)
^ Q2(a, a) < oo
for any a e E. Consequently, if E is infinite then the uniqueness conditions of Theorem (8.39) and Corollary (11.17) exclude each other. (3) The non-existence condition of Corollary (11.17) is satisfied whenever Q = tP + (1 — t)I for some 0 < t < 1 and any positive stochastic matrix P which is either null recurrent or transient with L(P) = 1. In fact, the equation Qn(x,x)=
t
{nk)tkPk(x,x)(\-tr-k
k=0
is readily seen to imply that L(Q) = tL(P) + 1 — t = 1, and Q is not positive recurrent because otherwise there is a probability vector a with <xQ = a and thereby ccP = a. Remark (11.7) thus implies that Q is not equivalent to a positive recurrent positive stochastic matrix, o The non-existence condition of Corollary (11.17) also holds when Q is the transition matrix of a random walk on ZN, N ^ 1. This will be shown in the final corollary below. Under a growth condition on Q, the same result was already obtained in Example (9.17). Recall that a matrix Q on ZN is said to be homogeneous if Q(x, y) = g(0, y — x) for all x, y € ZN. (11.19) Corollary. Let £ = ZN for some N ^ 1 and Q be a homogeneous positive matrix on E such that
C= Z ß(0,x)
Then <${Q) = 0. Proof. 1) Since inf Q(x, x) = *g(0,0) > 0, the result will follow from Corollary xeE
220
Markov fields on the integers II
(11.17) provided we can show that Q is not equivalent to a positive recurrent stochastic matrix. According to Remark (11.7), the latter certainly holds when Q is equivalent to a positive stochastic matrix P with L(P) = 1 which is not positive recurrent. It is well-known that a homogeneous P can never be positive recurrent. (For suppose there was a probability vector a with aP = a. The ergodic theorem (10.34) then would imply that <x(y) = lim Pn(0,y) = lim P"(-y,0) n —•co
= a(0)
n~*co
for all y e E. This is impossible.) To prove the corollary it is therefore sufficient to find a homogeneous positive stochastic matrix P with L(P) = 1 and P ~ Q. 2) For each s e UN we define
Q(0,x)exp(s-x)
xeE
and, if q>(s) < oo, Qs(x,y) = Q(x,y)exp [s • (y - x)]/(?(s)
(x, y e E).
(The dot denotes the inner product.) Qs is stochastic and homogeneous, and Remark (11.4) shows that Qs ~ Q. Therefore the corollary will be proved once we have found an s 0 such that
3) In order to find s0 we shall use an argument which is well-known in large deviation theory. We fix an integer k ^ 1 and introduce the truncated matrix Q(xy) = \Q{X'y) (0
i f
l*-^/c, otherwise,
as well as the associated generating function q>. We note that cp(0) < C. Putting 6 = min ß(0, x), we have
#(s)^^ X eSX =23 Z c o s h s -->^^l 2 1*1 = 1
i= l
for all s = (Si,..., sN) e IRN. As ^ is continuous, ^ attains its minimum at some S with |s\ 2 S C/ö. Also, 0 = grad^(s)= ^ x (5(0, x)exp(s-x).
The Spitzer-Cox example of phase transition
221
Consequently, the vector Qs(0, • ) belongs to the set K of all probability vectors a on £ which are such that £ x cc(x) = 0 and a(x) = 0 when |x| > k. The set xeE
of all such a with rational coordinates is dense in K. For each n ^ l w e can therefore find an a„ e K such that na„(x) e Z and lim a„(x) = Qs-(0,x) for all n-*oo
x e E. Looking at the paths of length n in E from 0 to 0 which contain nct„(x) jumps of size x we obtain the inequality
&5(o,o) ^
n
n &(o,xr«<*>. 11 (najx))\ V X
Using Stirling's formula, we arrive at the inequality L(QS) ^ 1 which is equivalent to the inequality L(Q) ^ cp(s). Since ß"(0,0) ^ ß"(0,0) for all n, we end up with the inequality L(Q) ^ cp(s). Finally, we let k tend to infinity. The associated minimal points s have a cluster point s 0 . Letting k run through an appropriate subsequence and using Fatou's lemma we obtain the desired result
k
^liminf
This proves the corollary, D In the next three sections we shall present three examples of positive stochastic matrices Q which exhibit a phase transition. In each of these examples, the state space will be E = Z+, the set of all nonnegative integers.
11.2
The Spitzer-Cox example of phase transition
We begin by introducing the notations (11.20)
â{n,p,k) = {l)pk{\ -p)"'k
(n,k^0,0SpS
1)
and (11.21)
/t(q,k) = e-qqk/k\
(k^0,q>0)
for the binomial and Poisson distributions, respectively. Later on, we shall use the elementary formulas (11.22) (11.23)
6(m,p,-)*â{n,p,-)
= 6{m + n,p,-)
X t(n,pi,k)6{k,p2,-)
= ê{n,p1p2,-)
(m,n ^ 0,0 ^ p ^ 1), (n ^ 0,0 ^ Pl,p2
JcâO
(11.24)
Mli,-)*M92>-)
= M9i+Qi,-)
(Si,q2 >0),
^ 1),
222
Markov fields on the integers II
and (11.25)
£ Av,W(k,P,
•) = Mw, •)
(« > 0,0 ^ p s i).
Here a*a' denotes the convolution of two probability vectors a, a' on Z+. That is, a * a'(fc) = £ a(Oa'(/c - f)
(k ^ 0).
The preceding formulas are easily checked using moment generating functions. For example, (11.25) is verified by noting that
I
£
fiiq,k)6(k,P,t) z' = £ /t(«,fc)(l-p + pz)*
k>0
tf>0
= e x p [ - ^ + ?(l - p + pz)]
for all 0 ^ z ^ 1. After these preliminaries we can define a matrix Q which will be shown to exhibit uncountably many phases. This example of phase transition was invented by Spitzer (1975b) and solved completely by Cox (1977a). We put E = Z+, fix a parameter 0 < p < 1, let q = 1 — p, and define a stochastic matrix Q on E by (11.26)
Q(x,-) = t(x,p,-)*t(q,-)
(xeE).
More explicitly, (11.27)
Q{x,y) = Xf!t{x,p,k)t{q,y-k)
(x,yeE).
k=0
(The choice q = 1 — p is just for convenience. For arbitrary q > 0, the ratio q(l — p) _1 would appear in many of the expressions below.) We note that Q may be viewed as the transition matrix of a Markov chain which describes the number of individuals in a population that evolves as follows. At each time unit, the individuals die independently of each other with probability q, and survive with probability p. After that, but at the same time unit, an independent Poisson number of immigrants is added to the population. Q is evidently positive. Moreover, Q is positive recurrent. Indeed, the probability vector a = ^ ( 1 , •) satisfies a.Q = a. In fact, even more is true, a is a reversible probability measure for Q in the sense that (11.28)
a(x)Q(x,y) = a(y)Q(y,x)
for all x, y e E.
This follows from (11.27) because for each 0 ^ / c ^ x A y w e have
The Spitzer-Cox example of phase transition
Ml,x)t(x,p,k)Mq,y
223
~k) = ckqx+>/(x - k)\(y - k)\,
where ck = e~1~qpkq~2k/k\ is independent of x and y. Next we introduce a two-parameter family {n"-v: u, v ^ 0} of probability measures which will turn out to coincide with ex @(Q). Let u, v ^ 0 be given. For each i e Z and x e E we put (11.29)
/f(x) = /i(l + up\x), r»{x) = >(1 +
vp-\x)/a(x).
Then t"Q=
I
>*(i + u p ' , x)*(x,p, •)*/*(«,•)
= / ( p + up i + 1 , -)*/t(q,
•)
i+1
= /,(p + q + up ,-) —
c
i + l-
The second equality comes from (11.25), and the third from (11.24). Combining this and (11.28) we also obtain Qri°(x)= I Q(x,yWLt(y)/<x(y) yeE
= I
Q(y,x)r^(y)/oi(x)
yeE
= ^ ; Q(x)/a(x) = r_i+1(x)/a(x) = tf-iM for all i e Z and xe E. Finally, a short computation shows that W
= I
Mx)r?(x) = e"°
for all i. Consequently, the system {£",çruvr?: i e Z) is a boundary law for Q which defines a unique Markov chain ß"-v e ^(Q) via (11.10). We note that Bi(tiu-V) = pup "'•'"'' for all i e Z a n d u , ^ 0. In particular, 0,0 (i = /xß, the unique shift-invariant Markov chain for Q. Moreover, the measures nu'v are pairwise distinct. This can be seen from the equation (11.30)
nu-"((ri = x) =
fï(x)rtv(x)e-uv
= /fc((l + up')(l + t;p-'),x) which holds for all i e Z, x e E, and u,v^0. We are now ready to state the theorem of Cox which gives a complete description of &(Q). In particular, this theorem shows that ex ^(Q) has the cardinality of the continuum. But pi0,0 is the only element \i of ^(Q) which is tempered in the sense that supieZn(oi)\i\~r < oo for some r ^ 1.
224
Markov fields on the integers II
(11.31) Theorem. Consider the matrix Q at (11.26). Q satisfies ex#(ß)
=
{fiu'v:u,v^0}.
Moreover, for each \i e ^(Q) the limits U = lim <7;P~'
and
V= lim otpl
£-* —oo
i->oo
exist ^-almost surely, and \i has the representation \i=
n"'vm(du,dv),
\ [0,oo[
2
where m is the distribution of (U, V) under /i. Before entering into the proof we observe that (11.32)
ß"(x,-) = <*(*,PV)*/fe(l-pV)
for all n ^ 1 and x e E. For n = 1, (11.32) is identical with (11.26). Thus suppose that (11.32) holds for some n ^ 1. Then (11.32) holds for n + 1 because
ß"+1(x,-) = £ Q"(x,y)Q(y,-) yeE
= £
6{x,pn,k)fi{\-pn,t)6{t
= £
^ ( x , ^ f c ) X l - p V W , p , - ) * ^ , p , •)*>(«.•)
+ k,p, •)*/*(«,-)
= X ^(x,p",/c)/(p-p" + 1 ,-)*^(/c,P,-)*/te-) = «*(x,p"+1, • ) * / * ( ! - P B + 1 , - ) for all x e E. The third equality follows from (11.22), the fourth from (11.25), and the last from (11.23) and (11.24). Proof of Theorem (11.31). The proof is broken up into four steps. In the first and second step, we shall use (11.32) and Theorem (11.9)(c) to show that ex^(ß) <= {[i"'v: u, v ^ 0}. In Step 3 and 4 we shall establish the reversed inclusion and the integral representation in terms of U and V. 1) Let (x„)„ âl be any sequence in E, and suppose that c = lim Q"~l(xn,0)/ n->oo
ß"(x„,0) exists and is positive. Then u = lim x„p" exists, and for all i e Z and x e E we have "~>°° limß"+,'(x11>x) = (7(x). To prove this claim we first note that the sequence {xnp")n^l is bounded. This follows from the hypothesis c > 0 and the estimate
The Spitzer-Cox example of phase transition
225
ß"" 1 (x B ,0)/ß"(x ll ,0) = 6(xn,p"-\0)fi(\
- p- 1 ,0)/^(x ll ,p-,0)/i(l - p",0)
n~œt{xn,p"-\m{xn,P\Q)
= (1 - p"_1)*"(l - p") - *" = [1 - p " _ 1 g ( l - p " ) ^ 1 ] " " gexp[-x„p" _ 1 qf]. In the above, we have used (11.32) for the first equality. We conclude that the sequence (x„p")„ âl has cluster points. Let u be any cluster point. Letting n run through a subsequence with x„p" -» u, we conclude from the Poisson convergence theorem that t(x„,pn+i, -)-»/i("P V ) and therefore Q" +i (x n ,-) = ^ x „ , p " + \ • ) * / ( ! - p " + ' , - )
=c Using this for i = — 1 and i = 0 we obtain c = r i ( 0 ) / a O ) = exp(-uq/p). Thus u is uniquely determined by c. This shows that the sequence (x„p")„ äl has a unique cluster point and is therefore convergent. 2) Let n e ex^(Q) be given. We show that pi = /J,"'" for suitable u, i; ^ 0. According to Theorem (11.9)(c), n is represented by a boundary law {^,r;: i e Z ) which satisfies 4 = 4(0) lim Qn+i(x„, • )/Q"(x„, 0)
(i < 0)
n-*oo
and r, = ro(0) lim ß - ' ( •, yH)/Qn{0, yn)
(i > 0)
for suitable sequences (x„)„ ôl and Cyn)„fel in £. Step 1) shows that u = lim x„p" exists and ^ = c x C for all i < 0, where cL =
other hand, we conclude from (11.28) that Q"-'{-,yn)/Q''{0,yn) = «(0)Q"^(y„,
-)K-)Qn(yn,0)
for all n and i > 0. Thus Step 1) again shows that v = lim ynp" exists and «-»•oo
r, = c ^ f o r a l h ' > 0, where c 2 = r 0 (0)a(0)/^(0) = ro(0)/r£(0). Since StQ = Si+l, C o = C n Q n = r«-i, and gr" = r ^ for all i, we arrive at the conclusion that
226
Markov fields on the integers II
4 = cxQ and r, = c2r? for all i e Z. But 1 = £QrQ = c^c^rl = c1c2eul'. Equation (11.10) thus shows that the boundary laws {/;,r ; :ieZ} and {V",e-l"Vj': i e Z} represent the same measure. Hence /i = /i"'". 3) For fixed u, v ^ 0, the limits [7 = lim otp~l and F = lim atpl exist /i u '"i->—oo
i->oo
almost surely, and fi"'v(U = u,V = v) = l.To see this we introduce the shorthand notation mi = (I + up')(l + vp~'), i e Z. Clearly, lim m;p~' = u and i->—oo
lim m ; p' = v. From (11.10) we know that Oj(/i"'") = /è(m;, •) for all i e Z. In i->oo
particular, we have /iu,"(<7;) = m; and /iu'"(|<7; — m,|2) = m, for all i. Hence X n'-WOi - mjp-' ^ e) ^ e" 2 X P~2im< < oo i<0
i<0
for all e > 0, and the Borel-Cantelli lemma gives Hu'v( lim |<x,p_i - m £ p _i | = 0 ) = 1, whence U exists and equals u /iu'"-almost surely. The result for V is obtained similarly. 4) We define a map q> from ex ^(Q) to [0, oo [2 by
i->oo
(v e ex S(ß)).
/
cp is well-defined. This is because Step 2) has shown that each v e ex ^(Q) is equal to some n",v, and in Step 3) we have seen that
j" v w(dv) ex»(Q)
which exists because of Theorem (7.26). We put m = (p(w). Then ^=
J
/i*
ex»(Q)
/i">um(du,di?).
| [0,oo[
2
Together with Step 3) we obtain fi{U and F exist) =
j" fiu'v(U and F exist)m(du, dv) = 1.
[0,oo[ 2
Moreover, for each Borel set B cz [0, oo[2 we have H((U, V)EB)=
\ [0,oo[
=
2
{
[0,oo[ 2
HU'V{{U, V) e B)m(du, dv) lB{u, v)m(du, dv)
= miß). This shows that m is the distribution of (U, V) under /i.
The Spitzer-Cox example of phase transition
227
It remains to show that each nu,v is extreme in ^(Q). From Step 3) we know that ô(uv) is the distribution of (U, V) under n"'v. By the above, ô{u
wu-v{{nu'v}nexg(Q)).
This shows that fj."'" e ex ^(Q), and the proof is complete.
•
Theorem (11.31) enables us to characterize the class of all Markov chains in y(Q). This characterization shows that neither is any Markov field for Q a Markov chain, nor is any Markov chain in ^(Q) extreme in @(Q). Thus both the converse of Remark (10.9) (3) and the converse of Theorem (10.21) fail in general. (11.33) Corollary. Let Q be given by (11.26), and let n e &(Q). The following statements are equivalent. (a) \i is a Markov chain. (b) The random variables U and V are independent with respect to p. = ÜK e- /x//x(e-ÜK). (c) \i has a representation of the form H=
lml(6.u)\m2(6.v)euvnu'v
with suitable measures m1, m2 on [0, oo[. Proof. The equivalence of (b) and (c) is immediate from Theorem (11.31). We thus need only prove the equivalence of (a) and (c). (a) implies (c). Suppose [i is a Markov chain, and let m be its representing probability measure on [0, oo[2. We putm(du,di;) = e~a+u)(1+v)m(du,dv). Then H{UeA,o0
= 0,VeB)
= l m(du, dv)n"-v(U e A,a0 = 0,V e B) =
J
m(du,dv)e-(1+u)il+v)
AxB
= m(A x B) for all Borel sets A, B c [0, oo[. Since U is .^j-«,>0[-measurable and V is J^ 0oo[ -measurable, the one-sided Markov property (10.7) of ft implies that m(A x B)/m(A x [0, oo[) = n(V e B\a0 = 0,U e A) = n(V e B\a0 = 0) = m([0,oo[ x B)/m([0, oo[2) for all A, B with m{A x [0, oo[) > 0. This shows that m = m1 x m2, where m1 = m(- x [0, oo[) and m2 = w([0, oo[ x -)/m([0, oo[2). Putting mt(du) = e 1+ "w 1 (du) and m2(dv) = e"w2(dy) we arrive at statement (c). (c) implies (a). Suppose (c) holds. For i e Z we define
228
Markov fields on the integers II
/, = Im^duWr^
im2(dv)ry.
Then li(oi = x0,...,oi+n
= x„)
= J mi(du) J m 2 (du)e u Vr(x 0 )Ô(x 0 ,x 1 ).. • Q{xn_uxn)e-"vr°+n{xn) whenever j ' e Z , n ^ 0 , and x0,...,xne cf. Theorem (11.9) (a), a
11.3
E. This shows that /lis a Markov chain;
Kalikow's example of phase transition
As before, we take E = Z+. We fix two numbers p, q with 0 < q < p < 1, and we define two further numbers a, b > 0 by the two requirements (11.34)
a{\ - p)"1 - b{\ - q)~l = 1.
a/b = p/q,
Thus a = p{\- p)(l - g)(p - q)~l and 6 = g(l - p)(l - q)(p - q)~\ We also put c = a — b = (I — p)(\ — q). Next we introduce a (row) vector a e ]0, oo[ £ by (11.35)
a(x) = apx - bqx = c(px+1 - qx+1)(p - q)~l X
= c Y, pkqx~k
(x e E).
k=0
The equality of the first and second expression on the right comes from the first requirement in (11.34), whilst the second requirement in (11.34) ensures that a is a probability vector on E. Looking at the second expression for a we realize that a satisfies the recursion relation (11.36)
«(0) = c, a(x) = poc(x - 1) + cqx
(x ^ 1).
We define a positive matrix Q on E by (11.37)
Q(x,y) = poi(x - l)ot(xylô^iy)
+
cqxz(xy1a(y)
= [pàx-i(y) + cqx~]u(y)/a{x). Here x, y e E, and ôx_1 is Kronecker's delta. The matrix Q was invented and studied by Kalikow (1977) for a specific choice of p and q. We note that Q(0, •) = a because cL^y) = 0 for all y e E. Moreover, Q is stochastic. Indeed, (11.36) shows that Q(x, •) is a convex combination of the probability vectors öx^ and a for all x ^ 1. To get a feeling for the characteristic properties of Q it is useful to look at the case q = 0, although this case
Kalikow's example of phase transition
229
was excluded at the beginning because Q then fails to be positive. In the case q = 0, Q is given by Q{x,y) = \S^{y) i f ^ 1 ' ^ , y y y * ' \(l-p)p iîx = 0,yeE and can be thought of as describing the evolution of the number of inhabitants of a fixed territory: The population loses one individual per time unit until the time of extinction, at which time a geometrically distributed number of immigrants enters the territory. In the case q > 0, this process is shortened, in that the inhabitants of the territory may be dislodged by an invading population of size distribution a even before extinction (with a positive probability which depends on the number of inhabitants). It is clear from the preceding interpretation that Q is positive recurrent. In fact, a is the equilibrium distribution of Q, in that ocQ = a. To check this we fix some y e E. Then aQ(y) = a(0)a{y) + pa(y) + £ = [c + p + cq(\ 1
= [c(l - qy
cq*a(y)
qy'My)
+ pMy)
= *(y). According to Theorem (11.13), the positive recurrence of Q implies that %(Q) = {J"Q} <= ex^(g). Kalikow's interesting discovery was that Q admits a non-trivial entrance law {a;: i e Z} which reaches equilibrium, in that a; = a for all i ^ 1. Let us introduce this entrance law. We put s = q/p. For i e Z and x e E w e define (11.38)
a(x) a,(x)= „ , _,_.,, ,__, , ^ (1 - s 1 _ i ) M x ) + s 1 - 'a(x)
if i ^ 1, if i ^ 0.
Clearly, each a, is a probability vector on E, and the a,'s with i ^ 0 are pairwise distinct. Let us check that a,Q = ai+1 for all i e Z. We already know this when i ^ 1. So let i ^ 0. For each y e E we can write *iQ(y) = (1 - s 1_, ')ß(-».30 +
s^'aQiy)
= (1 - s ' - ^ p a f - i - l)a( —O - 1 ^-,-^^) + [(1 - s^cq-'ai-i)-1
+ s 1_ ']a(y)
because ocQ = a. A glance at the second expression on the right of (11.35) shows that cq-'ai-i)-1
= q-\p - qVp'^l = s _ i (l - s ) / ( l
- s 1 "') -s1-1).
230
Markov fields on the integers II
Consequently, the expression in square brackets equals s '(1 — s) + s 1 ' = s~'. Thus a .Q
= (1 - s 1 _ i )pa(-J - l M - i T 1 ^ , - ! + s~'a.
Since a,Q and a are probability vectors, we conclude that a*ô = (1 - s~0^-i-i + s~*a = «,-+iWe have thus proved that {a;: i e Z} is an entrance law for Q. We let p0 denote the Markov chain for Q which is defined by this entrance law. That is, p0 is given by (11.10) with ^ = a; and ri = 1. Theorem (11.9) (a) asserts that p0 e ^(Q). The properties of {a;: i e Z] imply that 9j(p0) ^ 9k(p0) for all j , k e Z with j =£ k. In particular, p is not shift-invariant. But p0 = pQ on ^o.oot- Thus /i 0 is one-sided shift-invariant, in that p0(9iA) = p0(A) whenever i e Z and A is such that A e ^j0,oo[ a n d fy^ e ^ 0 ,œ[- Moreover, lim 9j(p0) = pQ. Since /x0 e #(ß)\# e (6)> Corollary (11.14)(b) shows that j->-CO
|ex^(Q)| = oo. However, much more can be said. (11.39) Theorem. Consider the matrix Q at (11.37), and let p0 be as above. Define pj = 9j(p0),j e Z. Then ex3(ß) = {/xQ}U{/x,:./eZ}. For each p e ^(Q), the limit Z = lim (<j; + i) exists in {— oo} U Z p-almost surely, and the unique extreme decomposition of p is given by p = p(Z=
- cc)pQ + X KZ =j)fij-
In particular, each p e ^(Q) is a Markov chain. Proof This proof is similar to that of Theorem (11.31). 1) For all n 2: 1 and x, y e E we have Q"(x,y) = lp"öx.n(y) + q<^" +1)v0 a(x A (n - l))]a(y)/a(x). By the definition (11.37) of Q, this equation holds for n = 1. The induction step runs as follows. For any x, z e E we can write Qn+1(x,z) = a(z)a(x)"1 X lP"àx-n(y) + 9 ( *-" +1)v0 a(* A (n - 1))] yeE
x C P ^ - I ( Z ) + c
A (n - l))(p + c(l - q)" 1 )]
= a(z)a(x)-1[p"+1<5;c_„_1(z) + 9<*-">v°a(* A n)].
Kalikow's example of phase transition
231
The last equality comes from the identities p + c(l — q)~l = 1 and cp V~"l[o,:c](") + qlx~H+l)v0a(x
A (n - 1))
k= 0 x An
= c £ PV * = q(x'n)v0a(x
A n).
The induction proof is thus complete. 2) Let fi e ex ^(ß) be given. Theorem (11.9)(c) asserts that /i is represented by a boundary law {/j,^: £e Z} via (11.10). Moreover, there are sequences (x„)ni ! and (y„)„z t in £ such that (11.40)
4/4,(0) = lim en+,'(x„, -)/e"(^,0) n-*oo
for i < 0, and (11.41)
rt/ro(0) = lim Q-\ •, yn)/Q"(0, yn) n->oo
for i > 0. Finally, there is no loss in assuming that r0(0) = 1. Combining Step 1) and (11.41), we obtain r,(x) = lim Lp'-'ôx-^iy,,)
+ g ( *-" + m > v 0 a(x A (n - i - l))]/a(x)
n->oo
= a(x)/a(x) = 1 for all x e E and i > 0. Thus r; = 1 for all £ e Z. Next, we identify the / ; 's. From (11.40) and Step 1) we conclude that (11.42)
ctt(y)/t0(0)a(y)
= lim t„,,(y)/t„,0(0). n-*oo
for all y e E and i < 0. Here t-.i(j') = Pn+%K-n-i(y) + ^ — i + 1 ) v 0 a ( x „ A (n + i - 1)). Suppose now that lim sup (x„ — n) = oo. Then for each £ < 0 and y e E n->oo
there are infinitely many n (namely all n with x„ — n — i > y) such that t nAy)ltn,oiS>) = q~'cc(n + i — l)/a(n — 1). Since a(/c) ~ apk as k -» oo, it follows from (11.42) that c/;(}>)//o(0)a(}>) = as _ i for all £ < 0 and y e E. As r, = 1 and /;?-,• = 1, a summation over y shows that c//0(0) = as~' for all £ < 0. Since this is impossible we conclude that lim sup (x„ — n) < oo. n->oo
Suppose next that liminf(x„ — n) = — oo. Then for each i < 0 there are n->oo
infinitely many n such that t„ t = a(x„). Inserting this into (11.42) we obtain c{i = 4)(0)a for all £ < 0. Since / ; and a are probability vectors, we arrive at the conclusion that 4 = a for all £ < 0. This implies that ß = ßQ.
232
Markov fields on the integers II
Finally, suppose that the sequence (xn — n)„ êl has a cluster point j e Z. Then xn — n =j for infinitely many n. For these n and all i < j A 0 and y e E we have '-.«(30 = Pn+iSj-t(y) + qj-i+l°L(n + i - 1). Using (11.42), we conclude that ctt(y)/*(y)m
= lim tn
= lim [g'--'- 1 p" + 'a(n + i - l ) - 1 ^ - , ^ ) + 1] n-»oo
=
9'--'-
1
pfl-1Vi(3')+l
for all i < j A 0 and y e E. Multiplying this equation by a(y) and summing over y we obtain c/4(0) = q'-'b^aU
- j) + 1 = qi-'b-^ap'-1
= s'-'-1.
Hence 4 = a - 1 p'--'a(j - 0^,-i + s 1 _ i + J a for all i <j A 0. Since *f;, <5y_;, and a are probability vectors, we finally arrive at the result that
for all i < j A 0 and therefore all i e Z. This shows that /x = /^. To summarize the results at this stage, we have proved that each fie ex ^(Q) belongs to the set {nQ} U {[if j e Z}. 3) We have /iQl lim (at + i) = — oo 1 = 1. This follows from the BorelCantelli lemma because
X HQ(at + i ^ k) = X
Z a(x)
< a(l — p) * £ p* ; < oo
for all k eZ. Similarly, ii}1 lim (<X; + i) = y 1 = 1 for ally e Z because
Together with Step 2), this shows that the limit Z = lim ((j, + i) exists in i-> — oo
{— oo} U Z /i-almost surely for all fie ex ^(Q). In view of the extreme decomposition theorem (7.26), we further conclude that Z exists /i-almost surely for
Spitzer's example of totally broken shift-invariance
233
all 11 e &(Q), and the extreme decomposition of any ß e ^(Q) is directed by the distribution of Z under \i. This in turn implies that each /Xj is extreme in &(Q); cf. Step 4) in the proof of Theorem (11.31). Alternatively, we can argue as follows. Since ^{Q)\{ixQ} # 0, there is some /x e ex^(Q)\{fiQ}. By Step 2), H = Hj for some j e Z. Thus /Xj e ex &(Q) for some j e Z. But the /x/s are translates of each other. Remark (7.2) therefore shows that fXj e ex ^(Q) for all j e Z. To prove the final sentence of the theorem, we note that each [x e ^{Q) admits a representation of the form (11.10) with 4 = fi{Z=-
oo)a + £ I*{Z
=j)a,-j
and rt = 1 (i e Z). Hence [i is a Markov chain. This completes the proof, D
11.4
Spitzer's example of totally broken shift-invariance
In the preceding examples of phase transition, Q was a positive recurrent matrix. Therefore it is natural to ask if there is any matrix Q which exhibits a phase transition but is not equivalent to a positive recurrent stochastic matrix. In view of Theorem (11.13), such a Q can never admit a shift-invariant Markov field. Since yQ is shift-invariant, one might wonder if such a Q can admit any Markov field at all. In fact, Corollary (11.14) asserts that such a Markov field, if it exists, has pairwise distinct translates. Is this case possible? F. Spitzer (1975b) has answered this question in the positive by providing an example which we are now going to discuss. It is clear from the above that in contrast to the preceding examples the main point of this example is the existence problem. As in the examples above, we take E = Z+. We consider a stochastic matrix P of the form P ( „ ) = f;*;"' rt * x i l ; y e f la(y) it x — 0, y e E. Here 0 < p < 1, â(x,p, •) is given by (11.20), and a is a strictly positive probability vector on E. Since P is not positive, we shall consider Q = P2. Q is positive because Q(x, y) ^ â{x, p, 0)a(y) > 0 for all x,yeE. In particular, P is irreducible. As in the preceding examples, it is convenient to think of P in terms of population dynamics. P describes the evolution of the number of inhabitants of a territory. At each time unit, the inhabitants survive independently of each other with probability p and die with probability 1 — p. At the time of extinction, a new population of size distribution a immigrates into the territory.
(11.43)
234
Markovfieldson the integers II
In the following we shall also need to look at the process with the same survival mechanism but without immigration. This process is described by the stochastic matrix (11.44)
P(x,y) = 4(x,p,y)
(x,yeE).
According to equation (11.23), the powers of P are given by (11.45)
Pn{x,y) = 6{x,p\y)
(n^l,x,yeE).
It is evident from the intuitive description of P that P is recurrent. A formal proof runs as follows. For each x e E, we let fip e ^(Ez*,éz*) denote the Markov chain with transition matrix P and starting point x. We look at the extinction time T = min {n ä: 1: er„ = 0). We obtain HP(t < oo) = a(0) + Y, oc(x)fip(t < oo) = a(0) + Y
a x
( ) hm
x^t 1
^P(T
= n)
n->oo
= a(0) + £ a(x) lim fâa, = 0) x^.1
n-^oo
= a(0) + X a(x) lim (1 - pnf = 1. The next to last equality is a consequence of (11.45). Since P is irreducible, the equation HP(T < oo) = 1 implies that P is recurrent. Next we will show that a can be chosen in such a way that P is null recurrent. Indeed, for each x ^ l w e have ^(T)= £ * > « )
= I
(1-(1-PT)
and therefore, by Fatou's lemma, liminf/ip(x) = oo. Consequently, we can find an increasing sequence (x(k))kèl in E such that /ip(fc)(T) ^ 2k for all /c. Thus, if a is any positive probability vector with ac(x(k)) ^ c2~k for some c > 0 and all k then A#(T) = 1 + E « W # ) ^ I
a(x(/c))/i^")(T) = oo,
and this means that P is null recurrent. Finally, we observe that Q is null recurrent whenever P is null recurrent. For on the one hand we have 2 n (0,0) = P 2n (0,0) ^ a(0)P 2 " _1 (0,0) for all n ^ 1 and therefore 2 £ ß"(0,0)^a(0) £ P"(0,0) = oo.
Spitzer's example of totally broken shift-invariance
235
Thus Q is recurrent. On the other hand, Q is null recurrent because
oo = /I°(T) = X tib > » ) ^ 2 I Ufa > 2k) ^ 2 X
/i°(T >k)
= 2/I°(T).
We are now ready to state Spitzer's result. (11.46) Theorem. Let P be given by (11.43), and suppose a is chosen in such a way that P is null recurrent. Define Q = P2. Then |ex ^(Q)\ = oo, but ö,(/i) ^ /i for allj e Zand ne ^{Q). Proof. Since Q is null recurrent, Remark (11.7) ensures that Q is not equivalent to a positive stochastic matrix. Thus ^®(Q) = 0. Consequently, the theorem will follow from Corollary (11.14) provided we can show that ^(Q) ^0. The proof of this is broken up into four steps. 1) The powers of P satisfy the equation (11.47)
P»(x,y) = â{x,p",y) + " f ^ ( T = n - k) [Pk(0, y) -
ô0(y)l
k= l
Here n 5; 1, x 5; 1, y 2Ï 0, and ô0 is Kronecker's delta. For we can write Pn(x,y) = n*(on = y,x>n)
+ n$(on = y,x ^ n).
The first term on the right is equal to Hxp{an = y,T>n)
= (l-
ô0(y))^(an
= y)
= P"(x,y)-ô0(y)fif(<7n =
= 0)
Hx,p",y)-ö0(y)fi*P(^n).
Because of the Markov property of /i£, the second term can be written as t
^ ( T = m)ß°P{an.m = y) = " f fifc = n-
m=l
k)Pk(0,y).
k=0
Combining all this and noting that P°(0,y) = ö0(y) we arrive at (11.47). 2) Let (xn)nèl be any sequence in E such that x„p" ^ 1 and xnp" -> 1 as n -> oo. (For example, x„ might be the smallest integer exceeding p~") Then the limit ai(y)=
limPn(xn_hy) n-*oo
exists for all y e £ and i e Z, and cct is a probability vector on £. To prove this we put x = x„_; in (11.47). Since lim x„_,p" = p1', the Poisson convergence theorem implies that "^°° lim t(xn-l,pK,y)
= fc{p\y\
236
Markov fields on the integers II
where /(p 1 , • ) is given by (11.21). As for the second term on the right of (11.47), we note that fipni(x = n — k) = HP"'(T 5= n — k) — HP"'(T ^ n — k — 1) Pn-k(x^i,0)-Pn-k-1(xn.i,0)
=
— H — nn~kYni
— (1 —
nn~k~1)xn-i
Hence lim fipni(T = n — k) = exp( — p'~k) — exp( —p'"*"1) for all i e Z and k ^ 1. Moreover, /X*"-'(T
= n - k) ^ (1 -
n k in p
-y
^ exp(-p'"*)
and
Z
ex
P(-?'-*)< °°-
The dominated convergence theorem thus gives = » - k)[_Pk(0,y) - ô0(y)]
lim " f nï-ii
[exp(-p;-k)-exp(-pi-'t-1)][P't(0,>')-^0(>')]
= Z
for all i e Z and y e E. This proves the existence of a,-. The sum over _y of the last expression equals zero, and fe(p\ •) is a probability vector. Hence a,- is a probability vector. 3) {<x;: i e Z} is an entrance law for P. Indeed, from Fatou's lemma we conclude that alP{y)=
Z \im xeE
P"(x„_i,x)P(x,y)
n->oo
^ lim infP n + 1 (*„_;, y) n-*oo
= <*t+i(y)
for all i e Z and y e E. The strict inequality is impossible because cctP and ai+1 are probability vectors. Hence <xtP = <xi+1 for all i. In particular, at-2Q = «; for all i, and this implies that the a;'s are strictly positive. 4) To complete the proof we put ^ = <x2i and rt — 1, i e Z. According to Step 3), \ßi,r{. i e Z} is a boundary law for Q. Theorem (11.9)(a) thus shows that <S(Q) ^ 0. a In the proof above, we have constructed an element fi1 e ^(Q) by means of a sequence (x„)„ äl in E with x„p" -> 1. The same reasoning applies to any sequence (*„)„>! with x„p" -> c for some c > 0. This gives us a Markov chain
Spitzer's example of totally broken shift-invariance
237
lic e ^(Q). The construction shows that 0,(/ic) = // c p 2 j for all j e Z and c > 0. Moreover, using the fact that Z i<0
Z l^'fa = *) - >(cp 2 ',x)| < °o J E £
and applying a Borel-Cantelli argument similar to that in Step 3) of the proof of Theorem (11.31), we see that W
limCT;/?"2'= c I = 1
for all c > 0. In view of the extreme decomposition theorem, this implies that ex @(Q) is uncountable. It would be interesting to know if the Markov chains fic are extreme in ^(Q), and if in this case there are any further extreme elements of ^(Q). We suggest this problem to the reader. Finally, we note that the construction of nc does not rely on the hypothesis that Q is null recurrent. Consequently, choosing a in such a way that Q is positive recurrent we obtain a further example of phase transition which exhibits uncountably many phases but also admits a shift-invariant phase.
Chapter 12 Markov fields on trees
In the preceding two chapters we have seen that Markov chains are very useful in the study of Markov fields on the simple graph Z. We are thus led to ask if the set Z of integers can be replaced by a parameter set S with a richer graph structure which is still simple enough to admit the notion of a Markov chain. Since it is impossible to define Markov chains on graphs containing loops, we must require that S be a tree. A particular tree of special interest is (ßSTid), the Cayley tree of degree d ^ 1. ^STid) is the unique connected tree with d + 1 edges emanating from each vertex. Note that ^^(1) = Z. It will turn out that the Cayley trees of degree d > 1 are large enough to admit a phase transition for Markov fields even when the state space E is finite. For this reason we shall be content with looking at finite state spaces only. In Section 12.1 we shall first introduce a suitable notion of a Markov chain on a tree. Then we shall extend two basic results of Chapters 10 and 11 from the case S = Z to the case when S is a tree. In analogy with Theorem (10.21) we shall show that a Gibbs measure /i for a Markov specification y is a Markov chain whenever fi is extreme in @(y), and as a counterpart to Theorem (11.9) we shall obtain a characterization of the Markov chains in &(y) in terms of a suitable concept of a boundary law. In Section 12.2 we shall apply these results to the Ising model on a Cayley tree ^^~(d). We shall again obtain the result of Section 3.2 that no phase transition occurs when d = 1. In the case d > 1, however, the Ising model exhibits a phase transition for suitable values of the coupling constant and the external field. These critical values can be specified explicitly. In addition, we shall identify some special elements of the associated simplices of Gibbs measures.
12.1
Markov chains and boundary laws
Let £ be a finite state space and S the vertex set of a locally finite connected tree. By definition, this means that there is a distinguished set B cz {b cz S: \b\ = 2} of "bonds" or "edges" b = {i,j} between "adjacent" sites i, j e S which exhibits the three properties below. (i) Local finiteness. For each i e S, the set di =
{jeS:{i,j}eB}
of all neighbours of i is finite. Of course, this implies that
Markov chains and boundary laws
239
5A 4 (J di\A ie A
is finite for all A e ^ . (ii) Connectedness. For any two sites i,j e S there is a sequence i = i0, i x ,..., i„ = 7" in S such that {ik_l5 ik} e ß for all 1 ^ /c ^ n. Such a sequence is called a path from i toj. (iii) Tree property. For all i, j e S, there is only one path from i to 7. Consequently, we can introduce a metric d on S by letting d(ij) be the length n of the unique path from i to 7. In this chapter we shall be concerned with specifications on S which are Markovian in the following sense. (12.1) Definition. Let y be a specification for E and S. y is said to be a Markov specification if yA(cA = CI ') is J^ A -measurable for all £ e £ A and A e y . Clearly, each Gibbs specification for a nearest-neighbour potential is Markovian. Also, if y is Markovian then each fj. e $(y) is a Markov field, in that ^ satisfies the local Markov property A*(o"A = C r a = M*A = C I ^ A )
A*-a.s.
(Ce£A,Aen
Besides this Markov property, we can also introduce a stronger Markov property which corresponds to the one-sided Markov property of a Markov chain. In order to do so we need some further notation. For each bond {1,7} e B we let 17 denote the associated oriented bond which points from i to 7. The symbol B will stand for the set of all oriented bonds. Each site ke S induces a splitting of B into the sets Bk = {y e B: d{k,i) = d(k,j) + 1} and k
B = {ijeB:d(k,j)
=
d(k,i)+l}
of oriented bonds that point towards k and away from k, respectively. Similarly, each oriented bond ij e B defines a splitting of S into the "future interval" ]ij, col =
{keS:ijeÏÏk}
and the "past interval" ] - o o , y [ = {fceS:ye*#}. (12.2) Definition. A probability measure JX on (Q, &) will be called a Markov chain if
for all y e ß and j^ e E. Any stochastic matrix Pf • on E with
240
Markov fields on trees
H(a} = y\&[l]) = Py(o-„ y)
M-a.s.
for all y e E will then be called a transition matrix from i toj for ß. A Markov chain \x will be said to be completely homogeneous with transition matrix P if ß(aj = y\^{i])
= P(ai,y)
/i-a.s.
for all y e E and all ij e B. (12.3) Comments. (1) Standard arguments show that every Markov chain ß satisfies ß(A\^i_XtiJ[) = ß(A\^[i}) ß-a.s. for all A e ^y >00 [ and all ij e B. (2) Let ß be a Markov chain with transition matrices (Py-)y6g, and let a k = ak(fl) be the marginal distribution of ß at fc e S. Then (12.4)
A*(*A
= 0 = «*(£*)
El
ijekH:i,jsA
Pttfi'tj)
for all connected sets A e ^ and all Ç e -EA and fc e A. This is easily seen by induction on |A|. (3) Let ß be a Markov chain, and suppose that V is a copy of Z which is imbedded (as a graph) into S. Then the marginal distribution av(ß) of /i on V is a Markov chain in the sense of Definition (10.4). This follows readily from equation (12.4). (4) Let (Pij)iJSB be a family of stochastic matrices on E. (P y ) ije B is a family of transition matrices for a Markov chain ß if and only if there exists a family ( a A e s of probability vectors on E such that (12.5)
a,.(x)Py(x, 30 = a / y ) / ^ x)
(y e S, x,y e £).
This is because (12.5) is equivalent to the statement that the expression on the right of (12.4) is independent of the choice of k e A for all connected sets A e ^ and all Ç e EA. (5) Let P be a positive stochastic matrix on E. P is the transition matrix of a completely homogeneous Markov chain ß if and only if P is reversible, in that there exists a probability vector a on £ such that <x(x)P(x, >>) = a(y)P(y, x)
(x, y e E),
and in this case we have a = ak(ß) for all fceS. This follows from (12.5). Moreover, equation (12.4) shows that ß is invariant under the group 1(B) of all graph automorphisms of S. 1(B) is defined as the set of all transformations T e T of the form TO» = (a>r;i,)ieS(a> e Œ) with a bijection T*: S -> S which is such that {i^i, T^ j} e B if and only if {i,j} e B. In the special case S = Z (with the usual graph structure), 1(B) consists of both translations and reflections.
Markov chains and boundary laws
241
This is the reason why we speak of a completely homogeneous (rather than a homogeneous) Markov chain. (6) Every Markov chain /i is a Markov field. For let A e y be given, and let A e £f be a connected set with A U dA cz A. Equation (12.4) shows that /i((jA = (corç)/iK = C W ) = M K = C'
for all Ç, C e £ A , to e £ÔA, andrç,rç'e £A\(AUÔA) Summing over C and r/' we obtain /i(crA = (co7/)/i(<7aA = to) = /i(<7MA = (or])ß(aAUdA = Ceo). Therefore, if /i(cA\A = corç) > 0 then M(CTA
= CkA\A = oiri) = /i(ffA = C|ffaA = to),
and this means that
Since ^ is generated by the union of all these ^A\ A 'S, we conclude finally that ^ K = CI ^A) = /X(OA = CI ^3A)
M-a.s..
Hence ß is a Markov field, o In Section 12.2 we shall see that the converse of the last comment fails: There exist examples of finite state Markov fields on a tree which are not Markov chains. Nevertheless, the following analogue of Theorem (10.21) holds. (12.6) Theorem. Let y be a Markov specification. Then each ß e ex^(y) is a Markov chain. Proof. Let ij e B and y e E be, given. For each n ^ l we write A(n) = {k e S: d(k,j) ^ n} and A(n) = A(n) fl ]i/', oo[. In view of Theorem (7.7), ß is trivial on the tail c-field ST. A fortiori, /i is trivial on the smaller er-algebra ( | &)ij,
co[\A(n)-
»ai
This implies that Indeed, the u-algebra on the left is clearly contained in that on the right. Conversely, if/ is bounded and measurable with respect to the latter u-algebra then f(x
/x-a.s.
for all x e E because /(xersMi}) is measurable with respect to (~) &\i}< œ[\A(ll). Using the backward martingale convergence theorem, we conclude that
242
Markov fields on trees H((7j = y\&{(i)
= Hm n(oj = y\&{i]UVj,OO[\AW)
M"a-S-•
n-»oo
Since {1} U ~\ij, oo[\A(n) => dA(n) and n e &(y), the expression on the right is almost surely equal to lim 7M„Mj = y\-) = Hm n((7j = n-*oo
y\rMn))
n-*oo
= filaj = y n>l
By virtue of the inclusion
this implies that Mo-, = y|.F w ) = ii(Oj = yl^-«,,^)
/x-a.s..
Hence n is a Markov chain, D We now work towards obtaining a characterization of the Markov chains in <&(y) which is similar to Theorem (11.9). For simplicity we shall only consider positive Markov specifications. In view of Corollary (2.32), a positive specification y is Markovian if and only if y = y* for some nearest-neighbour potential . Setting (12.7)
Qb(C) = exp[-
when b = {i,j} e B and ( e Eb, we see that each positive Markov specification y can be written in the form (12.8)
yA(aA = œA\œ) = ZA(œr1
ft
Qb(œb),
ftHA#0
where A e ^ , a> e Q, and ZA{w) is a normalizing constant. It will often be convenient to think of Qb as a transfer matrix along the bond b. To stress this aspect we introduce a family {Qtj: ij e B} of positive matrices by writing (12.9)
Qij(x,y) = Q}i{y,x) = Qb{t)
whenever b = {i,j} e B, Ç e Eb, and x = Çt, y = £,-. (12.10) Definition. A family {tftj: ij e B) of (row) vectors *fy e ]0, oo[ £ will be called a boundary law for {Qij: ij e B} (or y) if for each ij e B there is a number Cij > 0 such that W
= c«,- EI
'*iß«M
forallxeE.
kcdi\{j}
This definition is consistent with Definition (11.8), as is explained in the example below.
Markov chains and boundary laws
243
(12.11) Example. Let S = Z (with the usual tree structure) and Q be a positive matrix on E. Define a family {Qi}: ij e B} by Q (M+1) = Ö and Q ^ - D = QT, where i e Z and QT is the transpose of Q. Let {/,-, r;: i e Z} be a boundary law for Q in the sense of Definition (11.8). Then {fi}: ij e B} defined by 4u+i) = A and 4i,i-i) = r7 is a boundary law for {Q^: ij e B} in the sense of (12.10). This is a straightforward computation, o The next theorem extends Theorem (11.9) to Markov specifications on trees. To state it we observe that if A e ^ is connected and k e dA then A fl dk consists of a unique element which will be denoted by kA. (12.12) Theorem. Consider a Markov specification y of the form (12.8), and let {Qtj: ij e B} be the associated family of transfer matrices. (a) Each boundary law {ß^: ij e B} for {(?y: ij e B } defines a unique Markov chain \i e ^(y) via the equation
(12.13) n(aAUdA = C) = zA J ] 4kA(Ct) f l Qbttb)kedA
fcHA#0
Here A e y is any connected set, £ e EAUdA, and zA > 0 a suitable normalizing constant. (b) Each Markov chain \i e @(y) admits a representation of the form (12.13) in terms of a boundary law {^ij: ij e B} which is unique in the sense that each £tj is unique up to a positive factor. Proof, (a) The first step in the proof of assertion (a) consists in showing that the expressions on the right of (12.13) are consistent. This means that
(12.14)
E ^ n « Ç K e£^
J n QAM = ZA n **&*) n e«.&)
teflA
fcnA#0
kedA
fcHA#0
whenever A, A e
?A n ^jQkjiCj) n keV
kedA\{i}
4, A (cj n QM>)bHA#0
Since {/l7: y e B} is a boundary law, the last expression coincides with the right side of (12.14) up to a factor zJcuzA. Summing over CAUAA w e s e e that this factor is 1. This establishes (12.14). As a consequence of (12.14), equation (12.13) defines a unique finitely additive measure on the algebra of all cylinder events, and thereby a unique probability measure fj. on (Q, #"). By definition, \x is positive on cylinder events. To show that \i is a Markov chain we fix any ij e B, x, y e E and œ e Q, and we let A e y be a connected set with i e A c ] — oo, y[. We set A = A U BA\{j}. Equation (12.13) shows that
244
Markov fields on trees H(a} = x\aA = oiA)/fi((Tj = y\aA = oiA)
= Sjt(x)Qji(x, oJty/jiWQjAy, <",-)• Summing over x e E we obtain
The expression on the right depends on at via co; only. We thus conclude that n(o} = yli^-oo,^) = /x(o) = y\^{i})
/x-a.s..
It remains to prove that /z e ^(y). Let A e if be given. Take any two configurations £, co e Q with £S\A = cos\A. Let A e y be an arbitrary connected set with A c A . Then we can write, using (12.13) and (12.8), M ° A = CAICT(AUflA)\A= — ^(^AUflA
= n
=
0i
(AödA)\A)llJ-\(TA
^\Ud\)ln(a\Ud\
=
ffl
A I CT(AUflA)\A =
^(AUflAjXA)
— ^AUAA)
QtiCVQM
bnAjt0 bnAjtQ
= ttvK = C A M / ^ A K = OJA\(O). Summing over £A e £ A we see that n e &(y). (b) To prove assertion (b) we fix any Markov chain n e &(y). [i is positive on cylinder events because y is positive. For ij e B and x, y e E we define Pij(x,y) — n(pj — y\a{ = x). Let A e if be connected, £ e Q, and a e £ be any fixed reference state. We write MffAuav =
CAU*A)
= /x(^)/x(5|A)/x(C|B)//x(^|B),
where A = {aA = a},
B = {a8A = ÇdA},
C = {aA = £A}.
Equation (12.4) shows that H(B\A)=
fi
PkJa,Ck).
kedA
On the other hand, we conclude from (12.8) that H(C\B)/n(A\B) =
yA(C\0/yA(A\0 fcnA^0
/
Consequently, equation (12.13) holds with ZA = /*(öA = a ) / I l ôfc( aa ) / kA
fcczA
kedA
Markov chains and boundary laws
245
and 4 ; (x) = Pjt(a, x)/Qß(a, x)
(ij eB,xe
E).
In particular, replacing A by A = A U {/} in (12.13) and comparing the resulting equation with (12.13) we obtain the consistency equation (12.14). But (12.14) implies that {£{j: ij e B} is a boundary law; cf. the first part of this proof. To prove the uniqueness of
4(x)/4,(x) = (z{l]/z'{t]) n 4(Û)/<;,-(Û)kedi\{j]
It follows that ^ is a positive multiple of fy. This completes the proof of Theorem (12.12). • We have just seen that a boundary law is only determined up to a positive factor. It is sometimes useful, therefore, to introduce a normalization. The simplest normalization is that at a given reference state a e E. We will say that a boundary law {
4,(x)=
n
4ß«W/4ß«(a)
i}jeB,xeE).
kedi\{j}
We conclude this section with two corollaries of Theorem (12.12). The first deals with completely homogeneous Markov chains in a homogeneous setting. Suppose there is an integer d ^ 1 such that \di\ = d + 1 for all ie S. S is then called the Cayley tree or Bethe lattice of degree d and will be denoted by ^^(d). Clearly, ^^"(l) equals Z with its usual graph structure. Two embeddings of a part of <^"(2) into the plane are shown in Figure 1 below.
vy yy yy YY
Y YY
(o) (b) Figure 12.1 Two embeddings of the Cayley tree
246
Markov fields on trees
A Markov spécification y on ^STiß) will be said to be completely homogeneous with transfer matrix Qify satisfies (12.8) with functions Qb which are given by (12.9) with Qij = Q for all ijeB. (Note that the matrix Q is necessarily symmetric.) Let such a y be given, and suppose n e &(y) is a Markov chain. An inspection of the proof of Theorem (12.12) shows that ft is completely homogeneous if and only if the unique boundary law {/y: i / ' e ß } which is normalized at some a e E and represents ft, via (12.13), is completely homogeneous, in that / y = ( for all ij e 0 and some t e ]0, oo[£. In view of (12.15), { is then a solution of the equation (12.16)
£{x) = ^Q(x)//Q(a))d
(x e E).
We thus arrive at the following conclusion. (12.17) Corollary. Let d ^ 1 and S = <^~(d), and consider a completely homogeneous Markov specification y with positive transfer matrix Q. For any fixed reference state ae E there is a one-to-one correspondence between the completely homogeneous Markov chains /J. e @(y) and the solutions tf e ]0, oo[ £ of equation (12.16). This correspondence is established by equation (12.13) with £.. = t far all ij e B. The second corollary will provide a condition under which a convex mixture of Markov chains fails to be a Markov chain. This condition will be used in the next section to obtain examples of Markov fields which are not Markov chains. (12.18) Corollary. Let S = ^^{d) for some d^\, and let y be a completely homogeneous positive Markov specification. Also, let /i1, . . . , fiNe^(y) be pairwise distinct Markov chains and {///': ije B}, . . . , {^): i / ' e ß } be the associated boundary laws for y which are normalized at some ae E. Suppose that for each ij e B there exists some k e di\{j} such that £$ = ^ n ) for all 1 ^ n ^ N. Under this condition, a non-trivial convex mixture of iilt . . . , nN cannot be a Markov chain. N
Proof Let 0 ^ t l s . . . , tN < 1 be such that ]T t„ = 1, and suppose that n = ]T tn\in is a Markov chain. Let {*fy: y e ß} be the boundary law which n= l
_
describes \i and is normalized at a. Take any ij e B and let k e di\{ j} be as in the hypothesis. Applying (12.13) to the measures n, / ^ , . . . , fiN, the singleton A = {i}, and all configurations Ç with Çm = a unless m =j or k, we see that there are numbers c1,...,cN > 0 (depending on i) such that
WM)
= t tncJ}?(xV}?(y) n= l
The Ising model on Cayley trees
247
for all x, y e E. Since all boundary laws are normalized at a, it follows that
t ^^vn[4m»(z)-4"»(z)]2 = o m,n = l
for all z e E. Let m # n be such that tmtn > 0. Then ^ m ) = $•> for all ;ï e 5 , and therefore //m = //„, in contradiction to our assumption, a
12.2
The Ising model on Cayley trees
In this section we assume that S = ^^(d) for some d ^ 1. Thus S is the unique connected tree with \di\ = d + 1 for all i e S. For the state space we choose E = { —1,1}. We shall sometimes write + and — instead of 1 and — 1, respectively. The a priori measure on E is counting measure. For given parameters J e R (the "coupling constant") and heU (the "external field") we shall consider the Ising potential (12.19)
' -Ja^j <&£•*= J -for, 0
iîA = {i,j}eB, if,4 = {i}, otherwise
as well as the associated set ^(J,/i) = (§(<&J'h) of Gibbs measures. We shall use Theorem (12.12) to identify some particular Markov chains in <&(J, h). By virtue of Theorem (12.6) and some specific properties of the model we shall then be able to determine the phase transition region {(J,h)eR2:\${J,h)\
> 1}.
To begin with, we note that the Gibbs specification yJ-h for Q>Jh is a completely homogeneous Markov specification with a transfer matrix Q = QJth which, in view of (12.7) and (12.9), is given by <2(-,-) = exp(J-2/z(d+l)-1), (12.20)
ß ( - , + ) = ß( + , - ) = e x p ( - J ) , Ô( + , + ) = exp(J + 2h{d+ I)" 1 ).
According to Corollary (12.17), there is a one-to-one correspondence between the completely homogeneous Markov chains in g(J, h) and the numbers s > 0 which satisfy the equation (12.21)
s =
' ß ( - , + ) + sß( + ,+) N d
ß ( - , - ) + sß( + , - )
(Just put a = — 1 in the corollary and identify s with the vector / = (1, s).) To exploit this equation we pass from s to the new variable
248
Markov fields on trees
t = h(d + l)'1 + ^logs. Inserting (12.20) into (12.21), expressing s in terms oft, and taking logarithms, we arrive at the equation (12.22)
t = h + d q>j(t).
Here (12.23)
with w = tanh J. This gives us the following result. (12.24) Proposition. For given parameters d ^ 1 and J, h e U. there is a one-toone correspondence t -> ßt between the real solutions of equation (12.22) and the completely homogeneous Markov chains in ^(J,h). /x, is determined by equation (12.13) with *f y (-) = 1, *fy( + ) = exp(2t - 2h(d + l)" 1 ), j / ' e l The transition matrix Pt of jxt is given by ( eJ~'/2 cosh (J - t) e'-J/2 cosh (J - t)\ ' ~ Ve_J,~'/2 cosh (J + t) e J+ '/2 cosh (J + t)/ ' and the one-dimensional marginal distribution a, = <7;(/xr) (i e S) o/ jjr is a( = (2e" 2J + 2cosh2t) _1 (e^ 2J + e~2',e~2J + e2t). Proof. We only need to check the formulas for Pt and <xr. The first follows from the equations Pr( + ,+)/P r ( + , - ) = exp(2r + 2J), P t ( - , + ) / P t ( - , - ) = exp(2t-2J) which are easily derived by looking at the proof of Theorem (12.12)(a). The second formula then follows from Comment (12.3) (5). D The formula for Pt has the surprising consequence that »M = °j = +)Ht(<*i = °j = -)lnA°i = ~
+)
= exp 4J for all {i,j} e B. In other words, the coupling constant J is easily recovered from the bond marginal distribution of an arbitrary completely homogeneous Markov chain in &(J,h). A similar formula holds for exp4f, and this determines h via (12.22). For later purposes we also note that the magnetization of
The Ising model on Cayley trees
249
H, is given by (12.25)
n,(ot) = (e~ 2J + cosh 2() _1 sinh It
(i e S).
In particular, JU,(
The maximal slope of q>j is
J(d) = arcothd = ~ l o g ^ ^ - . 2 d— 1 If J ^ J(d) then dw rg 1. Since dw — 1 is the maximal slope oî dcpj(t) — t, this means that d(pj(t) — t is decreasing in t. Hence h(J,d) = 0 when J :g J(d). In the other case J > J(d), the function dcpj(t) — t has a positive slope at t = 0. Thus h(J,d) > 0. To find the explicit value of h(J,d) we note that d(pj(t) — t has a local maximum at 1/2
w w 1 where w = coth J = w" . Indeed, using (12.26) we see that dcp'j(t) = 1 if and only if sinh 2 1 = (dw — 1)/(1 — w2) and therefore (12.29)
t J d = artanh
250
Markov fields on trees
tanh 2 t = (1 + sinh 2 t) - 1 sinh 2 1 = (d - w)/(d - w). Evaluating q>j at tJtd we finally get '0
(12.30) h(J,d) = i
ifJ^J(d), 2
Jdw-
2
\Y
(d - wV'
d ar tanh -— — ar tanh if J > J id). \dw — 1/ \d — w J In particular, h(J, 1) = 0 for all J > 0. In the case d > 1, a straightforward calculation shows that h(J,d) = (d- 1)J + ~^—\og(d
- 1) - ^logd + o(l) as J -^ oo
and h(J, d) = \(d2 - l) 1 / 2 (i - J{d)f'2{\ + o(l)) as JI J(d). So far we have discussed equation (12.22). Now the point is that \@(J, h)\ > 1 if and only if equation (12.22) has more than one solution. This will be shown in the following theorem. In other words, the thorn-shaped region {(J,h)eU2: J> J(d), \h\
^h(J,d)}
is the phase transition region of the ferromagnetic Ising model on ^^(d). (This region is marked with an F in Figure 2 below.) In addition, this theorem will clarify the rôle of the completely homogeneous Markov chains in 'S{J, h). In order to state it we need some notation. We let ë?IiB)(Q, &) denote the set of all \i e ^(Q, &) which are invariant with respect to all x e 1(B); cf. Comment (12.3)(5). Similarly, we put %B)(J, h) = %(J, h) n ^I(B)(^i, F). Moreover, combining the notations of Proposition (12.24) and Lemma (12.27) we shall write H*, ß-, fi+, and \i# instead of nu, \it , /i r+ , and /i r# , respectively. That is, \i^ is the unique, /i_ the minimal, fi+ the maximal, and n# the intermediate Markov chain in ^I(B)(J, h). (12.31) Theorem. Consider the ferromagnetic Ising potential (12.19) with parameters J > 0 and he U on the Cayley tree S = ^^(d) of degree d^.1. Let J(d) and h(J, d) be defined by (12.28) and (12.30). (a) J ^ J(d) or\h\> h(J,d) then 9(J,h) = 9I(B)(J,h) = {^}. (b) In the opposite case we have ß-, p+ e ex^(J,h) fi ex^I(B)(J,h) and \ex9(J,h)\ex&I(B)(J,h)\
= oo
Also, if \h\ < h(J,d) then \i# is a third extreme element of 'Si^J, h).
The Ising model on Cayley trees
251
Proof. 1) For given J and h we let £_ be the smallest and t+ be the largest solution of (12.22). (The case t_ = t+ = t^ is not excluded.) Also, we let {yy: ije B} be any boundary law which is normalized at — 1, and we put fy = A(d + i r 1 + i l o g ^ ( + )
(ijeÏÏ).
We claim that £_ ^ ttj ^ f+ for all ij e If. This follows from the equation (12.32)
ty = A+
I
%(tj
(ye ß )
fce3i\{j}
which is obtained from (12.15) in the same way as equation (12.22). Indeed, since q>j ^ J we conclude from (12.32) that ttj ^ h + dJ = i^(oo) for all ij e B. As (p7 is increasing, a second application of (12.32) shows that ttj ^ iA2(oo) for all y, where \j/ = h + dcpj. Repeating this argument, we see that ttj £j i^"(oo) for all n ^ 1 and ij e B. The sequence (i/'"(oo))näx is decreasing and bounded from below by t+. Its limit is a fixed point of ^ and thus equal to t+. This proves that ttj 5£ t+ for all y e B. The lower estimate of t; • is similar. 2) To prove assertion (a) we assume that J ^ J(d) or \h\ > h(J,d). Lemma (12.27) then shows that f_ = t+ = t^. Step 1) above and Theorem (12.12) thus imply that there is only one Markov chain in <&(J,h). In view of Theorem (12.6), this is only possible when \ex&(J,h)\ = 1. Using Theorem (7.26), we conclude that \&(J,h)\ = 1. 3) Next we assume that J > J(d) and \h\ S h(J,d). Then t_
= EI 4,A(CJ kedA
EI
-1)
LQ(Ci,Cj)/Q{-,-)l
{i,j'}nA#0
According to Step 1) above, {fu: ijeB} is dominated by the normalized boundary law defining p+. Hence rAC(p) ^ rAC(ß+ ). In view of Theorem (7.26), this inequality even holds for all n e &{J, h). Suppose now that n+ was not extreme. Then ß+ is a non-trivial mixture of two distinct measures \i, p' e ^(J, h). This implies that rAJ-(n+ ) is a non-trivial convex combination of the numbers rA^{ß) and rA^(p') which are at most >A,Ç(M+)- From this we conclude that rAI-(p+) = rA
252
Markov fields on trees
tn = h+d
zeI(B,n)
In view of standard extension arguments, it suffices to check this when A and C are cylinder events. In this case we can find connected sets A, À e y with Ae ^A and C e J ^ . For given k ^ 1 we let n ^ 1 be so large that ^(T" 1 A, A) ^ k for all T e I(B, n). Formula (12.4) then shows that sup luiAHT^B)
- n(A)n(B)\ ^max
veI(B,n)
xeE
£ \Pk(x,y) - oc(y)\. yeE
By Theorem (3.A3), the last expression tends to zero as k -> oo. This establishes the mixing property of \i, Now let A e SF be such that ^(-T 1 AAA) = 0 for all T e 1(B), and put C = A. The mixing property then implies that ß(A) = ß(A)2 and thus ß(A) = 0 or 1. Applying Corollary (7.4) to the set 11 = {Qx ^3(œ,A)-+
lA(xa>): x e 1(B)}
of all deterministic probability kernels which are induced by the transformations in 1(B), we conclude that ijeex ^ /(B) (Q, J5"). This completes the proof of the theorem, D As a by-product, the preceding theorem provides us with examples of Markov fields which belong to ^ /(B) (Q, #") but are not Markov chains. Indeed, if
The Ising model on Cayley trees
253
J > J(d) and \h\ < h(J,d) then Corollary (12.18) ensures that a non-trivial mixture of /i_, \i#, and /i + can never be a Markov chain. But each such mixture belongs to ^HB)(J,h). This observation shows that Theorem (10.25) fails when Z is replaced by a Cayley tree of degree d > 1. It is worthwhile to compute the magnetization [i+io^ in the critical case J > J(d), h = —h(J,d). In this case we have t+ = tj d; cf. (12.29). The derivation of (12.29) shows that sinh 2tj d = 2 sinh2 ts d coth tj d = 2w(l - w2)~l(d - w)l/2(d - w)1/2 and e" 2J + cosh 2tJd = (1 - w)/(l + w) + 1 + 2 sinh2 tJ>d = 2w(l - w 2 ) - 1 ^ - 1). Equation (12.25) thus implies that (12.33)
/x+(oi) = (d - w)1/2(d - w)l,2(d - I)"1
(i e S)
when J > J(d) and h = — h(J, d). Note that the expression on the right of (12.33) is positive, tends to 1 as J-> oo, and behaves like (d + l)d~U2(J — J(d))m as J | J(d). Let us summarize our results in the ferromagnetic case J > 0. If d > 1 and J > J(d), the Ising ferromagnet on ^^(d) exhibits the phenomenon of spontaneous magnetization against an external field. More precisely, if h < — h(J,d) then <&(J,h) = {/!„,} for some completely homogeneous Markov chain p.^ with magnetization /^(o,-) < 0; cf. (12.25). At the negative external field h = —h(J,d), there exists a completely homogeneous phase \i_ of negative magnetization. This \i_ is the limit of the unique phases for h < —h(J,d). However, in addition to \i_ a new phase /i + e ^I{B)(J, —h(J,d)) emerges. /i + has a positive magnetization which is given by (12.33). If h increases, /i + splits into two completely homogeneous Markov chains / i # and /i + .The magnetization /i+(<7;) increases with h, whilst ß#{at) is a decreasing function of h and reaches 0 at h = 0. The negative phase /i_ continues to exist, and besides /i_, / i # , and /i + there is an infinite number of inhomogenous phases. At h = h(J, d), the Markov chains /i_ and \i# coalesce. All phases except /i + disappear when h > h(J, d). Finally, let us turn to the antiferromagnetic case J < 0. In this case q>j is decreasing. As a consequence, equation (12.22) has only one solution t*. In view of Proposition (12.24), this means that there exists only one completely homogeneous Markov chain /i^ in ^(J, h) for all J < 0 and he U. However, in contrast to the ferromagnetic case this does not imply that the Gibbs measure is unique.
254
Markov fields on trees
In order to determine the antiferromagnetic phase transition region we introduce the notion of an alternating Markov chain. This notion is suggested by the fact that S — ^^(d) is a bipartite graph. By definition, this means that S can be decomposed into two disjoint subsets S0 and S t such that \b (1 S0\ = \b Pi S t | = 1 for all b e B. A boundary law {
if J e S 0 ,
°' Vi
tf/e^.
=
Similarly, a Markov chain in <£(J, h) will be called alternating whenever it can be represented by an alternating boundary law. An obvious modification of the derivation of equation (12.22) shows that an alternating boundary law for Q>Jh is described by a pair (t0, tx) of real numbers satisfying (12.34)
t0 = h + dcpjit,), t, = h + dq>j(t0).
Of course, t0 then solves the equation (12.35)
t = h,hJt)
4 h + d
Conversely, if t solves (12.35) then the pair (t, h + dq>j{t)) defines a unique alternating boundary law. Consequently, there is a one-to-one correspondence between the solutions of (12.35) and the alternating Markov chains in ^(J,h). Since \I/J
+
2t„Jtd>0.
The Ising model on Cayley trees
255
In fact, numerical calculations suggest that I(J,d) =
]-h{J,d),h(J,d)l.
An explicit formula for h{J,d) is easily derived from (12.29). The resulting picture of the antiferromagnetic phase transition region is shown in Figure 2 below.
-
2
-
1
0
1
2
Figure 12.2 The phase transition regions of the Ising model on <^~(2). The antiferromagnetic region AF is open, whilst the ferromagnetic region F includes its boundary line except for the singular point. In analogy to Theorem (12.31) we can also obtain some information about the structure of phases when (J, h) belongs to the antiferromagnetic region. Suppose that J < —J(d) and h e I (J, d), and let t0 and t1 be the smallest resp. largest solution of (12.35). Then (12.34) holds, and t0 < t* < t1. Whilst ^ corresponds to the unique completely homogeneous Markov chain / ^ in & (J, h), the pairs (t0, fx) and (f1; f0) define two alternating boundary laws and thereby two distinct Markov chains /*_+, /*+_ e ^(J,h)\^IiB)(J,h). With the help of ideas similar to those in Step 3) of the proof of Theorem (12.31), one can show that /i_ + and /x+_ are extreme in
Chapter 13 Gaussian fields
The theory of Gaussian fields is itself a well-known branch of Probability Theory. In this chapter we will show how this classical subject fits into our Gibbsian framework. Section 13.1 will be devoted to the question of whether a given Gauss field / i o n a countable parameter set S is a Gibbs measure for a suitable specification y. This will turn out to be the case whenever ß is non-degenerate and satisfies a weak kind of Markov property. The specification y with ß e @(y) will be Gaussian, in that each yA(-\m) is Gaussian field. In addition, y will be Gibbsian for a pair potential O.
Gauss fields as Gibbs measures
257
shift-invariant. In this case, the existence problem admits a satisfactory solution in terms of Fourier analysis. We shall prove that &(yJ'h) # 0 if and only if Mj h # 0 and the reciprocal J" 1 of the Fourier transform J of J is integrable with respect to Haar measure. One corollary of this theorem will assert that there is a unique shift-invariant Gibbs measure ß e ^{yJ,°) with /i(CTo) < °o if and only if J has no root. At the end of this chapter there is an appendix which contains the proofs of some auxiliary results about Gaussian random vectors and positive definite functions on Zd.
13.1
Gauss fields as Gibbs measures
Throughout this chapter, the state space E is chosen to be U, the real line. E is equipped with the Borel a-algebra S, and the a priori measure A is Lebesgue measure. In this and the next section, the parameter set S is an arbitrary countably infinite set. (13.1) Definition. A probability measure \i on (Q, #") = (£, S)s is called a Gaussian field or a Gauss measure if all finite dimensional marginal distributions <JA{ß) (A e y ) are Gaussian. The vector m = {mt)ieS = (^(o-;));eS is then called the mean of \i, and the symmetric function C(i,j) = HÜOi - mjicj - m,-)) = H^j)
~ mi™j (Uj e S)
is called the covariance function of \i. ß is said to be centered if m = 0. As is well-known, the preceding definition can be restated in Fourier analytic terms as follows. A probability measure \i on (Q, 2F) is Gaussian with mean m and covariance function C if and only if (13.2)
^(exp
t.-CT, ieS
exp - ~ Z hC(i,j)tj + i £ t,mt •£ iJeS
ieS
for all real sequences (£,.),.6S for which {ie S: t{ # 0} is finite. Here i stands for the imaginary unit. Recall that a symmetric function C: S x S -» M is said to be nonnegative definite resp. positive definite if (13.3)
£
UiC{i,j)Uj^0
resp. > 0
iJeS
for all complex sequences (u;), eS with {i e S: ui # 0} e Sf. It is well-known, and easily seen, that a covariance function is always nonnegative definite. If a Gaussian field p has a positive definite covariance function then each marginal distribution oA(n) of ß has the well-known Gaussian density with respect to 1 A (A e Sf).
258
Gaussian fields
In this section we shall address ourselves to the following question. Is every Gauss field a Gibbs measure? More precisely, is it possible to associate to a given Gaussian field fi a Gibbsian specification y in such a way that \i e <&{y)1 To answer this question we shall proceed in three steps. First, we shall look at the conditional expectations (13.4)
tf
= ufal^)
(ieS),
and we shall assume that fi(at ^ Çf) > 0 and that each £f only depends on finitely many spins. By virtue of general properties of Gaussian distributions, this will imply that (13.5)
ai-Z?
= J(i,i)-1 K + X J{i,j)oj
/i-a.s.
for a suitable vector h = (/j;)ieS e Q and a positive definite symmetric function J: S x. S ->M. (Equation (13.5) asserts in particular that the conditional expectation £f is affine.) We then shall conclude from equation (13.5) that the conditional distribution of ci given $~^ is Gibbsian for a quadratic pair potential $ = 0J,h. Finally, we shall conclude from the positive définiteness of J that $ is 1-admissible. This will allow us to define the Gibbsian specification y®, and Theorem (1.33) will imply that fi e ^(y®). Turning to the first step of this program, we introduce the covariance function (13.6)
T(i,j) = n((ai - tf)(d, -
tf))
(ij e S)
of fi. (F(i, i) is often called the mean square interpolation error.) (13.7) Proposition. Let \i e ^(Q, IF) be a Gaussian field with mean m. Suppose (i) T(i, i) > 0 for all i e S, and (ii) n is Markovian, in that for each i e S there is some di e £f with i $ di such that Çf has an ^di-measurable version. Define
j(i,j) = T(i,j)/r(i,i)ru,j)
(ijes)
and /!,= - £
J(i,j)mj
(i e S).
jeS
Then J(i,j) = 0 unless j e {i} U di, J is positive definite, and equation (13.5) holds for all i e S. Proof. First of all, hypothesis (ii) implies that £f = /^il^ai) A*-a.s. for all i e S. Therefore, if j <£ {i} U di then ai — Ç? is ST^ -measurable, and this gives Thus J(i,j) = 0 unless; e {i} U di, and this shows that the sums in the definition oîhj and on the right of (13.5) are finite.
Gaussfieldsas Gibbs measures
259
Next we take advantage of the fact that a^di{ß) is a Gaussian distribution. As is well-known, this ensures the existence of real numbers ut and vtJ such that P{ai\&rdi)= - « £ - X vuaj
A*-a.s..
See Proposition (13.A4) in the appendix. Letting vu = 1 and vtj = 0 when j $ {i} U <9i, this can be rewritten as (13.8)
/*-a.s..
In order to identify the vy's we observe that H(ok(o} - $)) = n(ak nipj - ^\3r{j))) = 0
(k ± j)
and Hiafa - £/)) = r ( ; , ; ) + täftaj - £/)) = r(j,j)
(j e S).
Equation (13.8) thus implies that
raj) = JLt + Zs w V - - tf)\ keS
for all i,j e S. To identify the M/S it is sufficient to integrate equation (13.8) with respect to \i. Equation (13.5) then follows from (13.8). It remains to show that J is positive definite. It is evident from the definition of J that J is nonnegative definite. To prove the strict inequality in (13.3) we look at the symmetric matrices fA = (J(ij')),^-6A> A e y . The nonnegative definiteness of J just means that all eigenvalues of all ,/ A 's are nonnegative, and J is positive definite if and only if all eigenvalues of all ,/ A 's are positive. Consequently, we need only show that each ßK is invertible. To this end we fix any A e ^ and define a matrix TA = (rA(z'J)); jeA by TA(fj ) = ptfa - nt)(oj - >7A)) = Hifiiioj -
tf))
(U e A )-
Here rçtA = \i{a^^). We claim that TA is an inverse of ßK. Indeed, for each i, k e A we can write X J{iJ)TAU,k)= j'eA
Z
J(i,j)n(aj(ak-^))
jEA
= X J(Uj)nH
= j(uM(°i-zn(°k-ri£)). The last equality follows from (13.5). If i # k then Ufa - #)(** - ,,*)) = KKOi - m^{i}H°k
- rf)) = 0.
260
Gaussian fields
In the case i = fc we have
This shows that ,/ A r A is the identity matrix and proves that J is positive definite. • Next we will use equation (13.5) in order to show that the conditional distribution of ai given 3~^ is Gibbsian for a pair potential O = 0 J ' h . For later purposes, we shall forget about the particular definition of J in Proposition (13.7). In particular, we shall dispense with the finite range property of J which was a consequence of the Markov hypothesis (ii) at (13.7). However, we still need to make sure that the sum on the right side of (13.5) is well-defined almost-surely. We shall do this by simply assuming that the set (13.9)
Qj =
iesl
has probability 1. Note that Qj e ST. (13.10) Lemma. Let J: S x S ->U be symmetric and positive definite, and let heQ. Also, let fi e £?(Q, J^) be a Gaussian field with the properties
(0 M " J ) = i. (ii) n({pi - £f) 2 ) = J(U 0" 1 f°r
al1
» e s>
and
(Hi) fi satisfies (13.5) for all i e S. Then, for each i e S, the conditional distribution of oi given 3~^ with respect to fi is Gaussian with expectation £f and variance Jfciy1. More explicitly, for each A e J^ we have H{A\Zr{i]) = A{i](p{i]lA) H-O.S.. Here p^y. Qj -> [0, oo[ is given by P{i} = Z ^ e x p
\J(i, i)af - Ufa - £ J(i,;>iOj ./Vi
with a ST^ -measurable normalizing function Z^ Proof Fix any i e S. By definition (13.4), the random variables ai — £f and Oj are uncorrelated for all j ^ i. Also, for each A e ^ the random vector (a,. — £f, t7A) has a Gaussian distribution. This follows from (13.5) because the image of a Gaussian distribution under a linear mapping is again Gaussian, and the almost sure limit of Gaussian random vectors is also Gaussian. (Look at (13.A1) for the first statement, and Proposition (13.A5) for the latter.) Remark (13.A3) thus implies that at — Ç? and
Gauss fields as Gibbs measures
nW{t})
261
^K^ai~iv\^{i))
=
= e"«ï/i(e' t( ^-^)
= exp[^r-it 2 M(^-^) 2 )] = exp [it^f - {t2J{i, i)" 1 ]
p-a.s..
But this means that the conditional distribution of
Zi=-W)
-1
hi + X •*(U)o)
With the help of the Gaussian density function p{i] = (J(M)/27r)1/2exp[-(<7£ - £,)2./(M)/2] on Qj, the last equation can then be rewritten as
P{j} clearly coincides with the function p ^ in the statement of the lemma. The proof is thus complete, G The conclusion of the preceding lemma can be restated as follows. For each i e S, the conditional distribution of at given 3~^ is Gibbsian with respect to the potential
\J{i,i)af+ hiai iîA = {i}, (13.11)
\{A = {i,j}, i =£j, otherwise.
There is, however, one technical point to which we must pay attention. If J has infinite range then the Hamiltonians HJ^h{co) for <&Jh do not exist for all œ e Q. Thus <J>Jh is not a potential in the strict sense of Definition (2.2). Nevertheless, H{'h(co) clearly exists for all co G fij5 and this will be sufficient for our purposes. The main question concerning (bJh is this: Does (5)J,h define a Gibbsian specification? For this we need to check if (J-7-'1 is A-admissible in the (slightly modified) sense that the partition functions Z^'h for
A = (J(iJ)) U eA
(AeSn
are invertible. (Cf. the proof of (13.7).) (13.13) Proposition. Let J: S x S ->Mbe a symmetric function, heQ, and
262
Gaussian fields
only if J is positive definite. In this case, the Gibbs distribution yi'k(-\o)) for O J ' / I and A e £f is the unique Gaussian field with mean
- £ /Ä1(Uk)(hk+
m;(A, CÜ)
keA
\
Y J(k,j)œ) i$A
ifieA,
J
ait
ifi£A
and covariance function {/Ä'iUj) [0
ifUjeA, otherwise.
Proof. Let A e 5^ be fixed. We will show that Z3^h(œ) < oo if and only if the matrix ^K is positive definite. Since m e Q J ; the Hamiltonian
^ n c ^ v O = E httt + \ E JdJKiCj + E d E J(Uj)o>j ieA
i,je \
ieA
j $A
is well-defined for all ( e £ A . Let us think of each ( e EA as a column vector, and define a row vector a(A, œ) — (Ö;(A, œ))ieA by a;(A,CÜ) =fc,-+ E
J
{hj)o3y
Then we can write tfA'h(CcosXA) = a(A,eo)C + K r A C , where £ r is the transpose of (. Let 0 A be an orthogonal matrix such that @>A = GTKßK(9K is a diagonal matrix. The diagonal entries qt (i e A) of @>A are the eigenvalues of ßK. We put b(A, œ) = (b;(A,co))ieA = a(A, œ)(9A. As AA is invariant under orthogonal transformations, we obtain Z
A
' » = j 2A(dC)exp [-a(A,co)C - K V A C ] = jA A (dC)exp[-fc(A,eo)C-K r S>AC] = ]^[ j A(dx)exp[ — bi(A, œ)x — ^ ; x 2 /2]. i'e A
The last product is finite if and only if qt > 0 for all i e A. This proves the first part of the proposition. We now assume that ßK is positive definite. Then we can write, putting m(A, o>)= - / A 1 a(A, œ)T, H{'"(CcoSSA) = HC - m(A,œ))TMC T
- jm{A, œ) fAm(A,
- m(A,œ)) œ).
The second term on the right depends only on œs\A, whilst the negative exponential of the first term is equal, up to a normalizing constant, to the Gaussian density with mean m(A, co) and covariance i/A"1. Combining this with the fact that yi'h(-\o}) is supported on {
Gauss fields as Gibbs measures
263
We note that yi'h(-|co) depends on ca via its mean only. The formula for this mean admits a nice interpretation in potential theoretic terms, as we shall explain now. (13.14) Comment. Consider the setting of the preceding proposition. The formula for the mean m(A, ca) immediately shows that m(A, eo) is the unique solution of the equations Z J{UJ)mj(A, eu) = - h{
(i e A)
jeS
m,(A, co) = eo;
(i £ A).
These equations can be considered to be a discrete Poisson boundary value problem. (If ht = 0 for all i, we have a discrete Dirichlet problem.) It is a standard fact of potential theory that such a boundary value problem can often be solved by means of a related Markov process. To be precise, let us assume that J is symmetric and such that (i) J(i,j)S0 for all i^j, (ii) £ J(i,j) ^ 0 for all i, and JeS
(iii) J is irreducible, i.e., S is connected if any two distinct sites i, j with J(i,j) # 0 are joined by an edge. These conditions imply that J is positive definite. Indeed, if {ut)ieS is any complex sequence with at most finitely many non-zero terms then condition (ii) gives
Because of (i), the expression on the right is nonnegative and vanishes if and only if U; = u,- whenever J{i,j) =£ 0. In view of (iii), the latter means that u{ = 0 for all i e S. By virtue of our assumptions (i) and (ii), we can define a stochastic matrix P o n S = SU{oo}by ' - J(Uj)/J(i, 0
if U j e S , i / j ,
X J(i,j)/J(i, i) if i e S, j = co, (13.15)
P(i,j) = 1 1 .0
if i = j = oo, otherwise.
Our Poisson equation for m(A, co) then takes the form (13.16)
X (P(i,j) - SJmjiKco) = -9i m;(A, co) = a)t
(i e A), {i e S\A).
Here gt = —hJJ(i,i), coœ = 0, and <5y is Kronecker's delta. To solve this equation we fix an arbitrary i e S and let (I„)„ Ê 0 be a Markov chain on S
264
Gaussian fields
with transition matrix P and starting point i. The underlying probability measure will be denoted by ß'P. Let TA
= min {n ^ 0: Xn e S\A}
be the first exit time from A. Since P is irreducible, each i e A is a transient state for the Markov chain with transition matrix P and absorption in S\A. As A is finite, we conclude that A4( T A) < °° f° r all i e S. Suppose now that m(A, co) is a solution of (13.16). Then for each i e A and N ^ 0 we have /(TA-1)AJV
ml(\,oi) = ßÜ
Y^
\
3xn + mx^A(N+l)(A,(o)J.
Indeed, for N = 0 this is identical to the first equation in (13.16), and the induction step is an immediate consequence of (13.16) and the Markov property of (X„)„^0. The Markov property also shows that ßU\(0XzJ)=
Z /4(l{rA>„,*n+1*A}Nxn+1l)
= Z 4(i{ t A >.} Z ^^(xA)max Z
fftjii^-i)
\J(k,j)cOj\/J(k,k)
< 00
and thus yuj,
sup |mXt
AJ ,(A,Û))|
< GO
N>0
for all i e A. Letting N tend to infinity, we can therefore apply the dominated convergence theorem to get the formula (13.17)
m;(A, co) = / 4 ( V
9x
^ + ^(c« XtA )
(i e S)
for the mean of yi'h{-\(o). This formula is completely analoguous to the solution of the Poisson boundary value problem in terms of the Wiener process. (See Section 4.6 of Chung (1980), for example.) o After digressing to potential theory we now return to our main point, the definition of a Gibbsian specification for the potential <^J,h in (13.11). In view of Proposition (13.13), we need to assume that J is positive definite. We can then take the natural Gibbsian definition whenever the boundary condition œ belongs to Q 7 . What do we do when co $ Q,? The answer is simple: Since Qj is tail measurable, the definitions of yJ'h on Q, and Q\Q 7 do not affect each other. yJ,h can thus be defined on Q\Q 7 in any consistent way we like.
Gauss fields as Gibbs measures
265
(13.18) Definition. Suppose J: S x S -> U is symmetric and positive definite, and let he Q. The Gaussian specification yJ'h for J and h is then defined by
the equations .7.*,.,^ y A , 0 4 M
,Z^»-U A (l y l exp(-H^*)|û,)
ifweflj,
1<W^)
otherwise
In the above, A e y , a» e Q and ^ e !F are arbitrary, H^' 1 and Z^,h are respectively the Hamiltonian and the partition function with respect to the potential Q>J,h at (13.11), <50ACOS\A i s t n e Dirac measure at the configuration which is zero on A and equal to œ on S\A, and Q 7 is defined by (13.9). As Qj e ST, the same arguments as in Proposition (2.5) and Remark (1.32) show that yJ,h is indeed a specification. Proposition (13.13) asserts that each yi'h('\co) is a Gaussian field. Moreover, the definition of yJ,h on Q\Qj has been tailored in such a way that (13.19)
fi(Qj)=l
for all n e $(yJ-h).
In other words, the Gibbs measures for yJ,h "feel" only the Gibbsian part of their specification yJ'h. To prove (13.19) we assume there was some fi e $(yJ-h) with n(Qj) < 1. Then v 4 n(-|Q\Qj) G 9(yJ-h) because Slj e ST. Thus v(<jA = 0) = vyl'h((TA = 0) = 1 for all A, and therefore v = <50, the Dirac measure at the zero configuration 0. This is impossible because 0 e Q 7 and v(Qj) = 0.
We are now ready to state two theorems on the representation of Gaussian fields as Gibbs measures. The first deals with the Markovian case in which Q , = Q.
(13.20) Theorem. Let \i e ^(Q, SF) be a Gaussian field. Suppose that (0 n(at + ^(CTJI^;})) > 0 for all i e S, and
(ii) for each i e S there exists some di e if with i $ di such that ^(c;!^;}) = H(oi\&rai)H-a.s.. Define J: S x S ->U and h e Ü. in terms of the conditional covariances and the mean of fi as in Proposition (13.7). Then yJ'h is well-defined and \i e ^(yJ,h). Proof. Combine Proposition (13.7), Lemma (13.10), Proposition (13.13), and Theorem (1.33). D In contrast to the preceding theorem, the next one will become important in the next section. It characterizes the Gibbsian Gauss fields in terms of their means and covariances. It implies that a centered Gauss field \i with covariance function C is a Gibbs measure if and only if C, considered as an infinite matrix, has a positive definite inverse J such that /*(fij) = 1. We need to introduce some notation. For given J: S x 5 -> U. and h e Q we put
266
(13.21)
Gaussian fields
MJih = \ m e Qy. ht + £ J(Uj)m} = 0 for all i e S1.
Note that M, 0 is a linear subspace of Q = Us, and M, h = m + Mj 0 for each m G M,,„. (13.22) Theorem. Let n be a Gaussian field with mean m and covariance function C. Also, let h e Q and J: S x S ->M be a positive definite symmetric function. Then the following conditions are equivalent. (a) n e g(yJ-h) (b) H(£lj) = 1, m e MJth, and X J(i,j)C(j,k)
= ôik
(i,keS),
JeS
where aik is Kronecker's delta. (In either case, the sum above is absolutely convergent.) Proof, (a) implies (b). At (13.19) we have already shown that /i(Qj) = 1. Consequently, the limits Xt = lim X J(Uj)oj Aey
(i e S)
jeA
exist /i-almost surely. Since /i is Gaussian, these limits also exist in the L2(/i)-sense and, a fortiori, in the L 1 (/i)-sense; see Corollary (13.A6). Hence VL(Xt) = lim X J(iJ)mj
(i e S).
By virtue of a well-known theorem of Riemann, the existence of the limit on the right side implies that m e Qj. Similarly, we have fi((Xt - « ) ) f o - mk)) = lim X J(Uj)C(j, k)
(i, k e S).
4ey jeA
Since \L e (£(yJ'h), we conclude from Proposition (13.13) that Z* = m^i}, •) = -J(i,i)'1
(ht + X. J{i,j)oj\
/z-a.s.
and therefore a{ — £f — (ht + Xi)/J(i, i) /i-a.s.. Taking expectations with respect to fi, we obtain that ht + ^(Xt) — 0. When combined with our previous equation for /i(X;), this shows that m e Mj h. We also obtain, using Proposition (13.13) in the last step, that H((Xi - fi(Xd)K
- mk)) = J(U)K(Oi - Z?)(ok - mk)) ôikJ(i,m°î-(St)2)
= =
ôikJ{i,iMy{t)\<7?)-yfo\<7i)2)
= öik.
Gibbs measures for Gaussian specifications
267
Together with our previous equation for the covariance of Xt and ak, this proves that J is an inverse of C. (b) implies (a). Since fi(£lj) = 1, the random variable tl=-J{i,i)-1(hi+Z
J(i,j)o)
is well-defined /i-almost surely, and we can write °i - & = JiiJ)'1
X J{Uj){Oj - rrij) jeS
because m e MJh. By Corollary (13.A6), the sum on the right also exists in L2(fi). Thus /i((7,- — £;) = 0 and J(i,i)n((Oi - tufa - mk)) = Z J(UJ)C(j,k) = ôik jeS
for all i, k e S. In particular, if i =£ k then at — ^ and ak are uncorrelated, and the proof of Lemma (13.10) has shown that in this case ai — £,- and 2T^ must be independent. Hence £f - it = fi(at - Q^{1})
= ii(at - &) = 0
H-SL.S.,
and this establishes equation (13.5). We also have WMiot
- tf)2) = J(Ui)n((at - Q(at - mt)) = 1
for all i. Thus all hypotheses of Lemma (13.10) are satisfied. From this lemma and Theorem (1.33) we then conclude that \i e <S(yJ'h). (Theorem (1.33) does not apply as it stands because yJ,h is a positive /l-specification on Qj only. However, since fi(£lj) = 1 we are allowed to modify the definition of yJ'H on £l\Çlj at (13.18) in such a way that yJ-h becomes a positive ^-spécification. For example, this can be achieved by letting a be the standard normal distribution and putting yJ'h = a. on Q\Qj.) a
13.2
Gibbs measures for Gaussian specifications
In the preceding section we started from a given Gaussian field n and sought a Gaussian specification yJ'h with /i e &(yJ'h). Here we shall take the opposite point of view: For a given Gaussian specification yJ'h we shall consider the associated set (S(yJ,h) of Gibbs measures. We begin with a remark which shows the significance of the sets Mj h in (13.21) for the specifications yJ'h (provided these sets are non-empty). For each m e f l we let xm e T denote the spin translation by m, i.e., TmCO = CO + m = (CO; + W ; ) ; e S
(<X> G Q ) .
268
Gaussian fields
(13.23) Remark. Let J: S x S -» R be symmetric and positive definite, and let h,heQ. Then the following conclusions hold. (a) If m e Mj h then ^•h+fi(-|Tmt0) = T m ( ^ ' Ä ( » )
for all A e y and co e Qj. (b) For each m e M, h, £(/•*) =
{rm(n):ne%(yJ'0)}.
(c) <&(yJ'h) is preserved by the transformations xm with m e Mj 0 . (d) If Mj 0 contains an element m ^ 0 then either ^{yJ,h) = 0, or ex &(yJ,h) is uncountable. Proof, (a) Let A e y , co e Qj and m e Mj h be given. From Proposition (13.13) we conclude that yJK'h+f,(-\xmœ) and Tm()>;(,Ä(-|co)) are Gaussian fields with the same covariance function. It is therefore sufficient to prove that the mean values of these fields coincide. This is immediately checked, using the explicit expression for these mean values. (b) (c) In view of (13.19), these statements are obtained by an obvious modification of Remark (5.10). (d) Mj 0 is a linear space. Therefore, if Mj 0 ^ {0} then yJ,H exhibits a continuous symmetry, because of (c). The assertion thus follows from Theorem (7.26) and Remark (7.2). D Our first theorem on <&(yJ'h) will show that a probability measure JJL belongs to y(yJ'h) if and only if /z admits a representation as a random translation of a particular Gaussian field JJLC. This means that H = | v(dm)Tm(//c) for a suitable v e ^(Cl, J27). Obviously, the expression on the right is just the convolution /J.C * v of /J,C and v. (Recall that for any two probability measures Hi, n2 on (Q, J*) the convolution JJL^ * JJL2 is defined by the formula
Mi * M2(/) = 1 M d O J H2(dri)M + V). Here / is any bounded measurable function, and Ç + r\ = (£,- + /?;),-eS-) We remind the reader that the matrix fA was defined in (13.12). (13.24) Theorem. Let J: S x S -> [Rfeesymmetric and positive definite, and let heQ. Suppose that <${yJ'h) ^ 0. Then the limits (13.25)
C(i,j) = lim X\i,j)
(i,j e S)
AeSf
exists, and a random field ß belongs to ex (S{yJ'h) if and only if ß is a Gaussian field with covariance function C and a mean m e M} h. Consequently,
Gibbs measures for Gaussian specifications
269
9(yJ-h) = W * v : v e 0>(Q,F), v(MJih) = 1}, where \xc is the unique centered Gaussian field with covariance function C. Proof. Theorem (7.26) ensures that ex #(yJ-*) # 0. So let p e ex #(/•*). If ( A J n è l is any increasing cofinal sequence in y, then Theorem (7.12) implies the existence of an co e Qj such that \i = lim yi'*{ • |a>) in the topology of n-*oo
local convergence. In view of Propositions (13.13) and (13.A5), this implies that the limiting covariance function C at (13.25) exists and \x is a Gaussian field with this covariance function. Theorem (13.22) then shows that the mean m of n belongs to MJh. Now let \ic be the centered Gauss field with covariance function C. The random field Tm(jxc) = \ic * dm is then again Gaussian and has the same mean and covariance function as \i. Hence \x = \xc * ôm. In the above, we have shown that In particular, this implies that \ic = i~m{jx) for some m e M, h and \i e ex^(y 7 '"). Remarks (13.23)(b) and (7.2) thus imply that ficeex^(yJ'°). The same remarks then allow to conclude further that nc*ôm = x"'(lxc) e ex &(yJ,h) for all m e Mj h. Hence exV(yJ'h) =
{Hc*àm--rneMJth}.
The final statement of the theorem now follows from Theorem (7.26). D Theorem (13.24) provides a complete description of (S{yJ'h): <&(yJ,h) is isomorphic to the set gP{MJh,!F\Mjh) of all probability measures on Mj h, provided (S(yJ,h) =£ 0. We are thus left with the existence problem. Combining Theorems (13.22) and (13.24) with Remark (13.23)(b) we see that &{yJ'h) # 0 if and only if M, h =£ 0 and J has an inverse C such that nc(Q,j) = 1. (As before, Hc stands for the unique centered Gauss field with covariance C. The existence of JXC is guaranteed by Proposition (13.A7).) Moreover, this inverse C should be given by (13.25). If J has finite range, the condition ^ c (üj) = 1 is trivial. Thus in this case we have ^(yJ'h) ^ 0 if and only if MJh # 0 and the limit (13.25) exists. The next theorem is a slight improvement of this statement. (13.26) Theorem. Let heÇï and J: S x S -> 1R be a positive definite symmetric function such that {j e S: J(i,j) j= 0} e 9> for all i e S. Then <${yJ'h) =£0if and only if Mj h # 0 and (13.27)
sup fA\i,
i) < oo
for all i e S.
Proof In view of Theorem (13.24), we need only prove the sufficiency. We will show that condition (13.27) implies the existence of the limits in (13.25). To
270
Gaussian fields
this end we shall establish the following monotonicity property: If 0 ^ A c A e y and t = (tt)ieA e UA then
To see this we recall from Proposition (13.13) that ßAx is the covariance of y^,h(- |CÜ) when œ e Qj. Therefore we have, putting / = £ liai> I
fÄ1(UJ)titj
= yA(f-yAf)2
onQj.
i, j e A
(We omit the superscripts J and h of y, for simplicity.) As the function on the right is constant, it coincides on Qj with yAyA(/ - y A /) 2 = TATAC/2) - VACTA/)2By Jensen's inequality, this is at most TATAC/2) - (TATA/) 2 = yÀP)
- (TA/) 2
i.jeA
This completes the proof of the monotonicity property. Our first conclusion is that fAl{i, i) is an increasing function of A. Assumption (13.27) thus implies that C(i, i) = lim fAl{i, i) exists for all i e S. Secondly, if i ^j then
fÄHU) + fÄHjJ) + 2fAl(Uj) is an increasing function of A. But (13.27) implies that this function is bounded. For, the Cauchy-Schwarz inequality gives
fAl (i,j)2 S fAl(U OA" * ( J J)
(hi e A)
v
because ßA is a covariance. We thus conclude that the limiting function C at (13.25) exists. Since J has finite range, C satisfies
X J(i,j)C(j,k) = lim X J(i,j)fAHj,k)
= ôik
for all i, ke S. We also have Qj = Q. Moreover, C is evidently nonnegative definite. Thus there exists a centered Gauss field \ic with covariance C; cf. (13.A7). fic satisfies condition (b) of Theorem (13.22) with h = 0. Thus pc e <$(yJ-°), and Remark (13.23)(b) implies that xm{nc) e %{yJ-H) for each m e MJth. This completes the proof, D (13.28) Comment. Suppose J: S x S -> IR has finite range and satisfies the hypotheses (i) to (iii) of Comment (13.14). Let P be the stochastic matrix of (13.15). We claim that condition (13.27) is equivalent to the statement that each i e S is transient for P. Of course, the latter means (in the notation of Comment (13.14)) that
Gibbs measures for Gaussian specifications
271
00
for all i e S. Indeed, it is easily checked that
f^(i,j) = n\, f £ hxn=^jjj(hi)
(Uj e A).
(For example, you might set co = 0 and gk = ökj in (13.17) and insert the result in Proposition (13.13).) Thus (13.27) is equivalent to the condition sup til £
1
{x„=i}) < °°
(*'ÊS)-
Since sup TA is the hitting time of the additional site oo, the claim follows. We thus obtain the result that ^(yJ'h) ^ 0 if and only if MJyh ^ 0 and all i e S are transient for P. In this case, the covariance function C of the Gibbs measures for yJ,h is given by C(i,j)=
£ P»(i,j)/J{j,j)
(i,jeS).
We note that the transience condition holds trivially whenever £ J(i,j) > 0 jeS for at least one i e S. o (13.29) Example. The harmonic oscillator. Let d ^ 1, S = Zd, ß > 0, and r-ßßd J(i,J) = -\ ß [_ 0
i f | i - ; i = i, ifi=j, otherwise.
We are thus in the setting of Comment (13.28). The stochastic matrix P at (13.15) is just the transition matrix of the simple random walk on Zd. By Pôlya's well-known theorem, P is transient if and only if d ^ 3. Thus (&(yJ'h) = 0 for all h when d ^ 2, whilst ex <3{yJ,h) is isomorphic to M, h when d ^ 3. The set Mj 0 contains all constant configurations. By Remark (13.23), this implies that the specifications yJ,h exhibit the continuous symmetry co -> (co; -(- Oie s (t e R). It is also easy to see that MJh ^ 0 for all h. Thus, if d ^ 3 then yJ'h exhibits a continuous symmetry breaking. We shall return to this example in (13.43). o (13.30) Example. Let S = Z + be the set of nonnegative integers and (c„)„â0 be any sequence of positive real numbers. We define -c.-Aj
J(iJ) = i
if \i-j\
= 1,
Ci + c^
if
0
otherwise.
i=j,
272
Gaussian fields
(Weputc.j = 0.)Then Y.JiUj) = 0 for alii. The stochastic matrix P a t (13.15) is given by P(i j) = J c ' A j/( c i }0
+ C;
-^
if | !
~-J'l = *' otherwise.
This is the transition matrix of a so-called birth and death process on Z + . A theorem of Harris (1952) states that P is transient if and only if
" P(fc,fc-1) ntitÀP{k,k+l)
C0
"
(See also Breiman (1968), Proposition 7.37, for example.) The product term in the sum above equals c0/cn. Thus P is transient if and only if £ l/c„ < oo. Suppose now we are given any h e Q. It is easily seen that each m0 e R can be continued to a unique m e Mv>Ä. Thus MJh is isomorphic to R. In particular, Mv>0 consists of all constant configurations and yJ,h exhibits the continuous symmetry &)-•(&); + t)ieS(te R). We thus arrive at the following conclusion: (&(yJ,H) is non-empty if and only if £ l/c„ < oo, and in this case ex (&(yJ'H) is isomorphic to R. o "êl If J has infinite range, the existence problem is more difficult. A necessary and sufficient condition for existence in the infinite range case is only known when S = Zd and J is homogeneous; see Theorem (13.36) below. In the nonhomogeneous case, the next theorem provides a sufficient condition, at least. (13.31) Theorem. Let J: S x S -> R be a symmetric positive definite function, and let he Q be such that Ms h j= 0. Suppose J satisfies the conditions 1
c^supjao" £|j(i,j)i
j^i
and Z \J(iJ)\J(Jjr112
< oo
(icS).
3*i J h
Then <${y - ) j= 0. Proof. We introduce the matrices v
' - J(i,j)/J(i, i) if i, j e V, i j= j , ' 0 otherwise
with V <=. S, and Q(i,j) = \Ps(i,j)\, i,j e S. By hypothesis, the powers Q" of Q satisfy £ g"(U) Û cn for all i e S and n ^ 0. Thus JtS
£
X Q"(i,j) £ (1 - c)' 1 < oo
(i e S).
The homogeneous case
273
Since \Pv(i,j)\ ^ Q"(i,j) for all n, i,j, the matrices Cr(i,j) = I
Pv(i,j)/J(j,j)
(hj
eS,V^S)
are well-defined. It is easily checked that CA(i,j) = ^^{i,}) when A e £f and i,j e A. The dominated convergence theorem therefore implies that the limiting covariance C of (13.25) exists and is equal to Cs. (13.25) implies that C is nonnegative definite. Thus there exists a centered Gauss field ßc with covariance C. \xc satisfies condition (b) of Theorem (13.22) with h = 0. Indeed, it is obvious that C = Cs is an inverse of J, and the property ßc(^j) = 1 follows from the estimate
0 c ( x \J(iJ)°j\)=
Z l^(U)IMkjl)
^ Z |J(U)|C(;J)1/2 jeS
^ Z l ^ " J M ; j r 1 / 2 ( z ß"(7'J))1/2 jeS
\«ä0
12
lß
/
g (i - c)- / z \J(uj)\J(jjr JsS
< 00.
Theorem (13.22) and Remark (13.23)(b) thus show that &(yJ-h) # 0. a Using Theorem (13.36) below, one can easily see that the hypothesis of Theorem (13.31) is far from necessary. For example, if d — 1 then the function
J(Uj)
1 1/2 1/4 0
when i = j , when \i — j \ = 1, when | i - ; | = 2, otherwise
violates the condition of Theorem (13.31) but satisfies the existence condition of Theorem (13.36). (The latter holds because J ^ 1/4.)
13.3
The homogeneous case
Here we assume that S = Zd, the d-dimensional integer lattice. We shall look at homogeneous Gaussian specifications. In view of Proposition (13.13), a Gaussian specification yJ,h is homogeneous if and only if the underlying quantities J and h are homogeneous, in that there exists an even function J': S ->• M and a real number h' such that J{i,j) = J'(i — j) and h} = h' for all i, j e S. In the following, we shall assume that J and h are homogeneous.
274
Gaussian fields
We shall replace J and h by J' and h', and we shall drop the primes. In other words, we shall use the letters J and h in a new meaning which can be related to their earlier meaning in a unique way. It is therefore clear what the positive definiteness of an even function J: S -> R means, and we can still use the notations Qj, MJh, and yJ'H when J and h have their new meaning. The sets Qj and Mj h are then invariant under the shifts of S, and the Gaussian specifications yJ,h are homogeneous. As a consequence of our homogeneity assumptions, we can now use the efficient tools of Fourier analysis. We let K = { z e C : | z | = 1} denote the complex unit circle, and we put G = Kd. Note that G is the dual group of S = Zd. We shall write (13.32)
zi =
zii...z^
whenever z = (z1,...,zd)e G and i = (i1,...,id) e S. The normalized Haar measure on G will be denoted by dz. By definition, dz is the image of the normalized Lebesgue measure on ] — 1, l ] d under the group isomorphism (13.33)
] - U ] d 3 p = ( p 1 , . . . , P a ) ^ z p = (e , "^...,e" t *').
Now let J: S -> R be any function which is absolutely summable, in that
(13.34)
X WOI < » . ieS
We then can define its Fourier transform (13.35)
J(z) = X z'-/(0
(z e G).
J is a continuous function on G. If J is even then J is real and satisfies J(zp)=
£./(i)cos7rp-i
(p6]-l,in
i eS
(The dot denotes scalar product.) In this case, J is positive definite if and only if J is nonnegative and not identically zero. A proof of this fact is given in (13.A8) of the appendix. If J ^ 0, we can define its reciprocal J'1 by using the convention 1/0 = oo. The main theorem on homogeneous Gauss specifications connects the existence of a Gibbs measure for yJ,h with the integrability of J - 1 . Recall from (5.2)(1), (5.12) and (5.13) that ^©(O,^) resp. %(yJ'h) stand for all the sets of all shift-invariant random fields resp. Gibbs measures for yJ,h. (13.36) Theorem. Let S = 1d, h e R, and J : S -> R be a positive definite even function which satisfies (13.34). Then (&{y]'h) ^ 0 if and only if MJh ^ 0 and j J(z)" 1 dz < oo. G
In this case we have
The homogeneous case
» ( / ' * ) = (Mc* V: v e P(Q,&),v(Mj,k)
275
= 1}
and %(yJ'") = U c * v: v e ^ 0 (Q, #"), v(M,,fc) = 1}. Here [ic stands for the centered Gauss field with covariance function (13.37)
C(iJ) = J z ' - ' ^ z ) " 1 dz
(i,jeS).
G
Proof 1) Suppose J1 is integrable, and let C be given by (13.37). C is symmetric and positive definite. So we can look at the centered Gauss field Hc with covariance C (cf. (13.A7)). For each i e S w e have
HC(T \JU - *m) ^ z \Ju - i)\cuj)112 \jeS
/
jeS
= c(o,oy>2 x w)i
Thus Hc(Qj) = 1- Moreover, we can write X J(7 - i)C(j, k) = X •/(./ - 0 1 ^ ^ z " ^ J ( z ) - 1 dz = |j(z)zk-iJ(z)^1dz
= Jzk-jdz = <5.-t
for all i, fceS. We have used that 0 < J < oo almost surely. Therefore, if m e Mj Ä then Theorem (13.22) ensures that [ic*ôme <${yJ'h). 2) Conversely, suppose that <$(yJ,h) # 0. Theorem (13.24) then shows that Mjh # 0 and the limit C in (13.25) exists. C is evidently homogeneous and nonnegative definite. A well-known lemma of Herglotz (cf. Proposition (13.A9)) thus ensures the existence of a finite measure a on G such that (13.38)
C(i,j) = J z^a(dz)
(ij e S).
To identify a we use the fact that C is an inverse of J in the sense of Theorem (13.22). (This theorem applies because Theorem (13.24) shows that '&(yJ'h) contains a Gaussian field.) For each k e S we have | zkJ(z)a(dz) = J X J(j)z k ^a(dz) = XJ(j)C(J,/c) = <50k = j V d z . Using the fact that the trigonometric polynomials on G are dense in the set of all continuous functions on G, we conclude that J(z)a(dz) = dz. We can thus write
276
Gaussian fields
a(dz) = l { j >0} (z)a(dz) + l{J-=0}(z)a(dz) = J(z)" 1 dz+ l {J=0} (z)a(dz). Since a(G) = C(0,0) < oo, the integrability of J - 1 follows. 3) To prove the final statement we will show that the right sides of equations (13.25) and (13.37) coincide. In view of (13.38), this will follow provided we can show that a(dz) = Jiz)"1 dz. In Step 1) above we have seen that <&(yJ'h) contains a Gauss field p with the covariance in (13.37). Let p=
J
v w(dv)
be the extreme decomposition of p. Combining Theorem (13.24) and Step 2) above, we obtain that V(T02)
- v(a0)2 = J J{zYl dz + a(J = 0)
for all v e ex (S(yJ'h). We can thus write, using Jensen's inequality, j J ( z r 1 d z = / l( ( T 0 2 )-M ( T 0 ) 2 = jv( ( T 0 2 )w(dv)-[Jv( ( T 0 )w(dv)] 2 ^j[v((To 2 )-v( ( T 0 ) 2 ]w(dv)
= J J{z)~l dz + a(J = 0). Hence a(J = 0) = 0, and this implies that the right sides of (13.25) and (13.37) are identical. The representation oieS(yJ'h) thus follows from Theorem (13.24). To obtain the characterization of <&&(yJ'h) we need to show that a Gibbs measure pc* v e (S(yJ'h) is shift-invariant if and only if v e ^©(Q,^). This follows from Corollary (7.28) because v can be identified with the probability measure w on ex (§{yJ'h) which represents jic * v. The proof is thus complete, D Theorem (13.36) has several implications which will be discussed now. We start with a remark on the sets Mj h. (13.39) Remark. Let he M and J: S -* M be even and positive definite. Suppose J has finite range, in that J(i) = 0 when |i| is large enough. Then MJh =£ 0. Moreover, if J(i) =£ 0 for at least one i j= 0 then Mj h is uncountable. Proof. 1) For each p e ] — 1, l ] d we let zp be defined by (13.33), and we look at the function J(p) = J(zp) on ] — 1, l] d . J is even, analytic, and not identical zero. Hence there exists an n = (n1,..., nd) e Z+ such that
The number ||n|| = n1 + ••• + nd is necessarily even. Without loss we can assume that ||n|| is minimal. Let m e Q be defined by
The homogeneous case
mt= -c(nr 1 hre»" l l (-l) w i / 2 i" 1 ...ij -
277
(ieS).
Then m e MJh because X J(j - i)mj JeS
--«">~ 1 *£ J( '-<£^••(£)"' ( •• ), d V
1
p=0
d
( d \"
p=0
for all i e S. 2) We next assume that J(i) ¥= 0 for some i # 0. Without loss we can assume that i1 # 0. Let N = m a x ^ : J(i) # 0}, and let z 2 , zd be any complex numbers # 0. The function /(w)
ii + N „ i 2
(weC)
on C is a non-constant polynomial. z 2 , ..., zd can be chosen in such a way that /(0) # 0. By virtue of the fundamental theorem of algebra, / has a root z1 ^ 0. Consequently, there exists some z e (C\{0}) d such that J(z) = 0. Here J, as defined at (13.35), is considered to be a function on (C\{0})d. We let É%S{W) denote the real part of a complex number w, and we put m = (ffie(zl))ieS. Then m e Mj 0 because
I '(J
i)ntj = ®e{zlJ(z)) = 0
(i e S).
jeS
Thus Mj 0 contains a nonzero element. Since MJ0 is a linear space and Mj,h = m' + MJ0 for every m' e MJth, we conclude that M, h is uncountable. n
Combining the preceding remark with Theorem (13.36) we see that only three cases are possible when J has finite range. Either
278
Gaussian fields
m = ( — h/J{l))ieS Remark (13.23)(b) thus shows that there is a one-to-one correspondence between ^&{yJ'h) and ^&(yJ,°). On the other hand, if h ^ 0 and J(l) = 0 then Ms h cannot contain a constant element, and this implies that ^&{yJ'h) does not contain any ß such that ^(|
(i G S)
jeS
exist almost surely and are square integrable. This is because 2N
E \JU - i)oj\ ' ) = /
E
\J(j-i)J{k-i)\n(\
j.keS
2
E UU)\
<
OO.
The homogeneous case
279
The same argument as in the proof of Theorem (13.22) therefore shows that the covariance function C of /i is an inverse of J in the sense of this theorem. Moreover, the constant mean m o f / i belongs to Mj 0 . Since J(l) ^ 0, we conclude that m = 0. Further, since C is homogeneous the Herglotz lemma (13.A9) ensures the existence of a measure a on G such that equation (13.38) holds. The argument following (13.38) then implies that a(dz) = J(z)" 1 dz. Now let v e ^ @ (Q, #") be such that ß = nc * v. Then V(CT,) = ^(CT,) — nc{at) = 0 and J/(z)- 1 dz = / i ( f f f ) - ^ ) 2 + vfe2) - v(a;)2
= M O - VM?
= JJ(z)- 1 dz + v(ai2) for all i. Hence v{a{ = 0 for all i) = 1 and n = nc.
a
As a matter of fact, the second assertion of the preceding corollary can be improved. If J has no root in G then, by a classical result of Wiener (1932), the absolute summability of J implies that the covariance function C at (13.37) is also absolutely summable, in that
X |C(0J)|
(A modern approach to Wiener's theorem using Gelfand's representation of commutative Banach algebras can be found in Loomis (1953), for example.) The absolute summability of C then implies that 0 is the only bounded element of Mj 0 . Indeed, if m e MJ0 is bounded then X \J(k-j)C(j,i)mk\
< oo
j,keS
and thus
»»<= I f i J(k- j)C(j,i)) mk /
keS \jeS
= 1 C(7,o(z^(/c-7> k ) jeS
\ksS
/
= 0 for all i e S. We thus conclude from Theorem (13.36) that there exists only one n e
280
Gaussian fields
Dobrushin (1980): If J has no root in G and J(i) ~ c\i\ a as \i\ -» oo with suitable numbers c ^ 0 and a > d then lim \j\aC(0,j) exists. The same interchange of summations as above then allows us to conclude that Mj 0 = {0}. Hence <${yJ'°) = {fic}- This result thus provides us with a non-trivial example of uniqueness, even without any temperedness condition as above. It also shows that Remark (13.39) fails when J has infinite range. Turning to the next corollary of Theorem (13.36), we assume that J has a root z e G. Step 2 in the proof of Remark (13.39) then shows that m = {0te{zl))ieS is a bounded non-zero element of Mj 0 . Applying Corollary (9.24) to the dissipative continuous symmetry group (T""), S H , we can thus conclude that (S(yJ'h) — 0 for all h e U when d :g 2 and J satisfies the decay condition (9.21) resp. (9.35). The same result can be derived directly from Theorem (13.36). In fact, when d = 1 we can replace (9.35) by a slightly weaker condition. (13.41) Corollary. Let d ^ 2, h e R, and J:Zd-+R is a constant L < oo such that
be as above. Suppose there
\i\d\J(i)\^Llogn
X 0<\i\S"
J(z) = 0 for some zeG then >g(yJ-h) = 0.
for all n^2.If
Proof. We will prove that J ^ z ) ^ 1 dz = oo. Let J be defined as in the proof of (13.39), and let q e ] — 1, Y]d be such that zq = z. In the case d = 2, we conclude from Step 3) in the proof of Lemma (9.33) that £ |i11 J"(i")| < oo. Hence J is differentiable with gradient 'e s g r a d J ( p ) = — n £ iJ(i)sin np • i. ieS
Since J ^ 0 and J(q) = 0, we have grad J(q) = 0. Consequently, in either dimension we can write J(P) — X -J(i)C cos np-i — cosnq• i + (d — l)(p — q)• i sinnq• i~\ ieS
£
X
2
\J(i)\\n(p-q)-i\d/d
m\P-q\-
+
X
\J(i)\l2 +
(d-l)\(p-q)-in
\i\>\p-q\~2
^nd\p-q\dL\og\p-qr2
+ \p-q\d
I
|./(0l[2|*r/2 + (rf-l)|»| 3/2 ]
i>l>|p-«|-2
^M\p-q\d\og\p-q\-1 for some constant 0 < M < oo and all p with 0 < \p - q\ S à = 2~xß. Thus
The homogeneous case
7(p)-ldp^M-1
J
dp\p - q\-d(\og\p -
J
281
qrr1
= M'1 (In)"-1 J d r ^ l o g r " 1 ) " 1 o 1 1 = M" (27t)"- ( - l o g log r- 1 )!* = oo. This completes the proof of the corollary,
a
Our final corollary characterizes the centered Gauss fields in ^B(yJ'°) in terms of their spectral measure. Recall that a centered Gauss field /i is shift-invariant if and only if its covariance function C is homogeneous. In this case, C admits a unique representation of the form (13.38) in terms of a finite measure a on G (cf. Proposition (13.A9)). a is called the spectral measure of/i. (13.42) Corollary. Let \ibe a shift-invariant centered Gauss field on Zd, and let J: Zd -> M be as before. Then /i e ^©(y7'0) if and only if its spectral measure a can be written in the form a(dz) = ./(z) -1 dz + a0(dz), where a 0 is supported on the set {J = 0}. Proof "only if". The proof of this is identical to Step 2) in the proof of Theorem (13.36). "if". Let nc be as in Theorem (13.36). The measure a0 is the spectral measure of a unique shift-invariant centered Gauss field v (cf. Proposition (13.A7)). The convolution \ic * v is centered Gaussian and has the same covariance as fi. Hence /i = \ic * v. Also, if i e S then v ( I J(j - i)oj 2 ) = I \
JeS
/
J(j - i)J(k - i) J z*-'a 0 (dz)
j,keS
= J J(z) 2 a 0 (dz) = 0. Thus v(Mj 0 ) = 1, and Theorem (13.36) shows that fj, e %(yJ'°).
a
We conclude this chapter with three examples. (13.43) Example. The harmonic oscillator. (Continuation of Example (13.29).) Let ß > 0 and JO)
-ß/2d ß 0
if|i| = l, if i = 0, otherwise.
Since J(l) = 0, MJi0 contains all constant configurations, and yJ'° exhibits the continuous symmetry co -> (coi + t)icS, t e R. By the way, yJ'° can be
282
Gaussian fields
written in the form yA•" = pAXA with pA(œ) = ZA(œ) *exp
( A e y , f f l e Q). Ii-Jl = l
This is an immediate consequence of Proposition (13.13) and shows that yJ,° coincides with the model of Section 6.3, except for the choice of the a priori measure. If d S 2 then %(yJ-0) = 0, because of Corollary (13.41). So let d ^ 3 and J be as in the proof of Remark (13.39). Then ^
d
J(P) = 2ß Z (! -
cos n
Vt)
e=\.
= Aß X sin2 npt/2
^ 4>S t PÏ = W\v\2 for all p e ] — 1, l] d . Since |p|~ 2 is integrable over ] — 1, l ] d when d ^ 3 we conclude that J" 1 is integrable. Theorem (13.36) thus implies that &B(yJ,0)r\ ex^(yJ'°) contains uncountably many Gibbs measures which break the continuous symmetry of yJ'°. (Recall that the same result has already been obtained by different arguments at (13.29).) o (13.44) Example. Long range interaction in one dimension. Let d = 1, ß > 0, a > 1, and J: Z ->• (R be such that J(l) = 0 and J(i) = -ß\i\~a when i ± 0. J is positive definite. This can be derived either directly as in Comment (13.14), or by using Proposition (13.A8) and the fact that 3(P) = ß I
l * r ( l - cosnpi)
is nonnegative. If a ^ 2 then Corollary (13.41) shows that ^(y 7,0 ) = 0. So let 1 < a < 2. Then J is integrable over ] — 1,1]. Indeed, if 0 < |p| ^ 1 then J(p)^2ß
|/r û sin 2 (7ipi/2)
X o<|i|si/|p|
All
2ß
o<
All
2ß\p\'
\pi\
X
J
All
3 < | i | S l / | p | |pi|-|p| 1/2 2 -
J dx \x
-1/2
ll>
2-a
l\p\
••iâi -i
c -i
2ß\p\
lP\a\pi
I
ipr 1 IM
dxx
2-a
The homogeneous case
283
Hence i
l
J J{pYlàp^M
J |p|1-adp
-l
-l
Theorem (13.36) thus gives the result that e\^(yJ'°) contains uncountably many shift-invariant Gauss fields which are related to each other by the continuous symmetry co -• (a>; 4- t)ieS (t e R) of yJ,°. o (13.45) Example. Long range interaction in two dimensions. Let d = 2, ß > 0, a > 2, and J: I2 -> U be given by J(i) = - j 8 | i r ° (»' ^ 0) and J(l) = 0. J is positive definite. From Corollary (13.41) we know that J - 1 fails to be integrable when a ^ 4. We will show here that J" 1 is integrable when 2 < a < 4. Consider J(p) = J(zp). Since J is bounded away from zero on the complement of each neighbourhood of the origin, we only need to find a lower bound for J(p) when |p| is small. We fix any 0 < ô < 1. For each p ^ Owe have, writing r = |p| and u = p/r, \p\2-°J(p)}zßr2-°
X
|i|--(l — cosrep-i)
ÄSr|i'|Sl,u-ia|i|/2
^
ieS •5ä|ri|Sl,u-iä|i|/2
^ M(p). We have used the fact that 1 — cos ret = 2sin 2 7tt/2 ^ 2t2 when \t\ ^ 1. If r = |p| is small, M(p) is approximately equal to the integral ß 2
J
{xeR2:Äg|x|gl,u-xa|x|/2}
ß J
*/3
= Tjdss
J
L
à
|x|2^dx
2-a
dts
-jt/3
= JBTI(1 - (54-")/3(4 - a) ^ 2M > 0. More precisely, since the function x -> \x\2~" is uniformly continuous on the annulus {ô ^ |-| ^ 1}, there exists an e > 0 such that M(p) ^ M whenever |p| ^ e. Since a < 4, the function p -• | p | 2 ~" is integrable over the disc {| • | ^ e}. This shows that J" 1 is integrable over ] — 1, l ] 2 , and Theorem (13.36) yields the same conclusion as in the preceding examples, o
284
13.A
Gaussian fields
Appendix. Some tools of Gaussian analysis
In this appendix we list some elementary facts about Gaussian random vectors and positive definite functions. Let n ^ 1 and U = (f/,) l â l â n be an U"-valued random vector on a probability space (Q, tF,n). U is said to be Gaussian (or, more precisely, to have a Gaussian or multivariate normal distribution) if there is a symmetric matrix C = (C(i,j))lûiJèn and a vector m = (mj)is,-gn e W such that the Fourier transform of U is given by (13.A1) n(exp[if-I/]) = e x p [ - i t - C f + if-m]
(teW).
Here i is the imaginary unit, and the dot denotes scalar product. The matrix C is necessarily nonnegative definite because the left side of (13.A1) is at most 1 in modulus. By Fatou's lemma, ßiU?) ^ 2 lim t;2n(l - costjl/j) = r ; ->0
d_
2-^a*q>(t) ^2 '
Oti
< oo
(i e S),
r=o
where
mi)(Uj
- m,.))
(1 g i, j g n).
Thus m is the mean and C the covariance matrix of U. If C is positive definite, then the distribution of U has the well-known Gaussian density x -> (2n)'"121det C| - 1 / 2 exp [-•£(x - m) • C - 1 (x - m)] with respect to Lebesgue measure on U". Gaussian random vectors enjoy the following simple but important property. (13.A3) Remark. Let U be Gaussian, 1 ^ k < n, U' = ( l ^ i s . s * ' a n d U" = (Uj)k<j£n- Then the following statements are equivalent. (i) U' and U" are uncorrelated, in that C(i,j) = 0 for all 1 ^ i ^ k < j ^ n. (ii) U' and 17" are independent. Proof. The implication (ii) => (i) is trivial. To prove the converse it is sufficient to note that (i) and (13.A1) imply that fi(e"'u'en"v")
=
for all t' e Rk and t" e R"~k.
fi(e,t'u')ß(e""u") a
One useful consequence of this remark is stated in the next proposition. (13.A4) Proposition. Le? U = (I7j)1g1-5„fee a Gaussian random vector and Û1 = H{U± \U2,.--, Un) the conditional expectation of Ul with respect to the a-algebra which is generated by U2, •.., U„. Then there are real numbers a,b2,---,b„ such that
Appendix. Some tools of Gaussian analysis &i =
a
+ S ht Ut
285
fi-a.S..
i= 2
Proof. Consider the real Hilbert space L2([i) of all square integrable functions on the underlying probability space (Q, !F, fi), and let Jf c L2(/i) be the linear span ofU2,...,U„ and the constant function 1. Let Ü1 be the orthogonal projection of Ul onto Jf. That is, L^ is the unique linear combination of 1, U2, ..., Un which satisfies fi(Ui - 0t) = 0 and ^ ( ( ^ - fy)!/,) = 0 for all 2 ^ i ^ n. The random vector (l^ — Ül,U2,...,U„) is a linear image of the extended vector (1, U), and U is Gaussian. A glance at (13.A1) thus shows that (C/j — Ü1,U2,...,U„) is Gaussian. But l^ — Ü1 and ([/ 2 ,..., [/„) are uncorrelated. Remark (13.A3) therefore implies that l^ — Ü1 and (U2,..., U„) are independent. Consequently, if A belongs to the cr-algebra generated by U2, ...,[/„ then
This means that 0l is a version of L^. G (13.A5) Proposition. Let (U(k))kil be a sequence of W-valued Gaussian random vectors U(k) with mean m(k) and covariance matrix C(k\ Suppose Uik) converges in distribution to a random vector U. Then the limits m = lim mw and k-* oo
C = lim Ök) exist, and U is Gaussian with mean m and covariance C. k-*cc
Proof. Looking at (13.A1) we see that the limit l i m e x p [ - i f C ( k ) t + zt-m (k) ]
(teW)
k-*oo
exists and is equal to the Fourier transform of U. Taking the modulus and letting t = (0,..., 1,0,...) be any basis vector, we find that the limits C{i, i) = lim C(k)(i,i) exist in [0, oo], 1 ^ i ^ n. As the Fourier transform of U is continuous at the origin, the C{i, i)'s must be finite. If we let t be the sum of two basis vectors, we obtain that C(i,j) = lim C(k){i,j) exists for all i =£ j . We further conclude that lim exp [it • w(fc)] exists and is continuous in t. k-*oo
This implies the existence of m = lim mik).
a
k-*oo
(13.A6) Corollary. Let (U„)nèl be a sequence of real random variables on a probability space (Q,J^,/i). Suppose Un converges in probability to a random variable Uœ. If U„ — Uk is Gaussian for all n, k ^ 1 then lim n(117,-14,12) = 0. n-*oo
286
Gaussian fields
Proof. For each n ^ 1, U„ — Uk converges in distribution to U„ — Uœ as k -> oo. Proposition (13.A5) thus shows that U„ — Uœ is Gaussian. In the limit n -> oo, Un — Um tends to the (Gaussian) random variable 0. A second application of Proposition (13.A5) thus shows that lim fi(U„ — Uœ) = 0 and k~*<X>
2
2
lim [>(([/„ - t / J ) - p{Un - l / J ] = 0. n->oo
This proves the corollary, D The next proposition ensures the existence of a centered Gauss field with a given covariance function. (13.A7) Proposition. Let S be a countably infinite set and C: S x S -> U a nonnegative definite symmetric function. Then there exists a unique centered Gaussian field /i c on S with covariance function C. Proof. For each A e £f we let vA be the centered Gaussian distribution on UA with covariance matrix CA = (C(i,j))iJeA. The vA's are consistent. For, if A c A then equation (13.A1) shows that the A-projection cA(vA) of vA is centered Gaussian with covariance CA. Hence cA(vA) = vA. By Kolmogorov's extension theorem there exists a unique probability measure nc on IRS such that <JA(HC) = VA f° r aH A. e y . /i c satisfies equation (13.2) with m = 0. D We conclude this appendix with two propositions which deal with the characterization of nonnegative definite functions on Zd in Fourier analytic terms. We let K denote the complex unit circle. (13.A8) Proposition. Let J': Zd -> M be an even, absolutely summable function and J its Fourier transform. J is positive definite if and only if J ^ 0 and J(z) ^ 0 for at least one z e Kd. Proof, "if". Let w; e C (i e Zd) be such that u{ ^ 0 for at least one but at most finitely many i, and define g(z)=
X ukzk
(zeKd).
keZd
Then E
i,jeZd
«,•/('•-/)«;= =
Z
i,j,keZd
UtUjJik) $d zk+J->dz K
2
\\g(z)\ J(z)dz.
Since J is continuous, J > 0 on a set of positive measure. On the other hand, \g\2 can only vanish on a null set. This is because there is an n e Zd such that g(z)z" is a polynomial. Consequently, for any choice of z 2 , . . . , zd there are at
Appendix. Some tools of Gaussian analysis
287
most finitely many z^s such that g(z) = 0. Combining these facts we see that the last integral is positive. "only if". The equation above shows that $\g(z)\2J(z)dz>0 for every nontrivial trigonometric polynomial g on Kd. Using the StoneWeierstrass theorem, we conclude that J"/(z)J(z)dz ^ 0 for all continuous functions / : Kd -> [0, oo[. This gives us that J(z) ^ 0 for almost all z e Kd. Since J is continuous, this implies that J ^ 0 everywhere, D Last but not least, we quote a well-known lemma of Herglotz. This lemma is the discrete counterpart of Bochner's famous theorem. (13.A9) Proposition. Let C:Zd->C be any even function. C is nonnegative definite if and only if there exists a finite measure a on Kd such that C(i) = J z;a(dz)
for all i e Zd.
Proof, "if". This is easy. "only if". For each finite A c Z ' w e define gA(z) = \Ar
I
C(;-!>'"•'•
(zeKd).
i,jeA
By hypothesis, gA(z) ^ 0 for all z. Writing gA(z)=
X C(fc)s-*|An(A + fc)|/|A| keTd
we see that Jz'"0 A (z)ds = C(i)|An(A + i)|/|A| for all i e Zd. Here A + i is the translate of A by the vector i. In particular, we have J" gA(z) dz = C(0) for all A. Now we let A run through a sequence of cubes. As Kd is compact, the measures gA(z) dz have a cluster point a in the weak topology. This a has the required property, a
Part III Shift-invariant Gibbs measures
Shift-invariant random fields on an integer lattice Zd are a well-studied area of Probability Theory. Within the Gibbsian theory, a similarly important role is played by the shift-invariant Gibbs measures on Zd. Actually, such Gibbs measures appeared already in various places in Parts I and II, and they will also be crucial for our investigations in Part IV. Here we shall look at shift-invariant Gibbs measures from a general, and in some sense global, point of view. We commence Chapter 14 by noting several classical facts concerning the (shift-) ergodic random fields. (These constitute the extreme boundary of the simplex of all shift-invariant random fields.) This chapter then provides some general results on ergodic Gibbs measures for shift-invariant specifications. These include the decomposition of shift-invariant Gibbs measures into ergodic components, as well as the approximation of ergodic Gibbs measures by averaged Gibbs distributions. In the two further chapters of this part we shall concentrate on shiftinvariant Gibbs measures for shift-invariant and absolutely summable potentials. In Chapter 15 we shall introduce and discuss the specific entropy and specific free energy of shift-invariant random fields, and the central result is the variational principle which characterizes the shift-invariant Gibbs measures as the minimizers of the specific free energy. We shall further see that the minimum free energy principle also determines the exponential decay rates of Gibbsian probabilities of untypical behaviour. Chapter 16 presents a geometric view of Gibbs measures. Namely, as a consequence of the variational principle, each shift-invariant Gibbs measure can be identified with a tangent hyperplane to a convex function (the "pressure") on the space of potentials. This identification opens the door to a convexity approach to both the existence of phase transitions, and the uniqueness problem for shift-invariant Gibbs measures. Whilst much of Chapter 15 can be read independently of Chapter 14, Chapter 16 depends on the preceding chapters 14 and 15. These in turn are based on Chapters 4, 5, 7 (and, of course, 1 and 2). The other chapters of Parts I and II do not enter into the main stream of this part.
Chapter 14 Ergodicity
Let S = Zd and ^©(Q, !F) be the set of all shift-invariant probability measures on (Q, F). Also, let y be a homogeneous specification and ^©(y) the set of all shift-invariant Gibbs measures for y. The objective of this chapter is to provide a number of basic facts about the sets ^®(Q, !F) and &&(y) as well as their extreme boundaries ex ^®(Q, F) and ex "^©(y). The elements of ex ^©(Q, F) are called ergodic. From an abstract point of view, the definitions of ^©(Q, F) and ^©(y) are very similar to that of @(y). As a consequence, most results of this chapter have a direct counterpart in Chapter 7. In fact, some fundamental results of Chapter 7 have been stated in sufficient generality to include the present cases. Hence, the reader should refresh his memory of Sections 7.1 and 7.3 before continuing. Section 14.1 is devoted to a study of the simplex ^©(Q, !F). We shall start with some standard characterizations of ergodic probability measures. Then we shall show that each \i e ^©(Q, F) admits a unique ergodic decomposition, in that \i is the barycenter of a unique probability weight on ex ^©(Q, IF). Moreover, we shall establish the remarkable fact that the ergodic measures are dense in ^©(Q, F) relative to the ^-topology. In Section 14.2 we shall look at @@(y). We shall see that %(y) is a face of the simplex ^©(Q, F). In particular, each \x e ^®(y) has a unique extreme decomposition, and this extreme decomposition coincides with the ergodic decomposition of \i. Finally, we shall prove that each ergodic Gibbs measure can be approximated by averaged Gibbs distributions in finite volumes with suitable boundary conditions. Several results of this chapter rely on the multidimensional ergodic theorem. For the sake of convenience, this is stated and proved in Appendix 14.A.
14.1
Ergodic random fields
Throughout this chapter we put S = Zd, the d-dimensional integer lattice, and we let (£, S) be an arbitrary measurable space. As before, we put (Q, F) = (E, ê)s. (Q, F) is acted on by the shift group 0 which was defined in Example (5.2) (1). We look at the convex set (14.1)
^ e (ft, F) = {M e ^(Q, JF): 0,(/i) = ß for all i e S}
of all shift-invariant random fields on Zd. ^©(Q, F) is always non-empty. We also consider the u-algebra
Ergodic random
(14.2)
S = {Ae&:0,A
=
fields
291
AîoTaX[ieS}
of all shift-invariant events. (14.3) Remark. (1) An & -measurable function / : Q -* M is ^-measurable if and only if / is invariant, in that / o 0. = / for all i e S. (2) For each \i e ^ e (Q, &), the
/i-a.s. for all i e S}
of all /^-almost surely invariant events is the /^-completion of J. Proof. (1) If / is invariant then {/ ^ c} e J for all c e U. Conversely, any ^-measurable / is the limit of ^-measurable step functions, and each such step function is invariant. (2) For given A e J(n) let B = (J BtA. Then B e J and fi(AAB)^
£ fi(AABtA) = 0.
a
is S
In Section 7.1 we have seen that the structure of a Gibbs simplex @(y) is governed by the tail c-field ST. The set ^@(Q, &) is linked to the
ôt(A\(o) = 1A(0,Û>)
(ieS,Ae^,œe
Q).
Using the notation of Corollary (7.4), we then have ^©(Q, &) = ^ n a n d ^(/i) = ^ n (/i). Combining this corollary with Remark (14.3)(2) and mimicking the proof of Theorem (7.7), we arrive at the following result. (14.5) Theorem, (a) A probability measure p. e ^@(Q, &) is extreme in ^@(Q, &) if and only if \i is trivial on the invariant a-algebra J'. (b) Suppose fi e ^©(Q, 3?) and v e ë?{Çl, &) is absolutely continuous relative to fi. Then v e ^®(Q, &) if and only ifv = ffi for some J"-measurable function f (c) Each ß e ^@(Q, &) is uniquely determined (within ^©(Q, &)) by its restriction to J'. (d) Distinct probability measures \i, v e ex ^@(Q, J^) are mutually singular on J, in that there exists an Ae J such that n(A) = 1 and v(A) = 0. Needless to say, Theorem (14.5) remains true when the shift-group 0 is replaced by any countable subgroup of T. The following definition is standard. (14.6) Definition. A probability measure \i e ^@(Q, &) is said to be ergodic (with respect to the shift-group 0 ) if \i is trivial on J. In the language of mathematical physics, any such \i is often called a pure state.
292
Ergodicity
Ergodicity can be characterized by a mixing property which resembles that in Proposition (7.9). Let us agree to call a set A e Sf a cube if d
A = Z d n n [ m ^ + n] i= l
with m = (ml,...,
md) e Zd and n ^ 0.
(14.7) Proposition. Let /i e ^®(Q, IF) and (A„) n>1 be any sequence of cubes in S such that |AJ -» oo as n -» oo. Then the following statements are equivalent. (i) /i is ergodic. (ii) For all Ae^, lim sup IAJ- 1 X
n(AndtB)-n(A)n(B)
0.
ieA„
(iii) For arbitrary cylinder events A and B, limlAJ" 1 n-*oo
X ß(ADdiB) = n(AMB). ieAn
Proof, (i) implies (ii). If A, B e ^ then
i e A_
1 Jd/* IAJ- Z
^fd/x IAJ-1 I
l^ofl.-^) l x o0.- M (A)
By virtue of the mean ergodic theorem (14.A5), the last integral tends to zero as n -» oo. (ii) implies (iii). This is trivial. (iii) implies (i). For fixed A e ^ we let ^ denote the set of all B e ^ such that limlAJ" 1 X M^n0 J B) = /i(A)/i(B). n->ao
ieAr
It is easily seen that 2>A is a Dynkin system. Similarly, the set of all A e #" with ^4 = ^ is a Dynkin system. Since any Dynkin system containing the cylinder events is equal to SF, we conclude from (iii) that 3>A = 3F for all A e !F. Now let AB J. Then Ae2>A, and this implies that /iC4) = n(A)2. Hence /x(y4) = 0 or 1. • It is not difficult to see that condition (ii) of Proposition (7.9) implies condition (iii) of Proposition (14.7). This observation is sharpened by the following result. In its proof and later on, we shall use the notation
Ergodic random fields
(14.8)
293
A+j={i+j:ieA}
for the translate of a set A c S by a vector j e S. (14.9) Proposition. Let /z e 0»e(Q, ^ ) . Then J <^ ST
/i-a.s.,
in t/iaf /or eac/i Ae J there exists a set B e 3" such that p.(A A B) = 0. /n particular, if /x is trivial on 3~ then /x is ergodic. Proof. Let A e J^ be given. Then we can find a sequence (ß„)„äi of cylinder events such that fi(A A B„) ^ 2~" for all n ^ 1. (This is a well-known corollary to Carathéodory's extension theorem. It can also be proved by checking that the set of all A with this property is a Dynkin system which contains all cylinder events.) For any i e S and n ^ l w e have H(A A 6tBn) = fi(dtA A ßtBn) = n{A AB n ) < 2"". Let A„ e y be such that B„ e 3FKn. Without loss we can assume that (A„)„ àl is a cofinal sequence. For each n M we choose any i(n) e S such that A„ fl(A„ + i(n)) = 0, and we set Bn = BmBn. Then Bn e ^An+m <= ^ „ and p(A A B„) g 2"". Therefore, if B = (~) (J Bn then ß e ST and
M^4AB)^/x(n U ^ A 5 n ) ^ l i m Z 2 " - = 0. This proves the proposition,
o
As is evident from the proof above, Proposition (14.9) remains valid when the full shift-group 0 is replaced by any infinite subgroup of 0 . Next we will show that ^@(Q, #") is completely determined by its extreme boundary (provided (E, S) is standard Borel). More precisely, each // e ^@(Q, 3F) can be represented as a mixture of ergodic probability measures. This ergodic decomposition of/i is governed by a (^ e (Q, $F\ ^-kernel in the sense of Definition (7.21). In accordance with Section 7.3, e x ^ ^ Q , ^ " ) is endowed with the c-algebra s(ex ^@(Q, #")) which is generated by the evaluation mappings v -> v(^4), ^4 e J r . (14.10) Theorem. Let (£, ê) be a standard Borel space. Then there exists a {SP&(£1, !F\ jykernel %. Consequently, there is an affine bijection as'.fi^w^ from 0>&(Q., &) onto 0>(ex 0>&(Çl, &), ^(ex 0»e(Q, &))) such that V=
1
v
w^(dv)
for all n e ^@(Q, J^). w^ is the image of p. under the mapping m -> nm. In particular, for any A e !F and c e R we have
294
Ergodicity
w„(v G ex^ 0 (Q, J^): v(A) g c) = MA] J) ^ c). Proo/. In view of Remark (4.A3), (Q, J*) is a standard Borel space. Theorem (4.All) thus ensures that J^ has a countable core (€. We let (A„)„>1 be an increasing sequence of cubes in S such that |A„| -» GO, and we look at the invariant set ( a e Ü : lim |A„r x
Qo= (.
£
n-*ao
l ^ e o ) exists for all A e
ieA„
J
=
n^ao
ieA„
'
J
According to Definition (4.A9) of a core, there is a well-defined mapping œ -> nm = 7t( • |co) from Q 0 into ^>(Q, J*) such that 7r(X|ca)= lim \A„\~l n->oo
^l^co)
for all A e^.
>'eA„
This mapping can be extended to all of Q by choosing any v0 e ^®(Q, J5") and putting 71™ = 7t(-|co) = v0 when œ eQ\Q0. The resulting function n: J* x Q -> [0,1] is a probability kernel from J to J*\ Indeed, if A e
{n- e 0>&(Q, &)} =
H
M M ) = *"(^)}
A e
is ./-measurable and has /^-probability one. This is a consequence of the obvious fact that ß(9iA\
Ergodic random
fields
295
(14.11) Corollary. Let (E, S) be a standard Borel space and x e T be such that x o O = @ o T. Let u> be as in Theorem (14.10). Then u> commutes with x, in that wI(/1) = x(wß) for all /x e ^®(Q, &). Proof. Same as Corollary (7.28).
o
Theorem (14.10) can be summarized in a short phrase: ^@(Q, J*) is a simplex. This simplex, however, cannot be visualized in the naive way. This follows from the next theorem which states that ex^ 0 (Q, J^) is dense in ^@(Q, OF) relative to the topology of local convergence. Of course, this result depends on the chosen topology. For example, assertion (14.5)(d) implies that any two distinct ergodic measures have total variation distance 2. Therefore, if ^®(£2, J*) is endowed with the total variation metric then ex^®(Q,&) is certainly not dense in ^ ( Q , J*) (except when |£| = 1). Recall from the paragraph below (5.12) that a?&{ÇL,&) is closed in the y -topology. (14.12) Theorem. Relative to the y-topology, ^@(Q, J^) has a dense extreme boundary. Proof. Fix any /x e ^ 0 (Q, &). For each n ^ l w e define A(n) = [-w, ri]d fl S,
and
v„ = |A(n)r
X 0,.(/O. jeA(n)
Let / e if and A e ^ be such that / e j§?A. Then k(/)-/xCDI = |A(n)r
Z
[/i„(/o0,)-M/ooy)]
^ 211/11 \{j e A(fi): (A - j)\A(n) * 0}|/|A(w)| because f°6j6j?A_j. The last expression tends to zero as n->oo. Thus tx = lim v„ in the JS?-topology. n-*GO
On the other hand, we have v„ E ex i^0(Q, ^ ) for all n ^ 1. Indeed, since ö (2n+i)i(Mn) = A*« for all ! e S it is easily checked that v„ is shift-invariant. We thus only need to show that v„ is ergodic. So let A e J be given. According to Proposition (14.9), there is a set B e 9~ such that v„(A A B) = 0. In particular, n„(AAB) = 0. Since ^ e / , w e conclude that 0,-(/x„)(,4) = /x„(,4) = ix„(B) for all j e A(n). Thus v„(/l) = xx„(B). But Kolmogorov's 0-1 law (cf. (7.14)) implies that tx„(B) = 0 or 1. Hence v„(A) = 0 or 1. o (14.13) Comment. Suppose £ is a compact metric space and ê its Borel aalgebra. It is well-known that in this case ^@(Q, &) is compact and metrizable
296
Ergodicity
in the weak topology. From Remark (4.3)(3) we know that the weak toplogy is coarser than the if-topology. Theorem (14.12) therefore implies that ex ^©(Q, &) is weakly dense in éP&(Q, &). Moreover, ex ^©(Q. #") is a dense Gs set in ^ @ (Q, J^). In other words, ^ @ (Q, J^)\ex ^ 0 (Q, #") is of first Baire category, i.e., a countable union of nowhere dense sets. To see this we let d(-, •) denote any metric for the weak topology. Then we can write ^ @ (Q,
with Kn = {{ii + v)/2: n, v e 0>@(Q, &), d(n, v) ^ 1/n}. In view of the compactness of ^©(Q, &), Kn is easily seen to be closed. Hence ex ^®(Q, J^) is the countable intersection of the open sets ^©(Q, ^)\K„, n ^ 1. A set of first Baire category should be considered as being exceptional. We thus arrive at the surprising conclusion that a "typical" shift-invariant probability measure is ergodic. There is a further result which should be mentioned in connection with this. With reference to a paper of Poulsen (1961), a compact metrizable simplex with a dense extreme boundary is sometimes called a Poulsen simplex. Lindenstrauss et al. (1978) have shown that all Poulsen simplices are affmely homeomorphic to each other, o
14.2
Ergodic Gibbs measures
Let y be a shift-invariant specification. (See Example (5.8) for the definition.) We will consider the set (14.14)
%{y)±9(y)n0>B(Q,&)
of all shift-invariant Gibbs measures for y. As we have seen in (11.46), %(y) can be empty even when <8(y) ^ 0. Sufficient conditions for %(y) to be nonempty have been given in Section 5.2. By definition, a probability measure /z belongs to %(y) if and only if fi is preserved by all probability kernels in the countable set n = {yA: A e y j U {ôt: i e S}. 0i was defined in (14.4).) In this case, the a-algebra Jn(n) of all /z-almost surely n-invariant events is the /z-completion of J. Indeed, Remarks (7.6) and (14.3)(2) imply that Jn(fi) is the intersection of the /z-completion of $~ and the \icompletion oi J. But Proposition (14.9) asserts that the latter completion is contained in the former. We thus obtain the following counterpart to Theorems (7.7) and (14.5).
Ergodic Gibbs measures
297
(14.15) Theorem. Let y be a shift-invariant specification. (a) A Gibbs measure p e &&(y) is extreme in ^@(y) if and only if /i is ergodic. Thus ex%(y) = %(y)nex0>&(Q,^). (b) If fie ^ e (y) and v e ^@(0,8F) is absolutely continuous relative to fi then v e %(y). (c) %(y) is a face of 0>&(Q, &). I.e., iffi,ve 0>@(Q, #") and 0 < s < 1 are such that s/i + (1 — s)v e ^©(y) then n,ve &&(y). Proof. Assertion (a) follows from Corollary (7.4) and the remarks above. Combining Theorems (14.5)(b) and (7.7)(b) with Proposition (14.9), we obtain (b). Statement (c) is an immediate consequence of (b). D We note that assertion (c) implies (a). We also note that the theorem above remains valid when the shift-group 0 is replaced by any countable subgroup of T which contains a (non-trivial) shift. For in this case Proposition (14.9) still holds. As a last remark we point out that statements (c) and (d) of Theorem (14.5) remain true when ^©(O, #") is replaced by ^@(y). In view of (14.14) and (14.15)(c), one might ask whether %{y) is also a face of ^(y). For example, this happens under the conditions of Theorem (10.35). In general, however, ^©(y) fails to be a face of ^(y). There are ergodic Gibbs measures which are not trivial on 2T, and we now give a simple example. (14.16) Example. The Ising antiferromagnet in one dimension at zero temperature. Let S = Z, £ = { —1,1}, and X be counting measure. Define a shiftinvariant ^-specification y as in Example (10.3) by means of the functions , x (1 i f x ^ v , ^• ( x ' y ) = l 0 ïîx = y,
(X'yeE'JeV-
From Section 3.2, Case 2, we know that y describes the zero temperature limit of the Ising antiferromagnet without external field. Let (5+_ and <5_+ be the Dirac measures at the alternating configurations a>+~ an oT + which are defined by +_ , f 1 if fis even, co = coi++i =< I — 1 if i is odd. It is easily verified that <5+_, c>_+ e ex^(y) and (c>+_ + <5_+)/2 e ex^@(y). Thus ex^ @ (y)\ex^(y)^0. o A further, more interesting example with the same property is provided by the two-dimensional Ising antiferromagnet at low but positive temperatures; cf. the discussion at the end of Section 6.2. Also, in (16.30) we shall obtain a general theorem on the existence of ergodic but non-extreme Gibbs measures. Thus, an extreme element fi of ^®(y) is always a pure state in the sense of
298
Ergodicity
Definition (14.6) but not necessarily a phase in the sense of Comment (7.8). In spite of this, any such ß is often called a pure phase. We emphasize that Theorem (14.15) relies primarily on Proposition (14.9). In addition, this proposition implies that the ergodic components of a shiftinvariant Gibbs measure are also Gibbs measures. More precisely, if (E, S) is standard Borel then assertion (14.15)(c) can be sharpened as follows. (14.17) Theorem. Suppose (E,S) is a standard Borel space and y a shiftinvariant specification with ^©(y) # 0. Let ß e ^©(y) and wß be the unique probability measure on ex^©(Q, J^) which represents fi; cf. Theorem (14.10). Then wß is supported on ex ^©(y), Consequently, fi has a unique extreme decomposition within ^©(y), namely \i =
\
v wß(dv).
Proof Let /i e ^©(y) be given. In view of Theorem (14.10), the representing weight wß on ex^©(Q, J*) is defined in terms of/i via a (^©(Q, J*), ^-kernel n. We thus need only show that n' e ^©(y) /i-a.s.. (In other words, any (^©(Q, 3F\ ^-kernel is also a (%(y), ^-kernel.) Since TV e ^©(Q, 3F) /i-a.s., it is sufficient to check that if e &(y) /i-a.s.. Let <€ be any countable generator of 3F which is stable under finite intersections. Such a <ê exists because (Q, 3F') is standard Borel. For each A e <€ and A e y we can write, using Proposition (14.9) for the third and fifth equality, •K-yK{Ä) = fi(yA(A\ -)\J) =
n(n{Am\S)
= VL(vL(n(A\rA)\F)\J) =
n(n(A\f)\S)
= n(A\J>) = n'(A) /i-a.s.. Thus H(n e 9(y)) = fi(nyA(A) = n\A) for allAeV and the proof is complete.
and A e Sf) = 1,
•
(14.18) Corollary. Let {E,S) be a standard Borel space, y a shift-invariant specification, and N ^ 1 an integer. Then |ex^©(y)| 2: N if and only if ^©(y) contains at least N linearly independent elements. Proof. Just as Corollary (7.29).
a
Theorem (14.17) and the preceding corollary still hold when the shift-group 0 is replaced by any finitely generated abelian subgroup of T which contains a non-trivial shift. This follows from the fact that Theorems (14.5), (14.10), and (14.15) as well as Proposition (14.9) remain valid under this replacement. In (7.12) we have seen that each extreme Gibbs measure is a limit of
Ergodic Gibbs measures
299
finite volume Gibbs distributions with suitable boundary conditions. What about ergodic Gibbs measures? The ergodic theorem (14.A8) provides us with the following preliminary answer: U fie ex^@(Q,^) and (A„)„ èl is any increasing sequence of cubes with |AJ -> oo then
M/) = Hm IAJ-1 X fißM for /i-almost all a> and any bounded measurable function /. Thus ^=lim|A„r1 n -> oo
X \«B
for/i-a.a. to
ieAn
in any topology which is generated by countably many evaluation mappings v -> v(/). (When E is compact metrizable, the weak topology has this property.) The above approximation result is unsatisfactory when \i is an ergodic Gibbs measure for y. In this case, the approximating measures should be related to y. We will give such an approximation result in the final theorem of this section. In it we will show that each fi e ex ^@(y) is the limit of averaged finite volume Gibbs distributions with suitable boundary configurations. For the proof we shall need the following refinement of the backward martingale convergence theorem. (14.19) Lemma. Let (Q, J^,/i) be a probability space, (/„)„ è i a sequence of measurable real functions on Q, and (^„)„ à l a decreasing sequence of asubalgebras of J*\ Suppose (0 l/J = 9 for aH n and some integrable function g, and (») /oo = u m fn exists /i-a.s.. Then f] sén I = lim / i ( / j i j n^.1
/
n-a.s
n—>oo
Proof We claim that /M/«>
H < ) Û lim inf /i(/„ | s/n) n^ 1
/
/i-a.s..
n-*oo
Once this is proved, the lemma will follow by the same argument which is used to derive the dominated convergence theorem from Fatou's lemma. For each N ^ 1 we put gN = inf /„. Since 1^1 ^g,gN is integrable. The backward martingale convergence theorem gives P\9N \
Since gN \ f
Pi *** ) = lim Möwl-O ^ liminf/i(/„K„) «èl
/
n-KX)
/i-a.s..
n->oo
as N Î oo, the result follows in the limit N -> oo. D
300
Ergodicity
(14.20) Theorem. Let y be a shift-invariant specification, /j e ex ^ 0 (y), and A„ = [ — n, n] d fi S for alln^. 1. Then the following conclusions hold in the limit n -> oo.
(a)
TA^IAJ-1 ; Z
f°e\^n(f)
ß-a.s.
for all bounded measurable functions f on Q. (£>) / / E is a compact metric space and S its Borel a-algebra then lAnT1 Z yA„+i(-\diœ)->H weakly for ß-a.a. co. ieA„
(c) / / y is a ^.-specification for some X e Jt{E, S) then |A n | _1 YJ yA„+i(' |ö;Co) -> fi in the ^-topology for [i-a.a. to. ieA„
Proof (a) For given / and n, we put
/„ = iA„r z /°0,, ieA„
The ergodic theorem (14.A8) ensures that ß(f) = lim /„
ß-a.s..
n-»oo
On the other hand, since /x e @(y) we have
Assertion (a) thus follows from Lemma (14.19). (b) In view of equation (5.5), statement (b) is an immediate consequence of (a). (Cf. the proof of Theorem (7.12)(b).) (c) We can assume without loss that X e äP(E,$), cf. Remark (1.28)(3). Let p be a /(.-modification with y = pX., and let A e y be fixed. We set A'„ = {i e A„: A„ + i is A}, and we consider the function p^C,(o) = \K\~1
X V+«\A(PAB+JICA(Ö^W)
(C,û>efi)
on Q 2 . By definition, we have (14.21)
\A'„r
z
yA„ +i (/i0 i «) = i m ) / ( O P A , M
for all /ej£? A . We will show that p"A(-,co) converges in L x (/l s )-norm for /z-almost all co when n tends to infinity. To this end we introduce the probability measure v = Xs x /x on (Q, J^) 2 , the measurable function pA(£, co) = pA(CAcoS\A) on Q 2 , and the measurable transformations 0,-(£, co) = (£, ö;Co) of Q 2 . (Here £, co e Q and j e S.) We claim that (14.22)
pi = vflAJ- 1
X p A ° 0,.|^A x ^
v-a.s.
Ergodic Gibbs measures
301
for all n. In order to prove this we pick any n and i e A'„, and we let A e ^A and B e 9~Kn be given. Then J
v(dC,dco)A(Ari+0\A(pAn+i|CA(ö1-co)s\A)
AxB
= \ fi(dco)yAn+i(A\6ico) B
= J fi(dco)yAn+i(A\co) etB
= »(An9iB)
=
nyA(An9iB)
= \ Kdoj) \ P(dOßA(C,co) etB
=
A
I v(dC,dco)pAo9i(i:,co). AxB
The shift-invariance of /* has been used in the second and the last step. Since each term in the sum defining pA is measurable with respect to J ^ x 3~An, the equation above proves that (14.22) holds as claimed. Next we look at the right side of (14.22). As v is invariant under (ö;) ieS , the ergodic theorem (14.A8) implies that (14.23)
limlAJ" 1 £ p A ° ö ; = v(p A |J0
v-a.s..
Here J is the cr-algebra of all (^)iGS-invariant events in J^ x $F. Letting pA(0 = ^(dr,)pA(i:,r])
(CeQ),
we conclude from the ergodicity of fi that (14.24)
v(pA\J)((,co) = pA(()
forv-a.a.(C,
Indeed, let g be any bounded ./-measurable function. Applying Remark (14.3)(1) twice we see that, for any £ e Q,, g(C, •) is ./-measurable and thereby constant /i-almost surely. We thus obtain « A )
= 1 ^ S (d0 J / « | KdcoMZ, OJ)PA(C, V) = Jv(dC,dü))flf(C,ü))pÄ(C).
This proves (14.24). We now combine equations (14.22) to (14.24) with Lemma (14.19). Since the function ((, co) -> pA(() is measurable with respect to (~) J ^ x STAn, we get the result that lim p£{Ç, co) = pA(()
for v-a.a. (£, to).
n-*oo
Finally, we note that the functions pA( •, co) and pA are probability densities relative to Xs. Using Fatou's lemma in the same way as in the proof of Theorem (7.12)(c), we therefore obtain that lim A s (|^(-,co) - p A | ) = 0 n-*<x>
302
Ergodicity
for all co in a set QA e J* with yu(QA) = 1. As n e ^(y), pA is the probability density of yu|J^A relative to A S |JV Consequently, the left side of (14.21) tends to jx{f) for all / e ifA and co e QA. Since | An\A|,|/| A„| -> 0 and A is arbitrary, assertion (c) follows, a Theorems (14.17) and (14.20) have a corollary which is similar to (7.30). Suppose y is a quasilocal shift-invariant specification. We let %jim(y) denote the set of all JX e ^>(Q, #") which are such that /i=lim|A„r1 n -> oo
X y^+i(-\Qi
for a suitable to e Q. Here A„ = [ — n, n\d D S, and the convergence is understood in the sense of the if-topology. Each JX e %>hm(y) is called an averaged limiting Gibbs measure. It is easily checked that ^&Am{y) <= ^©(Q, J^). By an argument of Example (5.20)(1), it even follows that ^&Aim{y) <= @0(y). (The last inclusion even holds when ^ 0 lim(y) is defined to be the set of all cluster points of averaged Gibbs distributions with possibly varying boundary conditions.) (14.25) Corollary. Let (E,$) be a standard Borel space, X e Jt{E,ê), and y a shift-invariant quasilocal ^-specification. Then ^&(y) is the closed convex hull of ^e,iim(y) (in the &-topology). Proof. By virtue of Theorems (14.17) and (14.20), the proof of Corollary (7.30) applies unchanged. •
14.A
Appendix. The multidimensional ergodic theorem
Here we will prove the ergodic theorem which has been used in this chapter and will also be applied later on. Let (Q, J*\yu) be a probability space, d ^ 1, and 0 = (9i)iezd be a group of measurable yu-preserving transformations of Q which satisfy 0; o Oj = Qi+j for all i, j e Zd. Let / c j ^ b e the cr-algebra of all ©-invariant events. Also, let (A„)„â 1 be any sequence of cubes in Zd such that |A„| -> oo as n -> oo. For a given measurable function / : Q -> M we put (14.A1) RJ=\An\~1
I
fo6t
(n£l).
ie An
We investigate the limiting behaviour of Rnf as n -> oo. We shall start by proving mean square convergence. Then we shall obtain convergence in mean, and finally we shall prove almost sure convergence. To prove convergence in L2(/i) we shall apply the following Hilbert space lemma. (14.A2) Lemma. Let Jf be a Hilbert space with norm \\ • || and <€ be a nonempty closed convex subset of Jf. Then there exists a unique element f e <€ such
Appendix. The multidimensional ergodic theorem
303
that 11/11 = min9e<^ \\g\\. Moreover, if ( / J „ È 1 is any sequence in <6 such that lim ll/JI = 11/11 then lim \\f„ - f\\ = 0. n-*oo
n->oo
Froo/ Define c = inf ||g||. Take any sequence (/„) n è l i n ^ with lim ||/„|| =c. ge*
rt->oo
By hypothesis, || • || satisfies the parallelogram law which can be written in the form 11/ - 0ll2/4 = (U/H2 + ||2ll2)/2 - ||(/ + g)/2\\2
(fge
3f).
Consequently, if e > 0 is arbitrarily fixed and m, n ^ 1 are so large that ||/ m || 2 S c2 + s and H/JI2 ^ c2 + e then ||/„ - fJ2/4 ^ s because (fm + /„)/2 e ( €. This shows that (/„)„ èl is a Cauchy sequence. Let / be its limit. By continuity, ||/|| = c. As <& is closed,/ e #. If g e <€ is an arbitrary element with ||gf|| = c then the parallelogram law again implies that ||/ — g\\ = 0 . Hence / is unique, D The theorem below is usually referred to as the L 2 ergodic theorem. (14.A3) Theorem. For each measurable f with /i(|/| 2 ) < oo, \im ß(\RJ
- n(f\J)\2)
= 0.
Proof. We consider the Hilbert space L2(/i) of all (equivalence classes of) square integrable functions with norm ||g|| 2 = ^(lôf|2)1/2, g e L2(ß). Let <€ c L2(/i) be the closed convex hull of the set {f o Q/.j eZd}. Define c — inf || öf || 2. We will show that (14.A4) lim ||R„/|| 2 = c. n->oo
This will prove the theorem. Indeed, Lemma (14.A2) then will imply that lim | | R „ / - / | | 2 = 0, n-*oo
where / i s the unique element of <€ such that || / 1 | 2 = c. / i s nothing other than the conditional expectation \i(f\J). For let i e Zd be given. Since / o 9t e <& and | | / o 0.||2 = | | / 1 | 2 = c, Lemma (14.A2)_shows that / o 0. = / i n L2(n). In view of Remark (14.3) (2), this implies t h a t / h a s an ./-measurable representative. Further, if g is any J -measurable functon in L2(/i) then M(f - f)9)\ = Km \K(R„f - f)g)\ £\\g\\2 lim \\Rnf-f\\2
= 0.
n->oo
This proves t h a t / = \i{f\J\ We now turn to the proof of (14.A4). Let e > 0 be given. By the definition
304
Ergodicity
of <€, we can find a finite set A e Zd and nonnegative numbers tt, ie A, with YJ ti = 1 such that the function ie A
g-Zhfodj is close to /, in that \\g — f\\2 ^ e. Clearly, \\g\\2 S c + e. Moreover, for each n > 1 we can write
\\Rnf-R,9h
Y
=
tjlRJ-RH(foOj)
j'eA
^ I
fyl|Ä ll /-Ä„(/oEl y )
j'eA
= iA„r j'eA z o I / o 0 , - I /o0 ( ieA n
^ ll/WAJ-
1
ieA„+j
I tJ-|(A„+7)AAJ. j'eA
The last expression is less than e when n is sufficiently large. For such n we have C^||/Ull2^!l*,,0ll2+ll*»/-*.,0ll2 ^ II0II2 + e ^ c + 2e. This proves (14.A4) and thereby the theorem,
o
As an immediate consequence, we obtain the mean ergodic theorem. (14.A5) Corollary. Suppose that /x(|/|) < 00. Then Km p(\RHf - n(f\S)\)
= 0.
Proof. In view of Theorem (14.A3) and Jensen's inequality, the conclusion of the corollary holds whenever / e L2(/i). Since L2(/i) is dense in Lx(/i), this result can be extended to a l l / e L1(/i) by a straightforward application of the inequalities H(\Rnf - Rng\) ^ H(\f - g\ and H(\n(f\J) f,geLl(n).
- fi(g\J)\) ^ fi(\f - g\),
o
Our next objective is the individual ergodic theorem which ensures the almost sure convergence of Rnf. We still assume that (A„)„S1 is a sequence of cubes with |A„| -> 00. In addition, we now impose the condition that (A„) nàl be increasing. (But (A„)„ èl is not required to exhaust the lattice Zd.) The following maximal inequality is the key to the individual ergodic theorem.
Appendix. The multidimensional ergodic theorem
305
(14.A6) Lemma. Suppose that /i(\f\) < oo and c > 0. Then
J sup |K„/| > c W w D / c Proof. Since \R„f\ S Rn\f\, w e c a n assume without loss that / ^ 0. We also can and do assume that O e A , and thus 0 e A„ for all n. (Otherwise we translate all A„'s by a suitable vector and use the 0-invariance of fi.) Let N ^ 1 be an integer, A any cube in Zd, and a> e Q a fixed configuration. We define An = <^ sup # „ / > c and A^ =
{ieA:9iWeAN}.
For each i e A,,, there is some n(i, co) 5j N such that R„^œ)f{di<xî) > c. With the notation Ui(a = An(i>0)) + i, the last inequality can be rewritten as (14.A7)
X f(ôjœ) > c\UuJ.
To exploit this inequality we intend to cover a positive percentage of Ara with disjoint members of the family (UU(0)ieAœ. For each i e Am we let Vim be the cube in Zd which is concentric with UU(0 and satisfies \Vl>(0\ = 3d\Uit(0\. More precisely, if r ^ 1 and j e Zd are such that Uum = ([0, r [d fl Zd) + ; then VU(0 = ([ — r,2r[dr\ Zd) + j . We define a finite subset Wm = {ilt...,ik] of Am recursively as follows. Let i1 e Am be such that n(i1,a>) = max{n(i,œ): i e Aa}. If i1, e ..., ie are already defined and Aa\ [j VmAm # 0 then we take any v + 1 e
m = l
Ara\ U Vco,im such that e n(ie+1,co) = max ^ n(i, œ): i e A m \ (J F ^ L
m= l
Since iet/i, f f l c ^ ^ this recursion stops after finitely many steps. The resulting set Wœ is such that
U
Vi.œ^K-
Moreover, it is easily seen that the sets Ui(a, i e Wm, are pairwise disjoint. We let Um denote their union. Then 3d\UJ=
X 3"\UiJ=
I
\Viia\^\AJ,
and a summation of (14.A7) over all i e Wa gives X fidjco) > c\UJ.
306
Ergodicity
Since Um c A + AN = {i + j : i e A,j e AN} and / ^ 0, we arrive at the key inequality
£
f(dj0))>c3-d\AJ.
The rest of the proof is easy. As
IAJ = I U(0M ieA
an integration over co with respect to /i yields |A + A J V | / z ( / ) > C 3 - d | A | ^ J V ) . Letting first A run through any cofinal sequence of cubes and then N tend to infinity, we finally obtain the desired maximal inequality, a The following is the multidimensional individual ergodic theorem. As before, we let (A„)„ àl be an increasing sequence of cubes in Zd such that |A„| -> oo as n -> oo. R„f is defined by (14.A1). (14.A8) Theorem. For any measurable f with n(\f\) < oo, lim R„f = n(f\J)
n-as..
n-*ao
Proof. Let e > 0 be given. We first choose a bounded measurable function g with ji(\f — g\) ^ e23~d. Then we apply the mean ergodic theorem (14.A5) to obtain an integer N ^ 1 such that H(\RNg-ti{g\J)\)£e23-i. By Lemma (14.A6), AM s u p | Ä n ( / - 0 ) l > e j ^ £ and J sup \Rn(RNg - ii(g\J))\ > e) g e. We also have n(\n(f — g\
limsuplAJ- 1 n->oo
£
|0o0.|
ie(A n +j')AA„
= 0
because g is bounded. Since R„/j,(g\J) = n(g\S) /x-a.s. for all n, we conclude that
Appendix. The multidimensional ergodic theorem
lim&up
307
\Rnf-n(f\S)\
n-^co
^\imsup\R„(f-g)\ n-^oo
+ lim sup \R„(RNg - v(g\s))\ + W-
g\s)\.
Combining this with the estimates above we obtain J lim sup \Rnf-n(f\J)\
> 3É ) ^ 3a.
The proof is completed by letting e tend to zero.
•
A final remark is in order. In the above we have always assumed that each A„ is a cube in Zd. This was just for simplicity. It is not difficult to extend all proofs of this appendix to sequences (A„)„èl of a more general shape.
Chapter 15 The specific free energy and its minimization
A cornerstone of the Gibbsian theory is the variational principle for homogeneous Gibbs measures. It asserts that the simplex ^e() of all homogeneous Gibbs measures for a shift-invariant potential O coincides with the class of all homogeneous random fields which minimize the specific free energy, viz. the difference of the specific (internal) energy relative to O and the specific entropy. The minimality of the free energy is precisely the classical equilibrium condition for a physical system at constant temperature. The variational principle thus supports the common belief that the Gibbs measures provide a proper description of physical systems in thermodynamic equilibrium. In fact, the variational principle for Gibbs measures is an extension to infinite homogeneous systems of an elementary variational principle for finite systems. Let us describe the latter in order to get some feeling for the subject. Suppose S is a finite set. For simplicity, we will also assume that the state space E is finite. Thus Q = Es is finite. Let O be any potential and H = £
the associated Hamiltonian. The unique Gibbs measure for
(co e Q).
Here Z = £ exp[ — H(a>)~\ is the partition function. For each probability coefi
vector /j on fl we let /x(if)= X n(œ)H(œ)
and
co eS2
Jt(n) = - £ M
be the energy and the entropy of fi, respectively. The difference fi(H) — J^{n) is called the free energy of \i. The variational principle for finite systems now reads as follows: For all probability vectors /i on Q, the inequality
li{H)-tf{n)^
-logZ
holds true. Equality holds if and only if/i = v. Indeed, for each ft we can write, applying Jensen's inequality to the convex function q>(x) = xlogx, fi(H)-JiT(fi)
+ logZ=
X /x(
= X vM
Relative entropy
309
^ (p[ Z V MM W )/ V M meil
= «p(D = 0.
Since q> is strictly convex, equality holds if and only if the function a> -» fi(a>)/v(a>) is constant, and the latter means that fi = v. Let us note that the second expression above is just the relative entropy of fi with respect to v. This observation will be the starting point of our approach to the variational principle for infinite systems. The variational principle will be stated and proved in Section 15.4. In the first section we shall deal with some basic properties of relative entropy. These will be used in Section 15.2 to prove the existence of the specific entropy A per site relative to a given a priori measure on the state space. A will turn out to be an affine, upper semicontinuous functional on the simplex of all homogeneous random fields. For standard Borel state spaces, we shall derive the surprising fact that A can be represented as the integral of a suitable measurable function. Then, in Section 15.3, we shall turn to the specific free energy of a homogeneous random field relative to a homogeneous potential
15.1
Relative entropy
Throughout this first section we let (Q, 3P) be an arbitrary measurable space and fi, v any two finite measures on (Q., J^). (15.1) Definition. Suppose se is a cr-subalgebra of J*\ Define \v(f^ log/«,) 'oo
if n « v on se, otherwise.
Here f^ is any Radon-Nikodym density of y\sé relative to v\sé. Jf^(/^|v) is called the relative entropy of /x with respect to v on se. Jf^(/^|v) is also known under the names Kullback-Leibler information, informational divergence, and information gain.
310
The specific free energy and its minimization
In the definition above, we adopt the usual convention that OlogO = 0. Since the function x -» x log x on [0, oo [ is bounded from below, the integral v(fjsf logfrf) is always well-defined (possibly + oo). If se is generated by a finite J^-measurable partition n of Q then the relative entropy is given by the formula *Am=
Z
KA)logn(A)/v{A).
Aen
We also note that (15.2)
^(a/i\bv)
= a J f ^ | v ) + a/i(Q)\oga/b
for all a, b > 0. Consequently, we can (and will) assume henceforth that fi and v are probability measures. Under this assumption, we can write (15.3)
Jr j/ (^|v) = v ( ^ o / j / )
whenever fi = f^v on se for some .sZ-measurable function f^ ^ 0. Here x// stands for the function (15.4)
(A(x)= 1 - x + xlogx
(x^0).
This function is nonnegative and strictly convex. Moreover, \J/ attains its minimum 0 at x = 1 only. These properties of \)/ immediately imply the following basic facts. (15.5) Proposition. Let /j,ve ^(Q, fF) and se be a a-subalgebra of 3F. Then the following conclusions hold.
(a)^Mv)^0. (b) Jf^(fi\v) = 0 if and only iffi = v on se'. (c) J^^(fi\v) is an increasing function of se. (d) 2f?M{\i\ v) is a convex function of the pair (^, v). Proof, (a) This follows from (15.3) because i// ^ 0. It will also follow from assertion (c) because ^n,0}(^|v) = 0. (b) If Jfjf(n\v) = 0 then fi = f^v on se for some f^ ^ 0, and (15.3) shows that \\i o fa — 0 v-almost surely. Hence f^ — 1 v-almost surely, and thereby fi = v on se. The converse is obvious. (c) Let sé-t c sé2 be two er-subalgebras of $F. Without loss we can assume that fi = / ^ v on sé2 for some t G/ 2 " m e a s u r a r j le function fs#i ^ 0. Then [i = fsiv on stfi_, where f^i = v(fJ^2\s/1). Using (15.3) and Jensen's inequality for conditional expectations, we obtain ^J/1(^|v) =
v(^ov(/j/2|^1))
^vWiAo/^j^))
- ^2Mv).
Relative entropy
311
(d) Let n, n', v, v' e <^(Q, &), 0 < s < 1, ß = sfi + (1 - s)/x', and v = sv + (1 — s)v'. There is no loss in assuming that /x = f^v and /i' = f^V on J / , where / ^ , fa ^ 0 are .«/-measurable. By the Radon-Nikodym theorem, there exist ^/-measurable functions g^, g'^ ^ 0 such that v = g^v and v' = g'^v on j / . Since sg^ + (1 — s)g'^ = 1, we conclude from the convexity of i// that
•XTAm = vty(s9*f* + (1 - s ) ^ / ^ ) ) ^ nsg^Wf*)
+ (1 -
= sJTMv) + (1 The proof is thus complete,
sMMti)) sWAn'W)-
o
The next proposition provides us with a continuity property of relative entropy. This property will be applied in the next section but will not enter into the proof of the variational principle. Therefore, it can be skipped at a first reading. (15.6) Proposition. Let JJL, v e ^ Q , ^ ) , O^AeD be an increasing net of asubalgebras of 3F, and si = a I (J srfa J the smallest a-algebra that contains all s/a's. Then ^xeD ' WM?)
= um - 0 | v ) = sup JtT^(fi\v). aeD
IED
Proof Proposition (15.5)(c) immediately shows that JPAl*\v) ^
SU
P -#^.0*1 v) = l i m -*rf.M v ) - c-
aeD
iED
We thus only need to prove that J O ( ^ l v ) ä= c. Clearly, we can assume that c < oo. Then /i is v-continuous on each jrfa. Hence there exists a net {fa)aeD of ^-measurable functions fa ^ 0 such that /x = / a v on j / a . Plainly, {fa)asD is a martingale relative to v. We will show that this martingale converges in L1(v)-norm to some ^/-measurable function/ ^ 0. First of all, we note that (/ a ) a e D is uniformly v-integrable. This follows from the estimate v(/ a l { / . ä r } ) ^ G o g r r ^ a i o g / . l ^ . g , } ) ^ (c + l)/logr which holds for all a e D and r > 1. (The last inequality follows from the fact that x log x ^ — 1 when x 5: 0.) Next we claim that ( / , ) „ „ is a Cauchy-net in L*(v). For suppose the contrary. Then there exists an 6 > 0 and an increasing sequence (a„)„ èl in D such that v(|/ an+i — / a J ) ^ e for all n ^ 1. This is impossible because (X„)n&i is a uniformly integrable martingale and thereby convergent in L*(v). Since L*(v) is complete, we conclude that there exists an ^/-measurable
312
The specific free energy and its minimization
function / ^ 0 such that lim v(|/ a — / | ) = 0. Evidently, /j = / v o n se. Using oteD
these facts, it is now easy to derive the inequality J^^(ß\v) ^ c. For let N be any positive integer. Consider the functions g = N A ( / l o g / ) and ga = N A (/„log/,). Since fa converges in probability to / ga converges in probability to g. As — 1 ^ gx, g ^ N, this implies that lim v(\gx — g\) = 0. Hence l e »
v(g) = lim v(ga) ^ lim Jf^(/i|v) = c. CCGD
CCGD
Letting N tend to infinity we arrive at the desired inequality, D (15.7) Corollary. Let ß, v e ^(Q, 3F} and se be a a-subalgebra of !F. Then jT j / (/i|v)= sup TEG r i j ^
X M^)logM^)/v(^). .4 GTT
//ere 11^ stands for the collection of all finite se'-measurable partitions of Q. Proof. For each n e n ^ we let sJn be the finite er-algebra which is generated byre.With the usual partial order on 11^, (stf„)„sllj, is an increasing net. Since srf = a{sJn: n e IL^), the corollary follows from (15.6). D
15.2
Specific entropy
After the general preliminaries of the preceding section we return now to our standard setting. Thus (£, S) is supposed to be an arbitrary measurable space, S = Zd the d-dimensional integer lattice, and (Q, !F) = (£, S)s the associated configuration space. For each A c S and any two finite measures fi, v on (Q, J^A) we let (15.8)
jTA(n\v) *
jrrA(n\v)
be the relative entropy of ß with respect to v on 3FA. Let us assume that (£, S) is equipped with a finite a priori measure X. X will be held fixed throughout the following. For each A e ^ w e let o~^{XK) denote the finite measure on (Q, J \ ) which is given by ol\XK)(oKe
A) = P(A)
(AeêA).
Clearly, if X e &(E,&) then alx{XK) = XS\&A. (15.9) Definition. For each fi e ^(Q, &) and A e ^ , the quantity
is called the entropy of ß in A relative to X.
Specific entropy
313
Note that the A-dependence of JfA(fi) is dropped in our notation. If E is finite and X is counting measure on E then •*AM
= - I
M*A = OlOg/^A = 0 ^ 0 .
Thus in this case the entropy is given by Shannon's well-known formula. Looking at equation (15.2), we also see that multiplication of X by a factor b > 0 amounts to an addition of the constant | A| log b to J^A{-). Proposition (15.5)(a) thus shows that J^A(-) ^ |A|logA(£) for all A, and we can assume without loss that X is a probability measure whenever this is convenient. Suppose now we are given a shift-invariant random field \i. Let (A„)„ äl be any sequence of cubes with | A„| -> oo. (Recall that a cube was defined in the paragraph before (14.7).) We intend to show that the limit A(H) = lim l A J - ^ ^ e
[-oo.logA(£)]
n-*oo
exists and is independent of the choice of the sequence (A„) n à l . It is this limit which we will call the specific entropy of \i. The existence of A(/J) will be an easy consequence of the strong subadditivity property below. We put J^ifi) = 0 when \x e 0>(Q, &). (15.10) Proposition. For each \x e ^»(Q, #") and all A, A e Sf, 3fA(n) + JfA(n) ^ 3fAnA(n) + ^AUA(^)Proof. We can assume that Xe^(E, S) and ^AUA(/^) > — °°- Then /i = on r /AUA/*-S -^AUA f° some # AUA -measurable function fAUA ^ 0. Let fA = s ^(/AUAI-^A) be the /l -density of /J on J ^ , and let fA and / AnA be defined similarly. (If A D A = 0 we put / AnA = 1.) In view of Proposition (15.5)(c), all these functions have /i-integrable logarithms. Consider the measure v = nXA\A. Since v(/ A > 0) = /i(/ A > 0) = 1, the function /AUA|A = /AUA//A is well-defined v-almost surely. It is easily checked that ß = /AUAIAV on J \ U A . Similarly, we have /i = /A,AnAv on ^ A , where / A|AnA = /A//AHA- Therefore we can write, using Proposition (15.5)(c) again, •*AM
- ^ A U A M = /^(logAuAiA) = ^AUAMV) ^JifA(fi\v) = fi(logfMAnA)
The proof is thus complete, D Since ^(p) = 0, the preceding proposition implies in particular that JifA(fi) + J^A(fi) ^ ^ U A ( / 1 ) whenever A, A e £f are disjoint. On the other hand, if/i is shift-invariant then jfA+i(fi) = JfA(fi) for all A e y and i e S. The function a: A -> JfA(fi) thus satisfies the hypotheses (i) and (ii) of the following
314
The specific free energy and its minimization
well-known lemma on the limiting behaviour of homogeneous subadditive functions. We let y D denote the system of all rectangular boxes of the form
A = Z'fl ft [mt,nt] k= l
with mk, nk e Z, mk ^ nk. (15.11) Lemma. Let a: £fu -> [—oo, oo[ be a function which satisfies the conditions (i) a(A + i) = a(A) for all A e ifu and i e S, and (if) a(A) + a(A) ^ a(A U A) whenever A, A e t?a are such that A U A E ^ and A fl A = 0. Also, let (A„)„g ! be a sequence of cubes such that | A„| -> oo as n -> oo. Then lim |AJ- 1 a(AJ = inf |Ar x a(A) in [ — oo, oo [. Proof. We choose any real c with c > a = inf |A| -1 a(A), and let A e y D be such that |A| 1a(A) < c. Each cube A„ may De split into a number N„ of disjoint translates of A and a remainder which is contained in a boundary layer of A„. We assume that Nn is chosen as large as possible. Then lim |A„|/iVJA| = 1. Applying the subadditivity property (ii) successively (in the right order) and using (i) we obtain the estimate fl(A„) ^ iV„a(A) + (|A| - iV„|A|)a({0}), n ^ 1. Hence a ^ lim sup I A J ^ É ^ A J = lim sup N^1 lAp^A,,) n-*oo
n-»oo
^|A|-1a(A)
exists in [ — oo,\ogX(E)~\ and satisfies the equation A(pi)= inf lAI" 1 .^/!).
Specific entropy
315
(15.13) Definition. For each /i e ^®(Q, ê?\ the quantity A{ß) above is called the specific entropy per site (or the mean entropy or entropy rate) of /i relative to the a priori measure X. In the rest of this section we shall establish some properties of the specific entropy functional /!(•). Whilst Proposition (15.14) will enter into the proof of Proposition (16.11), Proposition (15.16) will only be used in Theorem (15.20) and Example (15.41), and Theorem (15.20) is only included here because it is interesting in its own right. On a first reading it might thus be preferable to jump to Section 15.3. (One can return when the need arises.) (15.14) Proposition. The function /£(•) on ^@(Q, !F) is affine and upper semicontinuous (relative to the ££-topology). Moreover, if (E, S) is a standard Bor el space then the sets {/!(•) ^ c] are compact (c e U). Proof. As before, we can assume that X e 3P(E, ê). 1) To begin, we shall prove that /!(•) is affine. By Proposition (15.5)(d), the functions J^(-) are concave. Hence /£(•) is concave. We thus only need to show that /!(•) is convex, too. So let /i l 5 ß2 e ^ @ (Q, J^), 0 < s < 1, and ß = sß1 + (1 — s)ß2. We must check that A(ß) ^ sA(ßl) + (1 — s)A{ß2). Clearly, we can assume that i(ß) > — oo. In this case, ß is absolutely continuous relative to Xs on each J \ . Since ß1 and ß2 are /^-continuous, we conclude that for each A e y and k = 1 or 2 there exists an J^-measurable function X\ = 0 s u c n t n a t ßk = fAXs on J \ . We can thus write, using the monotonicity of the logarithm, jfA(ß) = -MlOgCs/A1 + (1 - 4 / A 2 ] ) S - WlOgEsA 1 ]) - (1 - S)/X2(l0g[(l = sJfA(ßl)
S)fn)
+ (1 - s)jfA(ß2) - slogs - (1 - s)log(l - s).
Dividing by |A| and letting A run through a sequence of cubes we obtain the desired inequality. 2) Next we will prove that A( • ) is upper semicontinuous. In view of the last equation of Theorem (15.12), it is sufficient to verify that each JfA(-) is upper semicontinuous. This is an immediate consequence of Corollary (15.7). Alternatively, we can use the following argument which does not rely on Proposition (15.6). Let A e if, c e U, and /i e ^ 0 (Q, J^) with JfA(/i) < c be given. We will show that ß has an open neighbourhood U which is contained in the set {jfA(-) < c}. To this end we note that there exists an J^-measurable function g > 0 such that log g is bounded and
-ß(logg)
316
The specific free energy and its minimization
Now we put U = {v G P9{0,&):
-v(logg)
< c + 1 - As(^)}.
1/ contains ft and is open because log ge i£. Moreover, if v G U and v = f^Xs on J \ for some ^-measurable / A =^ 0 then JfA(v)=-v(log0)-v(log/A/0)
Wkl9))
Sc. Here \j/ is given by (15.4). Hence U c {JfA(-) < c}. 3) To prove the last statement we fix a number c S 0 and look at the set K = {fie 0>@{n,^): Â(fi) ^ c}. We will show that K is compact in the <£topology. From Step 2) above we know that K is closed. By virtue of Proposition (4.9) it is therefore sufficient to show that each net in K is locally equicontinuous. So let A G yu and (Am)mèl be a sequence in J*\ with Am J. 0. For each fi G K we have ^ ( / i ) ^ c|A|. In particular, each fie K is Xscontinuous on J ^ with a density / / . For given e > 0 we let <5 > 0 be such that eloge/<5 5: 1 — c|A|. If m is so large that Xs(Am) S then fi(AJ = Xs(lAmn{f:èelo}iï) S s+ Se +
+ H(Amn{f£
> e/ô})
(logs/ô)-ns(lArnn{f:>E/ô]fOogfjï) (loge/ô)-'(I-JfA(fi))
S 2e for ail fie K. The next to last inequality follows from the inequality x log x ^ — 1. The proof is thus complete. • We note that the proof of semicontinuity can easily be adapted to give the following result: If £ is a complete separable metric space with Borel tr-algebra ê then A(-) is upper semicontinuous with respect to the weak topology on ^ @ (Q, £F). (Just replace g by a continuous function which is sufficiently close to g in L1 (ft + Xs).) Next we shall use Proposition (15.6) to show that â(fi) can be represented as the conditional entropy of a single spin, conditioned on all spins in its "lexicographic past". (Cf. equation (15.18) below.) As a by-product, we shall also find that the system £fu can be replaced by Sf in Theorem (15.12). We introduce the lexicographic order on S = Zd by writing i < j whenever there exists some 1 S k S à such that ik < jk and i( = j( when 1 S t < k. Here i = (i1,...,id),j = (ji,...,jd) e S. We shall use the notation (15.15)
V(j) = {ie S:i<j
or
i=j}
for the lexicographic past of a site 7" e S. We also put V*(j) =
V(j)\{j}.
Specific entropy
317
(15.16) Proposition. For each fi e 3P&{&., fF) and] e S, Ä(n)= inf \A\-1jT&(li)=
-jfrru)(ii\iü{J]).
Proof. We can assume that X e 0>(E,ë) and j = 0. We put v = nX^. Theorem (15.12), we only need to verify the inequalities (15.17)
inf l A p 1 . * ^ ) £ -Jrv{0)(ii\v)
By
Z a(vi\
To prove the first inequality we can assume that JrfF(0)(^|v) < oo. Then for each A c V(0) there exists an J^-measurable function gA such that fi = gAv on 3F^. Let A e y be arbitrary and define
A = FI 9(vu)n\)-j ° 9-jUsing the shift-invariance of fi and the lexicographic order of A, it is easily checked that fi = fAXs on !FA. Thus |A|- 1 jr A (^)= - l A p 1 X
ß(}oggiVU)nu-j)
= -IAI" 1 X ^(0)n,A-;)(Hv) ^
-JfF(0)(/i|v)
because of Proposition (15.5)(c). Turning to the proof of the second inequality in (15.17), we assume that Â{pL) > —eG. Together with Theorem (15.12) and Proposition (15.5)(c), this implies that JfA(/i) > - o o for all A e ^ . Take any A, A e Sf with A c V(Q). Then we can write l A I " 1 ^ ^ ) = IAI"1 X [-*W(,>M - ^Af1K.(i)(^)] ieA =
2 J L^(A-«)nK(0)(W ~~ ^ A - o n r ' f o t C / 1 ) ]
|A|
i e A
= - l A p 1 X -#(A-onK(o)Mv) ieA
^ -IAI"
1
X
^A-onF,o)(^|v)
i e A: A —Î3 A
^ - | A r 1 | { i e A : A + i c A}|JfA(/i|v). The third equality was established in the proof of Proposition (15.10), and the inequalities follow from Proposition (15.5) (a) and (c). Letting A run through a sequence of cubes with |A| -> oo, we conclude that 4(/i) S — 3fA(n\v). Finally, we let A increase to V(0). Proposition (15.6) then implies that â{\x) ^ ~ ^F(0)(^lv)- The proof is thus complete, D The expression jfF(0)(/i|/i/l{0}) in Proposition (15.16) can be rewritten in a more appealing manner. Suppose j^vm(fi\fiX{0]) < oo. Then /i = g(nX{0]) for
318
The specific free energy and its minimization
some Jv (0) -measurable function g. In less condensed form, this means that = qa(A) = J X(dx)g(xœs\{0})
li(o0 e A\^v,(0))(œ)
A
for ^i-almost all œ and each A e S. We also can write JifVi0)(n\nX{0}) = nÀ{0](glogg) =
Hi^Al'W)
We thus end up with the formula (15.18)
4(n) = -ii(jr,(f\t))-
That is, Â(H) is equal to the conditional entropy of a0 given the lexicographic past JV(O)- In particular, let S = Z, E be any finite set, and X counting measure on E. Then (15.19)
40*)= -ii\Y
li(o0 = x\&ï-œM)\ogii(p0
= x | ^ - 0 0 , 0 [ )J.
We emphasize that the probability kernel q in equation (15.18) depends on fi. It is therefore surprising that the specific entropy can be represented as the expectation value of a measurable function which does not depend on fi. In particular, this result sharpens the conclusion of Proposition (15.14) that &(•) is affine. Recall from (14.2) that J stands for the <7-algebra of all shift-invariant events. (15.20) Theorem. Suppose (E, S) is a standard Borel space. Then there exists a function h: Q -> [ — oo, log X(E)~\ which is measurable with respect to J T)F and satisfies the equation â(n) = n(h) for all fi e $P&{£1, F). In particular, if fi e ^©(Q, #") admits a representation H=
|
v w(dv)
in terms of some weight w e 0>{0>&{Q, &), e{0>@{Q., F))) then A(n)=
\
4(v)w(dv).
^»«(fi..^)
Proof. 1) We define h = â(w), where n is the (^©(Q, F), ,/)-kernel which was constructed in the proof of Theorem (14.10). As we have noticed there, we can achieve that %' is measurable relative to F DF V\ JFV(0). Hence h is measurable relative to F 0 F f) FV{0), provided we can prove that Â(-) is measurable with respect to ^(^©(Q,F)). This will follow once we have shown that each JfA(-) is measurable with respect to ^(^@(Q, !F)). Since (E, ê) is standard Borel, each !FK is countably generated. Consequently, if we set se = FK then the supremum in Corollary (15.7) can be replaced by the supremum over a countable subcollection of 11^. This implies the required measurability.
Specific energy and free energy
319
2) Now we let ji e ^©(Q, fF) be given. We will prove that Â(JI) — ji(h). To this end we can assume that >f(/i) > — oo or fi(h) > — oo. Proposition (15.16) then shows that J^fV{0)(fi\fiX^) < oo or j^fV{0)(n'\n'X^) < oo fi-a.s.. In each case there exists an J*y(0)-measurable function g such that /i = g(fiX^) on J*y(0). This is obvious in the first case. In the second case we have %"' « n^X^ on JV(o) for /i-almost all co. Since p = fin, this implies that ft « fiX^ on JV(o>The existence of g thus follows from the Radon-Nikodym theorem. Next we will show that nm = g(nmX^) on JV(o> f° r /^-almost all co. Since J*V(0) i s countably generated, we need only prove that n'(B) = n'X^(glB) /i-a.s. for any B e JV (0) . But for each A e J (1 &~ (1 #"K(0) we have H{lAn{B)) = / i ( l ^ ( ß | , / ) ) = M ^ n S ) = A^.{o}(0Unji) = M M { O } ( 0 1 B ) ) = MUM^{O}(01JI)I^))
= H{iAn-X{0]{glB)). As 71' is , / (12T (I Jv (0) -measurable, this gives us the desired result. We now combine the above results with Proposition (15.16) to obtain the final identity >%) = -A*O°g0) = -Awrflogg) = -A*(-*r(0)("'M{0})) = A*C0-
The second assertion of the theorem follows immediately,
15.3
o
Specific energy and free energy
In the preceding section we have shown that the limit -4(Az)=lim|AB|-1jrA>|As) exists whenever \i e ^»0(Q, SF\ X e 0>{E, S), and {A„)„èl is a sequence of cubes with | A„| -> GO. Of course, the product measure Xs can be considered to be the unique Gibbs measure for the potential d> = 0. It is thus natural to ask what happens to the above limit when Xs is replaced by a Gibbs measure for a non-trivial potential $. This is the question which we will discuss in this section. We shall see that the proposed replacement amounts to the addition of two terms to — 4(fi). The first expression is a bilinear form in n and $ and is called the specific energy. The second term is a function of only and is called the pressure. As in Example (5.8), we let äß& denote the Banach space of all shift-invariant potentials $ with finite norm
320
(15.21)
The specific free energy and its minimization
||0||o= Z I I O J .
For each $ £ J 9 w e introduce the function (15.22)
/.A
£ \Ar®A. A30
C l e a r l y , / , e f and ||/.|| £ ||
= lim | A n r V ( H * J = lim | A n r V ( < , K , ^ v ü ) -
(15.24) Definition. /i(/ 0 ) is called the specific (internal) energy per site of \i relative to
A ) = iArv(z/«.°0 for all A e y , it is sufficient to show that the expression (15.25)
sup meCl
Z / • ° 9-i - ^ ( ^ » s u )
ieA
is of order ^(|A|) when A runs through (A„) nàl . For each A e £f and œ e Q we can write J3AK<05\A)=I Z ieA i'e/lcA
i^r1«^
Z
^>A«S\A)
/lnA#0,/l\A^0
and
Z/*°0_,.= z I i^r<^+Z
ieA
I E A i€^lcA
Z
i € A A 3 i: /1\A ^ 0
i^r1^
Consequently, the expression in (15.25) is bounded by
-(A,0) = 2 Z
Z
ieA
110),
A3i:A\Ait
But for each A e y we have r(A,
Z
Z
i e A : A + i
g2|A|
Z
H^II+2
Z j'eA:A+t<£A
Z H^ll A^i
l|OJ+2||O||0|{ieA:A + i^A}|.
For each e > 0 there exists some A e y such that the next to last term is at
Specific energy and free energy
321
most e | A |. The last term has order ( | A | ). Hence r(A, $) = ( | A | ) when A runs through (A„)„ èl . a (15.26) Remarks. (1) Under the conditions of Theorem (15.23), we even know that H(U\J) = lim l A J - 1 ^ * = lim l A J - ^ ^ ^ f f l S w J /x-a.s. n-^oo
n-^oo
(provided (A„)„fcl is increasing). This follows from the preceding estimate of (15.25) together with the ergodic theorem (14.A8). (2) The mapping (p, <J>) -> p(fq,) on ^©(Q, #") x ^ @ is affine in /i and linear in O. If ^®(Q, #") is endowed with the i£-topology and â&& with the normtopology, this mapping is jointly continuous in p and
£
\A\-^(OA)
AsO
for the specific energy, o We shall now assume again that {E,i) is equipped with a finite a priori measure X. The finiteness of X ensures that each $ e ^ @ is A-admissible. The following lemma provides a key to the question which was posed at the beginning of this section. (15.28) Lemma. Consider any p e ^@(Ü, #") and O e J 9 . Let (v„)„èl and (v„)„ai be two sequences in ^(Q, !F\ and let (A„)„ âl be a sequence of cubes with \A„\ -> oo. Under these conditions, the limit
lim l A J - ^ O x k t f J exists if and only if the limit
lim l A J - ^ ^ l v ^ J n->ao
exists, and in this case both limits coincide. Proof. We can assume that X e &{E, S). We fix any A e if and v, v e ^(Q, &). On &A, vyA is As-continuous with positive density p A , v = \v(dœ)pA(aAœs\A). Similarly, vyA is As-continuous on J \ with density pA-\ From this we conclude that either J^A{p\vyX) = oo = J ^ ^ v y ^ ) , or there exists an ^-measurable function fA ^ 0 such that p = / A A S on J*A. In the second case we have (15.29)
JeA(p\vyX) =
p(logfJp^)
= ^A(p\vy^)
+
p(logp^/pX'v).
322
The specific free energy and its minimization
Let r(A,
P
> A Ç S X A ) / P A > A « ; S X A ) g e 2 * A -«>
for all Ç,œeQ. Hence l|logpr7pril^2r(A,(D). Inserting this estimate into (15.29) we obtain the lemma,
o
(15.30) Theorem. Let e ^ 0 a n d ( A n ) „ â l be a sequence of cubes with \ A„ | ->• oo. T/ien t/ie following conclusions hold. (a) For any sequence ((o")n^ ^ in Q, the limit (15.31)
P(cD) = l i m | A n r 1 l o g Z ^ > ' 1 ) n->ao
exists and depends only on <1> (and X). (b) For each \i e ^ © ( ß , ^ ) and v <E ^ ( $ ) , f«e limit A((i\v)=
lim I A J - ' J T A ^ A I I V )
exists and is equal to the quantity (15.32) *(/*|) = P(O) + (AI, O) -
4(p).
Proof. From Theorem (4.23) and Corollary (5.16) we know that there exists some ß e ^@(<S>). Putting v„ = ß and v„ = ôœ„ in Lemma (15.28), we see that (15.33)
lim|AJ- 1 jr A >|y? i i (-|( ü -)) = 0. n->ao
On the other hand, for each A e ^ and co e Q we can write
(15.34) jrA{M(- \œ)) = -jtrA(ß) + M(fl?Kû)S\A)) + logzJM (cf. equation (15.29)). Together with Theorems (15.12) and (15.23), equations (15.33) and (15.34) imply that the limit in (15.31) exists in ] — oo, oo]. In fact, this limit must be finite. This follows from the easily verified fact that | AI"11| log Z j || S ll^llo + I log 1(E) | for all A e ^.Assertion (a) is thus proved. Turning to the proof of (b), we let \i e ^@(Q, #") be arbitrary. Combining Theorems (15.12) and (15.23) with assertion (a), we can conclude from (15.34) that the limit on the left side of (15.33) exists and is equal to the expression in (15.32). Assertion (b) thus follows from Lemma (15.28), as applied to v„ = v and v„ = ôa„. a
The variational principle
(15.35) Corollary. For all ® e @& and pe &9(a,&) p e %(<&) then Â(p\Q>) = 0.
323
we have /f(/i|
Proof. Combine Theorem (15.30)(b) with Proposition (15.5)(a).
a
The preceding corollary is the first half of the variational principle: The function A{-10) at (15.32) attains its minimum 0 on ^ ( O ) . In the next section we shall prove the other half: Each p which minimizes â{-\<$) is a Gibbs measure for $. Corollary (15.35) also clarifies the significance of the function P(O) in (15.31): — P(O) is the common value of the specific free energy of all p e %(). In the lattice gas interpretation of our mathematical setting, P(
15.4
The variational principle
Here we shall prove the converse of Corollary (15.35): If /f(/i|$) = 0 then p. e %($>). Surprisingly, there are no difficulties in proving this interesting half of the variational principle in much greater generality. (15.37) Theorem. Let y be a quasilocal specification, v e ^®(y), and (A„)„ âl a sequence of cubes with | AJ -» oo. Suppose p e ^@(Q, &) is such that liminf|A n r 1 t 3f An (/i|v) = 0. n-><x>
Then p e %{y). Proof To begin, we notice that JfA(p\v) < oo for all Ae ,¥. This follows from Proposition (15.5)(c) and the shift-invariance of p and v. Consequently, for each Ae ,9" there exists an ^-measurable function fA ^ 0 such that p = fAv on J v Let us fix an arbitrary set A e £f. We shall prove, in three steps, that WA = P- As we shall see in the third step, we only need to show that there exist arbitrarily large sets A e 9> for which v(|/AVV — / J ) is arbitrarily small. This will be verified in the first two steps. Step 1. For each ô > 0 and each cube C D A there exists a set A e SP with A => C such that
324
The specific free energy and its minimization
JfA(n\v) - -*MAMV) ^ 6. To show this we let n ^ 1 be such that |A„| ^ \C\ and \An\^JfAri(ß\v) ô/2d\C\. Next we choose an integer m ^ 1 such that
^
m d |C| g | A J g(2m)"|C|. Finally, we choose md lattice sites i(l), ..., i(md) e S in such a way that the translates C(/c) = C + i(k), 1 ^ k ^ md, are pairwise disjoint subsets of A„. For each 1 S k ^ md we put W(k) = C(l) U • • • U C(k) and A(/c) = A + i(k). Then we can write, using Proposition (15.5)(c) in the first and third step,
jt=i
^ m"d £ [^V w (/i|v) - ^V (tACW (/i|v)] =
m^Jfw,(md)(/i|v)
gm-"jr A >|v)
^'ICIIAJ-^O/lv) Consequently, there exists an index /c such that
The claim of Step 1 thus follows by putting A = W(k) — i(k) and using the shift-invariance of \i and v. Step 2. For each e > 0 there exists some ô > 0 such that v(|/ A — / A \ A |) ^ e whenever A c A e ^ and J^(/i|v) — JfA\A(ß\v) ^ (5. For let \\i be the function at (15.4). Then (15.38)
JfA(/i|v) - JfAXA(/i|v) = /i(log/ A // MA ) = v(/ A log/ A // AXA ) =
V
(/A\A"A(/A//A\A))-
(Note that / A = 0 v-a.s. on {/AXA = 0} because / M A = v(/A|J^AXA) v-a.s..) A glance at the graph of ifr shows that there is a number 0 < r < oo such that |x — 1| S r&ix) + e/2 for all x ^ 0. Inserting this inequality into (15.38) we complete the proof of Step 2 by putting ô = e/2r. (In fact, we might invoke an inequality of Csiszâr (1967) to obtain the stronger result that the expression in (15.38) is not less than v(|/ A -/ AXA |) 2 /2.) Step 3. To prove that /iyA = /i we fix any g e $£ and e > 0. Since y is assumed to be quasilocal, there exists a ^-measurable function g e $£ such that \l\9 ~ #11 < £- Let C => A be a cube such that g e !£c and g e J5?C\A- Choose 5 in terms of e as in Step 2, and define A in terms of C and ô as in Step 1. Then
The variational principle IWA(0) - n(a)\ ^ MITAA - S\) + \K§)
-
325
V
(/A\A9)\
+ v(/ M A |g - T A 0|) + \v(fA\A(yAg - g))\ + \\g\\ V(IA\A - AD + |v(/A0) - n(g)\. ar
e
tne
Since g e ^AXA >d 0 =^A> second and the last term on the right are zero. The fourth term vanishes because v e ^(y) and fA\A is 5^-measurable. Due to the choice of g, the first and the third term are each at most e. The only non-trivial term is the fifth one. This term is not larger than ||g||e because of our choice of A. As e was arbitrary, we conclude that fiyA(g) = n(g)- The proof is thus complete, a Combining the theorem above with Example (2.25), Theorem (15.30)(b), and Corollary (15.35) we arrive at the following variational principle. (15.39) Theorem. For each
= a\coa)
n-*oo
= - l i m n'1 logQ{a,a)n+1/Qn+1 {a,a) n-*ao
= - l o g £(tf, a). The last identity follows from the fact that Q"{a,a) tends to a positive limit (cf. Theorem (3.A3)). On the other hand, if n ^ 1 and œ e Q are such that
326
The specific free energy and its minimization
o»s\[i. B ] =
a
t h e n
H*Un](co) = H?Un](ti) - H*Un](co°) logp*Un](coa)/p*,n](co)
=
= logf\
[Q(a,a)/Q(a>t,a>t+1).-]
i=0
Hence
n{\ogQ{a,a)IQ{o_uo0))
for all n e ^ @ (Q, J^). This, together with equation (15.19), yields the identity *(/i|O) =
/i(-logÖ(ff-i,ff0) +
Z K
=
x\&i-aoM)) J
= J /i(da)pf,M<70 = • ^-„.o^MIßfa-i. •))• This identity provides an easy proof of the variational principle for /i Q : By Proposition (15.5), ^(/x|0) = 0 if and only if n{a0 = x|^_ œ > 0 [ ) = QOT-^X) /i-a.s. for all x e E. Since /i is shift-invariant, the latter condition just means that fi = HQ.
O
The next chapter is devoted to a number of consequences of the variational principle. Here we will just mention that the variational principle (15.39) sheds a new light on several results of Sections 14.2 and 4.4. Firstly, we have shown in (14.15) that each Gibbs simplex %(0) is a face of ^@(Q, 3F). This is an immediate consequence of Theorem (15.39) and the fact that the functional A{-10) is affine (cf. Proposition (15.14)). Secondly, in the case of a standard Borel state space it has been shown in (14.17) that the extreme decomposition in ^=)(0) of a given \i e ^©(O) coincides with the ergodic decomposition of \i. This follows directly from the variational principle together with Theorem (15.20). (Note, however, that Theorem (15.20) is much deeper than Theorem (14.17).) Next, we have seen in (4.23)(c) that the graph {(0,/i): O e i , / i e ^(O)} of the Gibbs correspondence is closed. Plainly, this implies that the set
{((D,//):
Large deviations and equivalence of ensembles
327
last assertion of Proposition (15.14). For, %(M) c L
PB(0,F):
A(n) ^ P(0) - 2 sup ||O|| 0 l
(
<SeM
)
because |<-,0>>| ^ ||O|| 0 and |P(
%\F) = j© £ ^ 0 : inf 4n\®) = ol, and Remark (15.26)(2) and Proposition (16.1) imply that inf Â(H\-) is conMeF tinuous.
15.5
Large deviations and equivalence of ensembles
The final topic of this chapter is the role played by the specific free energy for the large deviation probabilities of Gibbs measures. Recall that the ergodic theorems of Section 14.A deal with spatial averages of the form
ÄA/ = |A|- 1 J ] / o 0 _ ; ieA
in the limit when A runs through a sequence of cubes with |A| ->• oo. (In contrast to Section 14.A, we use here the backward shifts #_, rather than the forward shifts 6>,, so that RAf depends on the spins around A itself instead of its reflection —A.) It will be convenient to modify RA as follows. (15.41) Definition. Let A be a cube in 5 = Zd of side length p e N, i.e., A = S(1 n ^ i D 7 1 * ' mk+p[ with m e S. For each configuration co e Q, let co°A be the periodic continuation of coA, which at site i e S takes the value &>/(,), the spin of co at the unique site j (i) € A with ik = j (i)k mod p for all 1 ^ k ^ d. The probability measure
(15.42) °R»A e:\A\~1
J2S^<
ieA
on (Q, ^) is then called the periodic empirical field of co in A. Since °/?A depends on coA only, we will use the same notation when co e EA.
328
The specific free energy and its minimization
The advantage of the periodization is that each °RCA> is translation invariant, so that °R% e ^&(Q, J O for every cube A and all co. On the other hand, the next remark shows that this change makes no real difference when A becomes large. Throughout this section we write | A| —> oo for the infinite volume limit along an arbitrarily chosen sequence of cubes. Note also that the probability kernels °RA : (co, A)-»°/?^(A) can be applied as usual to functions / on Q. (15.43)Remarks. ( l ) F o r e a c h / G %', \\°RAf-RAf\\ -* Owhen|A| -* oo. To see this, let A e ^ be such that / G JZ?A- Then / ( 0 _ ; < Ü ) = /(0_,-Û> A ) for all i G A with A + i c A and all co e Q. Hence \\°RAf — RA/W is not larger than 21|/|| |{* G A : A + / ç£ A}|/|A|, which tends to zero as |A| -> oo. (2) For each /x G ex &@(Q, ß), the random measures °RA '• co —> °R^ in £?@(Ç2, ß) converge to /x in /x-probability when | A| —> oo. That is, for every neighbourhood U G -e(£P&(Q, ß)) of/x we have (15.44) ß(°RA G U) -* 1 a s | A | - ^ o o . Indeed, by (4.2) we can assume that U consists of all v G ^@(^2, ß) with \v(fj) - p(fj)\ < s for all 1 £ j ^ k, where k G N, fu ..., fk G ££ and s > 0 are arbitrarily chosen. The claim then follows from the first remark together with the ergodic theorem (14.A5). (3) Let O G ^ 0 and °// A = //* A be the Hamiltonian in A with periodic boundary conditions; recall its definition in Example (4.20)(2). It is then easily checked that °//* can be expressed in terms of °RA via the simple formula °H*
= \A\<°RA,<Ï>>.
o
The following large deviation principle implies that the convergence in (15.44) is exponentially fast when ß is Gibbsian for some O and U is a neighbourhood of the whole set ^©(O). Equivalently, a behaviour of °RA that is untypical for the whole class of Gibbs measures can only occur with a probability decaying exponentially fast to zero with speed | A |. The rate of this exponential decay is determined by the excess free energy functional /?(-|0). Throughout this section we assume that (E, S) is standard Borel. For each set C c ^ © ( ß , ß) we let C (resp. C°) denote the closure (resp. the interior) of C in the Jz?-topology. (15.45) Theorem. For each® G 3ë®andC G - e ( ^ 0 ( ß , ß)), the inequalities (15.46) limsuplAr'logsupyjfCÄA G C\co) ^ lAH-oo
(oeQ
1
(15.47) lim inf | Al" log inf y?(°RA G C\co) ^ |A|-*oo
oeQ
- inf A(v\4>), veC
- inf rf(v|<&) veC
hold. Inparticular, (15.46) and(15.47) remain true when the measures y£(- \co) are replaced by any JJL G £f(0).
Large deviations and equivalence of ensembles
329
Note that the infimum on the right side of ( 15.46) is attained whenever it is finite. This is because the level sets {/C(-|) ^ c] are compact; see Proposition (15.14) and recall that <•, > is bounded and continuous. In particular, the variational principle (15.39) implies that inf ye c  (u|0) > 0 when %() n C = 0. We remark further that the inequalities (15.46) and (15.47) remain true when the Gibbs distributions y*(-\co) with configurational boundary conditions to e Œ are replaced by the Gibbs distributions °y * with periodic boundary conditions. The latter are given by > * = exp[-°//*]ÀA/°Z*, where °//* is as in Remark (15.43)(3) above and °Z* the appropriate normalizing constant. In fact, since °//* is a function of °RA, the proof of Theorem (15.45) then becomes slightly simpler because there is no need to consider the energy errors (15.25). We postpone the proof of the theorem and consider first a classical special case. For simplicity we confine ourselves to the case of periodic boundary conditions. Let us set up the stage. We fix some O e 88® as well as k potentials * ' , . . . , * * e SSQ, which are combined to a vector-valued potential * = ( * ' , . . . , tyk). Let t • * e <%& be the usual inner product of * with a vector t e Rk. We shall look at the probabilities that the Revalued Hamiltonians °//* = | A| (°RA , *> with periodic boundary conditions take prescribed values. (15.48) Corollary. In the setup above, we have for any Borel set B c Rk
limsuplAr'logV^IAr^eß)
^ - inf 7*(x|
|A|->oo
xcB
lim inf \A\~l \og°y%(\A\-1
°H* e B)
^
|A|-»oo
- inf 7*(x|0). x<=B°
Here, Mx\0) (15.49)
= =
inf
Â(v\<&)
sup[f -x - P ( 0 -t
• * ) ] + P(O)
(el
is a convex function ofx e Rk which has compact level sets {7* (• | O) fi c} and satisfies U,(-|
Let $ ë J 0 be arbitrary and * e ⧮ the self-potential where f : E -> R is any bounded measurable function.
330
The specific free energy and its minimization
Large deviations and equivalence of ensembles
331
332
The specific free energy and its minimization
Step 2. Here we establish a weaker form of (15.46), which is obtained by replacing C on its right-hand side with the closed convex hull ex C of C. The basic observation is that the measures Y^^RA belong to ex C. Indeed, for any choice of cylinder events A\,..., At and £ > 0, a convex combination v of measures in C with max]<_,-
-
i
+ P(0)]^mcA
l
J
inf
^
^
vecxC
where co e Q is allowed to vary arbitrarily with A. Suppose this inequality fails. There is then a constant c < mc and a sequence of A's and &>'s satisfying
- * ( y £ j +
inf 4(v|0) = inf 4(v|4>). vecxC
vec
This relies on the fact that A (-|0) is affine and its level sets L^C={A (-|<ï>) ^ c} are compact. We use some convex analysis. To meet the standard hypotheses we need to change the topology on ^&(Q, ß). By Corollary (4.A13) we can choose a countable core <€ of & with <€ c ß°. The <îf-topology on ^ 0 ( Œ , ß) is then defined as the coarsest topology making all maps v —> v(A) with A e l continuous. It is coarser than the .if-topology and, by Comment (4.A10)(1), turns ^©(Q, ß) into a compact metrizable set. As the level sets L<j>.c are compact in the S£-topology, they are so in the 'if-topology, so that A(-\Q>) is still lower semicontinuous in the latter topology. We conclude further that both
Large deviations and equivalence of ensembles
333
topologies coincide on each L® c, and thus on {A(•{<&) < oo}. We write C for the closure of C in the ^-topology. Suppose now mc is finite (since otherwise (15.54) is trivial) and let \x e ex C be such that A(ß\$>) — mc- Proposition 1.2 and Lemma 9.7 of Phelps (1966) then show that there exists a probability measure wM on (the ^f-Borel sets of) £P®(Q, JF) which is supported on C and satisfies
K/*l<*>)
[ *(v\ ®)vrJdv).
So, vyß is in fact concentrated on C D {/f (-|0) < oo}, and C can be replaced by C here. Hence mc > inf ve c 4(v|4>), and (15.54) follows. • Proof of the lower bound (15.47). We still use the abbreviations introduced in the proof of the upper bound. It is sufficient to show that (15.55) liminf |A|-' logyMoJ(°RA
G C) ^ -/t(v|
|A|->oo
for every v e C° with 4(v|4>) < oo and arbitrary boundary conditions co depending on A. Let£ > 0 be given. By Proposition (15.52) and the continuity of <•, 0> one can find a measure v G C° D e x ^ 0 ( Œ , ^") with 4 ( v | 0 ) < 4(v|<ï>) -f £. On the other hand, using (15.25) and (15.31) we obtain logKA|Xi?A e C) = logX A ( e - w A^A^A) A
1(ORA6C))
_ logZ*(o>)
R
£ log^(,-' l<° -*> l,oRAeC)) - | A | ( P ( * ) + e ) if A is large enough. To estimate this further we introduce the event AA = {°/?A e C°, <°RA, $> <
-A(v)+ej.
Here, / A is the Radon-Nikodym density of ÛIJ^A relative to a A "'(^ A ), which exists because A(v) is finite. Then we can write lAp'logKAiXtfA e C) ^ l A r ' l o g ^ / " 1 1AA) -<Î5, 0 > - P ( 0 ) - 2 £ ^
1
|Ar logv(AA)-/f(v|0)-4£
when A is sufficiently large. Now, the point is that AA is typical for v, in that v(A A ) —> 1 as |A| —> oo. This follows from the ergodicity of v together with (15.44), the continuity of <•, 0>, and the theorem of McMillan. The latter asserts that vdlAr'log/A + ^ v ) ! ) - » ^
as|A|^oo;
334
The specific free energy and its minimization
see Theorem 9.2.4 of Krengel (1985) or the references in the Notes on Section 15.2. So, letting first |A| -> oo and then e ^ O w e arrive at (15.55). • Proof of Corollary (15.48). We apply the so-called contraction principle to the continuous mapping ey : v ->
inf [ P ( 0 - * ) + L ( 0 - * ) ] ^ - c + P(O) + L(O).
Hence
veC
In view of (15.47), the conditional probabilities y^w are then eventually welldefined, and the identities in the proof of (15.46) together with (15.47) give lim sup [(y£lw°RA, O) - A(y^J
+ P(O)]
|A|-
S - lim inf | A p 1 log yA]a>(°RA G C) ^ mc . \A\—KyD
336
The specific free energy and its minimization
Large deviations and equivalence of ensembles
337
Chapter 16 Convex geometry and the phase diagram
One of the principal problems of Statistical Mechanics is the investigation of the sets { 0 e # e : | e x # 8 ( 0 ) | = Ar} (1 ^ N ^ oo) in the Banach space ää@ of all shift-invariant absolutely summable potentials. These sets form a partition of @)@ which is called the phase diagram. Whilst the true nature of the phase diagram is still unknown, we can make some statements on its general structure, and this is the subject of this chapter. We shall establish two different kinds of results on the phase diagram. First, we shall be concerned with potentials < P e f 9 that admit at least N distinct pure phases which can be distinguished by means of N — 1 prescribed local observables. Such potentials will be shown to exist in every closed convex subcone of <%@ which is related to the given observables in a suitable way. Moreover, such potentials are not isolated. Using the same method, we shall also be able to establish the existence of potentials which exhibit a breaking of shift-invariance or a breaking of a continuous symmetry. All this will be the subject of Section 16.3. In Section 16.4 (which can be read independently of Section 16.3) we will study a different feature of the phase diagram. Namely, we shall consider all those potentials which admit only one shift-invariant Gibbs measure. These potentials will be shown to be generic. Moreover, the associated unique ergodic Gibbs measures will turn out to be dense in the set of all shift-invariant random fields with finite specific entropy. The above-mentioned results depend on two key facts: the convexity of the pressure, and the variational principle. The latter implies that each shiftinvariant random field \i with finite specific entropy can be identified with a unique closed hyperplane in &@ x M which is a tangent or an asymptote to the graph of the pressure P. Moreover, ^ is a Gibbs measure for a potential
The pressure and its tangent functionals
Throughout this chapter we let S = Zd be the integer lattice of arbitrary dimension d^.1. Also, we let (E, S) be any measurable space which is equipped
The pressure and its tangent functionals
339
with a finite a priori measure X. First of all, we shall look at the pressure P which was defined in (15.31) and (15.36). P is a real function on the Banach space ffl& of all shift-invariant absolutely summable potentials with norm (15.21); cf. Example (5.8). (16.1) Proposition. The pressure P: 3§& -> M is convex and satisfies the inequality |P(
(
Proof Fix any
s)\ogZ]ï(œ),
by Holder's inequality. In view of equation (15.31), we see that P is convex. On the other hand, we have
Z * M = iA{hltijrv\cD) ^Z*(ui)exp||flJ-*|| gZ*»exp(|A|||O-¥||0). Taking the logarithm, dividing by | A |, letting A run through a cofinal sequence of cubes, and interchanging the roles of
ôTP(
t
a£P(
oo
exist and satisfy the inequality 3^P(
340
Convex geometry and the phase diagram
Proof. By (16.1) and (16.2), the functions *P Q) = (~) Gy, where
dyP(
"fe(f
GT = {
V
=
D
U
"Jl
s>0
G
T,n,S-
Here Gv,njS = {
Let us show that the sets Gy „ are dense. The Baire category theorem then implies that<2)is dense. (See Reed and Simon (1972), Theorem III.8, for example.) Fix any Y and n and suppose G>j. „ were not dense. Then there is some <50 e ^@ and e > 0 such that <£ G Vn whenever ||0 — O 0 || 0 < e. In particular, if \t\ < ô = e/||T|| 0 then 0 + f¥ <£ G v „. Hence, the convex function t -> P(o + f ^ ) would be nowhere differentiate on the interval ] — <5, <5 [. This is impossible, G The preceding proposition shows that the restriction of the pressure to any separable subspace of ^ 0 is differentiate on a dense Gs set. In particular, if E is finite then ^@ has a countable dense subset, namely the set of all finite range potentials with rational values. Thus, in this case, the pressure is differentiable on a dense Ga subset of 3ä@. We shall now consider linear functionals on <M% which are related to the pressure as follows. (16.4) (a) L^P (b)
Definition. Suppose L: @S@ -> IR is linear. L is said to be P-bounded if there exists a constant c < oo such that + c. L is called a tangent functional to P at e ^ 0 if L(*P) ^ P(
for all f e J e . The set of all tangent functionals to P at $ will be denoted by ÖP(O). Clearly, a linear functional L: ^ 0 -> IR is P-bounded if and only if the quantity (16.5)
i ( L ) = inf [POP) - LOP)]
is finite. (In Section 16.2 we shall see that A(L) is the specific entropy of a shift-invariant random field which is associated with L.) L is tangent to P at some
The pressure and its tangent functionals
341
tangent functional is P-bounded. Moreover, each P-bounded functional L can be identified with the hyperplane HL = {(»F, t) e ^ @ x R: t = L(¥) + 4(L)} in J'Q x R. Two cases are possible: Either HL is disjoint from the graph GP = { p F , t ) e # e
x
R:f = P(*)}
of P. Since J/L and GP have distance zero, we then might think of HL as an asymptote to GP. Or HL is a tangent to GP at some point (0, P()). This case occurs if and only if L is a tangent functional to P at O. We add two further remarks. (16.6) Remarks. (1) If
342
(16.8)
Convex geometry and the phase diagram
||0 - O°|| 0 g 8- 1 [P(0°) - L 0 (O°) - 4(L 0 )]
and (16.9)
L(T)^L0(T)-8||T!|0
for allV
eC.
(If C is a linear subspace of ^ @ then the last inequality just means that ||L-L0|!C4
sup
|L0F) - L0OF)| ^ e.)
Proo/. There is no loss in assuming that L 0 = 0. Otherwise we replace P by the convex continuous function P — L0 and the resulting L by L + L 0 . 1) As a first step, we shall construct the potential O. For each O e J 9 we set C(O) = {¥ G O + C: P(»F) £ P(O) - e || V - O|| 0 }. We will show that there exists some O e $ ° + C which satisfies (16.8) and is such that C() = {}. First of all, we note that each C(O) is closed and convex. Moreover, the C(0)'s are nested, in that C(T) a C(O) when »F e C(O). For if ¥ ' e C(T) then *F,GxF + Cc:(I. + C + C = (I> + Cand PPF')^P0F)-e||¥'-¥||o ^ P(O)-e||^-O||0-e||V'-V||0 ^P(O)-e||V-O||0. Starting from the given potential O 0 , we define a sequence (O") nè0 recursively by choosing O n+1 e C(C>") in such a way that P($" +1 ) <
inf
PPF) + 82"".
4» e C(4>")
(O n ) nS1 is a Cauchy sequence. For if n > m then O" e C(m) and therefore e||0"
_ o»|| 0 g p(o«) _ p(0") < e 2~ (m ~ 1) .
Since ^ @ is complete, the sequence (0") n ä l converges to a limit O e &@. By construction, O e C(
A geometric view of Gibbs measures
343
A = {PF,t)e# e x R:r>P(»F)} and B = {ÇV,t)eâlB
xU:x¥e®
+ C,t ^ P(0>) - e || v F - O|| 0 }.
Since P is convex and continuous, its epigraph A is convex and open. Similarly, since O + C is convex and the mapping ¥ -> HT — O|| 0 is convex, the set B is also convex. Moreover, A and B are disjoint. For suppose that (¥, t) e A H J3. Then ¥ e C(0>), whence ¥ = 0> and P(0>) < t ^ P(0>). This is impossible. Consequently, the sets A and B can be separated by a closed hyperplane. More precisely, there exists a continuous linear functional L on ^ @ x M and a number c e IR such that L(¥, t) ^ c when (¥, t) e /I and L(¥, t) ^ c when (¥, t) e B. Clearly, L can be written in the form LQ¥, t) = L(«P) + at
p F e 3B9, t e U),
where L: && -> R is a continuous linear functional and a e IR. It is easy to see that a < 0. Rescaling L and c if necessary we can thus assume that a = — 1. Then L(¥) - t ^ c for all ¥ e J ^ and all t > P(¥). Hence L ^ P + c. On the other hand, for each ¥ e 0> + C we obtain, putting t = P(0>) - e||*F -
16.2
A geometric view of Gibbs measures
In this section we shall derive the interesting fact that each shift-invariant Gibbs measure can be identified with a tangent to the pressure. Moreover, each shift-invariant random field with finite specific entropy will turn out to be identifiable with a P-bounded linear functional on @)&. This identification depends on the variational principle as well as the duality of 38& and ^®(Q, 3F) which is established by the specific energy <•,•>; recall (15.27) and (15.23). To begin with, we introduce the set
of all shift-invariant random fields with finite specific entropy (relative to the a priori measure X). If E is finite and A is counting measure then ^®(Q, &) = &>&(Q,&). We also consider the linear space 3ß% of all linear functionals L:@&->M. The specific energy <•,•> establishes an affine mapping / : ^@(Q, &) - • 38% by
344
Convex geometry and the phase diagram
J(H)= - < / V >
(fie &*<&,&)).
We will show that^' is a bijection from ^®(Q, #") onto the set of all P-bounded linear functionals on ^ @ . The simple lemma below implies that J(/j.) determines all integrals /j.(f) (f e if) and thus \i. In other words, j is injective. (16.10) Lemma. Let f be a bounded local function on Q. Choose any A e y with f e S£K and define a potential O = 0 ( A , / ) by fl.
f / ° fy if A = A — i for some i e S, [0 otherwise.
Then O e # e , ||O|| 0 = |A| ||/||, and <^,
a
Next we turn to the identification of the /-image of ^©(fi, 3F\ To this end we investigate the quantity ML) which was defined in (16.5). By definition, a function L e @&% is P-bounded if and only if A(L) > — oo. This occurs if and only if L e/'(^e(Q, J*)), as we will now show. The main ingredients of the proof are the first half of the variational principle which is stated as Corollary (15.35), as well as the properties of specific entropy which appear in (15.14). (16.11) Proposition. Suppose {E,S) is a standard Borel space, and let L e 0&% be given. If L = j(ß) for a {necessarily unique) ß e ^©(Q, #") then A(L) = A(n) > — oo. In the alternative case, A(L) = — oo. Proof. We extend the specific entropy to a function on 0&% by putting ML) = \A{ß) \ — oo
if L =
^ for otherwise
S
°me
ß e
^®(n'
^
when L e 0&%. The statement of the proposition is then equivalent to the identity A = A. The proof of this identity consists of three steps. 1) We start from the equation P(0)=
sup
[-
(
which is an immediate consequence of Corollary (15.35). This equation can be rewritten in the form P(O) = sup [L(O) + A(L)~\
(
That is, the pressure P is the Legendre-Fenchel transform of the function — A. On the other hand, the claimed identity A = A is equivalent to the converse statement that — A is the Legendre-Fenchel transform of P, in that
A geometric view of Gibbs measures
-Â{L) = sup [L(O) - P(O)]
345
(L £ 3S%\
In the next step we establish the essential ingredient which allows us to deduce this equation from the previous one. 2) The function A: 38% -> [ — oo, oo[ is concave and upper semicontinuous relative to the weak* topology on 38%. (By definition, the weak* topology is the locally convex topology on 38% which is generated by the seminorms L -> |L(0)|, O e 3#@.) The concavity of A is an immediate consequence of the concavity of the specific entropy; cf. Proposition (15.14) or, even simpler, Proposition (15.5)(d). To prove the upper semicontinuity of A we need to show that the sets {L e 38%: A(L) ^c) (ce U) are closed. In fact, these sets are even compact. For they are the ^'-images of the sets {/j, e ^©(Q,3r): A{p) ^ c} which, by Proposition (15.14), are compact (in the if-topology). The compactness of the image sets thus follows from the easily verified fact that J- is continuous relative to the topologies chosen. 3) In view of Step 2 above, the identity A = A follows from a general duality theorem for Fenchel transforms; see Ekeland and Temam (1976), Proposition 4.1, for example. For the convenience of the reader we provide a proof. First of all, we conclude from Step 1 that P(O) — L(O) ^ A(L) for all O and L. Hence A^A. Suppose there exists some L 0 e 3S% with A(L0) < A(L0). Consider the locally convex space 38% x U. and its subset C = {(L,t)e<%% x
U:t^A{L)}.
C / 0 because ^®(Q, &) / 0. In view of Step 2, C is convex and closed. By assumption, (L0,A(L0)) $ C. The sets C and {(L0,A(L0))} can thus be strictly separated by a closed hyperplane; cf. Reed and Simon (1972), Theorem V.4(c). This hyperplane is described by a continuous linear functional on 38%. Moreover, a well-known theorem of Banach asserts that each continuous linear functional on 3ä% is of the form L -» L(O) for some O e 38&; see Reed and Simon (1973), Theorem IV.20, for example. Consequently, there exists some $ e l 9 and real numbers a, c such that L(O) + at < c for all (L,t) e C, and L0(
min
[-4(/x)]
(
which follows from Corollary (15.35), but also by the converse identity
346
Convex geometry and the phase diagram
4(fi) = inf [ + P(0)]
(/x G 0»e(Q, ^ ) ) -
That is, — P and 4 are conjugate concave functions relative to the <•»')duality of @& and 0>@(Q.,^). In terms of the quantity 4(/x|
min
*(/x|-) = 0, inf *(-|O) = 0.
The second result which is contained in Proposition (16.11) is the following one: For each L G 3$%, £(L) is finite if and only if L = j(n) for some /x G ^e(Q, J* ). So we have the following theorem. (16.13) Theorem. Suppose (E,S) is a standard Bor el space. The mapping /': /x -» — then establishes a one-to-one correspondence between ^Q(Q, J^) and the set of all F'-bounded linear junctionals on 3&@. The theorem above also admits an alternate proof: Starting from a P-bounded L, one can define a linear form /x on j£? by setting /x(/) = — L(O) when is associated to / via Lemma (16.10). The P-boundedness of L then implies that \i is uniquely defined, positive, normalized, shift-invariant, and ^-continuous. The Daniell-Stone theorem (cf. Bauer (1981), Theorem 39.4) thus shows that H G ^e(Q, #"), and the construction of /x implies that^'(/x) = L. This argument is carried out in Israel (1979), Theorem II.1.2. So far, we have only used that half of the variational principle which is stated as Corollary (15.35). We shall now use the other half to obtain the following geometric characterization of shift-invariant Gibbs measures. (16.14) Theorem. Suppose (E,ê) is a standard Borel space. For each establishes a one-to-one correspondence between ^ e (0) and the set 3P(<1>) of all tangent Junctionals to P at
A geometric view of Gibbs measures
347
In view of (16.12), this implies that a(ß\Q>) = 0. Theorem (15.39) thus shows that /x e ^©(Œ). Combining this result with Theorem (16.13) we conclude that f (%($>)) = dP(Q>), and the proof is complete, o We conclude this section with some simple consequences of the preceding theorem. The first result states that the pressure is strictly convex on suitable subspaces of ^@. For each a e # ( £ , l ) we let 38® denote the space of all a-normalized potentials in ^©. We know from (2.35) that two distinct elements of ^® can never be equivalent. (16.15) Corollary. Suppose E is a complete separable metric space, S the Bor el a-algebra, and À is everywhere dense. (a) Let $, 4* e J 0 be continuous, in that all <5/s and T^'s are continuous. Then <5 ~ *P if and only if the pressure P is affine on the interval [$, *P] = {tO + (l - t ) » F : 0 ^ t g 1}. (b) For each a. e 3?{E, $), the pressure is strictly convex on the space of all continuous potentials in 0&®. Proof (a) By Theorem (2.34), O ~ ¥ if and only if %(Q) (1 %Ç¥) # 0. In view of Theorem (16.14), the latter means that there exists a tangent functional L to P which touches P at 0> and ¥ simultaneously. Such an L exists if and only ifP is affine on [ 0 , ¥ ] . (b) This follows from (a) and Theorem (2.35). o As an application of the corollary above, we can state that each
= o. Since P is convex, this inequality can only hold when P is affine on [
348
Convex geometry and the phase diagram
Our final corollary combines Theorem (16.14) with Dobrushin's uniqueness theorem. As in (8.36) we let J?@ denote the space of all O £ i 0 with finite norm
lll®lll= I
\A\\\<S>A\\,
ABO
and we look at the region 2 = { O e l 0 : HI(D|| < 1}. We know from (8.7) and (8.8), together with (5.17)(1) or (5.20)(3), that S(0) = S e (0) = { ^ } whenever £> e 3 (and (E, S) is standard Borel). Recall the notation (15.22). (16.17) Corollary. Suppose (E, S) is standard Borel. Then the pressure P is twice continuously differentiate on S in all directions of J?@. Its derivatives are given by the formulas
and dsdt
ieS
Here 0> e 3> and *F, ¥ e Proo/. Fix any (D e Q) and ¥ , »F e J@. Theorem (16.14) shows that dP(Q>) = {-<>
16.3
Phase transitions with prescribed order parameters
In this section, we will use the material of the preceding sections to establish the existence of phase transitions. More precisely, we shall start from a finite system {fi,---,fN} of local observables, and seek potentials $ e && which admit a family {fiy.j e J} of distinct Gibbs measures which can be discriminated by means of the associated expectation values {/^(/i),..., fij(fN)}. In the language of Statistical Physics, this condition on £> is usually expressed by saying that {/ x ,..., fN} is an order parameter for cD. Under weak hypotheses on {/ l5 ... ,fN) we shall find that such potentials $ exist. We shall also obtain some information on the general type of these potentials, but we shall be unable to specify any of these. This is because Proposition (16.7) provides
Phase transitions with prescribed order parameters
349
rather incomplete information on the touching points of the approximating tangent functionals. To begin, we introduce a concept of discrimination which underlies Theorem (16.20) below. Throughout the following we put / 0 = 1. (16.18) Definition. Let
fa=
Z Qnfn n= l
when a = (ai,...,aN)eUN.LetJ
be defined by (14.2).
(16.20) Lemma. Let
(ONI/,,...,/,}.
(ii) There exists some p. e%(<&) such that p(p(fa\ J)2) > p(fa)2 for all a e UN with \a\ = 1. Proof (i) implies (ii). Let p0,...,
pN e %(0) be such that (nm(fn))oèm
nèN
has
N
full rank. Define p = (TV + l)" 1 £ pm. For each o e R " w e let fa be a version oîp(fa\J). Since M i ? ) — M / J 2 is the variance of fa with respect to p, we only need to show that none of the functions fa is constant /^-almost surely. Suppose the contrary. Then there exists a vector a = (at,..., aN) e UN with \a\ = 1 and a number a 0 e M with ji(/ a = - a 0 ) = 1. By Theorem (14.5)(b), there exist ./-measurable functions gm ^ 0 with pm = gmp, 0 <; m ^ N. Thus /i m (/ a ) = ^9 m fa) = Ä9mfa) = - a0 for all m. Consequently, £ a„pm(fn) = 0 for all m. n= 0
This is impossible because of our choice of p0,...,
pN.
350
Convex geometry and the phase diagram
(ii) implies (i). For any fixed versions f„ of n{f„\J) we let «s/œ = o-(/i,... ,fN) be the smallest cr-algebra relative to which / j , . . . , fN are measurable. Plainly, «s/œ c «/. Moreover, «s/œ is countably generated. Thus we can find an increasing sequence («s/k)kâ j of finite subalgebras of «s/œ such that «s/œ is generated by (J «s/k. For each fe ;> 1, the atoms of sék constitute an ./-measurable fcai
partition of Q which will be denoted by nk. We claim that there exists an integer k ^ 1 such that the matrix has rank AT + 1. For otherwise there exists a sequence of vectors b(k) = (b$\...,fcj*>) e RN+1 with |ft(*>| = 1 such that
t WWAL) = 0 n=0
for all fc ^ 1 and A e nk (and thus all ^4 e «s/t). By compactness, the sequence (b (k) ) käl has a cluster point b e [R^ 1 with \b\ = 1. Since (^4) k â l is increasing, we conclude that
t KnixJn) = o n= 0 N
for all ^ e (J ,s/k and therefore all A e ^ œ . Since £ b„f„ is ^/ œ -measurable, we get the result that £ b„/„ = — b0 /x-almost surely. The vector b' =
(b1,...,
n= l
bjv) is non-zero because otherwise b = 0. Therefore, if a = b'/l^'l then n{fa2) = ß(fa)2, in contradiction to hypothesis (ii). This proves the claim. Now let k ^ 1 be such that Mk has rank N + 1. By a well-known theorem of linear algebra, there exist sets A0, ..., AN e nk such that the vectors vm = (ß(^A„L))o^neN (° ^™
Phase transitions with prescribed order parameters
351
(i) 0 A o is a linear combination of the functions /„, 1 ^ n ^ N. (ii) For each i # 0, 0 A . is a linear combination of the functions fm /„°0,-, 1 :g m, n ^ iV. (iii) If yl e y is not a translate of some A; then <&A = 0. It is easily seen that @>@{fi,..-,fN) is closed relative to ||-|| 0 . More precisely, @&(fi,---,fN) is the closed linear span of the potentials 0 ( A °' / n ) and $(Ai,/m/n°8i) (j ^ o, 1 ^ m, n <; iV) which appear in Lemma (16.10). Recall the definition of ^@(Q, OF) at the beginning of Section 16.2. (16.21) Theorem. Let (E, S) be a standard Bor el space, N ^ 1, / i , ..., fN e JS?, and âS&{fi,. •. ,fN) be as above. (a) Suppose there exist measures ß0,..., ßN e ^©(Q, 3?) such that the matrix nas u (Mm(/n))osm,nsiv f ^ rank. Then for each O 0 e ^@ there exists a potential ®e
lemma shows that we can take ji — (/V+l) - 1 Yl ßmm=0
Since the unit sphere in UN is compact and the mappings a -> ß(ß(fa | J*)2) — ß(fa)2 and a -> ||/fl|| are continuous, we can find an e e ]0,1 [ such that M(Mai^) 2 )-M(/ a ) 2 >4 £ |A 0 in/ f l ii 2 for all a e UN with |a| = 1. In case (b) we also require that e ^ e 0 , and we put Ö = e2/4. Now let O 0 e ^@ be given. In case (b) we assume that IJO0 — *P||0 < 5. We look at the linear functional L 0 = — *, •> on Sè%. By Theorem (16.13), L 0 is P-bounded. We apply Proposition (16.7) to L 0 and the closed convex cone C = ^@(f1,...,fN), and we use Proposition (16.11) and Theorem (16.14) to interpret the conclusion of (16.7). We conclude that there exist a potential
352
Convex geometry and the phase diagram
claim we fix an arbitrary aeUN with \a\ = 1. We let T ° , ¥ ° , and ¥< (i ^ 0) be the potentials which are defined by Lemma (16.10) in terms of the functions v{fa)fa, n(fa)fa, - / a / a ° 0 ; and the sets A 0 , A 0 , A,., respectively. All these potentials belong to @&{fi,...,fN), and we have the estimates li^llo v ||¥°||o S |A 0 | l!/J!2 and ||¥'|| 0 rg |A ; | H/J 2 ^ 2|A 0 | ||/J| 2 , i # 0 . Hence (16.22)
v(fafaoei) - v(fa)2 = - < v , ^ + T°> ^ - < M , T i + T0>-3£|A0|||/a||2 = Malawi)
-
v(fa)Kfa)
~ ^\A0\
\\fa\\2
= Ma/ a °ö 1 .)-
-
4s\A0\\\fJ2
= M/ u / a °ö i )-M/ a ) 2 -4e|A 0 ||!/ a || 2 for all i # 0. We average the above inequality over all i # 0 in a cube A and let j A | tend to infinity. The mean ergodic theorem (14.A3) then gives (16.23)
v(v(ftt\J)2) - v(fa)2 = v(fa v{fa\S)) - v(fa)2 = Urn |Al"1 A
X
[va/ao0;)-v(/a)2]
O^ieA
^KKfa\*)2)-Kfa)2-te\*0\\\fa\\2
>0. Lemma (16.20) thus shows that
Phase transitions with prescribed order parameters
353
the first expression in (16.23) does not vanish. This is all what is needed to complete the proof. (2) Consider the case TV = 1 in Theorem (16.21). So we are given just one function f e S£. In this case, Theorem (16.21) remains true when @)&{f) is replaced by the closed convex cone ^©(/) of all O e @&{f) which are such that O A is a nonpositive multiple of ffoOi when i ^ 0. To see this it is sufficient to note that in the case N = 1 the potentials T, (i ^ 0) of (16.22) belong to &&(/)• Combining this observation with comment (1) above we even see that âS@(f) can be replaced by J£(/)H^@(f) for all R > 0. Also, if / ^ 0 then the potentials T 0 , T 0 in (16.22) belong to the closed convex cone ^e> + (/) °f all ® 6 ^®{f) f° r which 0 A o is a nonnegative multiple of/. Thus in this case Theorem (16.21) still holds when &@(f) is replaced by 3t@ + (f) H # £ ( / ) for any R > 0. (3) Consider the special case when / l 5 ..., fN are J^-measurable. Thus fn — f* ° °o f° r suitable bounded measurable functions /„* on E and all 1 ^ n ^ N. @)&{fi,...,fN) then consists of pair potentials, provided we put A0 = {0}. Moreover, the hypothesis of statement (16.21)(a) is then equivalent to the simple requirement that the functions 1, ff, . . . , / / are linearly independent modulo 2-null sets. (By definition, this requirement means that Hf* + c) > 0 for all a e UN\{0] and all c e U.) For suppose the last condition holds. Then we can imitate the proof of the implication (ii)=>(i) of Lemma (16.20) to obtain the existence of disjoint sets A0, ..., ANeS such that the matrix {X{\Amf*))oèmnûN has full rank. We put ßm = X{- \Am)s, O^m^N. Then nm e>^(Q,#") because A(\im) = Jtr{0){fiJ = log X(Am) > - o o . Also,A*m(/B) = HlAJn*)ß(AJ for all m,n. Thus has (^m(/»))osm,»äJv full rankConversely, suppose that X(f* j= c) = 0 for some a e UN\{0} and some ceU. Then the hypothesis of assertion (16.21)(a) cannot hold. For otherwise we can use the proof of the implication (i) => (ii) of Lemma (16.20) to construct a random field fi e <^(Q,#") with fi(fa2) > n(fa)2. But Jf {0} (^) ^ Â(\i) > - c o and thus
354
Convex geometry and the phase diagram
that (i)
(iii)
®<{fu...,fN}.
The point of this result is that any two distinct a-normalized potentials can never be equivalent. (See Theorem (2.35)(a).) Property (i) above thus implies that the potentials O which correspond to distinct O 0 's are certainly not equivalent. On the other hand, there are infinitely many degrees of freedom for the choice of O 0 . Therefore we can conclude that there exists an infinitedimensional entity of pairwise non-equivalent pair potentials with discriminating system {/i,...,/*}. o (16.25) Examples. (1) Let E = {0,...,N}, X be counting measure, and /„ = l{ffo=„}, 1 <^n z% N. ^@{fY,...,fN) then coincides with the space of all shiftinvariant absolutely summable pair potentials which are gas potentials with vacuum state 0. By Comment (16.24)(3), the hypothesis of assertion (16.21)(a) holds. (In fact, we can put nm = ôm, the Dirac measure at the constant configuration with value m.) We thus conclude that for each O 0 e 3ß% there exists a gas pair potential O 1 e @@ with vacuum state 0 such that |ex &&(®° + O x )| ^ N + 1. In the case N = 1 we can even achieve that O 1 e ^© + (/i). In the lattice gas language, this means that the pair interaction between distinct particles is attractive, whilst the self-potential tries to push the particles out of the system. Applying Comment (16.24)(4), we also see that the set of all gas pair potentials O with |ex^ 0 (®)| ^ N + 1 forms something like an infinitedimensional manifold. (2) Suppose E c= M is bounded, even, and measurable. Let S be the restriction of the Borel er-algebra to E and A be finite and non-degenerate. We put / = <70. The cone ^® (/) then consists of all pair potentials O which can be written in the form ^
- J(i - jK-er,- ifA = {i,j}, i # j , — hai if A = {i}, 0 otherwise.
Here h e M is an "external field", and J: S -> [0, oo [ is even and absolutely summable. The nonnegativity of J means that O is ferromagnetic. Each such O is normalized by every even probability measure a e ^(E, ê) such a s a = {ôx + cLx)/2, x e E; cf. Example (2.38). Comment (16.24)(3) ensures that the hypothesis of Theorem (16.21) (a) is satisfied. Consequently, there exists some O of the above form such that ^©(®) contains two measures n~,n+ of different magnetization (i.e., /j.-{cr0) < n+(a0)). Moreover, the associated function J can be required to take prescribed values on any finite subset of S, and O belongs to an infinite-dimensional ensemble of potentials with the same properties. This follows again from the comments above, o
Phase transitions with prescribed order parameters
355
In the rest of this section we shall present two variants of Theorem (16.21) which take advantage of the presence of symmetries. The first variant concerns the breaking of the rotational symmetry in long range Heisenberg models. For given N 5: 1 we let E = {x e UN: \x\ = 1} be the unit sphere in UN and X the surface measure. (In particular, if N = 1 then E = { — 1,1} and X is counting measure.) We shall consider potentials
(16.26)
*A
=
(16.27) Theorem. Let E be the unit sphere in UN and X as above. Given any ô > 0 and R < oo, there exists a potential O e 0S& of the form (16.26) with the following properties: (i) J{i) = 0 whenO < \i\ < R. (ii) ||
356
Convex geometry and the phase diagram
(iii), and in Step 4) we shall apply the former inequality to prove property (ii). 3) For given i with \i\ ^ R we let 4" e C be the potential of the form (16.26) with J = l{j,_,-}. Inserting 4" into the last inequality and using the shiftinvariance of n and vh we find that Mo"o • °";) ^ vh(
H(o0-n(o0\S)) Vfc(ffo-Vfc(ffol-^))-2e
= |v,(a0)|2-2£
>0. The last equality follows from the fact that vh is ergodic. By Theorem (14.10), there exists a (^©(Q,!F),^-kernel n. Thus n(\n'(o0)\2) > 0. From Theorem (14.17) we know that n' e ex ^0() ju-almost surely. Consequently, there exists some co e Q with 7rra e ex^0(O>) and \nm{a0)\2 > 0. We put c = |7tu((T0)|, i; = 7rra((70)/c e £, and yUy = n'°. For arbitrary u e £ we let yUu be the image of nv under any rotation which maps v to u. Remark (5.10) and the observations before Corollary (14.11) then ensure that ßu e ex^@(). Clearly, ßu(cr0) — eu for all u. This completes the proof of (iii). 4) In order to verify property (ii) we exploit the estimate ||O|| 0 ^ £_1/€(vh|0). We shall prove that h and s can be chosen in such a way that £_1/C(vJ0) < N + ö. According to our condition on e we only need to find an h such that 2â(vh\0)\vh(a0)\-2 < N + Ö. We write x — (x1,.. .,xN) when x e £. Then lim \vh(a0)\/\h\ = lim \vh(a0-h)\/\h\2 = lim Jx-Äex'*A(dx)/|Ä|2A(£) |k|->0
= lim jx^e*1""1 - l)A(dx)/|Ä|A(£) 1*1-»o
= l/N. The second equality follows from the relation lim Zh = 2(£), the third from |A|-0 _
the rotational invariance of X, and the last from the identity N j xjA(dx) = J X *»*(d*) = A(E). n= l
On the other hand, we have *(vJ0) = logA(E)-4(vJ = |v Ä ( < 7 0 )||Ä|-logZ h M(£).
Phase transitions with prescribed order parameters
357
Writing ZJX(E) = 1 + \h\2X{E)-' J A(dj)[e^l"l - 1 - y, \h\M\h\2 we see that lim Â(vh\G)l\h\2 = l/N - 1/2N = 1/2JV. Hence lim 2/f(vh|0)|vh(a0)r2 = N. The proof is thus complete,
a
We note that a Heisenberg potential of the form (16.26) does not include a self-potential part that corresponds to the action of an external field. Property (iii) of the theorem above can thus be summarized by saying that
4 of the proof above. It is also worthwile to recall the results of Chapter 9: If d = 1 and N ^ 1 or d — 2 and N ^ 2 then property (iii) can only be satisfied when <5 has infinite range and a rather slow decay. It might seem surprising that the calculus of shift-invariant Gibbs measures can also be used to establish the existence of potentials which show a breaking of their shift-invariance. The following lemma explains why this is possible. (16.28) Lemma. Suppose (E,
lim sup /*(//o0,) > liminf /i(//o0j).
(ii) There exists a measure v e ex %()\ex &((&) with v(v(f\&~) ^ v(/)) > 0. (iii) There exists a measure v e ex ^( (ii) => (iii) hold. Proof, (i) implies (ii). In view of the ergodic decomposition theorem (14.17) and the dominated convergence theorem, there exists v e ex %(<$) such that lim v(ffoßt) does not exist. We pick any A e ^ with fe J^A, and we set |ij->oo,ie J A
n = [ - " , n\d fi S, n ^ 1. For each n ^ 1 and i e I with (A - i) fl A„ = 0 we
358
Convex geometry and the phase diagram
have v(//°0i) - v(/) 2 = v(/o0,. ( v ( / | ^ A J - v(/))). Hence O
^||/|| limsupv(|v(/|^)-v(/)|) n-»oo
= 11/11 v ( | v ( / | ^ ) - v ( / ) | ) . The last identity follows from the backward martingale convergence theorem. We conclude that v(f\&~) ^ v(/) with positive probability. In particular, v is not trivial on 2T and thereby not extreme in ^{Q>). (ii) implies (iii). Let v be as in statement (ii). Suppose that v(foQj\3~) = v(f\3T) v-a.s. for all; e S. Then the ergodic theorem (14.A8) shows that v(/m=lim|AJ-1
X v(/o0 ; |^)
= v| limlAJ" 1 "X = v(/)
" e"j sr f°
v-almost surely,
in contradiction to our hypothesis. Therefore we can find j e S such that v(v(foBj\^)¥= v{f\2T))>0. By Theorem (7.26), there exists a ( ^ ( 0 ) , ^ > kernel n. The set
has positive v-probability and is therefore non-empty. This proves statement (iii). D We are now ready to state a second variant of Theorem (16.21). We fix any function / e jSf and a symmetric infinite subset / of S. We choose any cube A 0 e y with / e i? Ao , and we let A; (i ^ 0) and &@(f) be defined as in the paragraph before (16.21). We introduce the linear space âa®{f) of all potentials G âS&(f) which are such that 0 A j = 0 when 0 ^ i $ I. Let us agree to call a random field p f-I-oscillating if p satisfies (16.29). (16.30) Theorem. Suppose (E, S) is a standard Borel space, and let f and I be as above. Suppose further that there exists an j-1-oscillating p. G ^®(Q, 3F\ Then for each <1>0 e &% there exists a potential O e O 0 + &®(f) which admits an f-I-oscillating shift-invariant Gibbs measure. In particular, ex ^0(
Ubiquity of pure phases
359
Gibbs measure v e ^©(<1>) such that |
16.4
Ubiquity of pure phases
This section is devoted to the phenomenon that ergodic Gibbs measures are ubiquitous. Our first result will show that a "typical" potential $ e ^© admits only one shift-invariant Gibbs measure / v By Theorem (14.15)(a), ^ is necessarily ergodic. Then we shall prove that the collection of all these JJL&S is dense in the set ^©(ü, 3F) of all shift-invariant random fields with finite specific entropy. This proof will be broken up into three stages. First we shall show that the /i^'s are dense in the set of all ergodic Gibbs measures. These in turn will then be shown to be dense in the set of all shift-invariant Gibbs measures. Finally we shall see that [J {^©():
360
Convex geometry and the phase diagram
is a dense Gô set in 3ß%. In other words, the potentials
%m = (M whenever <1> belongs to the uniqueness region °ll&. The next two theorems will show that the set {(O, p^)\ $ e $r©} is dense in the graph of the correspondence %:
Ubiquity of pure phases
361
C(G>) = {p e ^ ( Q , ^ ) : (®,p) e C} is a closed subset of %{0). We need to show that ex %(<£) c C(S>). This will be done in two steps. Step 1. Given any /i e "^©(S*) a n d / e if, there exists some v e C(O) such that v(/) ^ ju(/). Indeed, let *F e ^ @ be any potential which is associated to / via Lemma (16.10). By Proposition (16.32), there exists a sequence (0")„ à l in ^l& such that ||O — n" lv P — O"||0 < n~2 for all n ^ 1. The last condition implies that
^ lim p(g„s) /-co
= MK„) for all k and n. Hence v„(K„) ^ /x(K„) for all n and thus v„(7f = p) -• 1 as n -> co. This in turn implies that v„ -> p as n -> oo. For if g e if then \vn(g)~K9)\Svn(\n\g)-p(g)\) S2\\g\\vn(n-^p)^0 because v„ = v„n. Since C(«D) is closed, we conclude that p e C(«D). D Next we combine Propositions (15.52) and (16.7) to show that the ergodic Gibbs measures are dense in the set of all shift-invariant Gibbs measures.
362
Convex geometry and the phase diagram
(16.34) Theorem. Suppose (E, é>) is standard Borel. Then the set {(«D,M):$eÄ 0 ,^ex^((J))} is dense in the graph
of the Gibbsian correspondence ^ 0 : O -» ^©(O). Proof. We fix any <5 e ^ @ and fi e ^ 0 (O). By Proposition (16.34), there exists a sequence (^„)nä i in ex ^©(Q, #") such that fi„-> fi and /l(^„) -> /l(^) as n -> oo. In particular, /l(^„|0) -> Â{H\Q>) = 0. Passing to a subsequence if necessary we can therefore assume that A(n„\<$>) < n~2 for all n 2; 1. For each n 2> 1, we apply Proposition (16.7) to the cone C = @@, the P-bounded functional L 0 = — <^„, • >, the potential <5° = <5, and e = 1/n. Together with Theorem (16.14), we obtain a potential O" e ^ @ with (16.35)
||4>" - O||o ^/ii«(Ai I I |0)< l / n
and a Gibbs measure v„ e ^©(O") such that (16.36)
|
for all f e l 9 . To exploit the last inequality we fix any A e ^ and / e ifA. We consider the potentials which are associated to the local functions ff°Ot and the sets AU (A — i) in the sense of Lemma (16.10). Inserting these potentials into (16.36) we obtain that |v„(//oei)-^(//oei)|^2|A|||/||2/n for all i e S. Just as in the proof of Theorem (16.21), the mean ergodic theorem (14.A3) now implies that |v>n(/|^)2)-^(/)2|^2|A|||/||>. (Note that fi„ is ergodic). Therefore, if % is a (^©(Q, #"), J)-kernel then |v„(7r-(/)2)-^(/)2|^2|A|||/||>. Inequality (16.36) also yields the estimate k(/)-AU/)|g|A|||/H/n. We thus conclude that
v„([X(/)-M/)]2) ^2^(/)2 + 2|A|||/||>-2vn(/K(/) ^4|A|||/||2/n for all n > 1.
Ubiquity of pure phases
363
Now we let {Ak: k ^ 1} be a countable generator of #" which consists of cylinder events and is stable under finite intersections. For each k we choose a set Ake£f with Ak e J \ k , and we define fk = 2"k \AJ\Aft|. Then
v»( t I W / t ) - ^ ( / » ) ] 2 ) ^ 4 / n for all n ^ 1. From Theorem (14.17) we know that v„{if e e x ^ O " ) ) = 1. Consequently, for each n ^ 1 there exists some v„ e ex ^0(") such that (16.37)
X [v„(/k)-M„(/k)]2^4/n.
Since
OLSD
for all k ^ 1. Hence v = n and therefore (
By Comment (4.14)(1), the sequence (/v>)„ è i is locally equicontinuous. In view of Proposition (4.15), we thus can assume that (/v>)„Ê1 converges to a
364
Convex geometry and the phase diagram
limit v. (Otherwise we can pass to a subsequence.) Clearly, v(Ak) = /J.{Ak) for all k ^ 1. Hence v = /x and thereby (<ï>, ^) = lim ($", n^„). o n-*co
The next and final corollary completes our proof of the ubiquity of unique ergodic Gibbs measures. Since {^.j,: cD e W&} is a subset of ex^@(Q, 3F\ this corollary provides a remarkable refinement of Theorem (14.12). (16.40) Corollary. / / (E,$) is standard Bor el then {[i^. <S) e W@} is dense in
Proof. By Corollary (16.38), we only need to show that &@(@@) is dense in ^e(Q, #"). Let n e ^©(Q, 3F) be given. Pick any cofinal sequence (A„)„Ä1 in y. For each n ^ 1, we obtain from Proposition (16.7) and Theorems (16.13) and (16.14) a potential
^„
for all n 2: 1. Hence v„ -> fj. as n -> oo.
a
Part IV Phase transitions in reflection positive models
The phenomenon of non-uniqueness of Gibbs measures has already been studied in several places throughout this book: In Chapter 6 we examined three specific models which exhibit a breaking of one or several discrete symmetries. In Chapters 11 and 12, we encountered several Markovian examples of phase transition. (In some of these examples, the phase transition was not, or only to some extent, accompanied by a breaking of symmetries.) The breakdown of a continuous symmetry group was established in some Gaussian models of Chapter 13. Finally, in Chapter 16 we derived a (non-constructive) existence result concerning phase transitions with prescribed order parameters. In this part we shall provide quite a number of further examples of phase transition. Specifically, we shall deal with shift-invariant Gibbs specifications on a lattice Zd. In Chapters 18 and 19 we shall assume that d 2: 2. The spins will be allowed to take values in an arbitrary space, but their interaction will be assumed to have at most the range yfd. Our analysis of such systems will take advantage of some concepts of percolation theory, i.e., we shall be concerned with the existence and uniqueness of infinite clusters in certain random subgraphs of the lattice. This geometrical analysis will culminate in some general theorems on the existence of phase transitions. On the one hand, we shall provide stability conditions under which a degeneracy of ground states gives rise to a phase transition at low temperatures. On the other hand, we shall show how a conflict of energy and entropy can lead to a phase transition at a distinguished value of temperature. These results can be easily applied to various specific models, as we shall indeed show in various examples. The final chapter, 20, will be devoted to a discussion of a particular highlight in the theory of phase transitions: the proof of spontaneous magnetization in systems with continuous symmetries. Specifically, we shall consider systems of Revalued spins whose interaction is invariant under the group of all rotations of UN. Under some hypotheses on the interaction, the rotational symmetry of these systems can be shown to break down at low temperatures. The proof uses some techniques of harmonic analysis. A common tool for both the geometric methods of Chapters 18 and 19 and the Fourier analytic techniques of Chapter 20 is provided by a certain inequality which refines the Cauchy-Schwarz inequality and is usually referred to as the chessboard estimate. It involves the correlations of homogeneous
366
Part IV
random fields on discrete tori and relies on a definiteness condition which is called reflection positivity. This condition will be studied in Chapter 17. The chapters of this part are linked together according to the scheme ,48-19 17N 20 To read these chapters it is sufficient to know Chapters 1, 2, 4 and 5 and the main results of Chapters 7 and 14. (In Subsection 19.3.2 we shall also use some facts from Chapters 15 and 16, but one can do without them if one is willing to accept less complete results.) A deeper understanding also requires some familiarity with the essential ideas of Section 6.2 (for Chapters 18 and 19) and Sections 9.2 and 13.3 (for Chapter 20). The remaining sections of Parts I to III are not related to the contents of this part.
Chapter 17 Reflection positivity
The objective of this chapter is to establish a certain key inequality which underlies the results of Chapters 18 to 20. This inequality is a refinement of the well-known Cauchy-Schwarz inequality. It applies to random fields /i which have a torus A for their parameter set, and exhibit the following two properties. (i) fi is preserved by the rotations of the torus A. This property will be called A-periodicity. (ii) ß is reflection positive. This property means that a certain bilinear form which is defined in terms of JX and a reflection of A is nonnegative definite. Of course, the latter property will provide us with a Cauchy-Schwarz inequality. Repeated use of this inequality, combined with the application of suitable rotations of A, will give us the desired key inequality. (See Theorem (17.11).) The main idea is contained in a little combinatorial lemma which is called the chessboard estimate. All this will be the contents of Section 17.1. Section 17.2 will deal with sufficient conditions for property (ii) to hold. More precisely, we shall look at Gibbs distributions on a cube A in Zd with respect to shift-invariant potentials. To guarantee property (i) we shall impose periodic boundary conditions. We then shall distinguish between reflections in planes through lattice sites and reflections in planes between lattice sites. In each case we shall find a class of potentials which are reflection positive in that the associated periodic Gibbs distributions are reflection positive.
17.1
The chessboard estimate
Let d, N ^ 1 be any two integers. We consider the d-dimensional cube (17.1)
A=
A(N)^]-N,NYnZd
= {i = {i1,...,id)eZd:
-N
< ik ^ N for 1 ^ k ^ d}.
A is thought of as a torus. Accordingly, we equip A with the addition modulo A which comes from the identification of A with the factor group Zd/2NZd. More precisely, if i, j e A then the notation (17.2)
a = i+j
mod A
means that a is the unique element of A which satisfies ak = ik + jk mod 2N for
368
Reflection positivity
all 1 ^ k ^ d. Similarly, for each A c A and i e A w e write A = A + i mod A when A = {a e A: a = i + j mod A for some j e A}. Next we let (£, ê) be an arbitrary measurable space. We shall be concerned with finite measures on the product space (£ A , iA) which exhibit a certain behaviour with regard to two particular classes of transformations of £ A . The first of these classes consists of the rotations (or periodic shifts) 0A: EA -* EA which are defined by (17.3)
(9iAco)j = cok
(co e EA, i, j e A, k = j — i mod A).
A finite measure \i e Ji{EA,SA) will be called A-periodic if 9A{n) = n for all i e A. The second class of transformations consists of generalized reflections which are defined as follows. Take any 1 ^ k ^ d and look at the reflection rk: A -* A in the plane {x = (x l 5 ..., xd) e Rd: xk = 1/2}. rk is given by fl-i, [i( where i = (il,...,id) (17.4)
if ^ = fc, otherwise,
e A. Clearly, rk is an involution which maps the set A+,k={ieA:l^ikSN}
onto the set A_>k = A\A + i k = {i'eA: - N < ik ^ 0}, and vice versa. rk induces a measurable involution of (£ A , SA) which will be denoted by the same letter: (17.5)
(rkco)j = (orkj
(co e EA,j e A).
Combining this mapping with an arbitrary measurable involution xk: E -> E we arrive at the generalized reflection (17.6)
rt:
(co e £ A )
which is a measurable involution of £ A . Using fk we shall now introduce the notion of reflection positivity. Let j / + j k be the space of all bounded measurable functions on £ A which depend only on the coordinates in A + k. For each fe J^+ik we consider the reflected function / * = fofk. Clearly, / * is a function of the coordinates inA_k. (17.7) Definition. A finite measure \i on (£ A , êA) is said to be rk-positive if M / / * ) ^ 0 for a l l / £ . < , * . Examples of rk-positive measures will be provided in Section 17.2. Here we ask for the use of this notion. By definition, a given measure /J. on (£ A , SA) is rk-positive if and only if the bilinear form
if,g)^Kfg*)
The chessboard estimate
369
on s/+ik is nonnegative definite. Of course, this property implies the CauchySchwarz inequality (17.8) \n(fg*)\2 £ n(ff*M99*) (f,ges*+,k). When combined with the property of A-periodicity, the last inequality will give us an improved inequality which is a fundamental tool for the derivation of the results of Part IV. At the heart of this improvement is the combinatorial lemma below which is known as the chessboard estimate. We consider the set A(N) = {l,...,2N}
x{0,l}
= {(n,e): 1 g n ^ 2iV,8 = 0 or 1}. The change of the second coordinate is an involution of A(N) which is denoted by a -• à. Each 1 ^ n ^ 2iV is identified with (n, 0) G A(N). Accordingly we write n instead of (n, 1). (17.9) Lemma. Let F: A(N)2N ->• [0, oo[ be any map which satisfies the following two conditions. (i) For all a l 5 . . . , a2N e A(N), F(au...,a2N)
=
F(a2,...,a2N,a1).
(ii) For all at, b( e A(N),
^F(a1,...,aN,äN,...,ä1)F{bl,...,bN,bN,...,bl). Then the inequality F(au...,a2N)
^ f]
n*t,äe,...,a(,ä()^
holds for all cii,..., a2N e A(N). In proving this lemma it will be helpful to visualize an element (al,..., a2N) e A(N)2N as a configuration of white and black chips on a 2N by 2N chessboard as follows. For each 1 ^ k ^ 2N, we put a chip at position nk in the /c'th column if ak — (nk,sk). We take a white or black chip according to whether sk = 0 or 1. In this picture, hypothesis (i) means that F remains invariant under cyclic horizontal shifts of all chips. In hypothesis (ii), a configuration of chips is compared with the two configurations that result from it by means of the following procedure: remove all chips on the right (resp. left) half of the board, take a colour reversed copy of the chips on the other half, and put a reflection image of this copy onto the empty half. Condition (ii) then implies that if F is positive or attains its maximum at the starting configuration then the same holds for each of the two resulting configurations. It is this
370
Reflection positivity
consequence of (ii) that will be exploited in the proof. In the conclusion of the lemma, the value of F at any starting configuration (a1,..., a2N) is compared with the values of F at purely horizontal configurations which fill one row of the board with alternating black and white chips. (In view of (i), it is no matter whether the leftmost chip is black or white.) Each such alternating row configuration appears with a multiplicity which is equal to the number of chips of(a1,...,a2N) in that row. Proof of (17.9). We introduce the set M = {aeA(N):F(a,ä,...,a,ä)
> 0}
and prove the claimed inequality in the two cases (a1,.. .,a2N) e A(N)2N\M2N and (a1,...,a2N) e M2N separately. 1) In the first case we need to prove that F(a1,..., a2N) = 0 when a( $ M for some (. So let ( a t , . . . , a2N) e A(N)2N be such that F(a1,..., a2N) > 0, and pick any a = (n,s) e {a1,...,a2N}. We will show that a e M. In view of hypothesis (i) we can assume that a = aN. For each 1 5Ï ( ^ N we look at the set RaJ={(b1,...,b2N)eA(N)2N:F(b1,...,b2N)>0, "N-I+\
=
VN-IT+2
=
"N-e+z = "• = 0jv+«f-i
=
"N+I
= a).
Thus each element of Ra e contains a strip of alternating black and white chips at the 2£ middle positions of the n'th row, the leftmost chip having colour e. Since ae M when Ra N # 0, we will show by induction on ( that each Ral is non-empty. First of all, we conclude from (ii) that (a l s . ..,aN,äN,... . â j e Ra,i- Hence Ral ^ 0. Next we assume that RaJ # 0 and set k = N A 2(. Then Rak # 0, as can be seen by the following procedure. We take any configuration of chips from Ra e and shift it to the left in such a way that the positions N — k + 1,..., N of the n'th row are occupied by a horizontal strip of chips of alternating colour. Using (ii), we obtain a new configuration which contains a horizontal strip of chips of alternating colour at the 2k middle positions of the n'th row and still belongs to the domain of positivity of F. Hence Rak =£ 0, and the induction step is complete. 2) For ax,..., a2N e M we define / 2JV
G(a1,..., a2N) = F(Ö! ,..., a2N) / \\ F(at, 5 „ . . . , ae, äf)1/2N. We need to show that G ^ 1. Since G{a, ä,..., a, a) = 1 for all ae M, this will be proved once we have shown that G attains its maximum, say c, at some vector of the form (a,ä,...,a,a). To this end we observe that G satisfies the same conditions (i) and (ii) as F, provided A(N) is replaced by M. (Note that ä e M when ae M, because of (i).) Therefore, if
The chessboard estimate
371
Re = {(a l 5 ...,a 2 N ) e M2N: G(au.. .,a2N) = c, a
N-s+i
=
äN-s+2 = a?v-/+3 = ' " = ctN+e)
then the same procedure as in Step 1) shows that Rx # 0 and RNA2lf ^0 when ^ # 0. Thus i?N # 0 and thereby c = 1. • Suppose now we are given d commuting measurable involutions x1, ..., xd of E. For each i = (i l 5 ..., id) e A we consider the iterated mapping (17.10)
T' = r\l o ... o x1/.
Clearly, xl = xJ when ik = jkmod2 defined by (17.6).
for all 1 ^ fc ^ d. For each k we let rk be
(17.11) Theorem. Let \xbe a finite measure on (EA, SA). Suppose \i is A-periodic and rk-positive for all 1 ^ k S à. Then for each choice (f)ieA of bounded or nonnegative measurable functions f on E, the inequality
\i'eA
jeA
holds. Just as Lemma (17.9), the preceding inequality is referred to as the chessboard estimate, this name being also suggested by the structure of the mappings xl (i e A) in dimension d = 2. Occasionally, the above inequality is also called generalized Cauchy-Schwarz inequality. In fact, the standard CauchySchwarz inequality for integrals corresponds to the particular case when d = 1, JV = l,x1 = i d , and /i is the image of any finite measure on (E, S) under the mapping x -> (x, x). Proof of (17.11). We can assume without loss that all / ; 's are bounded. In this case we proceed by induction on the dimension d. 1) Let d = 1. We define a family (fa)aeA(N) of functions on £ by putting f(n,o) = /n-jv an d/(„,i) = fn-N ° T i when 1 ^ « ^ 2N. We also define a function F:A(N)2N -• [0, oo [ by 2N
F(au...,a2N)
A* ( I l fat ° °e-N i=\
(By the very definition of fj-positivity, the absolute value sign can be dropped when (a 1,..., a2N) = (n, n,..., n, n), for example.) The claimed inequality then takes the form flF(n,n,...,n,n^2N.
F(l,...,2N)g «=i
We thus need only check that F satisfies the hypotheses (i) and (ii) of Lemma
372
Reflection positivity
(17.9). Condition (i) is an immediate consequence of the A-periodicity of JX. Hypothesis (ii) follows from the Cauchy-Schwarz inequality (17.8). Indeed, with the help of the functions f = fb„0Oi---fbl0 we can write
«AT»
F(a1,...,aN,bN,...,b1)2
g=färt0°i---fä10
J*+,i
= n(fg*)2
£ n(ff*)n(gg*) = F{bt ,...,bN,bN,...,
b1)F(a1,
...,aN,äN,...,ä1).
2) Now let d > 1. We write A = A 0 x A 1 , where A 0 = ] — iV, iV] and A1 = l - N . N ] " - 1 . Let (£0,
AM n.Wi) = H n ( n Aio.^0^.^)) \ieA
J /
VieA, VOEAO
^ n À n n/(w 1 »^ w,, °^ 1 » h eAi
Vi eA, ioeA0
/ / \l/|Ail
• /
Interchanging the rôles of A 0 and A1 and again applying the induction hypothesis we obtain the result, a Theorem (17.11) applies to functions fi o al of single spins a{. However, in Chapters 18 and 19 we shall need a similar estimate for functions of the joint behaviour of several spins. (This should not seem surprising because each interaction potential consists of such functions.) To obtain such an estimate we shall subject Theorem (17.11) to a coarse-graining. It is for this coarsegraining that the involutions xk are indispensable. Let (17.12)
C ^ A(l) = {0,1}"
be the unit cube in Zd. For each i e A we consider the elementary cube (17.13)
C(i) = C + i mod A
in the torus A. (We suppress the A-dependence of C(i) in our notation.) For example, if i = (JV,...,N) then C(i) consists of the 2d corner sites of A. For the sake of brevity we identify £ C(,) with Ec in the natural way. That is, we shall suppress the transformation 6f which establishes this identification. Accordingly, the projection
The chessboard estimate
373
reflection of co#, the first stage being a reflection of A and the second one a reflection of C. From this it is obvious that the rôle of the involutions TJL, ..., xd will be taken over by the reflections of Ec. So we shall now use the letters rl,..., rd to denote the reflections of Ec. (This corresponds to putting N = 1 in (17.5).) Of course, these reflections are commuting involutions of Ec. In accordance with (17.10) we also introduce the iterated reflections (17.14)
r' = r i ' o . . . o r j -
(i = (it,.. .,id) e Zd)
of£c Suppose next we are given a finite measure ß on (EA, SA). We then let ß* denote the image of ß under the mapping co -• w* from EA to (EC)A. We seek a condition on ß which implies that /i # is r^-positive, provided the transformation fk of (£ C ) A is defined by choosing xk = rk in (17.6). To this end we observe that the set
Lk = U
c
( 0n U
c
( 0 = {» e A: '* = 1 o r -N + 1}
consists of a pair of (hyper-)planes in A. If A is thought of as being bent to a torus in Ud+l, this pair of planes is the intersection of A and a hyperplane in Ud+l. The ffc-positivity of \i* should be related to the behaviour of \i under reflection in this hyperplane. So we let fk denote this reflection. That is, we set
(«.is)
VA
= \2-''m°iW
iU
= k:
(if otherwise when i = (i l 5 ..., id) e A and 1 S k S d, and we use the same symbol rk to denote the associated transformation of EA. We emphasize that fk is a reflection in a plane that meets the sites of A, whereas rk was a reflection in a plane between the sites of A. Thus fk preserves the position of some spins, namely all those in Lk, whilst rk does not. In analogy to (17.7) we will say that a finite measure ß on (£A,
= fo (rkoC(rki))ieA
= ß(ggo fk) ^ 0. n
= / * o (oc (/) ), 6A .
374
Reflection positivity
(17.17) Corollary. Let p J ( £ V A ) be finite, and suppose JX is A-periodic and fk-positive for all 1 ^ k ^ d. Then for each choice (f)i eA of bounded or nonnegative measurable functions f on Ec we have .1/1M
0
r
a
AM n ^ ^ ) ^ n ^( u fj ° '° c(i) je A
\ieA
Proof Since n* is evidently A-periodic, this follows immediately from (17.11) and (17.16). •
17.2
Gibbs distributions with periodic boundary condition
In this section we shall provide examples of finite measures on (£ A , SA) which are both A-periodic and reflection positive. As we shall soon see, there is a trivial example of such a measure, namely the product measure kA of any finite k e Jiï(E, S). We can thus hope to find nontrivial examples by looking at measures with Gibbsian densities relative to kA. In order to ensure Aperiodicity we need to confine ourselves to shift-invariant potentials, and we also need to impose periodic boundary conditions. So we shall consider Gibbs distributions with periodic boundary condition, as were introduced in Example (4.20) (2). We shall first deal with a particular class of shift-invariant short-range potentials which will be called C-potentials. The associated Gibbs distributions with periodic boundary condition will be shown to be rk-positive. This fact will be essential for the results of the next two chapters, but will not be needed in Chapter 20. Then we shall establish the rk-positivity of the Gibbs distributions with periodic boundary condition relative to certain Heisenberg potentials. This result will not be used until Chapter 20. Throughout this section we let k e Ji(E, ê) be a fixed a priori measure. For simplicity we will assume that k is finite. For each potential <1> and A c A we shall freely think of Q>A as a function on EA, or as a function on EA which only depends on the coordinates in A. This is possible because of assumption (2.2)(i). Let C be given by (17.12). (17.18) Definition. Let us agree to call a potential <& with parameter set Zd and state space (E, S) a C-potential, and to write $ e <&@, if <& meets the conditions (i) to (iv) below. (i) <J>^ = 0 unless A = C + i for some i e Zd. (ii) cDc+. = (Dc o 0_j for all i e Zd. (iii) <1>C = <&c o rl for all i e C. (iv) Either ||O c || < oo, or there exists a sequence (K^^^ in ê with Ke\E as £ -> oo such that
Gibbs distributions with periodic boundary condition
375
for all t > 1
sup |Oc(a))| < oo WEKC
and inf Oc(co) ">• °°
as ( -> oo.
coiKc
(17.19) Comments. (1) For each <ï> e ^ e , assumption (iv) implies that all /J*'S are bounded. Thus $ is A-admissible. As we shall show in (18.12), condition (iv) also ensures that ^0(<ï>) ^ 0 (provided (£, ^) is standard Borel). (2) Up to equivalence, %>@ contains all next-nearest neighbour potentials which are invariant under reflections and translations and satisfy a version of assumption (iv). Suppose, for example, that d = 2 and can be written in the form "(p^Oi) \iA = {i}, (p2(<Ti,
c = T Z
Z
2fo,o))+
{i,j}<=C
Z
*»3(o"i»o>)
{W}<=Ç_
satisfies (iii) and is equivalent to <&. o Suppose now we are given a C-potential <ï>. We then let °y® denote the projection image on EA of the Gibbs distribution for $ in A with periodic boundary condition, and we write °p* for its density with respect to XK. According to Example (4.20) (2), °p% is given by (17.20)
> * = (°Z*)- 1 exp
I
ie A
$c(o
Here °Z* > 0 is a normalization constant, and C(i') is defined by (17.13). Of course, if i is such that C(i) ^ C + i then
1 <; k ^ d, °y% is fk-positive.
Proof. The exponential in (17.20) can be written in the form h hofk with exp
c(i)<=A +ik
and h only depends on the coordinates in Â+ k. We thus only need to show
376
Reflection positivity
that 1 A is ^-positive, which is quite easy. Indeed, let L = Lk be defined as in the paragraph above (17.15), and let A = Ä +>k \L. Also, let / be any bounded measurable function which only depends on the coordinates in À+>k. Since / can be written as / = g{aL, erA), we conclude from Fubini's theorem that * A ( / M ) = I ^L(àœ)k\g(œ, The proof is thus complete,
-))2 ^ 0.
o
In the following, we intend to provide examples of periodic Gibbs distributions which are ^-positive, i.e., ^-positive when xk is the identity. We confine ourselves to the case when £ is a rotationally invariant subset of some Euclidean space W, and we only look at shift-invariant Heisenberg potentials. By definition, these are all pair potentials of the form (17.22) O A = J
otherwise.
0
Here J : TLd —> E is any even function with 7(0) = 0 and
(17.23) J2 |y(Â:) l < °°' keZd
and the dot denotes the usual inner product. For such a
y
AO"-;) * • o)]
i,./eA
relative to kA. In the above, (17.25) JA(k) = J ] 7(fc +
2Nl
(k
)
e z
).
and A. is assumed to be such that °Z* 4 /\ A (dû>) e x p U J ]
Mi-ntOftoj]
1,7'eA
is finite. We seek conditions on J which imply that °yf is r*-positive. The following lemma is a first step towards answering this question. (17.26) Lemma. Let fibe a finite measure on (£ A ,
Gibbs distributions with periodic boundary condition
377
(i) m is a finite measure on some measurable space (W, W); (ii) h and hw, w e W, are measurable complex functions on EA which only depend on the coordinates in A + k, hw(oS) is measurable jointly in w and <x>, and sup \hw\ is dominated by a measurable function; and wsW
(Hi) * indicates reflection by rk combined with complex conjugation. Then fi is rk-positive. Proof. We can assume that X is finite. By (ii), there exists a measurable function cp : EA -> [0, oo [ which only depends on the coordinates in A+Ji and satisfies \h\ ^ cp and \hw\ ^ q> for all w e W. For a given function fe j / + k and any number c > Owe set fc = fl(vic}. Expanding the exponential density of fi and using Fubini's theorem we then see that Kfef*)
= Z
7TJm(dwi)---jm(dw
where gCWl...Wl = fehhWi ...hW/. Clearly, the functions gCWl...We can be regarded as measurable functions on £ A+ k. Hence ÀA(a
a*
) = \XA+-k(a
)\2 > 0
Thus n(fcf*) ^ 0, and the lemma follows letting c tend to infinity, D Let K = {z e C: \z\ = 1} denote the unit circle in the complex plane. (17.27) Definition. Let us agree to call an even function J on Zd nonnegative definite relative to rY if there exists a finite measure a on ] — 1,1[ x X d _ 1 which represents J, in that J(i) = j a(dz l 5 ...,dz d )z['- l z^...z 1 / for all ie N x Z d_1 . For any other coordinate direction 1 < k S d the nonnegative definiteness relative to rk is defined analogously by an interchange of coordinates 1 and k. (17.28) Comments. (1) An even and absolutely summable function J on Zd is nonnegative definite relative to r1 if and only if Z
i.JEN x Z d ~ '
Z
J VO'i
+ h - !' »2 - J2> • • •. id - h) ^ °
for each choice of complex numbers zt such that {ie N x Z d _ 1 : z; ^ 0} is finite. This result which combines the Herglotz lemma (13.A9) and the solution of the Hamburger moment problem is a particular case of a theorem of Berg and Maserick (1984). (See also p. 96 of Berg, Christensen and Ressel (1984).) To apply this theorem it is sufficient to note that the operation (i,j) -»(fx + j \ - 1,i2 +j2,...,id
+ jd)
378
Reflection positivity
turns N x Z 1 " into a commutative semigroup with identity (1,0,...,0) and involution i -> (i1, — i2,..., — id). The mappings are nothing but the associated bounded semicharacters. (Since £ i j è l 1.7(^,0,...,0)| < oo, the semicharacters with |z x | = 1 do not contribute to J.) (2) Suppose J and J' are nonnegative definite for r1. Then so is the product J J'. For let a and a' be the measures that represent J resp. J'. Then J J' is represented by the image of a x a' under the mapping (z1,...
,zd,zl,...
,zd) -^\z^zx,..
,,zdzd).
(3) Any J which is nonnegative definite for r1 is necessarily ferromagnetic to some extent, in that J(i 1 ,0,...,0) ^ 0 whenever i t is odd. On the other hand, if the representing measure a is supported on ] —1,0[ x Kd~^ then J(i1,0,..., 0) < 0 when il ^ 0 is even. Also, if J is nonnegative definite for r1 then so is the function i -> — ( — l)' 1 J(i). Its representing measure is a reflection of a. o (17.29) Theorem. Suppose 3> is a Heisenberg potential of the form (17.22) with an even and absolutely summable function J which is nonnegative definite relative to some rk. Then °y* is rk-positive. Proof. We can assume without loss that X is supported on a bounded set. (This is because the general case then follows by an obvious limiting argument.) For notational convenience we will also assume that d = 2 and k = 1. (In all other cases the proof is similar.) For each L ^ 1 and (u, v) e Z 2 we put JA>L(u,!;) = (2Lr 1 X
Z
J(u + 2Nk,v + 2N(S
-1')\
and we let \xKL denote the finite measure which is obtained from the measure °Z* °y* when JA is replaced by JA L. Using hypothesis (17.23) we see that lim J A , L (0 = JA(i) for all i e Zd. Hence lim fiA L = °ZA 0y* in variational L—•oo
L~*co
distance. Consequently, it is sufficient to show that each fiA L is r r positive. To this end we fix any L ^ 1. In order to apply Lemma (17.26) we write the AA-density of [iA L in the form exp(h + h* + H), where h
= \
Z
Jh,L{i-j)Oi -CTj
'.J6A + ,i
and H
=
Z
4,L(*-./)
ieA.,i,J£At,i
We need to find a measure space (W,W,m) and complex measurable func-
Gibbs distributions with periodic boundary condition
379
tions hw (w e W) which only depend on the spins in A + j l and are such that H — j m(dw)hwh*. By definition of the inner product, H is the sum of n terms which correspond to the n coordinates of the a^s. We can look at each of these terms separately. Equivalently, we can assume that n = 1. In this case, H is equal to the double sum (2*>)
ZJ
2J
keZ
u,v,f
(J
(U-N,V)G(N
+
1-U',V')
u'.v'.r x J(u + u' - 1 + 2Nk, v-v' + 2N{t - /')), where the inner sum is taken over the range 1 :g u, u' ^ JV, —N
V Lu
u,v,e
n
0
n
(u-N,v)0(N
x2N)~l rxu+u'-2
+ l-u',v')Lx
I
z
x2N-u-u'
v+2NS -v'-
2NC
L
z~v-2Nt
zv'
+ 2Nr~\
The last expression has the desired form. Indeed, let W = {0,1} x ] — 1,1[ x K and W be the Borel er-algebra on W. Define a (finite) measure m on (W, W) by m(0,dx,dz) = m(l,dx,dz) = (2L) _1 (l - x 2N )~ 1 a(dx,dz), and consider the measurable functions "(0,x,z) — Z J
cr
(Af + l-u,i;)- x
Z
and I, _ V n xN-u "(l,x,z) — Z J ^JV+l-u,!))"*' u,v,(
Z
v + 2N( )
{x, z) e ] — 1,1[ x K. The functions hw (w e W) satisfy condition (ii) of Lemma (17.26). Moreover, since (u — N,v) = /^(iV + 1 — M,U) we have 7/ = jm(dw)/iM,/i*. The theorem thus follows from Lemma (17.26). D (17.30) Example. Ferromagnetic nearest-neighbour potentials. Suppose J is an even real function on Zd such that J(i) = 0 if ] i\ > 1 and J(i) ^ 0 if | i\ = 1, and
380
Reflection positivity
let $ be given by (17.22). Then °y* is ^-positive for all 1 :g k ^ d. Indeed, J is nonnegative definite for r1 with representing measure a = J(l,0,...,0)<5 0 x v" -1 . Here ô0 is Dirac measure at 0 and v is normalized Haar measure on K. Thus Theorem (17.29) shows that °y* is r1-positive. For k =£ 1 the rk-positivity follows by symmetry. We note that the nonnegativity of J is not only necessary for J to be nonnegative definite for rk (cf. Comment (17.28)(3)) but also necessary for °y* to be rk-positive. To see this we choose X = ôx + ô_1 for the a priori measure, and for fixed 1 ^ k ^ d we let u denote the /c'th unit vector in Zd. We put h
=
Z
WJJAÜ - J)
{i,j}<=A+,k
and
/ = e-\THl{„(=„u for all ie
A+,k}"
Then 0
ZZ°y%(ff*)=
Z
exp[(2AT)^1J(U)x3;]
x,y=±\
=
4sinh\_(2N)d-1J(u)l
Thus, if J(w) < 0 then 0y£(ff*)
< 0.
o
(17.31) Example. Next-nearest neighbour potentials. Let d^.2, and suppose J is given by 'a |i| = l, J(i) = -J fc if \i\2 = 2, 0 |i| 2 > 2. If a ^ 2(d — l)|b| then °y* is rk-positive for all k. For we can take cc(dz1,...,dzd)
= ^(dzj)
+ fc £ (zt + zt) v(dz2)...v(dz„), k= 2
(50 and v being as in the preceding example. Conversely, if a < 2(d — l)\b\ then °y* fails to be rk-positive. This can be seen in a similar way as in Example (17.30) by looking at the same function / and also at J
=
e
<7
iil{
In this connection it is interesting to note that no restriction on a and b is required for "y® to be rk-positive. This follows from Theorem (17.21). o (17.32) Example. Long range potentials. Let d = 1 or 2, and suppose $ has the form (17.22) with
Gibbs distributions with periodic boundary condition
j(i) = ß\i\-a
for
381
\i\ ^ 1,
where ß ^ 0 and a > d. Then °y* is rk-positive for 1 5Ïfc:g d To show this we only need to look at the case k = 1. By Theorem (17.29) it is sufficient to show that J admits a representing measure a on ] — 1,1[ x K d_1 , and to this end we may assume that ß = 1. 1) First we consider the case d = 1. From the formula 00
r(a) = J e - V " 1 ds o for the gamma function we obtain, by a change of variables, 00
k~a = r ( a ) - 1 J e ^ V " 1 ds o for each k ^ 1. Thus the measure a on ] — 1,1[ can taken to be the image of the gamma distribution r(a)" 1 e" s s a _ 1 ds on ]0, co[ under the mapping s -> e~s. 2) Now let d = 2. Because of Comment (17.28)(2) it is sufficient to look for a representing measure a in the case when 1 < a < 2. In this case the integral 00
c(a)= J dtt1-"/^ o
+ t2)
is finite, and the substitution s = (1 + t 2 (l + tf2/k2))112 yields the formula oo
(k2 + / T a / 2 - k1-" J p(ds)sk/(/2 + s2k2), i
where k e N, /feZ, and p(ds) = cia)'1^2 - l)" a/2 ds. From the case d = 1 above we know that the function (k, i) -» k1~a admits a representing measure o n ] — l , l [ x K (which is supported on [0,1[ x {1}). In view of Comment (17.28)(2) we thus only need to find a measure kernel s -» ns such that ns represents the function
{kJ)^sW2
+ s2k2),
s ^ 1. If k is held fixed and / is thought of as a real variable, this function is just the Cauchy density for the parameter sk (up to a positive factor). The Fourier transform of the Cauchy density is known to be exp [ —s/c|u|]. Thus OO
sk/(t2 + s2k2) = c J due^e-* 1 " 1 — oo
for some constant c > 0, and ns can be taken to be the image of the finite measure ce~ s|u| du under the mapping u -> (e"s,u|, e'u). (In the above, the symbol i stands for the imaginary unit.) o
Chapter 18 Low energy oceans and discrete symmetry breaking
This chapter is devoted to the proof of the existence of phase transitions for certain short-range potentials which exhibit a specific kind of ground state degeneracy. Specifically, we shall deal with the case when the parameter set S is a simple cubic lattice Zd of any dimension d^2, and we shall focus on C-potentials in the sense of Definition (17.18). A C-potential <ï> will be said to exhibit a ground state degeneracy of order N > 1 if the function <1>C attains its minimum at N pairwise distinct configurations on C. (Recall that C stands for the elementary cell of Zd.) These N configurations will be called the local ground states of <ï>. Assuming that this occurs, we shall also stipulate that the local ground states are incompatible in a certain sense but are related to each other by suitable symmetries of <ï>. (The first condition will imply that the relating ^-symmetries are necessarily discrete.) To get a feeling for the significance of these conditions it is worthwhile to look at a familiar example: the Ising potential at zero external field (cf. (6.8)). This potential has two local ground states, namely the constant configurations on C which are identically + 1 resp. — 1. These local ground states can be distinguished from each other at every single site but are interrelated via a symmetry, the spin-flip transformation. As should be clear from Section 6.2, it are precisely these features of the Ising potential which give rise to a breaking of the spin-flip symmetry at low temperatures in two (or more) dimensions. It is thus natural to ask whether these features are the driving force behind a more general mechanism of symmetry breaking which lies behind the phase transitions in the Ising model. The answer is: yes, there is such a mechanism. In order to describe it we start again from the Peierls argument which has been used in the Ising case. As we know from Chapter 6, the heart of the Peierls argument is the concept of a contour. The Peierls contours indicate the boundaries of the connected regions in which the local spin patterns coincide with a local ground state and thus require a minimal energy. Keeping this in mind, it is natural to look at the random subgraph of Zd which consists of all those spins whose -energy relative to the adjacent spins is minimal (or almost minimal, at least). For reasons similar to those which ensure that a Peierls contour is typically short, the "low energy graph" for $ should contain an infinite cluster which resembles an ocean with small (and, in particular, finite) islands, at least at low temperatures. This conjecture falls into the realm of what is known as percolation theory: the study of infinite clusters in random subgraphs of
Percolation of spin patterns
383
regular lattices. Using some techniques of percolation theory, and also using the chessboard estimate as a counterpart to the Peierls contour estimate, it will be possible to confirm the above conjecture as follows. If the spin distribution is governed by a limit of Gibbs distributions °y^ with periodic boundary conditions, and if ß is large enough, then each two-dimensional layer in Zd contains a "low energy ocean" as described above. Now the point is this: If <5 exhibits a ground state degeneracy as described at the beginning then the low energy ocean in any fixed layer is bound to show a pattern which corresponds to one of the N distinct local ground states. This pattern can be observed at infinity, and its N possible values are symmetry-related. Consequently, there exist N mutually disjoint but symmetry-related tail events of necessarily equal and thus positive probabilities. In view of the results of Chapter 7, this can only occur when the relating symmetries are broken. To summarize, there exists a mechanism of symmetry breaking which consists of two parts: the existence of a low energy ocean when ß is large, and the ground state degeneracy which implies that the pattern of the ocean is a fluctuating tail variable. This mechanism will give us the following general theorem. If <5 is a C-potential which exhibits a ground state degeneracy as above then, for any sufficiently large ß, the potential ß® shows a breaking of all symmetries which relate the local ground states to each other. This theorem will be stated and proved in Section 18.2. The first section will be devoted to the existence of low energy oceans. In Section 18.3 we shall present various classical examples of phase transition that fit into the theory just outlined.
18.1
Percolation of spin patterns
Throughout this chapter we put S = Zd, the dimension d being at least two. We also let (E, S) be an arbitrary state space, and we assume that (£, S) is endowed with a finite a priori measure L (The case of an infinite a priori measure can be reduced to the finite case using the remarks in (2.18).) As in (17.12) we let C denote the unit cube in Zd, and we let
m 0 4 2-inf d>c = sup {c e U: Àc(®c ^ c) = 0}
denote the essential infimum of <S)C relative to Xe. (In concrete cases,
384
Low energy oceans and discrete symmetry breaking
a local ground state of O if Oc(co) = m®. More generally, we shall consider the sets G £ (0) of all local e-ground states for O. These are defined by (18.2)
G£(
e ;> 0. By definition of the essential infimum, 2C(G£(G>)) > 0 for all e > 0. Of course, if
V(G, co) = {i e S: (6^œ)c
e r'G}
be the set of all elementary cubes C + iinS for which coc+i exhibits the pattern that is described by the appropriate reflection of G. To understand the role of the reflections it might be worthwhile to look at an example. Suppose that d = 2, E = { — 1,1}, and G = {£}, where £
- + - - + - + - + 1 1 1 1 1 + - - + - + - - - + 1
-
1
1
+ - + - + + - + 1
1
1
1
1
1
1
+ - + - + © 2
2 2
- -
2
2
2
2
+ - + +
- + - + + - + - + - 1 1 2 + - + + + - + - - + + 1
Figure 18.1
1
1
+ + + - + - -
2
+ - +
A configuration m of + 's and — 's. The origin is encircled. The plaquettes C + i with i e V({Çk},(o) are marked with a k. Here k = 1 or 2 and Ck e {-1, + 1}C is given by Cf = ( - \f+h+k (i e C).
Percolation of spin patterns
385
The set V(G, co) will be regarded as the vertex set of a subgraph of S. The set of edges is simply the set {{iJ}czV(G,co):\i~-j\
= l}
of all undirected nearest-neighbour bonds between the sites of V(G, co). Accordingly, we shall examine paths and clusters in V(G,co). By definition, a (self-avoiding) path in a set V c S is an injective mapping n -> iM from a finite or infinite interval / c Zinto F such that | j ( n + 1 ) — j ( n ) | = 1 whenever n,n + 1 e I. If/ has a minimum k resp. a maximum £ then the sites im and i(e) are called the starting point resp. the end point of the path. A cluster of V is any non-empty subset t, of F which exhibits the properties below: (i) £ is connected, i.e., for any two distinct sites i,j e t, there exists a path in t, with starting point i and end point j , and (ii) if i e t, and j e F are such that \i — j \ = 1 then j e £. In other words, a cluster of V is nothing but a connected component of V. We shall be interested only in infinite clusters of V(G, co). So we let (18.4)
£(G, co) = {i e V(G, co): There exists an infinite path in V(G, co) with starting point i}
denote the union of all infinite clusters of V(G, co). If £(G, co) ^ 0 then V(G, co) is said to percolate. We shall also need to deal with the restriction of V(G, co) to an infinite subset R of S. To this end we introduce the notation VR(G, co) = V(G, co) D R, and we let ÇR(G, co) denote the union of all infinite clusters of VR(G, co). ÇR(G,co) is defined by (18.4) with V(G, co) being replaced by VR(G, co). (One should note that £,R{G,co) 7^ Ç(G,co)r\R in general.) In particular, we shall be concerned with the case when R is a two-dimensional layer in S. For the sake of definiteness we shall look at the plane (18.5)
P={i
= (ilt...,id)eS:i3
= -- = id = 0}.
(18.6) Definition. A subset t, of P will be called an ocean in P if t, is connected and all ^-islands in P are finite. Here a ^-island is meant to be any subset V of P\£ which is maximal relative to the following property: For any two i, j e V there exists a finite sequence (iw,...,ii0) in V with i(1) = i, ii0 = j , (k+1) m and |i - i \ = 1 or x / 2 for all 1 < k < t. Clearly, an ocean in P is always infinite. Also, if a subset V of P contains an ocean then V contains a unique infinite cluster, and this cluster is an ocean. We put (18.7)
£P(G, co) if £,p{G, co) is an ocean,
. [0 otherwise. Thus ÇP(G,co) is the unique maximal ocean in VP(G,co) whenever such an ocean exists. In particular, we have #(G,U,)=,
386
Low energy oceans and discrete symmetry breaking
{Çp(G, •) # 0} = {VP{G, •) contains an ocean}. In this section we will seek to provide a condition on $ which ensures that ( H{ÇP(G, • ) # 0) > 0 for a suitable \i e S{%. In fact, we shall only look at Gibbs measures for $ which are limits of Gibbs distributions with periodic boundary condition. There are two reasons for this restriction. On the one hand, these limits have the advantage of inheriting all symmetries of $. On the other hand, we shall need to make use of the chessboard estimate. So we let %() denote the set of these limits. That is, %() is the set of all cluster points (in the j§? -topology) of any sequence of the form (°y*(JV) x ô(0i)Nèl. Here A{N) is the cube (17.1), °y%m is given by (17.20), and œN e ESX-MN) is arbitrary. Recall that the set %(Q>) has already been considered in Example (5.20)(3). In particular, we know that %() a ^@(). In (18.12) below we shall prove that %{) whenever t(G, $) is sufficiently small. We put (18.8)
t(G,
c
ô>0
1
Here G e <S , Q>e%, l = À(Ey À, and I- inf
C
infimum of <5C on the set E \G relative to Xe. Here are some elementary properties of t(G, $). (18.9) Remarks. (1) For each $ e ^ e , the function t(-,(D) on Sc is decreasing. (2) For each $ e # e we have inf e7lc(G^(
inf
exp [2-sup
where A-sup <5)c = inf {c e U: XC(M D {C ^ c}) = 0} is the essential supM
remum of C on M. (3) For each G e Sc, the function t(G, •) is upper semicontinuous, in that f (G, *F) < t(G,
asß^oo.
Proof. (1) and (2) are obvious. (3) is an immediate consequence of (2) and definition (18.8). To prove (4) we set Ö = e/2 and observe that t(G£(<&), jffO) <; e-ßEeß3ßc(Ga{^)).
a
Percolation of spin patterns
387
The significance of the quantity t(G, O) comes from the following key estimate which might be regarded as a general version of the Peierls contour estimate. This estimate is the only result of this chapter which makes use of reflection positivity. (18.10) Lemma. Suppose G e §c is r--symmetric, and let O e ^&, ß e %($>), and D ^ if be given. Then the inequality n(DnV(G,-)
=
0)^t(G,®)m
holds. Proof. It is sufficient to prove that
°y*(DnV(G,-) =
0)St(G,®r
whenever A = A(N) is so large a cube that A ZD \J C + i. In this case we can isD write l{DriK(G,-)=0} = 11 ft ° ac(i)> ieA
where C{i) is defined by (17.13) and fi = f = l£C^G when i e D and ft= 1 otherwise. Applying Theorem (17.21) and the chessboard estimate (17.17), we thus obtain that / \ l/IAI °tf(D H V(G, • ) = 0) ^ f i °7A M l fj ° *<*> = tf'• jeA
\i'eA
/
Here MM \ie A
To estimate tA we can assume without loss that 1(E) = 1. (Otherwise we replace X by I = À(E)"1 À.) Then
a
z;^ = ^ ( n ^ / ^ c , o ) ^ exp
|A| A-inf Oc C E \G
nil/**™ ieA
Letting ]"]' denote the product over all i = (^,..., id) e A with ik = 0 mod 2 i
for all 1 ^ k ^ d, we also have
^ ( n. /° ^(o) ^ A (rr /° **>) = ^(/) IAI/IC| . On the other hand, for each (5 > 0 we can write, letting g =
\Gim,
388
Low energy oceans and discrete symmetry breaking
^ e x p [ - | A I K . + ö)-]XA(ug°
acA
To estimate the last integral we can again apply Corollary (17.17) (with p. = AA). This gives \ 1/|A|
/
1C(G,((D)) = XA(g o <jc) ^ AA m
9 ° *c
Putting together all preceding estimates we see that tA ^ t(G, 3>). The proof is thus complete. • We insert a comment which will be needed in the next chapter. (18.11) Comment. The conclusion of Lemma (18.10) remains true when t(G, $) is replaced by the quantity t{G,<S>) = e x p [ - l - i n f 0> c ]A c (£ c \G) 1/|C| EC\G
x infexp[l-sup(D c ]/A c (i?) 1/|C| , R
R
where the inf is taken over all rectangles R = f\ &i with Rt e S and A(K;) > 0. ieC
Indeed, in estimating °Z* from below we can take advantage of the identity
^A(nA u ° ^c(o) = ^ A (n' i« ° *««>) = m iAi/ic| . (In view of Remark (18.9)(2), the main difference between t(G,<&) and f(G,3>) is in the power 1/| C\ of the last factor of t(G,
as
( -> oo.
Percolation of spin patterns
389
The last condition implies that t{Kce,
z(t) = 1 A X ^(5075
(t ^ 0).
Clearly, z(t) - > 0 a s t - » 0 . (18.14) Lemma. Let Q be any of the four quadrants with vertex 0 in the plane P. Suppose Geêc,{ie 0>(Q, &\ andt^O are such that n(D n V(G, • ) = 0) ^ tm for all DeSf. Then
M0ee f l (G,-))^l-z(t). Proof If 0 £ ÇQ(G, •) then either 0 £ V(G, •), or 0 belongs to a finite cluster of VQ(G, •). In either case, there exists a finite set D = {u{1\...,uin} in ß such that (i) u^ = 0 and u> = 0, (ii) \um - u (k+1) | = 1 or Jl for all 1 ^ fc < /, (iii) \um - u{k+2)\ > J2 for all 1 ^ k < t - 1, and (iv) D D F(G, •) = 0. (Thus D is the vertex set of a path in Q which connects the two half-axes at the border of Q and is allowed to make next-nearest neighbour jumps.) For in the first case we can simply set D = {0}, and in the second case we let D be a suitable subset of the "outer boundary" of the finite cluster of VQ(G, •) that contains the origin; cf. the proof of Lemma (6.14). Hence { 0 ^ Q ( G , 0 } c U { ö n F ( G , - ) = Ö}, D
where the union is taken over all finite sets D c Q with properties (i) to (iii) above. Therefore we only need to count all such D with a given cardinality | JO | =
390
Low energy oceans and discrete symmetry breaking
for each quadrant Q c P, r-symmetric G £ ^ c , $ e ^ 8 , and p e %(®). We shall now combine this estimate with the shift-invariance of p to obtain a lower bound for p(0 e ££(G, •)). As a tool we shall need the well-known recurrence theorem of Poincaré which reads as follows. (18.15) Lemma. Let (Q.,tF,p) be an arbitrary probability space and 6 a measurable p-preserving transformation of Q. Also, let A e J^ and Aœ(6) = {(û e Q: 6"œ e A for infinitely many n ^ 0}. Then p(A\AJ6))
= 0.
Proof. For each œ e Q we let T(CO)
= sup {n ^ 0: 0"œ e A}
whenever there is some n with 6"œ e A, and T(CO) = — 1 otherwise. Then the inclusion A\AJ0)
c {0 g T < ex)} = U (^ = »}
holds. For each n ^ O w e have d~x {T = n) = {z = n + 1} and thus /X(T = n) = /i(r = n + 1). Since the sets {T = n) are pairwise disjoint and their union has at most probability one, we conclude that p(x = n) = 0 for all n ^ 0, and the lemma follows, D Let us agree to call a random field p quasi-Gibbsian if it enjoys the following property: There exists some X e &{E, S) such that for each A e 2T with p(A) > 0 the measure p(-\A) is equivalent to Xs on J \ for all A e if. It follows from Theorem (7.7)(b) and Remarks (1.28)(2) and (3) that each Gibbs measure relative to an arbitrary potential is quasi-Gibbsian. (18.16) Lemma. Let G e Sc be r-symmetric, and suppose /ie^@(Q, J^) is quasi-Gibbsian and satisfies p(0 e ÇQ(G, •)) ^ 1 — z for some z < 1/4 and each quadrant Q in P with vertex 0. Then /i(0e^(G,-))^l-4z. Proof. Let X e 2P(E, S) be such that p is quasi-Gibbsian with reference measure X. If Xe(G) = 1 then p(i e V(G, •)) = 1 for all i e S and thus p(V(G, •) = S) = 1. So we can assume that XC(G) < 1. To simplify our notation we shall further assume that d = 2. Hence P = S. We write Qx,..., Q 4 for the four quadrants with vertex 0 and consider the event * = {oen^(G,-)l=n{0^ß„(G,-)} that the origin is the point of intersection of an infinite "cross" in V(G, •)•
Percolation of spin patterns
391
By the subadditivity of ß, ß(X) ^ 1 — 4z. We thus need only to show that X is contained in {££(G, •) ¥= 0} ^-almost surely. This will be done in three steps. 1) For each k < 0 <
i1 ^ £, i2 = 0},
and we consider the events B(k,/) = {D(k,/)nV(G,-) 5W = # / ) J , O , - D ) .
= 0}, and
B=
f] k<0<(
BkJ.
(Here we use the notation of Lemma (18.15).) The significance of B is the following: The event B guarantees the existence of barriers which prevent V(G, •) from containing an infinite cluster which is limited to a strip of the form {i G 5:fc^ ij ^
Using the chessboard estimate (17.17) (or a direct Fubini-type argument) we conclude that 1C(EC\G) = 0, in contradiction to the assumption at the beginning of this proof. Hence ß(B) = 1. 2) In this step we will show that, ju-almost surely on X, V(G, •) contains a doubly infinite path in the upper half-plane which meets infinitely many points of each of the horizontal half-axes. For definiteness, we put Q1 = {ieS:i1~£0,i2^ty
and
Ô2 = {' e S: ^ ^ 0,i 2 ^ 0}.
We look at the events 4 = {0^Cl(G>-)nyG)-)} and -4«, =-4œ(o(i,o))n/4œ(o(_1>0)). Clearly, X cr A By Poincaré's theorem (18.15), A cz Aœ ju-almost surely. Combining this with Step 1), we conclude that X a AmC\B ju-almost surely. Now let F denote the event that VQiUQi(G, •) contains an infinite cluster which meets each of the half-axes {i2 = 0,i1 ^ 0} and {/2 = 0,il ^ 0 } infinitely often. The reader should check that F is measurable. We claim that Am H B cr F. Indeed, if w e Aœ then there exists a strictly increasing sequence (fc„)„sZ in Z with the property below: For each n e l there exist two infinite paths n* and n~ in VQ UQ(G, co) with starting point (fc„,0).
392
Low energy oceans and discrete symmetry breaking
n~ runs on the left side of the vertical axis [i1 = k„, i2 e Z}, and n^ extends on the right side. If, in addition, coe B then n* is bound to intersect n~+l, for all ne Z. Consequently, for each n e Z there exists a path in VQlUQ (G, co) which runs from (/c„,0) to (/c„+1,0). Sticking these paths together we see that coe F. This completes the proof of the inclusion Ax D B c F. Combining this result with the previous inclusions we finally conclude that I c F /i-almost surely. 3) To complete the proof of the lemma we consider the event Fx = Fx(9i0i _1() as well as the events F2, F 3 , F 4 which are obtained from F^ by a rotation of S by the angles n/2, n, and 3n/2, respectively. By Poincaré's lemma, F a F1 /i-almost surely. Hence, by Step 2), I c F j /i-almost surely. Interchanging the roles of the four quadrants we obtain by the same argument that I c F „ /i-almost surely for n = 2, 3, 4. Hence X <= F1 D • • • D F4 /i-almost surely. However, if œ e Fx D • • • fl FA then V(G, co) contains an infinite network of doubly infinite paths which run along some horizontal or vertical line, in that they keep to the "outer" side of that line and return to it infinitely often; see Figure 2. Clearly this network is an ocean. Thus Fl D • • • D FA a {££(G, • ) j= 0}, and the proof is complete, G
Figure 18.2 The event Fx 0... H F4; cf. the proof of (18.16). Combining the preceding lemmas we arrive at the following theorem. (18.17) Theorem. Let G e
/i(0e^(G,-))^l-4z(t(G,0)))
Percolation of spin patterns
393
holds. In particular, if (E, S) is a standard Borel space and z(t(G, <ï>)) < 1/4 then there exists some n e (S(^>) which is invariant under the shift-group (Ö,); e P and the reflections rx, ...,rd and satisfies
M#(G,-)*0)=1. Proof. The first assertion is immediate from Lemmas (18.10), (18.14), and (18.16). The second statement then follows from Proposition (18.12) by setting H = v(- \&(G, •) ± 0) for some v e %( lim
inf
0,
/i(0e^(G£(
In particular, if (E, S) is standard Borel and ß is large enough then there exists some /j. e ^(/?<ï>) which inherits the symmetries (ö ; ) ;eP and rx, ..., rd of <& and satisfies M(Gt(0), •) * 0) = 1. The above corollary is most appealing when d = 2 and thus P = S. It can then be rephrased as follows. In the low temperature limit ß -* oo, the set %(ßQ>) approaches the set {
m
for all ieS} = K°(G0(
of all absolute ground states. More specifically, if ß increases then with increasing probability the origin belongs to an ocean of spins which try to form an absolute ground state. The following example might serve as an illustration of this kind of approach to the absolute ground states. In particular, this example will show that a low energy ocean need not exist at high temperatures. (18.19) Example. The classical Heisenberg model. Let d = 2, n ^ 2, E be the unit sphere in W, and X the surface measure on E. We define a potential CD e <^e by putting {i,j}^C:\i-j\
=l
(In view of Comment (17.19)(2), <& is equivalent to a multiple of the potential A
(
0
otherwise
394
Low energy oceans and discrete symmetry breaking
which is usually called the Heisenberg potential.) Clearly, m(<S>) = — 4, and the local ground states are the constant configurations on C. For each e > 0 we have Ge(
{(oe Ec: co; •
= 1}.
Corollary (18.18) thus shows that for sufficiently large ß there exists some H e %(ß<£) such that with probability 1 there exists an ocean with the property that any two adjacent spins in this ocean point in almost the same direction. Of course, this property does not imply any relation between the orientations of spins which are far apart from each other. This fact contrasts with the behaviour of those low energy oceans which will be considered in the next sections. Next, we will show that n(^(G8, •) # 0) = 0 for all p e #(j80) when both ß and e are small. To this end we fix some integer £ ^ 1. For each \i e ^(ßO) and a e S we can write flic)" O j l l c + l ) ^
1
-
e for
all
0
^
k
<
£),
where the sum runs over all self-avoiding paths D = {i <0) ,..., i(n} with i<0) = a. There are at most 4- l>e'v such paths. Moreover, for all i,j e S with \i —j\ = l and ^-almost all œ we obtain by a straightforward estimate n(oroj £ 1 - e | ^ ) ) M = yffioi-coj ^ 1 - e\a>) ^e16ßA((xi,...,x„)eE:x1
^ 1 - e)/A(£)
An iterated application of this inequality shows that
Li(aeaGe,-))^4-y-h(ß,sY for all *f ^ 1 and a e S. Consequently, if ß and e are so small that 3r(ß, e) < 1 then n(a e Ç(Ge, •)) = 0 for all a e S. By the subadditivity of ß, this implies t h a t ^ ( G £ , - ) # 0 ) = O. o
18.2
Discrete symmetry breaking at low temperatures
In this section we will show how the existence of low energy oceans, together with a degeneracy of the local ground states, can give rise to a phase transition. As we shall see, for this to occur it is first essential that the degeneracy of local ground states induces a long range order on the ocean, in that knowledge of the actual spin pattern on any part of the ocean provides some information on the pattern of the ocean at infinity. The low energy ocean then establishes
Discrete symmetry breaking at low temperatures
395
a link between the microscopic and the macroscopic behaviour of the spins. In the presence of symmetries, this link will imply a phase transition. We begin with specifying a class of transformations which we will consider below. Roughly speaking, this class consists of all transformations in T which are completely determined by their restriction to Ec. Specifically, we shall deal with the following two classes of transformations of Ec. (i) Spatial transformations. These are the reflections r1, ..., rd which have been defined in (17.5), as well as the P-preserving rotation r0: Ec -> Ec which is given by the formula (18.20)
(r0cü)j = cü(i2,1_i1>i2,...jiil)
(i =
(il,...,ii)eC,coeEc).
(ii) Pure spin transformations. These are all transformations T: EC -> Ec of the form (18.21)
zœ = {ziœi)ieC
(œ e Ec),
where the T;'S are A-preserving invertible transformations of (£, $). Each of the preceding transformations can be extended in a natural way to a measurable invertible transformation of (Q, #") which will be denoted by the same letter. This is obvious for the transformations of type (i). In case (ii) we put T(O = (Zj(Dj)jsS (œ e Q), where T,- = xi whenever; e S and i e C are such that jk = ik mod 2 for all 1 ^ k :§ d. We let Tc x denote the transformation group on Ec resp. Q which is generated by the transformations in (i) and (ii) above. Also, for each O e <^0 we let / c A() denote the subgroup of Tc k which consists of all symmetries of Q> in Tc A. In view of hypothesis (17.18) (iii), / C t i (0) certainly contains the iterated reflections r\ ie C. We also recall the following fact which was proved in Example (5.20) (3) and will soon turn out to be important. (18.22) Remark. If
396
Low energy oceans and discrete symmetry breaking
(b) {G 1 ,...,G JV } will be called (^-symmetric if for all 1 ^ m, n ^ N there exists some x e /CjA() with xGm = G„. What is the significance of a stable ground state degeneracy? To answer this we assume that the conditions of Definition (18.23) (a) hold, and we let œ e Q. be any configuration. Suppose i,j e VP(Ge(®), co) are adjacent (i.e., |z —j\ = 1). Since rl(6_ioj)c, rj(6_jCo)c e Ge(), we can find two indices 1 ^m,n ^ N such that r'(ö^;co)c e Gm and rj(6_jCo)c e G„. However, the configurations r'(ö_;co)c and rj(6_jCo)c coincide on some P-face of C. The stability condition (18.23)(a)(ii) thus implies that m = n. In other words, there exists some 1 ^ n ^ N with i,j e VP(Gn, co). This in turn implies that each connected subset of VP(GE(<&), co) is a subset of VP(Gn,co) for some n. Consequently, there exists some n with £°(G£(<5),co)c£°(G„,co). Hence
(18.24) K°(Gs(0), - ) ^ } c U K?(G„, ') # 0}. n= l
One should note that the sets on the right side of (18.24) are pairwise disjoint. Indeed, let m ^ n be given. Then ÇP(Gm, co) D ÇP(G„, co) = 0 for all co e Q because Gm and G„ are disjoint. But the definition of an ocean implies that any two oceans have a non-void intersection. Hence ÇP(Gm, co) = 0 or ÇP(G„, co) = 0 for all co, and thus
K°(G m ,-)^0}nK°(G„,-)/0}=0. The inclusion (18.24) provides a key to the following theorem which is the main result of this chapter. (18.25) Theorem. Let (E, S) be a standard Borel space and <5 any C-potential. Suppose <5 admits a (^-symmetric P-stable partition {G1,...,GN} of its local ground states. If ß is sufficiently large, there exist N distinct Gibbs measures / ^ , . . . , / % e ^(ßQ>) which enjoy the following properties. (i) For all 1 ^ n ^ N, n„(Ç0P(Gn, •) ^ 0) = 1. Moreover, if ß-^ co then /x„(0e^(G„,-))-l. (ii) For each 1 ^ n ^ N, the measure p,„ is preserved by the shift-group @„ = {6t: ie P,rlG„ = G„} as well as all transformations x e Ic A(<5) which satisfy xG„ = G„. (Hi) If 1 ^ m, n ^ N and x e /CiA() are such that xGm = G„ then x{p,m) = p,„. If, in particular, x = r1 or r2 then, in addition, 6u(p,m) = p,n, where u is the unit vector in direction 1 resp. 2. The potential ß® thus exhibits a breaking of all symmetries x e / C A ($) which N
map some Gm onto a different G„,n / m. In particular, if © = f] ©„ then \ex%(ß®)\^N. Proof Let e > 0 be so small that Ge(<5) c Gt U • • • U GN and ß so large that
Discrete symmetry breaking at low temperatures
397
5{ß) = 4z(t(Gt(0), ßO)) < 1. (This is possible because of Remark (18.9)(4).) By Theorem (18.17), there exists some ß e %(ßd>) with /x(£°(GE(0>), •) # 0) ^ 1 - ö(ß). We put An = {^(G„, •) ^ 0}, i ^ „ ^ jv. The inclusion (18.24) then ensures that I p{An) > 0. n= l
Since {G1,...,GN} is O-symmetric, the events Av,..., AN are related to each other by suitable transformations in Ic,i{^>)- In view of Remark (18.22), this implies that ^{A^ = ••• = n(AN). We thus conclude that n{A„) > 0 for all n. The rest of the proof is routine. We introduce the conditional probabilities n„ = n(- \A„\ 1 ^n^N. Since An e ST, Theorem (7.7) (b) yields that Hn e ^(/?). As Av,..., AN are pairwise disjoint, the measures / i l 5 . . . , fiN are mutually singular and thereby distinct. Also, since A„ and ß are invariant under the shifts in ©„ as well as all G„-preserving transformations in IClx{^>\ fin is invariant under the same class of transformations. This proves assertion (ii), and the same kind of argument establishes (iii). To complete the proof of (i) we put G = Gt U • • • U GN. Using the hypothesis of -symmetry and the Ic A()-invariance of n, we obtain H„(0 e É?(G„, •)) = A*(0 6 ^(G„, -))ln(An) = MOe^(G,-))/M^(G,-)^0) ^M0^?(G£(
398
Low energy oceans and discrete symmetry breaking
18.3
Examples
In this section we shall visit a zoo of various classical models which are known for their symmetry breaking at low temperatures. In all these models it is easy to find a stable symmetric partition of the local ground states, so that the symmetry breaking is an immediate consequence of Theorem (18.25). Most of the models are usually defined in terms of a nearest-neighbour or next-nearest neighbour potential. Therefore, in order to apply Theorem (18.25) we need to pass to an equivalent C-potential. This is possible because of Comment (17.19) (2). Except in the last example which needs three dimensions, we shall always assume for simplicity that d = 2. (The extension to higher dimensions provides no difficulties.) A configuration œ e Ec can then be described conveniently by writing
\u
vj
when u = co(0i0), v = <w(1>0), w = co (0jl) , z = co(1>1). 18.3.1
The Ising ferromagnet
Let E = {— 1,1}, X be counting measure, and O e # e be the unique C-potential with (18.27)
X {i,j}^C:\i-J]
=l
Up to a factor of 2,
The Ising antiferromagnet in an external field
Let E and X be as above. We consider the C-potential <£> which is given by
(18.28)
X U,j}^C:\i-j\ = l
wj-h
5>iieC
Examples
399
Thus the first term differs from the ferromagnetic case (18.27) only in its sign. The real parameter h represents the action of an external magnetic field. We assume that \h\ < 2.
Oc=-J1
X
°i°j-Ji
Z
Wj-hZ ieC
Thus h is an external field, and Jx and J2 are the nearest-neighbour resp. next-nearest neighbour coupling constants. (Positive values of Jx or J2 correspond to a ferromagnetic coupling.) Ic,x(®) always contains the rotation r0 and the reflections rx and r2. If h = 0 then JTC.AO^) also contains the spin-flip, and if h = Jx = 0 then the sublattice spin-flip m -> ((— l) il+i2 cü ; ) ieS is a further symmetry of
Figure 18.3 The phase diagram of the Gertsik-Dobrushin model. The "etc." stands for the rotation images of the local ground state shown.
400
Low energy oceans and discrete symmetry breaking
Theorem (18.25) applies in each of the following four cases. (a) J t > 0, J2 + J1 > 0, h = 0. The set of local ground states then consists of the configurations ( ++) and (II), and we obtain the same result as in the case of the Ising ferromagnet. (b) J1 < 0,J2> J1,\h\ < — 2J t + (J2 A 0). In this case there are again two ground states, namely ( ++) and (+*). Theorem (18.25) thus implies the same result as in Example 18.3.2 (which corresponds to the particular choice Ji= - 1 , J 2 = 0). (c) J t = 0, J2 > 0, h = 0. These parameter values constitute a half-line in 1R3 which is the common border of the regions (a) and (b). The Gibbs specifications yß