This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
= g , and TO = RC.
wT,
(Qt)'; b) the relation Q , = W, - AU, as the apparent origin of a Gaussian-like central part of P ( Q T )as well as of the linear initial behavior of p up to 1. Physically, these features have the appearance of being more general and could possibly lead to a new broader picture of non-equilibrium fluctuations. 3. However, no consistent picture has as yet been obtained about the theoretical connection between the two Fluctuation Theorems, the CFT and the EFR and what the precise implications are of their contents. It seems possible though that they will
12 provide a n important hint on Fluctuation Theorems in non-equilibrium stationary states in general, so t h a t a deeper investigation of them seems warranted.
Acknowledgement The authors are indebted t o Professors Garnier and Ciliberto of t h e ENS-Lyon for providing us with Figs. 5a,b. Support from t h e Office of Basic Energy Sciences of t h e US Department of Energy under grant DE-FG02-88-ER13847 is also gratefully acknowledged.
References 1. D. Ruelle, J. Stat. Phys. 95,393 (1999). 2. D. Ruelle, Physics Today, May 2004, p. 48. 3. S. R. de Groot and P. Mazur, Non-Equilibrium Thermodynamics (Dover Publications, Inc., New York, 1984). 4. D. J. Evans, E. G. D. Cohen, and G. P. Morriss, Phys. Rev. Lett. 71,2401 (1993). 5. G. Gallavotti and E. G. D. Cohen, Phys. Rev. Lett. 74,2694 (1995). 6. G. Gallavotti and E. G. D. Cohen, J. Stat. Phys. 80,931 (1995). 7. R. van Zon and E. G. D. Cohen, Phys. Rev. E 67,046102 (2003). 8. R. van Zon and E. G. D. Cohen, Phys. Rev. Lett. 91,110601 (2003). 9. R. van Zon and E. G. D. Cohen, Phys. Rev. E 69, 056121 (2004). 10. R. van Zon and E. G. D. Cohen, Physica A 340,66 (2004). 11. R. van Zon, S. Ciliberto, and E. G. D. Cohen, Phys. Rev. Lett. 92,130601 (2004). 12. F. Bonetto, G. Gallavotti, and P. L. Garrido, Physica D 105,226 (1997). 13. S. Lepri, R. Livi, and A. Politi, Physica D 119,140 (1998). 14. G. Ayton and D. J. Evans, J. Stat. Phys. 97,811 (1999). 15. F. Bonetto and J. L. Lebowitz, Phys. Rev. E 64,056129 (2001). 16. E. Mittag, D. J. Searles, and D. J. Evans, J. Chem. Phys. 116,6875 (2002). 17. S. Ciliberto and C. Laroche, J. Phys. IV (France) 8,Pr6, 215 (1998). 18. S. Ciliberto, N. Garnier, S. Hernandez, C. Lacpatia, J.-F. Pinton, and G. Ruiz Chavarria, Physica A 340,240 (2004). 19. K. Feitosa and N. Menon, Phys. Rev. Lett. 92,164301 (2004). 20. N. Garnier and S. Ciliberto, preprint arXiv:cond-mat/0407574 (2004). 21. D. J. Evans and D. J. Searles, Phys. Rev. E 50,1645 (1994). 22. E. G. D. Cohen and G. Gallavotti, J. Stat. Phys. 96 1343 (1999). 23. G. Gallavotti, Math. Phys. Electronic J. 1, 1 (1995). 24. G. Gallavotti, Phys. Rev. Lett. 77,4334 (1996). 25. G. Gallavotti and D. Ruelle, Commun. Math. Phys. 190,279 (1997). 26. D. J. Searles and D. J. Evans, J. Chem. Phys. 112,9727 (2000); D. J. Evans and D. J. Searles, Advances in Phys. 51,1529 (2002). 27. D. J. Evans and G. P. Morriss, Statistical Mechanics of NonEquilibrium Liquids (Academic Press, London, 1990). 28. J. Kurchan, J. Phys. A, Math. Gen. 31,3719 (1998). 29. J. L. Lebowitz and H. Spohn, J. Stat. Phys. 95,333 (1999). 30. D. J. Searles and D. J. Evans, Phys. Rev. E 60,159 (1999). 31. G. M. Wang, E. M. Sevick, E. Mittag, D. J. Searles, and D. J. Evans, Phys. Rev. Lett. 89,050601 (2002). 32. R. M. Mazo, Brownian Motion (Clarendon Press, Oxford, 2003) p. 178.
IS THE E N T R O P Y SQ E X T E N S I V E OR N O N E X T E N S I V E ?
CONSTANTINO TSALLIS S a n t a Fe Institute, 1399 H y d e Park Road, S a n t a Fe, NM 87501, USA E-mail: [email protected] and Centro Brasileiro d e Pesquisas Fisicas, R u a Xavier Sigaud 150, Urca 22290-180, Ria d e Janeiro, Brazil T h e cornerstones of Boltzmann-Gibbs and nonextensive statistical mechanics respecp i In pi and S, k (1p ? ) / ( q - 1) (q E tively are the entropies SBG E -k R ; S1 = SBG). Through them we revisit the concept of additivity, and illustrate the (not always clearly perceived) fact that (thermodynamical) extensivity has a well defined sense only if we specify the composition law that is being assumed for the subsystems (say A and B). If the composition law is not explicitly indicated, it is tacitly assumed that A and B are statistically independent. In this case, it imme. diately follows that SBG(A B) = SBG(A) S B G ( B ) ,hence extensive, whereas S,(A B ) / k = [S,(A)/k] [S,(B)/k] (1 - q)[Sq(A)/k][Sq(B)/k], hence nonextensive for q # 1. In the present paper we illustrate the remarkable changes that occur when A and B are specially correlated. Indeed, we show that, in such case, S,(A B) = S,(A) S,(B) for the appropriate value of q (hence extensive), whereas S B G ( A B ) # SBG(A) S B G ( B )(hence nonextensive). We believe that these facts substantially improve the understanding of the mathematical need and physical origin of nonextensive statistical mechanics, and its interpretation in terms of effective occupation of the W a priori available microstates of t h e full phase space. In particular, we can a p preciate the origin of the following important fact. In order t o have entropic extensivity (i.e., IirnN,, S ( N ) / N < 00, where N number of elements of the s y s t e m ) , we must use (i) SBG,if the number Weffof effectively occupied microstates increases with N like ( p 2 1); (ii) S, with q = 1 - l / p , if W e n - NP < W ( p 2 0). We weff,w-pN had previously conjectured the existence of these two markedly different classes. The contribution of the present paper is t o illustrate, for the first time as far as we can tell, the derivation of these facts directly from the set of probabilities of the W microstates.
cEl
+
+
+
+
+
+
=
c,"=,
+
+
+
=
1. Introduction
A quantity X ( A ) associated with a system A is said additive with regard to a (specific) composition of A and B if it satisfies
+
+
X ( A B) =X ( A ) X ( B ) ,
+
(1)
where inside the argument of X precisely indicates that composition. For example, suppose we partition the interior of a single closed bottle in two parts. If no chemical or other reactions occur between the gas molecules that might be inside the bottle, nor between these molecules and the bottle itself (and its internal physical partition), the number of gas molecules is an additive quantity with regard t o
13
14 the elimination of the partition surface. The same happens with the total energy of an ideal gas, where all interactions have been neglected, including the gravitational one. More trivially, the total height of various (rectangular) doors is, practically speaking, an additive quantity, ifwe pile them one above the other one. Not so if we put them side by side! On an abstract level, it is clear that this additivity just corresponds to the number of elements of the union of two sets A and B that have no common elements. If, instead of two subsystems A and B , we have N of them (Al, Az, ...,AN), then we have that N
N
i=l
i=l
If the subsystems happen to be all equal (a quite common case), then we have that X(N) = N X ( l ) ,
(3)
with the notations X(N) = X(CLl Ai) and X(l) = X(A1). An intimately related concept is that of eztensivity. It appears frequently in thermodynamics and elsewhere, and corresponds t o a weaker demand, namely that of
Clearly, all quantities that are additive with regard to a given composition, also are extensive with regard to that same composition (and limN-,m X(N)/N = X(1)), whereas the opposite is not necessarily true. For example, the total energy, the total entropy and the total magnetization of the standard Ising ferromagnetic model with N spins on a square lattice are extensive but not additive quantities. In other words, they are asymptotically additive, but not strictly additive. Of course, there are quantities that are neither additive nor even extensive. They are called noneztensive. All types of behaviors can exist, such as X(N) 0: NY (y 2 0). For instance, thermodynamical quantities that, with regard to some specific composition, exhibit y = 0 are called intensive. Such is the case of the temperature, pressure, chemical potential and similar quantities in a great variety of (thermodynamically equilibrated) systems observed in nature. A less trivial example of nonextensive quantity emerges within a spatially homogeneous d-dimensional classical gas whose N particles (exclusively) interact through a two-body interaction potential that is strongly repulsive at short distances whereas it is attractive at long distances, decaying like 1/r" (r = distance betweentwoparticles), and 0 5 a l d . The total potential energy of such a system corresponds t o y = 2 - a / d if 0 5 a / d < 1 (i.e., nonextensive), and to y = 1 for a / d > 1 (i.e., extensive). The total potential energy of this particular model has a logarithmic N-dependance (i.e., nonextensive) at the limiting value a / d = 1. The Lennard-Jones model for gases corresponds to ( a ,d) = ( 6 , 3 ) , and has therefore an extensive total energy. In contrast, if we assume a cluster of
15 stars gravitationally interacting (together with some physical mechanism effectively generating repulsion at short distances), we have ( a ,d ) = (1,3), hence nonextensivity for the total potential energy. The physical nonextensivity which naturally emerges in such anomalous systems is, in some theoretical approaches, desguised by artificially dividing the two-body coupling constant (which has in fact no means of “knowing” the total number of particles of the entire system) by For the particular case a = 0 this yields the widely (and wildly!) used division by N of the coupling constant, typical for a variety of mean field approaches. See for more details. Boltzmann-Gibbs (BG) statistical mechanics is based on the entropy W
SBG = -k)pilnpi,
(5)
i=l
with W
cpi=1, i=l
where pi is the probability associated with the ith microscopic state of the system, and k is Boltzmann constant. In the particular case of equiprobability, i.e., pi = 1 / W (Vi), Eq. ( 5 ) yields the celebrated Boltzmann principle (as named by Einstein 3) :
SBG= kln W
.
(7)
From now on, and without loss of generality, we shall take k equal to unity. Nonextensive statistical mechanics, first introduced in 1988 (see for reviews), is based on the so-called “nonextensive” entropy S, defined as follows: 41516
7,8,9,10,11,12113,14,15
For equiprobability (i.e., pi = 1 / W , V i ) , Eq. ( 8 ) yields
S, = In, W , with the q-logarithm function defined
l6
(9)
as
The inverse function, the q-exponential, is given by
+
if the argument 1 ( 1 - q ) z is positive, and equals zero otherwise. The present paper is entirely dedicated to the analysis of the additivity or nonadditivity of SBG and of its generalization S,. However, following a common (and
16 sometimes dangerous) practice, we shall from now on cease distinguishing between additive and extensive, and use exclusively the word extensive in the sense of strictly additive. 2. The case of two subsystems
Consider two systems A and B having respectively W, and WBpossible microstates. The total number of possible microstates for the system A B is then in principle W = WA+B= WAWB. We emphasized the expression “in principle” because, as we shall see, a more or less severe reduction of the full phase space might occur in the presence of strong correlations between A and B. We shall use the notation p F B (i = 1 , 2 , ...,WA;j = 1,2,...,W B )for the joint probabilities, hence
+
The marginal probabilities are defined as follows:
hence
and
hence W B
cpj”=1. j=1
These quantities are indicated in the following Table.
17 We shall next illustrate the importance of the specification of the composition law. Let us consider two cases, namely independent and (specially) correlated subsystems.
2.1. Two independent subsystems Consider a system composed by two independent subsystems A and B , i.e., such that the joint probabilities are given by A+B = A B
Pi P j
Pi3
(V(i,j)).
(17)
With the definitions
cc p p B WA WB
SBG(A+ B ) = -
,
(18)
and
3=1
we immediately verify that
+
SBG(A+ B) = SBG(A) SBG(B)
(21)
and, analogously, that Sq(A + B ) = Sq(A)
+ Sq(B) + (1 - q)Sq(A)Sq(B).
(22)
Therefore, SBG is extensive. Consistently, S, is, unless q = 1, nonextensive. It is in fact from property (22) that the q # 1 statistical mechanics we are referring to has been named nonextensiue.
2.2. Two specially correlated subsystems Consider now that A and B are correlated, i.e.,
PFB# PfPj”
7
Assume moreover, for simplicity, that both A and B systems are equal, and that WA= WB = 2. Assume finally that the joint probabilities are given by the following Table (with 1/2 < p < 1):
18
It can be trivially verified that Eq. (21) is not satisfied. Therefore, for this special correlation, SBG is nonebensive. It can also be verified that, for q = 0 and only for q = 0, the following additivity is satisfied: So(A
+ B ) = So(A)+ S o ( B ) ,
(24)
therefore So is extensive. Indeed S o ( A + B ) = 2So(A) = 2. We immediately see that, depending on the type of correlation (or lack of it) between A and B , the entropy which is extensive (reminder: as previously announced, we are using here and in the rest of the paper “extensive” to strictly mean “additive”) can be SBC or a different one. Before going on, let us introduce right away the distinction between a T oLet ri possible states (in number W ) and allowed or effective states (in number W e l ). us consider the above case of two equal binary subsystems A and B and consequently W = 4. If they are independent (i.e., the q = 1 case), their generic case corresponds to 0 < p < 1 , hence W e f f= 4. But if they have the above special correlation (i.e., the q = 0 case), their generic case corresponds to 1 / 2 < p < 1 , hence Weff = 3 (indeed, the state ( 2 , 2 ) , although possible a priori, has zero probability). This type of distinction is at the basis of this entire paper. Notice also that the q = 1and q = 0 cases can be unified through We-@= [2l-Q 2l-q - l]’/(’-Q)= [22-Q- l ] ’ / ( ’ - Q ) . This specific unification will be commented later on. Let us further construct on the above observations. Is it possible to unify, at the level of the joint probabilities, the case of independence (which corresponds to q = 1) with the specially correlated case that we just analyzed (which corresponds to q = O)? Yes, it is possible. Consider the following Table:
+
where f q ( p ) is given by the following relation: 2pQ
+ 2(1 -
p)q
- ( f q ) Q - 2 ( p - f q ) Q - ( 1 - 21,
+
fq)Q =
1,
(25)
with f q ( l ) = 1, and 0 5 q 5 1 (later on we shall comment on values outside this interval). Typical curves f q ( p ) are indicated in Fig. 1. Since Eq. (25) is an implicit one, they have been calculated numerically. It can be checked, for instance, that f q ( 1 / 2 ) smoothly increases from zero to 1/4 when q increases from zero to unity, being very flat in the neighborhood of q = 0, and rather steep in the neighborhood of q = 1 . The interesting point, however, is that it can be straightforwardly verified that, for the value of q chosen in f q ( p ) defined through Eq. (25) (and only for that
19 4 )9
+
S q ( A B ) = 2Sq(A)= 2
1-pq-
(1 - p ) q
(26)
9
q-1
where we have used the fact that A = B. In other words, we are facing a whole family of entropies that are extensive for the respective special correlations indicated in the Table just above. 1
0.75
f4 0.5
0.25
n 0
0.25
O5
P
0.75
1
Figure 1. The function f q ( p ) , corresponding to the two-system A = B case (with W A = W B = 2), for typical values of q E [0,1]. A few typical nontrivial (q, fq(1/2)) points are (0.4,0.043295),(0.5,0.064765),(0.6,0.087262), (0.7,0.111289), (0.8,0.138255), (0.9,0.171838),(0.99,0.225630). It can be easily verified that these values satisfy the relation 21-4 - [fq(1/2)Iq - [(1/2) - fq(1/2)Iq = 1/2, which is the simple form that takes Eq. (25) for the p = 1/2 particular case. We also remind the trivial values fo(ll2) = 0 and fi(l/2) = 1/4.
Let us proceed and generalize the previous examples to two-state systems A and B that are not necessarily equal. The case of independence is trivial, and is indicated in the following Table:
20
1
2
1
P?pf
P?P§
PA
2
PAP?
PAPB
PA
A\B ||
pf Of course, Eq. (21) is satisfied. Let us consider now the following Table (with pA + pf > 1): A\B 1 1
2
1
P?+pf-l A
l~P
2
1-pf
Pf
0
1-rf
We verify that Eq. (24) is satisfied. Is it possible to unify the above anisotropic 9 = 1 and q = 0 cases? Yes, it is. The special correlations for these cases are indicated in the following Table:
pf-/,(p?,pf) 3
1-pf ) = /,(pf ,P ), /,(p,l) = p. /,(p,p) = /,(p), /l&tf.pf) = A
where /,( and fo(pf,P?) = P\ + pf — 1- For any value of q in the interval [0, 1], and for any probabilistic pair (p^,pf ), the function /q(pf ,pf ) is (implicitly) defined through
-bf - f,(rt,p? )]' - bf - /,(pf ,pf )]' -[i-pi 1 -pf + /9(pi1,pf)]9 = i
(27)
(We remind that, for the q = 0 particular case, it must be pf +pf > 1). We notice that the special correlations we are addressing here make that all joint probabilities can be expressed as functions of only one of them, say pn+ , which is determined once for ever. More explicitly, we have that pAfB = pA - pA{*~B, pA^B = pf A+B A+B _ A B nA+B Pi —nPi Pn P22 — Pn Eq. (27) recovers Eq. (25) as the particular instance pA = pf. And we can easily verify that, for 0 < q < 1,
21
So, we still have extensivity for the appropriate value of q, i.e., the value of q which has been chosen in Eq. (27) to define the function f g ( x , y ) reflecting the special type of correlations assumed to exist between A and B. In other words, when the marginal probabilities have all the information, then the appropriate entropy is SBG- But this happens only when A and B are independent. In all the other cases addressed within the above Table, the important information is by no means contained in the marginal probabilities, and we have to rely on the full set of joint probabilities. In such cases, SBG is nonextensive, whereas 5, is extensive. Before closing this section dedicated to the case of two systems, let us indicate the Table associated to the q = 0 entropy for arbitrary systems A and B:
B
A\
1
1
WB
1
rf+pf-1
$
PwB
rt
2
P$
0
0
rf
WA
PwA
0
0
r> Pw A
p?
r Pi
1
A
PwB
We easily verify that Eq. (24) is satisfied. For example, the generic case corresponds to all probabilities in the Table being nonzero, excepting those explicitly indicated in the Table. For this case we have S0(A) = WA - 1, S0(B) = WB - 1, and So(A + B) = WA + WB — 2. This is a neat illustration of the fact that, although the full space admits in principle W = W^Wg microstates, the strong correlations reflected in the Table make that the system uses appreciably less, namely, in this example, We" — WA + WB — 1- It is tempting to conjecture the generalization of this expression into Weff = [WA~g + Wg~9 -IjVU-g) for 0 < q < 1. It is clear that Weff < W^WB, the equality holding only for q = 1. Since, strictly speaking, WA, WB and We" are integer numbers, this expression for Weff can only be generically valid for real q ^ 0,1 in some appropriate asymptotic sense. This sense has to be for WA, WB » 1, which however are not fully addressed in the present paper for q ^ 0,1. For the particular instance A = B, we have Weff = [2WA~q - l] 1 /(i-«). We also verify another interesting aspect. If A and B are independent, equal values in the marginal probabilities are perfectly compatible with equal values in the joint probabilities. In the most general independent two-system case, we can simultaneously have p<* = I/WA (Vi), pf = I/WB (Vj), and S = \l(WAWB) (V(i,j)). This is not possible in the above Table. InP4.+ deed, equal probability values for all allowed microstates in the Table imply p^+B = I/(WA + WB — 1) (V(i,j)), which is incompatible with equal values for the marginal probabilities. This fact starts pointing into what kind of (irreducibly correlated) situation, the usual BG microcanonical hypothesis "equal probability
22 occupation of the entire phase space” for thermal equilibrium might become inadequate. It is very plausible that a variety of microscopic dynamical situations must exist (e.g., long-range-interacting Hamiltonian systems) for which the standard equilibrium hypothesis is an oversimplification for physically relevant stationary states that do not correspond to thermal equilibrium.
3. The case of three subsystems Consider now three systems A, B and C , having respectively W A , W B and W c possible microstates. The total number of possible microstates for the system A B C is then in principle W = WA+B+C= WAWBWC.As for the case of two systems, we shall see that strong collective correlations between A, B and C may cause a severe reduction of the allowed phase space. (i = 1,2, ...,WA; j = 1,2, ...,W B ; k = We shall use the notation p$B+c 1,2, ..., W c ) for the joint probabilaties, hence
+
+
The AB-marginal probabilities are defined as follows:
hence
Similar expressions exist for the AC- and BC-marginal probabilities. The joint probabilities for the W A= W B= Wc = 2 case are indicated in the following Table, where the numbers without parentheses correspond to system C in state 1, and the numbers within parentheses correspond to system C in state 2.
The corresponding AB-marginal probabilities are indicated in the Table below:
23
AVJ
2
1
1
r,A+tl
Pn
r>A+a Pl2
2
nA+a P21
•nA+u P22
which of course reproduces the situation we had for the two-system (A+B) problem. This is to say pAfB = so on. + 3.1. Three independent subsystems Consider first the case where all three subsystems A and B are binary and statistically independent, i.e., such that the joint probabilities are given by
The corresponding Table is of course as follows A\B!I
1
I
2
Pfpfpf
pAp-?p? (P2PJP2)
We immediately verify that SBG^ + B + C) = SBG(A) + SBG(B) + SBG(C)
(33)
Therefore, SBG is extensive. Consistently, Sq is, unless 9 = 1, nonextensive. 3.2. Three specially correlated subsystems Consider now that the three binary subsystems are correlated as indicated in the next Table (with pA +p? +pf > 2):
1 2
Pi+Pi+Pi ~2 (1-pf)
(Q) 1
1-PA (0)
0 (0)
We easily verify that 5-Q04 + B + C) = S0(A] + SQ(B] + SQ(C].
(34)
24
+
For example, if A = B = C and 2/3 < p < 1, we have that & ( A + B C) = = 3. Let us next unify the q = 1 and the q = 0 cases. We heuristically found the solution. It is indicated in the following Table: A\B
1
2
where the function fq(qy) is defined in Eq. (27). Interestingly enough, it has been possible to find a three-subsystem solution in terms of the two-subsystem and onesystem ones. More explicitly, we have, for example, that pfAB+c = fq(pf,p?) fq(P?,Py)-P::(Pi'+P?)+P::fq(Pi',P?) = Pf~c+P~~B-P::(Pi'+P?)+P::P;l;tB, and similarly for the other seven three-subsystem joint probabilities. Of course, all eight joint probabilities associated with the above Table are nonnegative; whenever the values of (pf,pf,py)replaced within one or the other of these analytic expressions yield negative numbers, the corresponding probabilities are to be taken equal t o zero. The AB-marginal probabilities precisely recover the joint probabilities of the
+
Pi'f,(P?,Pf)l+ bi'(1- P? - P:: Finally, we verify that
B + fq(Pf,Pf))l = Pi' - fq(P1A 7PlL and so on.
For the particular case A = B = C, the above Table becomes
25
where we have used fq(prp)= f,(p). For the generic case of three subsystems with W A ,WB and W c states respectively, we have that W = W AWBW c , whereas in the appropriate asymptotic sense we expect W e f l = [Wi-, WA-, WA-, - 2]'/('-'J) < - W for 0 5 q 5 1 (the equality generically holds only for q = 1 ) . In the particular instance A = B = C, this expression becomes We# = [3W;-, - 2]'/(1-4).
+
+
4. Enlarging the scenario
4.1. The case of N subsystems The three-system case discussed above is a generic one under the assumption that W A = W B = W c = 2. We have not attempted to generalize its corresponding special correlation Table to the generic (WA,W B ,W C )case, and even less to the even more generic case of N such systems ( A l , Az, ...,A N ) .It is clear however that, assuming that this (not necessarily trivial) task was satisfactorily accomplished, the result would lead to
r=l
r=l
where q = 1 if all N systems are mutually independent, i.e., Ai+AZ+ Pi&...IN
...+A N
n N
=
( v ( i l ~ i Z..., , iN))
P?
(37)
r=l
and q # 1 otherwise. This is to say, if we have independence, the only entropy which is extensive is SBG. If we do not have independence but the special type of (collective) correlations focused on in this paper instead, then only S, for a special value of q is extensive. For the case of independence, the generic composition law for S, is given by
r=l
or, equivalently,
r=l
26 Eq. (38) exhibits in fact the well known (monotonic) connection between S, and the Renyi entropy S,"
= [lnCE1pl]/(l- q ) =
Fn[l
+ (1 - q)S,]]/(l - q ) (we
remind that, for independent systems, S," is extensive, Vq). We have generically W = W A ~which , corresponds of course to the total number of a priori possibly occupied states (i.e., whose joint probabilities are generically nonzero) for the generic q = 1 case. In contrast, the generic q = 0 case has only We# = (Xi=,WA,) - ( N - 1) nonzero joint probabilities. These are A I + A z + ...+A N A Pll ...1 - (C,.=l P?) - ( N - 1) 2 0, pi1ii...i = pil' (ii = 2,3, ...,W A ~ A piiaii...I = P$ (22 = 2,3, ...,WA,),pili...iN = pi," ( i N = 2,3, ...,W A ~ )The . generic q = 1and q = 0 cases can, analogously t o what has been done before, be uni-
n,"==,
1/(1-d
fied through We#= [(C,"=, W i r q )- ( N - l)] 5 W (0 5 q 5 l ) , where the equality holds only for q = 1. In the particular instance A1 = A2 = ... = AN = A, this expression becomes We#= [NWA-' - ( N - l)]'/('-q). Furthermore, for N equal subsystems (a quite frequent case, as already mentioned), Eq. (36) becomes
s ? ( N ) = NSq(1)
7
(40)
where the change of notation is transparent. This is an extremely interesting relation since it already has the shape that accomodates well within standard thermodynamics, even if the entropic index q is not necessarily the usual one, i e . , q = 1. It is allowed to think that Clausius would perhaps have been as satisfied with this relation as he surely was with the same relation but with SBG!One might also quite safely speculate that if the system is such that its Table of joint probabilities is not exactly of the type we have discussed here, but close to it, then we might have, not exactly relation (40) but rather only asymptotically S,(N) c( N . In other words, as long as the system belongs to what we may refer to as the q-universality class, we should expect limN,, S , ( N ) / N < co,in total analogy with the usual BG case. To geometrically interpret Eq. (40), we may consider the case of equal probabilities in the allowed phase space, i.e., in that part of phase space which is expected t o have, not necessarily W microstates, but generically We# microstates (with We# 5 W ) . The effective number We# is expected (at least in the N >> 1 limit) t o be precisely the number of all those states that the special collective correlations allow t o visit. So, if we assume equal probabilities in Eq. (40) (i.e., A1+A?+...+AN = l/Weff), we obtain pi1 i2 ...2N
or, equivalently
Two cases are possible for this relation, namely q = 1 and q < 1. In the first case,
27 we have the usual result
with p = e SBGU) 2 1,
(44)
In the second case, we have an unusual result, namely
weff= [1+N S q ( l ) / p ] f ,
(45)
= 1/(1- q ) 1 0 .
(46)
with p
In the N
+ -00
limit, this relation becomes the following one:
WeffccNf.
(47)
This (physically quite appealing) possibility was informally advanced by us long ago, and formally in 12. It has now been obtained along an appropriate probabilistic path. 4.2. The q + --oo
case
From Eq. (46)we expect the q -+ --00 case to correspond to the limiting situation where Weff is constant. To realize this situation, let us first consider the A = B twesystem case with the following Table (WA= W B ) :
This Table corresponds to pG+B = piSij. Its generalization t o N equal systems is trivial: p i l i 2 . . . i N = pi, if all N indices coincide, and zero otherwise. The corresponding entropy therefore asymptotically approaches the relation
S-,(N)
= S-Co(1) (VN),
(48)
thus corresponding to p = 0 as anticipated. It appears then that all cases equivalent (through permutations) to the above Table, should yield the same limit q -+ --00 (for further analysis of this case, see 17).
28 4.3. Connection with the Borges-Niuanen-Le Mehaute- Wang
q-pwduct Let us mention at this point an interesting connection that can be established between the present problem and the q-product introduced by L. Nivanen, A. Le Mehaute and Q.A. Wang and by E.P. Borges la. It is defined as follows: z 63, y
3
+ yl-9
(zl-4
- 1)1/(1-,)
(z €31 y = zy) .
(49)
It has the elegant, extensive-like, property ln,(z @, y) = In, z +In, y ,
(50)
to be compared with the by now quite usual, nonextensive-like, property In, (zy) = In, z
+ In, y + (1 - q)(In,
2) (In,
y) .
(51)
This type of structure was since long (at least since 1999) being informally discussed by A.K. Rajagopal, E.K. Lenzi, S. Abe, myself, and probably others. But only very recently it was beautifully formalized la. It has immediately been followed and considerably extended by Suyari in a relevant set of papers 19. Let us now go back to the main topic of the present paper. Consider the following joint probabilities associated with N generic subsystems:
where
4ti2.,.iNis a nontrivial
function which ensures that
In the limit q + 1, Eq. (52) must recover the independent-systems one, namely N r=l
which implies c # ~ { : i ~ , , , ~=~ 0. Notice that, excepting for the function 4t12,.,iN,Eq. 1 j p A%;I+2 A Z + ...+A N with n , , = N, ( l / p = ' ) , where n,,"==,zr = [XI'-, 2...ZN zN1-,
- N + 1]1/(1-d
.
It follows from Eq. (52) that
N r=l
(52) associates
+
z21-q
+ ... +
29 hence
where we have imposed one more nontrival condition on
namely that
ili2 ...iN
I#Jej2,,,iN
One might naturally have the impression that no function might exist satisfying simultaneously Eqs. (53) and (57). This is not so however, at least for particular cases, since we have explicitly shown in the present paper solutions of this nontrivial problem. Using the definition of S, in the left-hand member of the equality we obtain N r=l
N
Let us now introduce in Eq. (58) the definition of marginal probabilities, namely
We obtain
r=l
r=l
Using once again the definition of S, on the right-hand member, we finally obtain N
N
r=l
r=l
as desired. It should, however, be clear that this remarkable mathematical fact by no means exhausts the problem of the search of explicit Tables of joint probabilities that would lead to extensivity of S, for nontrivial values of q. The constraints imposed by the definition itself of the concept of marginal probabilities are of such complexity that the search of solutions is by no means trivial, at least at our present degree of knowledge. Indeed, one easily appreciates this fact by looking at the explicit solutions indicated in Sections 2.2 and 3.2.
30 5 . Conclusions
Let us summarize the obvious conclusion of the present paper: Unless the composition law is specified, the question whether an entropy (or some similar quantity) is or is not extensive has no sense. Allow us a quick digression. The situation is in fact quite analogous to the quick or slow motion of a body. Ancient Greeks considered the motion to be an absolute property. It was not until Galileo that it was clearly perceived that motion has no sense unless the referential is specified. In Galileo's time, and even now, when no referential is indicated, one tacitly assumes that the referential is the Earth. In total analogy, when no composition law is indicated for analyzing the extensivity of an entropy, one tacitly assumes that the subsystems that we are composing are independent. It is only - a big only!- in this sense that we can say that SBGis extensive, and that S, (for q # 1) is nonextensive. Once we have established the point above, the next natural question is: Are there classes of collective correlations for which we know which is the specific entropy to be extensive? (knowing, of course, that absence of all correlations leads to S ~ C )For . this operationally important question, nontrivial illustrations on how the entropic form is dictated by the type of special collective correlations that might (or might not) exist in the system have been explicitly presented in Sections 2.2 and 3.2. From this discussion, two vast categories of systems are identified (at the most microscopic possible level, i.e., that of the joint probabilities), namely those whose allowed phase space increases (in size) with N like an exponential or like a power-law, corresponding respectively to q = 1 and to q < 1. However, it should be clear that the present paper is only exploratory in what concerns this hard task (further work along the present lines is in fact in progress Indeed, we have not found the generic answer for N (not necessarily equal) systems, and we have basically concentrated only on the interval 0 5 q 5 1. We do not even know without doubt if the answer is unique (excepting of course for trivial permutations), or if it admits a variety of forms all belonging to the same universality class of nonextensivity (i.e., sharing the same value of the entropic index q). Even worse, we still do not know what specifically happens in the structure of the allowed phase space in the (thermodynamically) most important limit N + 00, or in the frequent limit WA -+ 00 (which would provide a precise geometrical interpretation to a formula such as We# = [NWi-' - ( N - l)]'/('-q) for say 0 5 q 5 1). It is precisely this structure which is crucial for fully understanding nonextensive statistical mechanics and its related applications in terms on nonlinear dynamical systems. For example, an interesting situation might occur if we compare the d i s tribution which optimizes S q ( N ) and then consider N >> 1, with the distribution corresponding to having first considered N >> 1 in S,(N) and only then optimizing. We certainly expect the thermodynamic limit and the optimization operation to commute for a system composed by N independent (or nearly independent) subsystems. But the situation seems to be more subtle if our system was composed by N subsystems correlated in that special, collective manner which demands q # 1 2oi21).
31 in order to have entropy extensivity. Such a situation would be consistent with a property which emerges again and again for nonextensive s y s tems, namely that the N ---f 00 and the t -+ 00 limits do not necessarily commute. One more relevant issue concerns what specific dynamical nature is required for a physical system to “live”, in phase space, within a structure close t o one of those that we have presently analyzed. It is our conjecture that this would occur for nonlinear dynamical systems whose Lyapunov spectrum is either zero or close t o it, i.e., under circumstances similar t o the edge of chaos, where many of the so called complex systems are expected to occur. We leave all these questions as open points needing further progress. Let us finally mention the following point. It is by no means trivial t o find sets of joint probabilities (associated t o relevant statistical correlations) that produce very simple marginal probabilities (such as p and 1 - p for binary variables) and which simultaneously admit the imposition (as we have done here) of strict additivity of the corresponding entropy. This has been possible for S,. This might be in principle possible as well for other entropic forms. The fact however that, like SBG, S, simultaneously (i) admits such solutions, (ii) is concave (Vq > 0), (iii) is Leschestable, and (iv) leads to f i n i t e entropy production per unit time 12, constitutes we believe - a strong mathematical basis for being physically meaningful in the thermostatistical sense. 71s19110111112,13,14115
Acknowledgments
It is with pleasure that I acknowledge very fruitful discussions with S. Abe, C. Anteneodo, F. Baldovin, E.P. Borges, J.P. Crutchfield, J.D. Farmer, M. Gell-Mann, H.J. Haubold, L. Moyano, A.K. Rajagopal, Y. Sat0 and D.R. White. I have also benefited from a question put long ago by M.E. Vares related t o the possible difference between W and W e f f .
References 1. C. Tsallis, Chaos, Solitons and Fractals 13, 371 (2002). 2. C. Anteneodo and C. Tsallis, Phys. Rev. Lett. 80,5313 (1998); C. Anteneodo, Physica A 342, 112 (2004). 3. A. Einstein, Annalen der Physik 33, 1275 (1910) [ “Usually W is put equal to the number of complexions ... I n order t o calculate W , one needs a complete (molecularmechanical) theory of the system under consideration. Therefore it is dubious whether the B o l t n a n n principle has any meaning without a complete molecular-mechanical theory or some other theory which describes the elementary processes. S = log W+ const. seems without content, f r o m a phenomenological point of view, without giving in addition such a n Elementartheorie.” (Translation: Abraham Pais, Subtle is the Lord .... Oxford University Press, 1982)l. 4. C. Tsallis, 3. Stat. Phys. 52, 479 (1988); for updated bibliography see http://tsallis.cat.cbpf.br/biblio.htm. 5. E.M.F. Curado and C. Tsallis, J. Phys. A 24, L69 (1991) [Corrigenda: 24,3187 (1991) and 25, 1019 (1992)l.
32 6. C. Tsallis, R.S. Mendes and A.R. Plastino, Physica A 261,534 (1998). 7. S.R.A. Salinas and C. Tsallis, eds., Nonextensive Statistical Mechanics and Thermodynamics, Braz. J. Phys. 29,Number 1 (Brazilian Physical Society, Sao Paulo, 1999). 8. S. Abe and Y. Okamoto, eds., Nonextensive Statistical Mechanics and its Applications, Series Lecture Notes in Physics 560 (Springer-Verlag, Heidelberg, 2001). 9. G. Kaniadakis, M. Lissia and A. Rapisarda, eds., Non Extensive Statistical Mechanics and Physical Applications, Physica A 305,Issue 1/2 (Elsevier, Amsterdam, 2002). 10. P. Grigolini, C. Tsallis and B.J West, eds., Classical and Quantum Complexity and Nonextensive Thermodynamics, Chaos, Solitons and Fractals 13, Number 3 (Pergamon-Elsevier, Amsterdam, 2002). 11. M. Sugiyama, ed., Nonadditive Entropy and Nonextensive Statistical Mechanics, Continuum Mechanics and Thermodynamics 16 (Springer, Heidelberg, 2004). 12. M. Gell-Mann and C. Tsallis, eds., Nonextensive Entropy - Interdisciplinary Applications (Oxford University Press, New York, 2004). 13. H.L. Swinney and C. Tsallis, eds., Anomalous Distributions, Nonlinear Dynamics and Nonextensivity, Physica D 193,Issue 3-4 (Elsevier, Amsterdam, 2004). 14. G. Kaniadakis and M. Lissia, eds., New and Expectations in Thermostatistics, Physica A 340,Issue 1/3 (Elsevier, Amsterdam, 2004). 15. H.J. Herrmann, M. Barbosa and E.M.F. Curado, eds., Trends and Perspectives an Extensive and Non-extensive Statistical Mechanics, Physica A 344,Issue 3/4 (Elsevier, Amsterdam, 2004). 16. C. Tsallis, Quimica Nova 17,468 (1994). 17. C. Tsallis, Milan Journal of Mathematics 73 (2005), in press [cond-mat/0412132]. 18. L. Nivanen, A. Le Mehaute and Q.A. Wang, Rep. Math. Phys. 52,437 (2003); E.P. Borges, Physica A 340, 95 (2004). To the best of my knowledge, the two groups proposed independently this interesting generalization of the usual product. In any case, Borges has been advancing, in private conversations, this idea to me since around 2001. 19. H. Suyari and M. Tsukada, cond-mat/0401540; H. Suyari, cond-mat/0401541; H. Suyari, cond-mat/0401546; H. Suyari, On the central limit theorem in Tsallis statistics, in Complexity, Metastability and Nonextensivity, Proc. 31st Workshop of the International School of Solid State Physics (20-26 July 2004, Erice-Italy), eds. C. Beck, A. Rapisarda and C. Tsallis (World Scientific, Singapore, 2005), in press. 20. Y. Sat0 and C. Tsallis, On the extensivity of the entropy S, for systems with N 5 3 specially correlated subsystems, Proc. Summer School and Conference on Complexity in Science and Society, ed. T. Bountis (Patras and Olympia, 14-26 July, 2004), International Journal of Bifurcation and Chaos (2005), in press [cond-mat/0411073]. 21. C. Tsallis, M. Gell-Mann and Y. Sato, Scale-invariant occupancy of phase space can lead to additive entropy S,, preprint (2005).
SUPERSTATISTICS: RECENT DEVELOPMENTS AND APPLICATIONS
CHRISTIAN BECK School of Mathematical Sciences, Queen Mary, University of London, Mile End Road, London El 4NS, UK We review some recent developments which make use of the concept of 'superstatistics', an effective description for nonequilibrium systems with a varying intensive parameter such as the inverse temperature. We describe how the asymptotic decay of stationary probability densities can be determined using a variational principle, and present some new results on the typical behaviour of correlation functions in dynamical superstatistical models. We briefly describe some recent applications of the superstatistics concept in hydrodynamics, astrophysics, and finance.
1. Introduction Complex nonequilibrium systems often exhibit dynamical behaviour that is characterized by spatio-temporal fluctuations of an intensive parameter ,B. This intensive parameter may be the inverse temperature, or an effective friction constant, or the amplitude of Gaussian white noise, or the energy dissipation in turbulent flows, or simply a local variance parameter extracted from a signal. A nonhomogeneous spatially extended system with fluctuations in ,B can be thought of as consisting of a partition of spatial cells with a given ,B in each cell. If there is local equilibrium in each cell (so that statistical mechanics can be applied locally), and if the fluctuations of ,B evolve on a sufficiently large time scale, then in the long-term run the entire system is described by a superposition of different Boltzmann factors with different ,B, or in short, a 'superstatistics' '. The superstatistics approach has been the subject of various recent papers Superstatistical techniques can be successfully applied to a variety of physical problems, such as Lagrangian'4*'5 and Eulerian turbulence16J7, defect turbulence", cosmic ray stat is ti^'^, plasmas", statistics of wind velocity differenceszlgzz and mathematical finance Experimentally measured non-Gaussian stationary distributions with 'fat tails' can often be successfully described by simple models that exhibit a superstatistica1 spatio-temporal dynamics. If the intensive parameter in the various cells is distributed according to a particular probability distribution, the Xz-distribution, then the corresponding superstatistics, obtained by integrating over all 0, is given by Tsallis statistics 2 6 ~ 2 7 ~ 2 8 ~ For other distributions of the intensive parameter ,B, one ends up with more general superstatistics, which contain Tsallis statistics as a special case. General213,4,5,617,819110,11,12,13.
33
34 ized entropies (analogues of the Tsallis entropies) can also be defined for general s u p e r s t a t i s t i ~ s ~and ~ ' ~are ~ ~ ~indeed ~ ~ ~ a useful tool. The ultimate goal is to proceed from simple models to more general versions of statistical mechanics, which are applicable t o wide classes of complex nonequilibrium systems, thus further generalizing Tsallis' original ideas26. This paper is organized as follows: First, we briefly review the superstatistics concept. We then show how one can deduce the asymptotic decay rate of the stationary probability densities of general superstatistics from a variational princip1e3l. In section 4 we consider dynamical realization of superstatistics and investigate the typical behaviour of correlation functions. Many different types of decays of correlations (e.g. power law, stretched exponentials) are possible. Section 5 summarizes some recent applications of the superstatistics concept in hydrodynamics, astrophysics, and finance.
2. What is superstatistics? The superstatistics approach is applicable to a large variety of driven nonequilibrium systems with spatio-temporal fluctuations of an intensive parameter /?,for example, the inverse temperature. Locally, i.e. in spatial regions (cells) where /? is approximately constant, the system is described by ordinary statistical mechanics, i.e. ordinary Boltzmann factors e-PE, where E is an effective energy in each cell. In the long-term run, the system is described by a spatietemporal average of various Boltzmann factors with different /?. One may define an effective Boltzmann factor B ( E ) as
where f(P) is the probability distribution of /? in the various cells. For so-called type-A superstatistics', one normalizes this effective Boltzmann factor and obtains the stationary long-term probability distribution
where
Z=J,
B(E)dE.
(3)
For type-B superstatistics, the /?-dependent normalization constant of each local Boltzmann factor is included into the averaging process. In this case the invariant long-term distribution is given by
where Z(/?)is the normalization constant of e-PE for a given /?. Eq. (4) is just a simple consequence of calculating marginal distributions. Type-B superstatistics can easily be mapped into type-A superstatistics by redefining f(/?).
35
A superstatistics can be dynamically realized by considering Langevin equations whose parameters fluctuate on a relatively large time scale (see 32 for details). For example, for turbulence applications one may consider a superstatistical extension of the Sawford model of Lagrangian t ~ r b u l e n c e ~ ~ *This ’ ~ -model ~ ~ . consists of suitable stochastic differential equations for the position, velocity and acceleration of a Lagrangian test particle in the turbulent flow, and the parameters of this model then become random variables as well. Experimental data are well reproduced by these types of models. Often, a superstatistics just consists of a superposition of Gaussian distributions with varying variance. The parameter ,f? can then be estimated from an experimentally measured signal u(t) as
where (...)T denotes an average over a finite time interval T of the signal, corresponding to the ‘cell size’ of the superstatistics. It is then easy to make histograms of p and thus empirically determine f(P). 3. Asymptotic behaviour for large energies
Superstatistical invariant densities, as given by eq.(l) or (4), typically exhibit ‘fat tails’ for large E, but what is the precise functional form of this large energy behaviour? The answer depends on the distribution f (P) and can be obtained from a variational principle. Details are described in 31, here we just summarize some results. For large E we may use the saddle point approximation and write
where
The expression sup{-PE P
+ In f ( P ) )
(8)
corresponds to a Legendre transform of lnf(P). The result of this transform is a function of E which can be thought of as representing a kind of entropy function if we consider the function In f(P) to represent a free energy function. This entropy
36 function, however, is different from other entropy functions used e.g. in nonextensive statistical mechanics. It describes properties related to the fluctuations of inverse temperature. In the case where f(P) is smooth and has only a single maximum we can obtain the supremum by differentiating, i.e. sup{-PE P
+ In f(P)} = -PEE + In f(PE)
(9)
where PE satisfies the differential equation
By taking into account the next-order contributions around the maximum, eq. (6) can be improved to
Let us consider a few examples. Consider an f(P) of the power-law form f(P) y > 0 for small P. An example is a x2 distribution of n degrees of f r e e d ~ m
N
PY,
(PO2 0, n > 1) which behaves for P + 0 as
f ( P ) Pn’2-1,
(13)
i.e.
n y=--l (14) 2 . Other examples exhibiting this power-law form are F-distributionslJO. With the above formalism one obtains from eq. (10) PE =
and
7
(15)
-
B ( E ) E-Y-’.
(16)
These types of f(P) form the basis for power-law generalized Boltzmann factors (q-exponentials) B ( E ) ,with the relation 26327128*29
Y + l Z 1
q-1’
Another example would be an c > 0. In this case one obtains
f(P)which for small ,B behaves as f(P)
N
,-‘/PI
37
The above example can be generalized to stretched exponentials: For form e-CP6 one obtains after a short calculation
f(P)of the
f(P)
N
where a i s some factor depending on 6 and c. Of course which type of f(P)is relevant depends on the physical system under consideration. For many problems in hydrodynamic turbulence, log-normal superstatistics seems to be working as a rather good approximation. In this case f(p) is given by
where s and m are parameters
1914,15*16*17135
4. Superstatistical correlation functions
To obtain statements on correlation functions, one has to postulate a concrete dynamics that generates the superstatistical distributions. The simplest dynamical model of this kind is a Langevin equation with parameters that vary on a long time scale, as introduced in 32. Let us consider a Brownian particle of mass m and a Langevin equation of the form .ir = -yv
+ aL(t),
(23)
where v denotes the velocity of the particle, and L ( t ) is normalized Gaussian white noise with the following expectations:
We assume that the parameters (T and y are constant for a sufficiently long time scale T , and then change to new values, either by an explicit time dependence, or by a change of the environment through which the Brownian particle maves. Formal identification with local equilibrium states in the cells (ordinary statistical mechanics at temperature @-I) yields during the time scale T the relation36
38 or
Again, we emphasize that after the time scale T , y and u will take on new values. During the time interval T , the probability density P(v, t ) obeys the Fokker-Planck equation
with the local stationary solution
In the adiabatic approximation, valid for large T , one asumes that the local equilibrium state is reached very fast so that relaxation processes can be neglected. Within a cell in local equilibrium the correlation function is given by 36
Clearly, for t = t’ and setting m = 1 we have
in agreement with eq. (5). It is now interesting to see that the long-term invariant distribution P(v),given bY
$5
depends only on the probability distribution of p = and not on that of the single quantities y and u2. This means, one can obtain the same stationary distribution from different dynamical models based on a Langevin equation with fluctuating parameters. Either y may fluctuate, and u2 is constant, or the other way round. On the other hand, the superstatistical correlation function
can distinguish between these two cases. The study of correlation functions thus yields more information for any superstatistical model. Let illustrate this with a simple example. Assume that u fluctuates and y is constant such that ,f3 = $5 is X2-distributed. Since y is constant, we can get the exponential e-Ylt-t’l out of the integral in eq. (33), meaning that the superstatistical correlation function still decays in an exponential way:
C(t - t’)
e-~lt-t’l.
(34)
39 On the other hand, if u is constant and y fluctuates and @ is still X2-distributed with degree n, we get a completely different answer. In this case, in the adiabatic approximation, the integration over @ yields a power-law decay of C(t - t’):
C ( t - t’)
N
It - t y - 7 ,
(35)
where
Note that this decay rate is different from the asymptotic power law decay rate of the invariant density P(w),which, using (29) and (32), is given by P(w) w-’/(q-l), with 1 n 1 -(37) q-1 2 2 N
--+-.
In general, we may generate many different types of correlation functions for general choices off(@). By letting both u and y fluctuate we can also construct intermediate cases between the exponential decay (34) and the power law decay (35), so that strictly speaking we only have the inequality n 02--1, (38) 2 depending on the type of parameter fluctuations considered. One may also proceed to the position x(t)= ltn(t‘)dt’
(39)
of the test particle. One has
Thus asymptotic power-law velocity correlations with an exponent 7 < 1 are expected to imply asymptotically anomalous diffusion of the form
(x’(t))
N
t”
(41)
with
a=2-0.
(42)
This relation simply results from the two time integrations. It is interesting to compare our model with other dynamical models generating Tsallis statistics. Plastino and Plastino3’ and Tsallis and BukmanrP study a generalized Fokker-Planck equation of the form
40 with a linear force F ( z ) = Icl - kzx and u # 1. Basically this model means that the diffusion constant becomes dependent on the probability density. The probability densities generated by eq. (43) are q-exponentials with the exponent
q=2-u.
(44)
The model generates anomalous diffusion with a = 2/(3- q). Assuming the validity of a = 2 - f j , i.e. the generation of anomalous diffusion by slowly decaying velocity correlations with exponent f j , one obtains
On the other hand, for the X2-superstatistical Langevin model one obtains by combining eq. (36) and (37) the different relation q=-
5 - 39 2q - 2 '
Interesting enough, there is a distinguished q-value where both models yield the same answer: q = 1.453 + f j = q = 0.707
(47)
These values of q and 71 correspond to realistic, experimentally observed numbers, for example in defect turbulence''. So far we mainly studied correlation functions with power law behaviour. But in fact one can construct superstatistical Langevin models that exhibit more complicated types of asymptotic behaviour of the correlation functions. To see this we notice that the asymptotic analysis of section 3 applies t o correlation functions as well, by formally defining
-
1
E := -u2m)t - t') 2
(48) (49)
and writing
To obtain statements on the symptotic decay rate of the superstatistical correlation function, we may just use the same techniques described in section 3 with the replacement E + and f + f. In this way one can construct models that have, for example, stretched exponential asymptotic decays of correlations etc. (see also 39). Asymptotic means here that It-t'l is large as compared t o the local equilibrium relaxation time scale, but still smaller than the superstatistical time scale T, such that the adiabatic approximation is valid.
41 5. Some Applications
We end this paper by briefly mentioning some recent applications of the superstatistics concept. Rizzo and RapisardaZ1J2 study experimental data of wind velocities at Florence airport and find that X2-superstatistics does a good job. Jung and S ~ i n n e ystudy ~ ~ velocity differences in a turbulent Taylor-Couette flow, which is well described by lognormal superstatistics. They also find a simple scaling relation between the superstatistical parameter p and the fluctuating energy dissipation E . Paczuski et aL40 study data of solar flares on various time scales and embedd this into a superstatistical model based on X2-superstatistics = Tsallis statistics. Human behaviour when sending off print jobs might also stand in connection t o such a supers tat is tic^^^. Bodenschatz et al.42have detailed experimental data on the acceleration of a single test particle in a turbulent flow, which is well described by lognormal superstatistics, with a Reynolds number dependence as derived in a superstatistical Lagrangian turbulence model studied by Reynolds15. The statistics of cosmic rays is well described by X2-superstatistics, with n = 3 due to the three spatial dimensions’’. In mathematical finance superstatistical techniques are well known and come under the heading ‘volatility fluctuations’, see e.g.23 for a nice introduction and for some more recent work. Possible applications also include granular media, which could be described by different types of superstatistics, depending on the boundary condition^^^. The observed generalized Tsallis statistics of solar wind speed fluctuation^^^ is a further candidate for a superstatistical model. Chavanis3’ points out analogies between superstatistics and the theory of violent relaxation for collisionless stellar systems. Most superstatistical models assume that the superstatistical time scale T is very large, so that a quasi-adiabatic approach is valid, but Luczka and Z a b ~ r e khave ~ ~ also studied a simple model of dichotomous fluctuations of where everything can be calculated for finite time scales T as well. 24925
References 1. C. Beck and E.G.D. Cohen, Physica 322A, 267 (2003) 2. E.G.D. Cohen, Physica 193D, 35 (2004) 3. E.G.D. Cohen, Einstein und Bolttmann - Dynamics and Statistics, Boltzmann award lecture at Statphys 22, Bangalore, to appear in Pramana (2005) 4. C. Beck, Cont. Mech. Thermodyn. 16, 293 (2004) 5. H. Touchette, Temperature fluctuations and mixtures of equilibrium states in the canonical ensemble, in M. Gell-Mann, C. Tsallis (Eds.), Nonextensiwe Entropy - Interdisciplinary Applications, Oxford University Press, 2004. 6. F.Sattin, Physica 338A, 437 (2004) 7. C. Beck and E.G.D. Cohen, Physica 344A, 393 (2004) 8. J. Luczka, P. Talkner, and P. Hanggi, Physica 278A, 18 (2000) 9. S. Abe, cond-mat/0211437 10. V.V. Ryazanov, cond-mat/0404357 11. A.K. Aringazin and M.I. Mazhitov, cond-mat/0301245 12. C. Tsallis and A.M.C. Souza, Phys. Rev. 67E, 026106 (2003) 13. C. Tsallis and A.M.C. Souza, Phys. Lett. 319A, 273 (2003)
42 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45.
C. Beck, Europhys. Lett. 64,151 (2003) A. Reynolds, Phys. Rev. Lett. 91,084503 (2003) B. Castaing, Y. Gagne, and E.J. Hopfinger, Physica 46D,177 (1990) C. Beck, Physica 193D,195 (2004) K. E. Daniels, C. Beck, and E. Bodenschatz, Physica 193D,208 (2004) C. Beck, Physica 331A,173 (2004) F. Sattin and L. Salasnich, Phys. Rev. 65E,035106(R) (2002) S. Rizzo and A. Rapisarda, in Proceedings of the 8th Experimental Chaos Conference, Florence, AIP Conf. Proc. 742,176 (2004) (cond-mat/0406684) S. Rizzo and A. Rapisarda, cond-mat/0502305 J.-P. Bouchard and M. Potters, Theory of Financial Risk and Derivative Pricing, Cambridge University Press, Cambridge (2003) M. Ausloos and K. Ivanova, Phys. Rev. 68E,046122 (2003) Y . Ohtaki and H.H. Hasegawa, cond-mat/0312568 C. Tsallis, J. Stat. Phys. 52,479 (1988) C. Tsallis, R.S. Mendes and A.R. Plastino, Physica 261A,534 (1998) C. Tsallis, Bmz. J . Phys., 29: 1 (1999) S. Abe, Y. Okamoto (eds.), Nonextensive Statistical Mechanics and Its Applications, Springer, Berlin (2001) P.-H. Chavanis, cond-mat/0409511 H. Touchette and C. Beck, Phys. Rev. 71E,016131 (2005) C. Beck, Phys. Rev. Lett. 87,180601 (2001) B.L. Sawford, Phys. Fluids A3, 1577 (1991) G. Wilk and Z. Wlodarczyk, Phys. Rev. Lett. 84,2770 (2000) S.Jung and H.L. Swinney, Velocity difference statistics in turbulence, Preprint University of Austin (2005) N.G. van Kampen, Stochastic Processes in Physics and Chemistry, North Holland, Amsterdam (1981) A.R. Plastino and A. Plastino, Physica 222A,347 (1995) C. Tsallis and D.J. Bukmann, Phys. Rev. 54E,R2197 (1996) R.G. Palmer et al., Phys. Rev. Lett. 53,958 (1984) M. Baiesi, M. Paczuski, and A.L. Stella, cond-mat/0411342 U. Harder and M. Paczuski, cs/PF/0412027 N. Mordant, A.M. Crawford, E. Bodenschatz, Physica 193D,245 (2004) J.S. van Zon et al., cond-mat/0405044 L.F. Burlaga and A.F. Vinas, J. Geophys. Res. 109,A12107 (2004) J. Luczka and B. Zaborek, Acta Phys. Polon. B 35,2151 (2004)
TWO STORIES OUTSIDE BOLTZMANN-GIBBS STATISTICS: MORI'S Q-PHASE TRANSITIONS AND GLASSY DYNAMICS AT THE ONSET OF CHAOS
A. ROBLEDO': F.BALDOVIN'S~~AND E.
'Instituto de Fisica,
MAYORAL'^
Universidad Nacional Auto'norna de Mixico, Apartado Postal 20-364, Mixico 01000 D.F., Mexico Dipartamento di Fisica, Uniuersitd di Padoua, Via Marzolo 8, I-35131 Padoua, Italy
First, we analyze trajectories inside the Feigenbaum attractor and obtain the atypical weak sensitivity to initial conditions and loss of information associated to their dynamics. We identify the Mori singularities in its Lyapunov spectrum with the appearance of a special value for the entropic index q of the Tsallis statistics. Secondly, the dynamics of iterates at the noise-perturbed transition to chaos is shown to exhibit the characteristic elements of the glass transition, e.g. two-step relaxation, aging, subdiffusion and arrest. The properties of the bifurcation gap induced by the noise are seen to be comparable to those of a supercooled liquid above a glass transition temperature.
Key words: Edge of chaos, q-phase transitions, nonextensive statistics, external noise, glassy dynamics PACS: 05.45.Ac, 64.60.Ak, 05.40.Ca, 64.70.Pf 1. Introduction
Evidence for the incidence of nonextensive dynamical properties at critical attractors in low dimensional nonlinear maps has accumulated and advanced over the last few years; specially with regards to the onset of chaos in logistic maps - the Feigenbaum attractor,'-' and at the accompanying pitchfork and tangent bifurcation^.^^^ The more general chaotic attractors with positive Lyapunov coefficients have fullgrown phase-space ergodic and mixing properties, and their dynamics is compatible with the Boltzmann-Gibbs (BG) statistics. As a difference, critical attractors have * email: rob1edoOfisica.unam.m temail: baldovinOpd.infn.it *email: estelaOeros.pquim.unam.mx
43
44
vanishing Lyapunov coefficients, exhibit memory-retentive nonmixing properties, and are therefore to be considered outside BG statistics. Naturally, some basic questions about the understanding of the dynamics at critical attractors are of current interest. We mention the following: Why do the anomalous sensitivity to initial conditions & and its matching Pesin identity obey the expressions suggested by the nonextensive formalism? How does the value of the entropic index q arise? Or is there a preferred set of q values? Does this index, or indexes, point to some specific observable properties at the critical attractor? From a broader point of view it is of interest to know if the anomalous dynamics found for critical attractors bears some correlation with the dynamical behavior at extremal or transitional states in systems with many degrees of freedom. Two specific suggestions have been recently advanced, in one case the dynamics at the onset of chaos has been demonstrated to be closely analogous to the glassy dynamics observed in supercooled molecular liquids," and in the second case the dynamics at the tangent bifurcation has been shown to be related to that at thermal critical states.l' With regard to the above comments here we briefly recount the following developments:
'
(i) The finding that the dynamics a t the onset of chaos is made up of an infinite family of Mori's q-phase t r a n ~ i t i o n s , ' ~each > ~ ~associated to orbits that have common starting and finishing positions located at specific regions of the attractor. Every one of these transitions is related to a discontinuity in the u function of 'diameter ratios',14 and this in turn implies a q-exponential & and a spectrum of q-Lyapunov coefficientsequal to the Tsallis rate of entropy production for each set of attractor regions. The transitions come in pairs with conjugate indexes q and Q = 2 - q, as these correspond to switching starting and finishing orbital positions. The amplitude of the discontinuities in u diminishes rapidly and consideration only of its dominant one, associated to the most crowded and sparse regions of the attractor, provides a very reasonable description of the dynamics, consistent with that found in earlier studies.lP4 (ii) The realization lo that the dynamics at the noise-perturbed edge of chaos in logistic maps is analogous to that observed in supercooled liquids close to vitrification. Four major features of glassy dynamics in structural glass formers, two-step relaxation, aging, a relationship between relaxation time and configurational entropy, and evolution from diffusive to subdiffusive behavior and finally arrest, are shown to be displayed by the properties of orbits with vanishing Lyapunov coefficient. The previously known properties in control-parameter space of the noise-induced bifurcation gap play a central role in determining the characteristics of dynamical relaxation a t the chaos threshold. 14315
45 2. Mori's q-phase transitions at onset of chaos
The dynamics at the chaos threshold p = pc of the z-logistic map fM(X)= 1 - p 1x1=, z
> 1,-1 5 x 5 1,
(1)
has been analyzed r e ~ e n t l y . ~The - ~ orbit with initial condition xo = 0 (or equivalently, xo = 1) consists of positions ordered as intertwined power laws that asymptotically reproduce the entire period-doubling cascade that occurs for p < pe. This orbit is the last of the so-called 'superstable' periodic orbits at ji,, < pc, n = 1,2, ...,I4 a superstable orbit of period 2O0. There, the ordinary Lyapunov coefficient A1 vanishes and instead a spectrum of q-Lyapunov coefficients A?) develops. This spectrum originally studied in Refs. 13 when z = 2, has been shown 4*7 to be associated to a sensitivity to initial conditions & (defined as &(xo) f lhaz,-ro(Axt/Axo) where Ax0 is the initial separation of two orbits and Axt that at time t ) that obeys the q-exponential form Et(x0) = e.p,[~,(xo)t]
= [I - ( q - 1)A,(xo)
(2)
t]-'/q-'
suggested by the Tsallis statistics. Notably, the appearance of a specific value for the q index (and actually also that for its conjugate value Q = 2 - q ) works out to be due to the occurrence of Mori's 'q-phase transitions' l 2 between 'local attractor structures' at pc. As shown in Fig. 1,the absolute values for the positions x7 of the trajectory with xt=O = 0 at time-shifted T = t + 1 have a structure consisting of subsequences with a common power-law decay of the form ~ - ' / ~ - - 9 . with q = 1 - In 2/(2 - 1) lna(z)? where a(.) is the Feigenbaum universal constant that measures the period-doubling amplification of iterate positions. That is, the attractor can be decomposed into position subsequences generated by the time subsequences 7 = (2k 1)2", each obtained by proceeding through n = 0 , 1 , 2 , ... for a fixed value of k = O , l , 2, .... See Fig. 1. The k = 0 subsequence can be written as xt = exp,-,(-Af)t) with A?) = ( z - 1) In a ( z ) /In 2. q-lyapunou coeficients. The sensitivity &(xo) can be obtained from &(m)N I.n(m-l)/Qn(m)l", t = 2" - 1, n large, where u,,(m) = dn+l,m/dn,m and where dn,m are the diameters that measure adjacent position distances that form the period-doubling cascade sequence.14 Above, the choices Ax0 = dn,m and Ax, = dn,m+t, t = 2" - 1,have been made for the initial and the final separation of the trajectories, respectively. In the large n limit a,(m) develops discontinuities at each rational m/2n+1,14 and according to our expression for &(m) the sensitivity is determined by these discontinuities. For each discontinuity of a,(m) the sensitivity can be written in the forms & = exp,[A,t] and & = exp2-,[A2-,t], A, > 0 and X Z - ~ < 0.7 This result reflects the multi-region nature of the multifractal attractor and the memory retention of these regions in the dynamics. The pair of q-exponentials correspond to a departing position in one region and arrival at a different region and vice versa, the trajectories expand in one sense and contract in
+
46
the other. The largest discontinuity of o,(m) at m = 0 is associated to trajectories that start and finish at the most crowded (z N 1) and the most sparse (z N 0) regions of the attractor. In this case one obtains
the positive branch of the Lyapunov spectrum, when the trajectories start at z and finish at 2 N 0. By inverting the situation one obtains A$) = -
2(2 - 1) In a(.) (2k 1) In 2
+
k = 0 , 1 , 2 ,...,
N
1
(4)
the negative branch of the Lyapunov spectrum. Notice that expT-p(y) = l/exp,(-y). So, when considering these two dominant families of orbits all the q-Lyapunov coefficients appear associated to only two specific values of the Tsallis index, q and Q = 2 - q. Mori’s q-phase transitions. As a function of the running variable --co < q < 00 the q-Lyapunov coefficients become a function X(q) with two steps located at q = q = 1 - ln2/(2 - l ) l n a ( z ) and q = Q = 2 - q. In this manner contact can be established with the formalism developed by Mori and coworkers l2 and the qphase transition obtained in Refs. 13. The step function for X(q) can be integrated to obtain the spectrum d(q) (X(q) G d$/dX(q)) and its Legendre transform +(A) (= 4 - (1 - q)A), the dynamic counterparts of the Renyi dimensions D ( q ) and the spectrum f(6)that characterize the geometry of the attractor. The result for +(A) is
<X (1 - &)A, )A: (1 - q)X, 0 < x <
< 0,
xp.
(5)
As with ordinary thermal 1st order phase transitions, a ”q-phase” transition is indicated by a section of linear slope m = 1- q in the spectrum (free energy) +(A), a discontinuity at q in the Lyapunov function (order parameter) X(q), and a divergence at q in the variance (susceptibility) v(q). For the onset of chaos at pc(z = 2) a q-phase transition was numerically determined.12J3 According to +(A) above we obtain a conjugate pair of q-phase transitions that correspond to trajectories linking two regions of the attractor, the most crowded and most sparse. See Fig. 2. Details appear in Ref. 7. Generalized Pesin identity. Ensembles of trajectories with starting points close to the attractor point $0 expand in such a way that a uniform distribution of initial conditions remains uniform for all later times t. As a consequence of this we established 4,7 the identity of the rate of entropy production K i k ) with A?). The q-generalized rate of entropy production K , is defined via K,t = S,(t) - S,(O), t large, where
47 is the Tsallis entropy, pi is the trajectories' distribution, and where In, y 1)/(1 - q ) is the inverse of exp,(y). See Figs. 2 and 3 in Ref. 4.
= (yl-q
-
3. Glassy dynamics at noise-perturbed onset of chaos We describe now the effect of additive noise in the dynamics at the onset of chaos. The logistic map z = 2 reads now
where X t is Gaussian-distributed with average ( x t x t , ) = & t t , and u is the noise intensity. For u > 0 the noise fluctuations wipe the fine features of the periodic attractors as these widen into bands similar to those in the chaotic attractors, nevertheless there remains a well-defined transition to chaos at p,(u) where the Lyapunov exponent XI changes sign. The period doubling of bands ends at a finite maximum period ZN(") as p + pe(u) and then decreases at the other side of the transition. This effect displays scaling features and is referred to as the bifurcation gap.l4pl5 When u > 0 the trajectories visit sequentially a set of 2n disjoint bands or segments leading to a cycle, but the behavior inside each band is fully chaotic. These trajectories represent ergodic states as the accessible positions have a fractal dimension equal to the dimension of phase space. When u = 0 the trajectories correspond to a nonergodic state, since as t + m the positions form only a Cantor set of fractal dimension df = 0.5338.... Thus the removal of the noise u + 0 leads to an ergodic to nonergodic transition in the map. As shown in Ref. 10 when pc(o > 0) there is a 'crossover' or 'relaxation' time T= = a '-', T 2: 0.6332, between two different time evolution regimes. This crossover occurs when the noise fluctuations begin suppressing the fine structure of the attractor as displayed by the superstable orbit with xo = 0 described previously. For T < T= the fluctuations are smaller than the distances between the neighboring subsequence positions of the xo = 0 orbit at pc(0), and the iterate position with u > 0 falls within a small band around the u = 0 position for that T . The bands for successive times do not overlap. Time evolution follows a subsequence pattern close to that in the noiseless case. When T T, the width of the noise-generated band reached at time T~ = 2"") matches the distance between adjacent positions, and this implies a cutoff in the progress along the position subsequences. At longer times T > T= the orbits no longer trace the precise period-doubling structure of the attractor. The iterates now follow increasingly chaotic trajectories as bands merge with time. This is the dynamical image - observed along the time evolution for the orbits of a single state pe(o) - of the static bifurcation gap initially described in terms of the variation of the control parameter p.15 Two-step relaxation. Amongst the main dynamical properties displayed by supercooled liquids on approach to glass formation is the growth of a plateau, and for that reason a two-step process of relaxation, in the time evolution of two-time correlations.16 This consists of a primary power-law decay in time difference At N
48 (so-called /3 relaxation) that leads into the plateau, the duration t, = T, - 1 of which diverges also as a power law of the difference T - T, as the temperature T decreases to a glass temperature T,. After t, there is a secondary power law decay (so-called Q relaxation) away from the plateau.16 In Fig. 3 we show l7 the behavior of the correlation function
for different values of noise amplitude. Above, (...) represents an average over The development of the two power-law relaxation regimes and their intermediate plateau can be clearly appreciated. See Ref. 10 for the interpretation of the map analogs of the Q and /3 relaxation processes. Aging scaling. A second important (nonequilibrium) dynamical property of glasses is the loss of time translation invariance observed for T 5 T,, a characteristic known as aging. The drop time of relaxation functions and correlations display a scaling dependence on the ratio t / t w where t, is a waiting time. In Fig. 4a we show l7 the correlation function
for different values of u,and in Fig. 4b the same data where the rescaled variable t / t w = 2n - 1, tw = 2k 1, k = O , l , ..., has been used. The characteristic aging scaling behavior is patent. See Ref. 10 for an analytical description of the built-in aging properties of the trajectories at pc(u). Adam-Gibbs relation. A third notable property is that the experimentally observed relaxation behavior of supercooled liquids is well described, via standard heat capacity assumptions,16 by the so-called Adam-Gibbs equation, t, = Aexp(B/TS,), where t, is the relaxation time at T, and the configurational entropy S, is related to the number of minima of the fluid’s potential energy surface.16 See Ref. 10 for the derivation of the analog expression for the nonlinear map. Instead of the exponential Adam-Gibbs equation, this expression turned out to have the power law form
+
Since (1 - r ) / r N 0.5792 then t, + 00 and S, + 0 its u + 0. Subdiffusion and arrest. A fourth distinctive property of supercooled liquids on approach to vitrification is the progression from normal difhsivity to subdifisive behavior and finally to a halt in the growth of the molecular mean square displacement. To investigate this aspect of vitrification in the map at p,(a), we constructed l7 a periodic map with repeated cells of the form xt+l = F ( z t ) , F ( l + z ) = Z+F(z),
49 1 = ... - 1,0,1, ..., F ( - z ) = F ( z ) ,where
Fig. 5a shows this map together with a portion of one of its trajectories, while Fig. 5b shows the mean square displacement (z:) as obtained from an ensemble of trajectories with zo = 0 for several values of noise amplitude. The progression from normal diffusion to subdifision and to final arrest can be plainly observed as (T
+ 0.17
4. Summary
We reviewed recent understanding on the dynamics at the onset of chaos in the logistic map. We exhibited links between previous developments, such as Feigenbaum’s (T function, Mori’s q-phase transitions and the noise-induced bifurcation gap, with more recent advances, such as q-exponential sensitivity to initial condition^,^^^ q-generalized Pesin identity?~~ and dynamics of glass formation.1° An important finding is that the dynamics is constituted by an infinite family of Mori’s q-phase transitions, each associated to orbits that have common starting and finishing positions located at specific regions of the attractor. Thus, the special values for the Tsallis entropic index q in & and S, are equal to the special values of the variable q at which the q-phase transitions take place. As described, the dynamics of noiseperturbed logistic maps at the chaos threshold presents the characteristic features of glassy dynamics observed in supercooled liquids. The limit of vanishing noise amplitude (T -+ 0 (the counterpart of the limit T - Tg-+ 0 in the supercooled liquid) leads to loss of ergodicity. This nonergodic state with XI = 0 corresponds to the limiting state, (T + 0, t, + 00, of a family of small (T states with glassy properties, which are expressed for t < t, via the q-exponentials of the Tsallis formalism. Acknowledgments. FB warmly acknowledges hospitality at UNAM where part of this work has been done. Work partially supported by DGAPA-UNAM and CONACyT (Mexican Agencies). References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
C. Tsallis, A.R. Plastino and W.-M. Zheng, Chaos, Solitons and Fkactals 8,885 (1997). M.L. Lyra and C. Tsallii, Phys. Rev. Lett. 80, 53 (1998). F. Baldovin and A. Robledo, Phys. Rev. E 66, 045104 (2002). F. Baldovin and A. Robledo, Phys. Rev. E 69, 045202 (2004). E. Mayoral and A. Robledo, Physica A 340, 219 (2004). G.F.J.Aiiaiios and C. Tsallis, Phys. Rev. Lett. 93, 020601 (2004). E. Mayoral and A. Robledo, submitted and cond-mat/0501366. A. Robledo, Physica A 314, 437 (2002); Physica D 193, 153 (2004). F. Baldovin and A. Robledo, Europhys. Lett. 60, 518 (2002). A. Robledo, Phys. Lett. A 328, 467 (2004); Physica A 342, 104 (2004).
50 11. A. Robledo, Physica A 344,631 (2004). 12. H. Mori, H. Hata, T. Horita and T. Kobayashi, Prog. Theor. Phys. Suppl. 99, 1 (1989). 13. H. Hata, T. Horita and H. Mori, Progr. Theor. Phys. 82, 897 (1989);G. Anania and A. Politi, Europhys. Lett. 7 (1988). 14. See. for example, H.G. Schuster, Deterministic Chaos. An Introduction, 2nd Revised Edition (VCH Publishers, Weinheim, 1988). 15. J.P. Crutchfield, J.D. Farmer and B.A. Huberman, Phys. Rep. 92, 45 (1982). 16. For a review see, P.G. De Benedetti and F.H. Stillinger, Nature 410,267 (2001). 17. F. Baldovin and A. Robledo, to be submitted.
51
Figure 1. Absolute values of positions in logarithmic scales of iterations 7 for a trajectory at pc with initial condition xo = 0. The numbers correspond to iteration times.
52
>
I
q = 0.2445
Q =1.7555
9
(b)
IM
=1.323
I
I -
-2.646
0
1.323
Figure 2. q-phase transitions with index values q = 0.2445 and Q = 2 - q = 1.7555 obtained for z = 2 from the main discontinuity in u,,(m). See text for details.
c
Left to right: -3
0=10
0=5 0=3 0=2 0=1.5 lo-'
0.8
0.6 0.4
0.2 0
1
5
Figure 3. Two-time correlation function c(t2 - t i ) for an ensemble of trajectories with different values of noise amplitude 6 . See text for details.
20
= 0 for
53
I I 1 1 1
10'
loo
+
tltW lo2
Figure 4. a) Twetime correlation function c(t t,,t,) for different values of u. b) The same data in terms of the rescaled variable t / t w . See text for details.
54
A.
1000
100
’
2 Oit
10
1 1
10
t
100
1000
Figure 5. a) Repeated-cell map and trajectory. b) Mean square displacement (z:) for trajectories with 20 = 0 for several values of noise amplitude u. See text for details.
TIME-AVERAGES AND THE HEAT THEOREM
A. CARAT1 Dipahmento di Matematica Via Saldini 50 Milano, I-20132,Italy E-mail: caratiOmat.unimi.it In this paper it is illustrated how to compute the timeaverage of a generic dynamical variable in the limit of large systems. It is also shown how to use this result to deduce an analogue of the second principle of thermodynamics, even in presence of metastable phenomena, for which it is not granted that the standard Gibbs measure can be used.
1. Introduction The aim of this paper is to discuss the possibility of having a thermodynamic behavior also in presence of meta-stable phenomena which prevent t o reach the thermal equilibrium within a given time scale. We place ourselves in the most general setup, considering an abstract dynamical system with phase space M and dynamics @ : M + M , where @ is a suitable map. Now, given a dynamical variable f : M -+IR,in statistical mechanics one is interested in its time average
In equilibrium statistical mechanics one considers the limit N + +m, but in presence of meta-stable phenomena one has to consider time averages on some large but still finite time-scale. In the latter case it is meaningful t o think of N as a parameter having a fixed “large” value. In this case there is nothing analogous to the ergodic theorem, i.e. the function f ( ~is)not almost constant but does depend on the initial datum 20. Now, if we give an a priori probability distribution p(z0) on the initial data (for example the Lebesgue one), f(z0) turns out to be a random variable, in the sense that it will assume different values with different probabilities. It is then natural to consider the expectation value < f > of the time average f(q) with respect of the initial data distribution, i.e. the quantity
IM
< f >Ef
f(z0)d p
.
In this paper we illustrate the results of paper [l],in which it was shown how the expectation can be computed using a large deviation principle, in the limit of a large system, and it was also shown how an analogue of the second principle of
55
56
thermodynamics can be derived. This it will be done in Section 3. In the next section we show how to numerically compute the probability distribution function (p.d.f. for short) of the occupation number, a quantity which plays a fundamental role for computing the thermodynamical quantities of interest (like entropy). We give also an application to the standard map, in the two cases of weak and strong chaos. 2. The p.d.f of the occupation number.
The time average of the function f can be computed also in this way (more suited for our purpose): consider a partition { Z j } of the phase space M into disjoint cells, so that M = Uj 2j, and let nj(s0)be the number of points of the orbit {zn)which belong t o the cell 2j (i.e. the cardinality of the intersection of the orbit with the cell). This number will be called the occupation number in the rest of the paper. Then one has
with fj being the value of F in a given point of the cell 2j. This formula shows that the time average of every dynamical variable can be expressed in terms of the random variables nj(zo),so that the probability distribution function Fj(n) of the occupation number nj turns out to have a fundamental role. Let us recall that Fj(n) is the probability that nj 5 n, i.e. the measure of the set of initial data xo which give rise to orbits having a number of points nj in the cell 2j less than n. In particular, from the knowledge of Fj(n),one can compute not only the expectation value
< nj >=
s
ndFj
,
but also all the higher order moments, as the standard deviation and so on. Note that this probabilistic description is different from the usual one (see for example Ref. [2]), in which one considers the “probability that a given cell 2j is occupied”. In this case one considers “how many initial data” gives rise to orbits that are actually in the given cell at a given fixed time, and this fraction is just the probability assigned to the cell. There is nothing as a p.d.f. associated t o the cell. This is because one takes no care of the fact that the system can visit the same cell a different number of times during the motion. This fact is crucial for what concerns the time-average of the dynamical variables, but obviously have no importance if one limits itself t o consider the phase-space average of them at any fixed time. The function Fj, corresponding to a given cell 2j, can be numerically computed in the following way: extract a number m of initial data xk , 1 = 1 , . . . ,m, at random with respect to the given a priori distribution, then compute the corresponding orbits {x:} and let ni be the number of points of the I-th orbit which belong to
57
200
0
Figure 1. The histogram for
400
E
800
600
1000
= 1.0 and the curve e-Ppk/k! versus k, for p = 500.
the cell 2 j . Having determined the sequence {ni},one builds up the corresponding histogram, i.e. for every k = 0,. . . ,N one reports the number of times (divided by N) the value k appears in the sequence {ni}. From the histogram, the empirical distributiona is then computed. As one increases the number m of initial data the empirical distribution tends to the distribution function Fj (n). For illustration, Figure 1and Figure 2 report (in semilogarithmic scale) the histograms built up for the case of the standard map
{
zf=x+y y' = y
+ &sin2nd
mod 1 mod 1 ,
for two different values of E. We take, as a priori distribution, the uniform distribution on the torus, and we consider the cell 2j = {(z,y)E T2 : 2 E [.4, .5),y E [ . 5 , . 6 ) } . We choose m = lo4 different initial points and the length of any orbit is fixed at N = 5 . lo4. The two figures refer to different values of the parameter E : Figure 1 corresponds to the case E = 1, in which the dynamics is very chaotic. Figure 2 corresponds instead to the case E = 0.5, a much less chaotic one. For comparison, in the figures it is also reported (dashed line) the plot of the function e-Ppk/lk! versus k , which is the histogram for a Poisson process (namely when the different visits of the same cell are independent events). The parameter p is chosen by a best fit. One sees that for E = 1 the histogram is well approximated by the dashed line, while in the case E = 0.5 the curve and the histogram exhibit rather relevant differences for small n. This implies that the Laplace transform exp(xj(z)) We
recall that the histogram is nothing that the plot of Fj(k
+ 1) - Fj(k) versus k.
58 L
0.1
0.01
0.001
0.0001
le-05 0
200
400
600
800
1000
Figure 2. The histogram for E = 0.5 and the curve e-Ppk/k! versus k , for p = 575.
of Fj(n) (whose importance will be illustrated in the next section) will be different, for large z , from the Laplace transform of a Poisson process. 3. The Thermodynamics.
To arrive to the thermodynamics one needs one more concept: in fact in thermodynamics the expectation value U of the internal energy plays the role of a parameter the value of which can be fixed at will. In particular, as the energy is not bounded from above, then the a priori mean energy < E > is infinite , and thus, since U is instead finite, one has in general U #< E >. In other terms, denoting by ~j the value of the internal energy in the cell Zj,one has the condition 1
-X n j E j = U
N
with
U #< E >
1
This indeed is a condition on the initial data or equivalently on the variables nj. So, one is confronted with a large deviation problem, i.e. with the problem that one should compute not the expected value < f >, but rather the conditional expectation < f >(I of f, given the mean energy U . To this end it is sufficient to compute the mean occupation number (which we denote by fij) when the mean energy is U , because one obviously has
59
Note that
fij
satisfies the conditions
N =
x
fij
1
, U = -C~jfij.
j
N
(1)
j
This problem can be solved, under suitable hypotheses (see Ref. [l]),in the limit of large systems. In other terms, one can provide an asymptotic expansion for the mean occupation number fij, the remainder of which tends to zero in the thermodynamic limit. If one assumes that the quantities n j , for different values of j , are independent random variables, one can give a simple expression for the principal term of the expansion. In fact in such a case one has, neglecting the remainder,
0 fij=-x’.(-&.+a) 3 N 3
’
(2)
where the prime denotes derivative, and the function xj(z) is the logarithm of the moment function, i.e. is defined by exp(xj ( z ) )
efSfme--nr dFj 0
The parameters 0 and a are determined by imposing the conditions (l),i.e. by requiring
We can now state the main result of the theory. If one defines the exchanged heat as the difference 6Q = dU - SW, where SW is the mean work performed by the system when an external parameter is changed, then one finds
One then finds that this expression admits BIN as an integrating factor (where 0 is the same quantity entering formula (2)). In fact, introducing vj -x>(z) as an independent variable and the Legendre transform hj(v) of the function xj(z), one indeed has S Q = -Nd ( E 1x h j ( f i j ) ) 0 As a consequence the quantity S = C j h j ( f i j ) / N can be identified with the entropy, and p = O/N with the inverse temperature. It is easy to verify that if the p.d.f. of the occupation number corresponds to a Poisson process (i.e. if F j ( n ) = CkSn e-Ppk/k!,to which there corresponds xj(z) = pe-’ - p ) one gets
ef
h=-
(vj logvj
1
- vj logp ,
60
i.e. the Gibbs distribution for the energy and (obviously) the Boltzmann formula for the entropy. Different p.d.f.'s will give rise to different expressions for both the entropy and the energy distribution. In particular, Figure 2 suggests that xj(z) could decrease more slowly than an exponential for increasing z , for example as an inverse power. As an illustration, one can consider the function
x ( 4 =pe,(-z) -P ,
ef +
where e,(z) (1 (1 - q)z)l/('-q) is the Tsallis q-deformation of the exponential, and one obtains 03
= C(&)(l+ Dq(q - 1 ) E j ) *
ef
+
where C(&) is a suitable normalizing constant, and p, p/(1 (q - 1)a).This distribution coincides with the Tsallis q-distribution(see Refs. [3]) for the energy, while the expressions for the entropy h also coincides with Tsallis q-entropy S, if we express h not in terms of V j , but in terms of the quantities p j p'/qq'/q-lvi'q.
ef
References 1. A. Carati, Physica A 348, 110-120 (2005). 2. M. Baranger,V. Latora, A. Rapisarda, Chaos, Solitons and Fractals 13, 471 (2002). 3. C. Tsallis, J . Stat. Phys. 52, 479 (1988); C. Tsallis, An. Acad. Br. Cienc. 74, no. 3, 393-414 (2002).
FUNDAMENTAL FORMULAE AND NUMERICAL EVIDENCES FOR THE CENTRAL LIMIT THEOREM IN TSALLIS STATISTICS *
HIROKI SUYARI Department of Information and Image Sciences, Chiba University, 263-8522, Japan E-mail: [email protected]
On the way to finding the mathematical structure behind Tsallii statistics, the rigorous formulation of the q-central limit theorem and its proof are expected to play important roles in mathematical physics. This short paper reports some numerical evidences revealing the existence of the central limit theorem in Tsallis statistics reviewing some fundamental formulas such as law of error and q-Stirling’s formula.
1. Q-product uniquely determined by Tsallis entropy Since the birth of Tsallis entropy S, := (1 - Cy=,p y ) / (q - 1), the main approach to the generalization of the traditional Boltzmann-Gibbs statistics has been the maximum entropy principle (MEP) along the same lines of Jaynes’ original ideas, which leads us to a variety of successful theoretical foundations and their applications to unify power-law behaviors in nature 23. Nowadays, the generalized statistical physics is called Tsallis statistics including the Boltzmann-Gibbs statistics as a special case. Through the history of sciences, we have learned an important lesson that there always exists a beautiful mathematical structure behind a new physics. This lesson stimulates us to finding it in Tsallis statistics4. On the way to the goal, we obtain some fundamental theoretical results in Tsallis statistics567. The key concept leading to our results is “q-product” uniquely determined by Tsallis entropy, which is independently introduced by Nivanen et al and Borges g. The q-product @q is defined as follows:
The definition of the q-product originates from the requirement of the following satisfactions:
1% .(
@q
Y) = In, z + In, Y, exp,
@q
expq (Y) = exp, (z + Y)
(2)
*This work was partially supported by the ministry of education, science, sports and culture, grant-in-aid for encouragement of young scientists(b), 14780259, 2004.
61
62 1-s-1
where In, z is the q-logarithm function In, z := 5(z > 0, q E 1-2
EX) and exp, (z)
is the q-exponential function exp, (z) := [1+ (1 - q) x]? with the notation [z]+ := max(0, z } . These functions, In, z and exp, (x), are originally determined by Tsallis entropy and its maximization 23. Moreover, exp, (z) is rewritten by means of the q-product. (3) This representation (3) is a natural generalization of the famous definition of the usual exponential function: exp (x) = lim (1 E)n. This fundamental property n+w
+
(3) reveals the conclusive validity of the q-product in Tsallis statistics. In the following sections, we briefly review the applications of q-product to the fundamental formulations in Tsallis statistics. 2. Law of error, q-Stirling’s formula, and q-multinomial coefficient in Tsallis statistics
Gaussian distribution, the most important distribution, is known to be mathematically characterized by three ways: (i) MEP for Shannon entropy under the second moment constraint, (ii) Gauss’ law of error, and (iii) central limit theorem. MEP for Tsallis entropy under the second moment constraint yields a q-Gaussian as a generalization of a Gaussian distribution. Therefore, we expect the law of error and the central limit theorem in Tsallis statistics. The law of error in Tsallis statistics has been already obtained using q-product 8,‘.
2.1. Law of e r r o r in Tsallis statistics Consider the following situation. We obtain n observed values X I , x2, . . ’ ,xn E R as a result of n measurements for certain observations, where we do not necessarily assume independency. However the infinitesimal probability that the value ( X I , .. . ,X n ) lies in the infinitesimal cube around (XI,.. . , z n ) is assumed t o be proportional to a function L, (8) defined by
L, (8) := f (51- 8) 8,f (22 - 8 ) 8,. . . 8,f (zn - 8)
(4)
for some 8.
Theorem 2.1. If the function L, (8) of 8 for any fied z1,22,. . . ,zn attains the maximum value at 9 = 8’ := CZl xi, then the probability density function f must be a q-Gaussian:
where /3, is a q-dependent positive constant. This q-Gaussian coincides with the probability distribution derived from MEP for Tsallis entropy under the second moment constraint, which recovers a Gaussian distribution when q -+ 1. See for the proof.
‘
63
2.2. Q-Stirling’s formula in Tsallis statistics Using the q-product, we naturally obtain the q-Stirling’s formula. For the q-factorial n!, for n E N and q > 0 defined by n!, := 1 8, . . . 8, n,
(6)
the q-Stirling’s formula (q # 1) is In, (n!,)=
and
q>O
( & + ~1) ~nl-q-1 +(-&)+(&-6,)if if
4#1,2
q=2
(7) where 6, is a q-dependent parameter which does not depend on n. Slightly rough expression of the q-Stirling’s formula ( q # 1) is In, (n!,)
N
{-2nq
(In, n - 1) if q > 0 and q # 1 , 2 n-Inn if q = 2
These q-Stirling’s formulas recover the famous Stirling’s formula when q -+ 1. See for the proof.
2.3. Q-multinomial coeficient in Tsallis statistics The q-multinomial Coefficientin Tsallis statistics is defined by
k where n = Ci=l ni, ni E N (i = 1 , . . . ,k) . 0, is the inverse operation to @,I which is defined by
x0,y:=
{
[.I-,
- yl-9
+ 13 i+t , if x > 0, y > 0,
0,
z1-q
- yl-Q
+ 1 > 0,
otherwise.
(10)
0, is also introduced by the following satisfactions as similarly as 8,. In, x 0, y = In, x - In, y,
exp, (x) 0, exp, (Y)= exp, (x - Y)
.
(11)
Applying the definitions of 8, and 0, t o (9), the q-multinomial coefficient is explicitly written as
From the definition (9), the q-multinomial coefficient clearly recovers the usual multinomial coefficient when q + 1. When n goes infinity, the q-multinomial coefficient (9) has a surprising relation to Tsallis entropy as follows:
64 The present relation (13) tells us some significant mathematical structures: (i) There always exists a one-to-one correspondence between Tsallis entropy and the q-multinomial coefficient. In particular, (13) reveals the following (%, . . . , %) is equivalent to important equivalence: “Maximization of S Z - ~ n that of the q-multinomial coefficient n1 . . . n k ] when n is large.”
[
(ii) The relation (13) reveals a surprising symmetry: (13) is equivalent to
for q > 0 and q # 2. This expression represents that behind Tsallis statistics there exists a symmetry with a factor 1 - q around q = 1. Substitution of some concrete values of q into (14) helps us understand this symmetry. 3. Numerical computations revealing the existence of the central limit theorem in Tsallis statistics
It is well known that any binomial distribution converge to a Gaussian distribution when n goes infinity. This is a typical example of the central limit theorem in the usual probability theory. By analogy with this famous result, each set of normalized q-binomial coefficients is expected to converge to each q-Gaussian distribution with the same q when n goes infinity. As shown in this section, the present numerical results come up to our expectations. In Fig.1 Fig.3, each set of bars and solid line represent each set of normalized q-binomial coefficients and q-Gaussian distribution with normalized q-mean 0 and normalized q-variance 1 for each n when q = 0.1,0.5,0.9, respectively. Each of the three graphs on the first row of each Fig represents two kinds of probability distributions stated above, and the three graphs on the second row of each Fig represent the corresponding cumulative probability distributions, respectively. From these 3 figures, we expect the convergence of a set of normalized q-binomial coefficients to a q-Gaussian distribution when n goes infinity. Other cases with different q represent the similar convergences as these cases. In order to confirm these convergences required for the proof of the central limit theorwm in Tsallis statistics, we compute the maximal difference Aq+ among the values of two cumulative probabilities (a set of normalized q-binomial coefficients and q-Gaussian distribution) for each q = 0.1, 0.2, . . . , 0.9 and n. Aq+ is defined max by Aq,n := .j=o ... n IFq-bino (2) - Fq-Gauss (z)I where Fq-bin0 (2) and Fq-Gauss (2) are cumulative probability distributions of a set of normalized q-binomial coefficients and its corresponding q-Gaussian distribution, respectively. Fig.4 results in convergences of Aq,nto 0 when n -+ co for q = 0.1, 0.2, . . . , 0.9. This result indicates that the limit of every convergence is a q-Gaussian distribution with the same q E (0,1] as that of a given set of normalized q-binomial coefficients. N
65
.
.
&
Fig
Fig.1-Fig.4.
Convergences of a set of normalized q-binomial coefficients to a q -Gaussian distribution
The present convergences reveal a possibility of the existence of the central limit theorem in Tsallis statistics. The central limit theorem in Tsallis statistics provides not only a mathematical result in Tsallis statistics but also the physical reason why there exist universally power-law behaviors in many physical systems. In other words, the central limit theorem in Tsallis statistics mathematically explains the reason of ubiquitous existence of power-law behaviors in nature.
References 1. C. Tsallis, J. Stat. Phys. 52, 479-487 (1988). 2. C. Tsallis et al., Nonextensive Statistical Mechanics and Its Applications, edited by S. Abe and Y . Okamoto (Springer-Verlag, Heidelberg, 2001). 3. C. Tsallii et al., Nonextensive Entropy: Interdisciplinary Applications, edited by M. Gell-Mann and C. Tsallis (Oxford Univ. Press, New York, 2004). 4. H. Suyari, IEEE Trans. Inform. Theory., 50, 1783 (2004). 5. H. Suyari and M. Tsukada, Law of error in Tsallis statistics, to appear in IEEE. Trans. Infcrm. Theory. 6. H. Suyari, q-Stirling's formula in Tsallii statistics, LANL e-print cond-mat/0401541. 7. H. Suyari, Mathematical structure derived from the q-multinomial coefficient in Tsallis statistics, LANL e-print cond-mat/0401546. 8. L. Nivanen, A. Le Mehaute, Q.A. Wang, Rep.Math.Phys. 52, 437 (2003). 9. E.P. Borges, Physica A 340, 95 (2004).
GENERALIZING THE PLANCK DISTRIBUTION
ANDRE M. C. SOUZA Departamento de Fisica, Universidade Federal de Sergipe 49100-000, Sao Cristovao-SE, Brazil E-mail: amcsouzaOufs.br CONSTANTINO TSALLIS Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM, 87501 USA E-mail: [email protected] and Centro Brasileiro de Pesquisas Fisicas Rua Xavier Sigaud 150, 28290-180 Rio de Janeiro-RJ, Brazil Along the lines of nonextensive statistical mechanics, based on the entropy Sq = k(1 C i p 5 ) / ( q - 1) ( S l = - k C i p i lnpi), and Beck-Cohen superstatistics, we heuristically generalize Planck’s statistical law for the black-body radiation. T h e procedure is based yq (with y(0) = l), on the discussion of the differential equation dy/dx = -aly-(a,-al) whose q = 2 particular case leads t o the celebrated law, as originally shown by Planck himself in his October 1900 paper. Although the present generalization is mathematically simple and elegant, we have unfortunately no physical application of it at the present moment. It opens nevertheless the door t o a type of approach that might be of some interest in more complex, possibly out-of-equilibrium, phenomena.
We normally obtain the statistical mechanical equilibrium distribution by optimizing, under appropriate constraints, an entropic functional, namely the Boltzmann-Gibbs (BG) entropy SBG = -k pi lnpi. The success and elegance of this variational method are unquestioned. But at least one more possibility exists, namely through differential equations. Such a path is virtually never followed. Indeed, such an approach might seem quite bizarre at first sight. But we should by no means overlook that it has at least one distinguished predecessor: Planck’ s law for the black-body radiation. Indeed, Planck published two papers on the subject in 1900. The first one in October, the second one in December The bases of both of them were considered at the time as totally heuristic ones, although kind of different in nature. The second paper might be considered as a primitive form of what has now become the standard approach to statistical mechanics, based on the optimization of an entropy functional, the connection with Bose-Einstein statistics, and, ultimately, with the Boltzmann-Gibbs thermal theory for a quantum harmonic oscillator The first paper ’, however, is totally based on simple arguments regarding an ordinary differential equation. It is along this line that the present paper
xi
’.
’.
66
67 is constructed. If SBG is extremized under appropriate constraints, we obtain the famous BG weight p ( E ) = p ( 0 ) e-PE. This distribution can be seen as the solution of the differential equation dp/dE = - p p . Since more than one decade, a lot of effort is being dedicated to the study of the so called “nonextensive statistical mechanics”, based on the generalized entropy S, = k(1- c i p : ) / ( q - 1) (S1= SBG) (for a review, see 5 ) . The extremization of this entropy under appropriate constraints yields p ( E ) = p ( 0 ) e;PE, where e; = [l (1 - q ) z ] ’ / ( l - Q )(eT = e z ) . This distribution, which has been shown to emerge in many natural and artificial systems 5 , can be seen as the solution of the differential equation d[p/p(O)]/dE= -pb/p(O)]‘J. As a next step, we may consider even more complex systems, namely those which exhibit, for increasing E , a crossover from nonextensive to BG statistics. Such appears to be the case of cosmic rays ‘. Such situations can be handled with a differential equation which unifiesthe previous two ones, as follows:
+
Excepting for the fact that here q may be noninteger, this differential equation is a particular case of Bernoulli’ s differential equation. Its solution is given by
P(E) =
P(0)
[1+ Lk (e(q-l)PlE - I)]
1
,
(2)
Fi
P1
which precisely exhibits the desired crossover for q > 1 and 0 < ,& << B , , . Indeed, for ( q - 1)PlE << 1 we have that p / p ( O ) eQPqE,whereas, for (q - 1)PlE >> 1, we have p 0: e-PIE. In the limit p,/p1 --t 00 and p(O)p1/& + C , where C is a constant, Eq. (2) becomes N
which, for q = 2, becomes
If we multiply this statistical weight by the photon density of states g ( E ) a E2 and by the energy E , we have the celebrated frequency spectral density
where we have identified + l/kBT and E hu. It is in this precise sense that Eq. (3) (hence Eq. (2)) can be seen as a generalization of Planck statistics. For q > 1, Eq. (3) can be written as --f
m
’ 0 = C
n=O
d(n,q ) e-OIEn ,
68 10 9 -
0
1.o
I
I
I
1.5
2.0
2.5
3.0
9 Figure 1. Degeneracy d ( n , q ) as function of q ( n = 0,1,2,3); q = 2 corresponds to Planck law.
where
En = [(q - 1).
1 + 11 E 0: n + q-1’
(7)
and
r(z)being the Gamma function. We may now follow Planck’ s path in his December 1900 paper, where he introduced the discretization of energy that eventually led to the formulation of quantum mechanics. Consistently, we may interpret En as a discretized energy and d ( n , q ) as its degeneracy. We see that, Vq > 1, the spectrum is made of equidistant levels, like that of the quantum one-dimensional harmonic oscillator. The situation is definitively different in what concerns the degeneracy (see Fig. 1). Only for q = 2 we have the remarkable property d ( n , 2 ) = 1 (Vn), which recovers the harmonic oscillator problem. At this point, let us emphasize that any thermostatistical weight (that of thermal equilibrium for instance) reflects the microscopic dynamics of the system. This
69 fact was addressed by Einstein in 1910 ', and was recently revisited by several authors (see 8 , for instance). It was shown also, on quite general grounds, in '. In the same vein, a dynamical theory of weakly coupled harmonic oscillators system was recently used for deducing the functional relation between energy variance and mean energy that was conjectured by Einstein in connection with Planck' s formula, thus exhibiting that it is a consequence of pure dynamics It is within this dynamical interpretation that Beck and Cohen introduced their superstatistics'l. Indeed, nonequilibrium systems might exhibit spatio-temporal fluctuations of intensive quantities, e.g., the temperature. They assumed then that the inverse temperature P might itself be a stochastic variable, such that the generalized distribution of energy is expressed as
'.
where the distribution f(P)satisfies dPf(P) = 1. The effective statistical mechanics of such systems depends on the statistical properties of the fluctuations of the temperature and similar intensive quantities. Naturally, if there are no fluctuations of intensive quantities at all, the system must obey BG distribution (i.e., f(P) = S(P - l/kBT)). They also showed that, if f(P) is the y-distribution (see also 12), one obtains the q-exponential weight of nonextensive statistical mechanics. Moreover, for small variance of the fluctuations, the nonextensive statistical distribution is once again reobtained. See l3 for an entropic functional which, extremized under appropriate constraints, recovers the distribution of superstatistics. We straightforwardly obtain, through Laplace transform, that the superstatis tical distribution f(P) corresponding t o the p(E)/p(O)given by Eq. (2) is
Moreover, we define
sow
dPf(P)(...). The notation qBc (BC stands for Beck-Cohen) has where (...) = been introduced to avoid confusion with the present q. Only when f (P) equals the y-distribution we have qBC = q. Using Eq. (10) and integrating we obtain
Replacing( 12) into (11)we obtain
-
It is worthy remarking that, for all admissible f(f?), we can write the asymptotic expression p(E)/p(O)= (e-oE) e-(o)E(l + where o = d m= (qBC -
O'
e),
70 I0
a
0.1 0.8
la1
-
so.-
0.8 02
0.0
00
-
@I
(4
01-
0.8
-
0.-
02-
0
2
1
3
Id
4
E
-
2
00
B
Figure 2. Functions
[#
$#( l e f t ) and f(0)(right) for (p) = 1. (a) Boltzmann-Gibbs distribution
= e P E ;f(0)= 6(p-1)]; (b) q = QBC = 1.8 distribution
f ( ~ )=
;[6(0-
&,-l.z50]; 0.8 . r ( i . 2 5 )
(c) ( q , Q B C ) = (2,3/2) distribution
=
(l+o,~E~E)'.25;
[% = (*;
f(p) =
[s
i)+ ;6(p - 1) + $6(p - 5 ) + ...I]; (d) ( q , q B C ) = (3/2,5/4) distribution = - f(0)= $[6(0- 4) + 6(p - a) + $6(0 - 1) + ...I]. In the cases (a,c,d), what is
w; 4 1-In2
[$
represented is not
f(0) strictly speaking, but rather the weights of the Dirac delta's.
Finally, we may rewrite distribution (2) as follows:
hence, through Laplace transform,
(15) Observe that, for all q, if qBc -+ 1 we obtain the BG distribution. In addition, we see that p generically assumes discrete values in f(P). If we focus on the limit of continuous values for p, we must have (using Eq. (10)) Ap 3 @(n 1) - P(n) = pI(q - 1) --+ 0, and this is obtained (see Eq. (13)) when (p) -+ 0 (i.e., high temperature) or qBc ---t q (i.e., q-statistics) . In Fig. 2 we present typical examples of pairs (P(E)/P(O), f(P)>. Summarizing, we obtained the distribution corresponding to the differential equation (l),expected to characterize a class of physical stationary states where a
+
71 crossover occurs between nonextensive and BG statistics. This led us to a possible generalization of Planck law. We obtained also the Beck-Cohen superstatistical distribution f(p) associated with such type of crossovers between statistics. Along similar lines, it is possible t o study crossovers between q and q’ statistics, with eventual applications in turbulence and other complex phenomena.
Acknowledgments Partial support from PCI/MCT, CNPq, PRONEX, FAPERJ and FAP-SE (Brazilian agencies) is acknowledged.
References 1. L.J. Boya, physics/O402064. 2. R. Balian, From Microphysics to Macrophysics, Vol. I, 140 and Vol. 11, 218 (SpringerVerlag, Berlin, 1991/1992). 3. M. Planck, Verhandlungen der Deutschen Physikalischen Gessellschaft 2, 202 and 237 (1900) [English translation: D. ter haar, S. G. Brush, Planck’s Original Papers in Quantum Physics (Taylor and Francis, London, 1972)]. In his “Ueber eine Verbessemng der Wien’schen Spectralgleichung” 19 October 1900 paper, Planck writes the following equations: d2S/dU2 = a / [ U ( p U] ( S and U being the entropy and internal energy respectively; a and p are constants), and dS/dU = 1/T (T being Kelvin’s absolute temperature). Replacing the latter into the former leads to d U / d ( l / T ) = @ / a )U ( l / a ) U 2 . From this differential equation, he eventually obtains his famous law, namely E = CA-5/(eC’XT - 1) (A being the wavelength; C and c are constants). If, as usually done nowadays, we express this spectral density in terms of the frequency u cc 1/X, we obtain the familiar expression, proportional to ~ ~ / ( e ~ ‘-” 1) / ~(c’ > 0 being a constant). 4. C. Tsallis, J. Stat. Phys. 52, 479 (1988); E.M.F. Curado and C. Tsallis, J. Phys. A24, L69 (1991) [Corrigenda: A24,3187 (1991) and A25, 1019 (1992)l; C. Tsallis, R.S. Mendes and A.R. Plastino, Physica A261,534 (1998). For a regularly updated bibliography of the subject see http://tsallis.cat.cbpf.br/biblio.htm. 5. M. Gell-Mann and C. Tsallis, Nonextensive entropy - Interdisciplinary Applications (Oxford University Press, New York, 2004). 6. C. Tsallis, J.C. Anjos and E.P. Borges, Phys. Lett. A310,372 (2003). 7. A. Einstein, Annalen der Physik 33,1275 (1910). 8. E.G.D. Cohen, Physica A305, 19 (2002); E.G.D. Cohen, Boltzmann and Einstein: Statistics and dynamics - A n unsolved problem, Boltzmann Award Communication at Statphys-Bangalore-2004, Pramana (2005), in press. 9. A. Carati, Physica A348, 110 (2005). 10. A. Carati and L. Galgani, Phys. Rev. E61,4791 (2000). 11. C. Beck and E.G.D. Cohen, Physica A321,267 (2003). 12. G . Wilk and Z. Wlodarczyk, Phys. Rev. Lett. 84,2770 (2000); C. Beck, Phys. Rev. Lett. 87,180601 (2001). 13. C. Tsallis and A.M.C. Souza, Phys. Rev. E67,026106 (2003).
+
+
THE PHYSICAL ROOTS OF COMPLEXITY: RENEWAL OR MODULATION?
PAOLO GRIGOLINI* Center for Nonlinear Science, University of North Texas, P.O. Box 311427, Denton, Texas 76203-1427 E-mail: grigodunt.edu Dipartimento di Fisica E. Fermi, Via Bonarroti, 2 I 56127, Pisa, Italy Istituto dei Processi Chimico Fisici del CNR Area della Ricerca di Pisa, Via G. Moruzzi 1,56124 Pisa, Italy
We show that the emergence of a non-Poisson distribution might have different physical origins. We study two distinct ways to generate a non-Poisson distribution, the first from within the renewal theory, and the second based on infinitely slow modulation, a condition that makes this second perspective equivalent to superstatistics. We prove that these different origins yield different physical effects, aging in the former case, and no aging in the latter.
1. Introduction
Here we adopt the simple minded definition of complexity science, as the field of investigation of multi-component systems characterized by non-Poisson statistics. On intuitive ground, this means that we trace back the deviation from the canonical form of equilibrium and relaxation, to the breakdown of the conditions on which Boltzmann’s view is based: short-range interaction, no memory and no cooperation. Thus, the deviation from the canonical form, which implies total randomness, is a measure of the system complexity. However, this definition of complexity does not touch the delicate problem of the origin of the departure from Poisson statistics. Here we limit ourselves to considering two different proposals, which we shall refer to as renewal and modulation, both generating non-Poisson distributions. Thus, to a first sight, one might be tempted to conclude that they are indistinguishable, leaving no motivation whatsoever to prefer the one to the other. We shall prove that it is not so, and that an aging experiment can be done, to distinguish modulation from renewal. *Work supported by grant 70525 of the Welch Foundation
72
.
73 2. An example of complex system: the blinking quantum dots
The physical process that we adopt here as a paradigm of complexity is the phenomenon of non-Poisson intermittent fluorescence, producing a sequence of ”light on” and ”light off’ states. The well known experiment by Dehmelt on a single ion, studied by Cook and Kimble ’, is an example of non-complex intermittence, given the fact that the distribution of sojourn times is exponential. An example of fluorescence intermittency to be termed complex is given instead by the blinking phenomenon in semiconductor nanocrystallytes 3. In fact, in this case the waiting time distributions are found to fit an inverse power law for some time decades. In this paper, for simplicity sake, we assume that the ”light on” and ”light oil” time distributions are identical, and are thus described by the same waiting time distribution $J(r). Throughout this paper we adopt the form
with p > 1. This distribution is properly normalized, and the parameter T , making this normalization possible, gives information on the lapse of time necessary to reach the time asymptotic condition where +(r)becomes identical to an inverse power law. We shall see that the main conclusions of this paper are not confined to the inverse power law form of Eq. (l),being valid for any form of non-Poisson distribution. The choice of the form of Eq. (1)is dictated by the simplicity criterion. This form has been known for many years 4, see for instance Ref. ’, and following Metzler and Nonnenmacher and Metzler and Klafter we shall be referring to it as Nutting law. This form is also obtained by means of entropy maximization from a non-extensive form of entropy and, for this reason, is referred to by an increasing number of researchers as Tsallis distribution. The theoretical discussion of this paper rests on a time series { ~ i } , created in such a way as to correspond to the distribution of Eq. (1). After creating this time series, according to either the renewal or modulation prescription, we use it to generate a sequel of events in time. The first event occurs at time t = 71, the second at time t = 71 72, and so on. The time intervals between two consecutive events are called laminar regions, and the reason for this name will be made clear by the discussion of Section 3.
’
+
3. Renewal
As a prototype of renewal model we shall refer to the following dynamic process. Let us consider a particle moving within the interval I = [0,1]driven by the following equation of motion d
-&Y
= ay=,
74
with 221,
(3)
and O
(4)
Due to the positivity of (Y the particle moves from the left to the right, and when it reaches the border y = 1 is injected back to a generic initial condition with position yo, fitting the condition 0 < yo
< 1.
(5)
The distribution density of sojourn times, q b ( ~ ) is , evaluated as follows. First of all, we solve Eq. (2) to determine the time necessary for the particle to reach the border moving from a given initial condition yo. This time is given by
The probability for the particle to get the border in the infinitesimal interval d7] is determined by 1c1(7)d7 = PO(Y0)dYO.
(7, 7
+
(7)
We make the assumption of uniform back injection, which yields po(y0) = 1. Thus, we obtain the form of Eq. (1)with
and
T=-. 1-1-1 ff
(9)
In practice, we create first the sequence {yo(i)}, by means of a succession of random drawings of numbers from within the interval I . Then, using the transformation of Eq. (6), we associate yo(i) with 7ir thereby creating the sequence { ~ i } , which allows us to the define the time occurrence of events. The first event occurs at t = 71, the second at time t = 71 $ 7 2 , and so on. The time intervals between two consecutive events are called laminar regions, the reason being that the dynamical model here under study is an idealization of the celebrated Manneville map ', with the time interval between two events representing the fluid regular state. In view of the discussion on aging, which will be madeusing on Sections 5 and 6, it is convenient to notice that the uniform distribution po(y) is a non-equilibrium condition that will evolve in time, due to the joint action of the deterministic prescription of Eq. (2) and the back injection process. To evaluate the time evolution of the probability distribution density p(y, t ) ,we use the following equation of motion
75
It is evident that this is the proper density representation of the renewal model of this section. In fact the first term on the right hand side of Eq. (10) corresponds to the deterministic rule of Eq. (2), while the second term, independent of the position y, is a fair way to reproduce the process of uniform back injection. It is interesting to notice that for z < 2, the invariant distribution is given by
The transition from z < 2 to z > 2 provokes the breakdown of this invariant distribution. The renewal character of the model is made evident by Eq.(6). In fact, the values of yo are randomly chosen from a uniform distribution, 0 < yo < 1. Any drawing does not have memory of the earlier drawings. Consequently, a laminar region does not have any memory of the earlier laminar regions. In literature there are many examples of renewal models yielding the distribution of Eq. (l), one of them recently proposed for blinking quantum dots lo. g910111>12,
4. Modulation theory
We define as modulation theory any approach t o non-Poisson distribution based on the modulation of Poisson processes. For instance, a double-well potential under the influence of white noise yields the Poisson distribution of the time of sojourn in the two wells 13. In the case of a symmetric double-well potential we have
The parameter X is determined by the Arrhenius formula
x = kexp
(-&)
(13)
In the case when either the barrier intensity Q l3 or temperature T l4 are slowly modulated, the resulting waiting time distribution becomes a superposition of infinitely many exponentials. At least since the important work of Shlesinger and Hughes 15, and probably earlier, it is known that a superposition of infinitely many exponentially decaying functions can generate an inverse power law. This, by itself, does not qualify the theory as modulation. It depends on the criterion adopted to generate the sequence { ~ i } , discussed in Section 2. If changing the subscript of ri, automatically implies also the random selection of the rate parameter A, the resulting process is no doubt renewal. To turn it into superstatistics, we have to draw N d time values, ix 5 i 5 ix Nd, with Nd >> 1, from the same Poisson distribution $ ( r )= X e q - h ) . In recent times, the term superstatistics has been coined l6 t o denote an approach to non-Poisson statistics, of any form, not only the Nutting (Tsallis) form, as in the original work of Beck 17. We note that Cohen points out explicitly that the time scale to change from a Poisson distribution to another must be much larger
+
76 than the time scale of each Poisson process. Thus, we can qualify superstatistics as a form of modulation. Therefore, from now on we shall refer to this approach to complexity indifferently either as modulation or superstatistics. In conclusion, according to the modulation theory we write the waiting time distribution $(t) under the following form $(T)
where I I ( A ) is the
= /dAlI(X)Aezp(-At),
r distribution of order p - 1 given by TF- 1
lI(A) =
r ( P - 1)
exp (-AT).
''
This formula was proposed by Beck and used in a later work 18. To help the reader to understand the physical consequences of modulation, let us imagine that we have a box with infinitely many labelled balls. The label of any ball is a given number A. There are many balls with the same A, so as to fit the probability density of Eq. (15). We randomly draw the balls from the box and after reading the label we place the ball back in the box. Of course, this procedure implies that we are working with discrete rather than continuous numbers. However, we make the assumption that it is possible to freely increase the ball number so as to come arbitrarily close to the continuous prescription of Eq. (15). After creating the sequence { A j } , we create the sequence { ~ with i } the following protocol. For any number A j , the reader must imagine that we have available a box with another set of infinitely many balls. Each ball is labelled with a number 7, and in this case the distribution density is given by $ ( T ) = Aezp(-A.r). We create a sequence {.r!)} by making N d drawing from this box, with Nd being a very large, virtually infinite. Notice that the correlation function of the fluctuation (, @,c(T), for any X is equal to ezp(-A.r). We note also that the smaller X the larger is the time interval corresponding to it. Consequently, for a proper definition of the effect of modulation on @,~(.r), we have to use the statistical weight l I ( A ) / A 18, which yields
This means that the waiting time distribution $ ( T ) , of eq. (14) is proportional to the second time derivative of Eq. (16), this being a known consequence of renewal theory 19. Thus, for p > 2, modulation and renewal not only yield the same $(.r), but also the same correlation function. We shall see, however, that the aging experiment, Section 5, allows us to assess whether renewal or modulation applies. 5.
Aging
To establish whether the deviation from exponential decay is of renewal kind or not, we proceed as follows. With the same prescription as that illustrated in Section 2, we
77 create a so large number of sequences {q}, as to realize a reliable Gibbs ensemble.
If we set the observation time equal to t = 0, and we count how many of the laminar regions beginning at t = 0 end at a later time, we create a histogram that, obviously, yields the waiting time distribution $(r) of Eq. (1). Then, we begin observing the sequences at a given time t, > 0. For any sequence we do not know if the first sojourn begins at the moment when we start observing, or not. Using renewal theory, it is shown 2o that the resulting distribution is accurately described by the following formula
where Kt, is the normalization constant. This formula is approximated, but, in the case of an inverse power law, it has been shown 2o to be very accurate. In the exponential case this formula does not yield any deviation from the original waiting time distribution, showing therefore that with Poisson statistics no aging is possible, even if the process is renewal. In the case of blinking quantum dots, we have at our disposal only one sequence. Yet, it is possible to create a very large set of distinct sequences as follows. We shift in time the original sequence by the quantity 7-1, thus creating the second sequence. This means that with the second sequence the fist event occurs at t = 7-2, the second at time t = 7-2 + 5, and so on. The third sequence is created by shifting the second by the quantity 7-2, and so on. With this method it is possible to assess not only the existence of aging, a fact already assessed by the authors of 21. It is also shown 22 that the predictions of Eq. (17) are fulfilled with surprising accuracy. This seems to be a compelling proof that the blinking quantum dots obey renewal theory. 6. Modulation: no aging
Probably, the most direct way to prove that modulation does not yield any aging is based on the following remark. Let us assume that the renewal condition applies and that Eq. (10) can be used. Let us assume that the initial condition is the flat distribution, p(y, 0) = 1 for any value of y from y = 0 to y = 1. We decide to start the observation process at t = t, > 0. The waiting time distribution of the first sojourn times is given by
(W,
(18)
+t,(t) = ~ b , t a ) l d t l & / l ,
(19)
P(Y,t a P Y = *ta yielding
which defines $ ~ ~ , ( once t ) , on the right side y is expressed as a function of the time necessary to get the border moving from y, this function being the inverse of Eq. (6).
78 The authors of Ref. 23 have proved that this approach can be used to derive the same analytical expressions as those derived by Barkai 24 in an earlier publication. According to Ref. 2o these analytical formulas can be proven to be equivalent to Eq.(17). In conclusion, Eq. (18) is a reliable way t o express aging. We use this approach to prove that superstatistics yields no aging. In fact, the time evolution of p(y, t ) ,with modulation theory reads
This equation can be studied along the lines suggested by Kubo enlarged equation of motion
25,
through the
where the operator RAtakes care of the time evolution of the fluctuating parameter A, establishing also the conditions for fast or slow modulation. The general solution of this equation is not straightforward but in the case of fast modulation, (a), and the case of very slow modulation, (b). In the former case, (a), we have for
the following time evolution
with F
M
In case (b) we get roo
with p ~ ( yt ,) being the solution of the differential equation
We see that in both case (a), Eq. (23), and case (b), Eq. (26), the flat distribution coincides with the equilibrium distribution. Consequently P(Y,tJ =P(Y,o).
(27)
= $(t).
(28)
Thus, using Eq. (18), we get $taw
This means that both very fast and very slow modulation yield no aging. As we have seen in Section 4, superstatistics is equivalent to infinitely slow modulation.
79 Thus, we can conclude that superstatistics yields no aging. On the other hand, since blinking quantum dots are characterized by aging 21, we rule out superstatistics as a proper approach to complexity in the case of blinking quantum dots. Research work is currently being done to study the case where the modulation time scale is in between condition (a) and condition (b) 26.
7. Concluding remarks We want to point out that in no way we mean t o draw general conclusions about the origin of complexity. Our conclusions are limited only to the specific case of intermittent fluorescence in colloidal semiconductor quantum dots. In this specific case non-Poisson statistics seems to obey renewal theory, and consequently modulation (or superstatistics) cannot be invoked to explain the emergence of this kind of deviation from the canonical statistical condition. Even in this case, however, additional research work is required, even if preliminary results have been found 22 which confirm with surprising accuracy the renewal character of these non-Poisson processes recently revealed by the statistical analysis of ref. 21. There are, furthermore, plausible reasons to believe that the conversion of the blinking quantum dots intermittent fluorescence into a symbolic sequence, and this into a diffusion process 27, generates different scaling properties, according to whether renewal or modulation is adopted: modulation with no events is expected 26 to yield the generalized diffusion equation of Ref. 28, conflicting with the renewal picture of Ref. 29, thereby shedding light into the trajectory-density conflict produced by the transition from the Poisson to the non-Poisson condition We think, therefore, that it would be convenient for the researchers of this area to study complex processes with the help of a technique of statistical analysis converting experimental sequences into diffusion processes 27 as well as by means of the aging experiment proposed in this paper. 28,30131
32933.
References 1. H. Dehmelt, Bull. Am. Phys. SOC.20,60 (1975). 2. R.J. Cook, H.J. Kimble, Phys. Rev. Lett. 54,1023 (1985). 3. M. Kuno, D.P. Fromm, H.F. Hamann, A. Gallagher, and D.J. Nesbitt, J. Chem. Phys. 115,1028 (2001). 4. R. Metzler, T. F. Nonnenmacher, International Journal of Plasticity 19,941 (2003). 5. G. W.Scott, Nature 152,152 (1943). 6. R. Metzler, J. Klafter Journal of Non-Crystalline Solids 305,81 (2002). 7. C. Tsallis, J . Stat. Phys. 52,469 (1988). 8. P. Manneville, J . Physique 41,1235 (1980). 9. G.M. Zaslavsky, Phys. Rep. 371,461 (2002). 10. R. Veberk, A. M. van Ojien, and M. Orrit, Phys. Rev.B 66,233202 (2002). 11. J.P. Bouchaud, J . Phys. I (France) 2,1705 (1992). 12. J. P. Bouchaud and D.S. Dean, J . Phys I (France) 5,265 (1995). 13. P. Allegrini, P. Grigolini, and A. Rocco, Phys. Lett. A 233,309 (1997).
80 14. M. Compiani, T. Fonseca, P. Grigolini, and R. Serra, Chem. Phys. Lett. 114, 503 (1985). 15. M. F. Shlesinger and B. D. Hughes, Physica A 109,597 (1981). 16. C. Beck, E.G.D. Cohen, Physica A 322, 267 (2003) and E.G.D. Cohen, Physica D 193,35 (2004). 17. C. Beck, Phys. Rev. Lett. 87, 180601 (2001). 18. M. Bologna, P. Grigolini, M. Pala, L. Palatella, Chaos, Solitons & h c t a l s 17,601 (2003). 19. T. Geisel, A. Zacherl, and G. Radons, Phys. Rev. Lett. 59,2503 (1987). 20. G. Aquino, M. Bologna, P. Grigolini and B. J. West, Phys. Rev. E 70, 036105 (2004). 21. X. Brokmann, J. -P. Hermier, G. Messin, P. Desbiolles,, J. -P. Bouchaud, and M. Dahan, Phys. Rev. Lett. 90 120601 (2003). 22. S. Bianco, P. Grigolini, P. Paradisi, in preparation. 23. P. Allegrini, G. Aquino, P. Grigolini, L. Palatella, and A. Rosa,Phys. Rev. E 68, 056123 (2003). 24. E. Barkai, Phys. Rev. Lett. 90,104101 (2003). 25. R. Kubo, Adv. Chem. Phys. 15,101 (1969). 26. F. Barbi, P. Grigolini, P. Paradisi, work in progress. 27. N. Scafetta, P. Grigolini, Phys. Rev. E 66,036130 (2002). 28. M. Bologna, P. Grigolini and B. J. West, Chem. Phys. 284,115 (2002). 29. P. Allegrini, J. Bellazzini, G . Bramanti, M. Ignaccolo, P. Grigolini, and J. Yang, Phys. Rev. E 66,015101 (2002). 30. G. Aquino, L. Palatella, P. Grigolini, Phys. Rev. Lett 93,050601 (2004). 31. I.M. Sokolov, A. Blumen and J. Klafter, Europhys. Lett. 56,175 (2001). 32. P. Allegrini, P. Grigolini, L. Palatella and B. J. West, Phys. Rev. E 70, 046118 (2004). 33. P. Allegrini, G. Aquino, P. Grigolini, L.Palatella, A. Rosa, B. J. West, cond-
mat/0409600. 34. P. Grigolini, Adv. Chem. Phys. 62,1 (1985).
NONEQUIVALENT ENSEMBLES AND METASTABILITY
HUGOTOUCHETTE School of Mathematical Sciences, Queen M a y , University of London, London E l 4NS, UK RICHARD S. ELLIS Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA 01003, USA This paper reviews a number of fundamental connections that exist between nonequivalent microcanonical and canonical ensembles, the appearance of first-order phase transitions in the canonical ensemble, and thermodynamic metastable behavior.
1. Introduction The goal of this short paper is to trace a line of relationships that goes from the phenomenon of nonequivalent microcanonical and canonical ensembles to that of thermodynamic metastability. Our approach will aim for the most part a t stressing the physics of these relationships, but care will also be taken t o formulate them in a precise mathematical language. However, due to the limitation in available space, many mathematical details will have to be left aside, including the proofs of all the results stated here. References containing these proofs, when they exist, will be mentioned to assist the reader. Another more complete paper that treats these relationships with full mathematical details is also in preparation,12 based on the recent doctoral dissertation of one of us.21 Far from being exhaustive, we hope that this short review can serve as a starting point in the literature for the reader interested in knowing about nonequivalent ensembles, as well as those interested in phase transitions and metastable behavior in many-body systems. These proceedings are a testament t o the fact that there remain at present many unsolved problems related to metastability, and it is our belief that what has been learned in studies of nonequivalent ensembles could yield useful clues for solving these problems. Perhaps the most obvious of these clues is the fact that many of the systems described in these pages (see, e.g., the contributions on the HMF model) exhibit negative values of the heat capacity at fixed energies at the same time that they exhibit metastable states. The negativity of the heat capacity at fixed energy is well known to be related to the nonequivalence of the microcanonical and canonical ensembles. It is also, as we will see here, a direct indication of metastable behavior.
81
82 2. Nonequivalent ensembles
The equivalence of the microcanonical and canonical ensembles is mast usually explained by saying that although the canonical ensemble is not a fixed-meanenergy ensemble like the microcanonical ensemble, it must ‘converge’ to a fixedmean-energy ensemble in the thermodynamic limit, and so must become or must realize a microcanonical ensemble in that limit.11i22 This explanation is not far from being entirely valid, but there is a problem with it: the canonical ensemble may not in fact realize at equilibrium all the mean energies that can be realized in the microcanonical ensemble.22 In other words, the range of the equilibrium mean energy up realized in the canonical ensemble by fixing the inverse temperature p may be only a subset of the range of definition of the mean energy u itself. If this is the case, then the microcanonical ensemble must be richer than the canonical ensemble because there are values of the mean energy that can assessed within the microcanonical ensemble, but not within the canonical ensemble. The two ensembles must therefore be nonequivalent. To see how this possibility can arise, and how it is related in fact to the nonconcavity of the microcanonical entropy function, let us introduce some notation. We consider, as is usual in statistical mechanics, an n-body system with Hamiltonian U and mean entropy s(u) = S ( U ) / n , where u = U / n is the mean energy. To state our first result, we need to define an important concept in convex analysis known as a supporting Eine.1g9g This is done as follows: we say that s admits a supporting line at u if there exists p E R such that s(v) 5 s(u) O(v - u ) for all admissible u. From a geometric point of view, the requirement of a supporting line should be clear: it means that we can draw a line above the graph of s(u) that passes only through the point (u,s(u)); see Fig. 1. The slope of this line is 8.
+
Theorem 2.1. Let up be the value of the mean Hamiltonian realized at equilibrium in the canonical ensemble with inverse temperature p. (There can be more than one equilibrium value.) Then, f o r any admissible mean energy value u, there exists p such that up = u i f and only i f s admits a supporting line at u with slope p. This simple result seems to have floated in the minds of physicists for a long time. It is implicit, for example, when considering the physical meaning of first-order phase transitions in the canonical ensemble and their connection with nonconcave entropies.20~17~13~1*~14~15~6~16 However, to the best of our knowledge, there has never been a clear formulation of this result until recently.21i22This can be explained in part by the fact that the concept of a supporting line is not well known in physics. The full application of our first theorem is presented in Fig. 1 which shows the plot of a generic entropy function having a nonconcave part. This figure depicts three possible cases: (a) Mean energy value u for which s admits a supporting line. In this case, the value u can be realized at equilibrium in the canonical ensemble by setting /3 equal to the slope of the supporting line passing through (u,s ( u ) ) . We naturally expect
83
Figure 1. (a) Concave point of the microcanonical entropy function S(U) which admits a supporting line. (b) Two concave points of s(~) which admit the same supporting line. (c) Nonconcave point of S(U) which does not admit a supporting line.
in this case to see the microcanonical ensemble at u give the same equilibrium predictions as the canonical ensemble at /3 since the latter ensemble reduces to a single-mean-energy ensemble with u p = u. Note that /3 must be such that s’(u) = /3 if s is differentiable. (b) There exists a single supporting line that touches two points of the graph of the microcanonical entropy. In this case, not one but two values of the mean energy--e.g., ul and ‘uh in Fig. 1-are realized at equilibrium in the canonical ensemble for /3 corresponding to the slope of the supporting line. This situation, as will be clear in the next section, corresponds to a state of coexisting phases which universally signals the onset of a &&-order phase transition in the canonical ensemble. (c) Mean energy u for which s admits no supporting line. This also applies for all u E (ul,uh). Theorem 2.1 states for this case that the canonical ensemble must be blind to the properties of the microcanonical ensemble since it cannot realize at equilibrium any of the mean energies u E (ul,uh) for any values of /3. This means in particular that the standard thermodynamic relation /3 = s‘(u) ceases to be valid in this The next theorem relates this case of nonequivalent ensembles with the occurrence of negative values of the heat capacity in the microcanonical ensemble.14315 Theorem 2.2. Define the microcanonical heat capacity at the mean energy value u by c(u) = -s’(u)~s”(u)-~. If c(u) < 0, then s does not have a sapporting line at u. This result is a new formulation-again because the use of supporting linesof an old result that relates the negativity of c with the nonequivalence of the microcanonical and canonical ensembles. Usually what is concluded is that these two ensembles must be nonequivalent when c < 0 because the heat capacity can Our formulation has the never be negative in the canonical ensemble.20~17~18~14J5 advantage of stressing the physical root of negative heat capacities, namely that the mean energies u which are such that c(u) < 0 are not equilibrium mean energies in the canonical ensemble. This point will be discussed further in Section 4. For now, let us note in closing this section that the negativity of c is only a sufficient
84 condition for ensemble nonequivalence, not a necessary one.11122 Thus it is not true that the canonical ensemble is blind t o the microcanonical ensemble only for those mean energy values u such that C ( U ) < 0, as is often claimed.20J7~18J4~15 As we have seen, the canonical ensemble is in fact blind to the microcanonical ensemble for all u at which s admits no supporting lines, and that, in general, comprises more values u than only those having C ( U ) < 0; see, e.g., Fig. 1.
3. Nonequivalent ensembles and first-order phase transitions The previous section makes it clear that what is responsible for the nonequivalence of the microcanonical and canonical ensembles is the occurrence of a first-order phase transition in the canonical ensemble. To be sure, just replace the word 'blind' with the word 'skip' t o obtain a sentence such as: the microcanonical and canonical ensembles are nonequivalent because the canonical ensemble skips over an inter~~~~ inverse ~*~~ val of mean energies which can be accessed r n i c r o c a n ~ n i c a l l y .The temperature at which the canonical ensemble skips over the microcanonical ensemble corresponds, not surprisingly, to the inverse temperature at which a first-order phase transition appears. This is the subject of the next theorem which relates the nonconcavity property of s(u) with the differentiability property of the free energy function &?), the central thermodynamic quantity of the canonical ensemble which is taken here t o be defined by the limit
cp(p) = n+m lim - ?nl n ~ , ( ~ ) ,
(1)
where is Zn(B) is the standard n-body partition f ~ n c t i o n . ~ ~ ' '
Theorem 3.1. Assume that s admits no supporting lines f o r all u E (ul,uh). Then cp is non-di'erentiable at a critical value Dc equal to the slope of the supporting line that bridges u1 and U h . The left- and right-derivatives of cp at Bc equal U h and u1, respectively. This theorem is a direct result of the fact that cp(B) is the Legendre-Fenchel transform of s ( u ) , whether s(u) is concave or not, and some basic properties of these transform^.^^^^^^^ It can be found in many works20J7J8J4J5 which do not define, however, the concavity of s(u) in terms of supporting lines. This is a minor omission because most of these works use an equivalent method for defining the range of nonconcavity of s(u) based on the so-called Maxwell's con~truction.' In any case, it is clear in all the works just cited that the nonequivalence of the microcanonical and canonical ensembles arises as a consequence of first-order phase transitions in the canonical ensemble. The nonconcavity of s ( u ) , which translates into a 'back-bending' shape of s'(u), is in fact sometimes taken as a definition ' ~ ' ' ~opposite is also or a probe of canonical first-order phase t r a n ~ i t i o n s . ' ~ ~ ' ~The possible; that is, it is possible t o relate the absence of a first-order phase transition in the canonical ensemble with the equivalence of the microcanonical and canonical ensembles." This is done in the next theorem.
85
Theorem 3.2. If cp is differentiable at p, then s admits a strict supporting line that touches the graph of s only at u = cp’(B). This result implies the following standard result: if cp is differentiable at 8, then u = cp’(@) is the unique mean energy value realized at equilibrium in the canonical ensemble with inverse temperature
p.
4. Nonequivalent ensembles and metastability
The last set of results that we will discuss directly pertains to the mean energies which can be assessed microcanonically but not canonically. What we want to show is that these nonequivalent mean energies correspond to nonequilibrium critical mean energies of the canonical ensemble. This is somewhat obvious given that they cannot be equilibrium mean energies; however, what we want to discuss more specifically is the physical nature of these nonequilibrium critical points. To do so, we have to note that the values u p of the mean energy that are realized at equilibrium in the canonical ensemble at ,8 are, by definition, the global minimum > ~ ~ we call the nonequilibrium free energy of the function Fp(u)= /?u - S ( U ) , ~ which function.21g22This implies in particular that u p must satisfy &Fp(up) = 0, or equivalently /3 = s’(up), assuming that s is differentiable. Note however-and this is the crucial point here-that not all the points u satisfying p = s’(u) may globally minimize Fp(u);some of these critical points may actually correspond to local minimum of F,(u) or even local maximum of Fp(u). To determine the precise nature of these nonequilibrium canonical critical points, we can look at the sign of the second u-derivative of Fp(u)to obtain the following result. Theorem 4.1. Suppose that s does not admit a supporting line at u. (a) If ~ ( u>) 0, then u is a metastable mean energy of the canonical ensenable, in the sense that at is a local but not global minimum of Fa(.) for @ = s’(u). (b) If c(u) < 0, then u is an unstable mean energy of the canonical ensemble, in the sense that it is a local mmimum of Fp(u) for p = s’(u). While this result applies to the mean energy, it is interesting to see if anything can be said about general macrostates: e.g., the magnetization or the distribution of states. We all know, for instance, that phase transitions in spin systems can be revealed at the level of the mean energy (thermodynamic level) or at the level of the magnetization (macrostate level) since both levels are related in a one-to-one fashion. Is the same true for metastability? That is, can the metastable behavior of a system be revealed at the macrostate level? If so, can this macrostate level of metastability be related to the thermodynamic level of metastability defined with respect to the mean energy? The answer to these questions is yes, so long as we are concerned with mean-field systems, which are basically systems for which the Hamiltonian U can be expressed as a function of some macrostate m of intere~t.2~2~ In this case, we can formulate the
86 following result about the macrostate values mu which are realized at equilibrium in the microcanonical ensemble with mean energy u, but not in the canonical ensemble for any 8. The result is formulated in terms of the nonequilibrium free energy Fg(rn) which is the macrostate generalization of Fp (u).~'
Theorem 4.2. Suppose that s does not admit a supporting line at u. (a) If C ( U ) > 0, then mu is Q metastable macrostate of the canonical ensemble, in the sense that it is a local but not global minimum of Fg(m)for j3 = s'(u). (b) If C ( U ) < 0, then mu is an unstable macrostate of the canonical ensemble, in the sense that it is a saddle-point of Fg(m)for /3 = s'(u). What this results says physically is that a macrostate value mu which is stable in the microcanonical ensemble can become unstable and thus decay in time if we release the energy constraint and fix the inverse temperature instead, as in the canonical ensemble. The precise way in which mu decays in the canonical ensemble t o a different equilibrium value mg is determined by the local geometry of Fg(m) around mu which is determined, in turn, by the sign of c(u). For more details on this result, the reader is referred to two papers'*'' which contain the result of Theorem 4.2 in a more or less conjectured form. A proof of this theorem can be found in a recent proceedings paper of Campa and Giansanti.' Another proof will be presented elsewhere.12
5. Concluding remarks The present paper hardly exhausts the subject of nonequivalent ensembles and metastability. In going further, we could have reviewed recent works on the dynamics of nonequivalent states in the canonical e n ~ e m b l e ? J ~as>well ~ as the dynamical stability of these states" which is discussed, for example, in Anteneodo's contribution to these proceedings using an approach based on Vlasov's equation. We could have alluded also t o the fact that nonconcave entropies are seen in fields .~ we could have as disconnected as string theory7 and multifractal a n a l y ~ i sFinally, mentioned our recent work on a generalization of the canonical ensemble which aims at converting unstable and metastable states of the canonical ensemble into stable, equilibrium states of a modified canonical ensemble so as t o recover equivalence with the microcanonical ensemble.8 Research is ongoing on this topic.
Acknowledgments One of us (H.T.) would like t o thank the organizing committee of the Complexity, Metastability and Nonextensivity Conference for its hospitality and for financial support. The research of H.T. was supported by NSERC (Canada) and the Royal Society of London, while that of R.S.E. was supported by the National Science Foundation (NSF-DMS-0202309).
87 References 1. M. Antoni, S. Ruf€o,A. Torcini, Phys. Rev. E 66,025103, (2002). 2. M. Antoni, S. R d o , A. Torcini, Europhys. Lett. 66,645 (2004). 3. C. Beck, F. Schlogl, Thermodynamics of Chaotic Systems (Cambridge University Press, Cambridge, 1993). 4. F. Bouchet, Phys. Rev. E 70,036113 (2004). 5. A. Campa, A. Giansanti, Physica A 340 , 170 (2004). 6. Ph. Chomaz, F. Gulminelli, V. Duflot, Phys. Rev. E 64,046114 (2001). 7. M.A. Cobas, M.A.R. Osorio, M. Su&rez, Phys.Lett. B 601,99 (2004). 8. M. Costeniuc, R.S. Ellis, H. Touchette, B. Turkington, cond-mat/0408681. 9. R.S. Ellis, K. Haven, B. Turkington, J. Stat. Phys. 101,999 (2000). 10. R.S. Ellis, K. Haven, B. Turkington, Nonlinearity 15,239 (2002). 11. R.S. Ellis, H. Touchette, B. Turkington, Physica A 335,518 (2004). 12. R.S.Ellis, H. Touchette, Nonequivalent ensembles, metastability, and first-order phase transitions, in preparation (2005). 13. G.L. Eyink, H. Spohn, J. Stat. Phys. 70,833 (1993). 14. D.H.E. Gross, Phys. Rep. 279,119 (1997). 15. D.H.E. Gross, Microcanonical Thermodynamics: Phase ??ansitions in “Small” Systems, Lecture Notes in Phyics, Vol. 66 (World Scientific, Singapore, 2001). 16. F. Gulminelli, Ph. Chomaz, Phys. Rev. E 66,046108 (2002). 17. P. Hertel, W. Thirring, Ann. Phys. ( N Y ) 63,520 (1971). 18. D. Lynden-Bell, Physica A 263,293 (1999). 19. R.T. Rockafellar, Convex Analysis (Princeton University Press, Princeton, 1970). 20. W. Thirring, 2. Physik. 235,339 (1970). 21. H. Touchette, Equivalence and nonequivalence of the microcanonical and canonical ensembles: a large deviations study, Ph.D. Thesis, McGill University (2003). 22. H. Touchette, R.S. Ellis, B. Turkington, Physica A 340,138 (2004).
This page intentionally left blank
Applications in Physics
This page intentionally left blank
STATISTICAL PHYSICS FOR COSMIC STRUCTURES
LUCIAN0 PIETRONERO Dipartimento di Fisica, Uniuersitd d i Roma “La Sapienza”, P.le A . Moro 2, 00185 Rome, Italy FRANCESCO SYLOS LABINI “Enrico Fermi Center”, Via Panisperna 89 A, Compendio del Viminale, 00184 Rome, Italy and Istatuto dei Sistemi Complessi CNR, Via dei Taurini 19, 00185,Rome Italy Galaxy structures form a complex irregular pattern, characterized by clusters of galaxies which are organized in filaments around large voids. Recently the Sloan Digital Sky Survey project is extending our knowledge of the galaxy distribution, by discovering new and larger structures. The fractal properties are becoming more clear and well established and the issue of the possible crossover toward homogeneity a t large scale is revamping. We review the main statistical properties of galaxy distributions discussing the main theoretical open question in the perspective of complex systems.
1. Introduction
In the past twenty years observations have provided several three dimensional maps of galaxy distribution, from which there is a growing evidence of large scale structures. This important discovery has been possible thanks to the advent of large redshift surveys: angular galaxy catalogs are in fact essentially smooth and structureless. In Fig.1 we show a slice of the Center for Astrophysics galaxy catalog (CfA2), which was completed in the early nineties’, and a slice constructed from the recent observations of the Sloan Digital Sky Survey (SDSS) project2’. In the CfA2 catalog, which was one of the first maps surveying the local universe, it has been discovered the giant “Great Wall” a filament linking several groups and clusters of galaxies of extension of about 200 Mpc/h a and whose size is limited by the sample boundaries. Recently the SDSS project has reveled the existence of structures larger than the aThe typical mean separation between nearest galaxies is of about 0.1 Mpc. By local universe one means scales in the range [I, 1001 Mpc/h, where space geometry is basically Euclidean and dynamics is Newtonian, i.e. effects of General Relativity are negligible. On larger scales instead, space geometry becomes an important element and relativistic corrections start play a role for the determination of the geometry and dynamics. The size of the universe, according to standard cosmological models is about 5000 Mpc/h. 1 Mpc N 3 x loz2 m; distances are given in units of h, a parameter which is in the range [0.5,0.75] reflecting the incertitude in the value of the Hubble constant (H=100 h km/sec/Mpc) used t o convert redshift z into distances d = c / H z (where c is the velocity of light).
91
92 Great Wall, and in particular in Fig.1 one may notice the so-called “Sloan Great Wall” which is almost double longer than the Great Wall. Nowadays this is the most extended structure ever observed, covering about 400 Mpc/h, and whose size is again limited by the boundaries of the sample13.
Figure 1. Progress in redshift surveys: it is reported the “slice of the universe” from the CfA2 redshift survey (De Lapparent, Huchra, Geller, 1989) and the new SDSS data (Gott el al. 2003). This cone diagram represents the reconstruction of a thin slice observed from the Earth which is in the bottom. The CfA2 slice has an depth of 150 Mpc/h, while the SDSS slice has a depth of 300 Mpc/h. The “Great Wall” in the CfA2 slice and the new “Sloan Great Wall” in the SDSS slice are the dominant structures in these maps and they are clearly recognizable. For comparison we also show a small circle of size of 5 Mpc/h, the typical clustering length separating the regime of large and small fluctuations according t o the standard analysis (elaboration from GSLJP).
The search for the “maximum” size of galaxy structures and voids, beyond which the distribution becomes essentially smooth, is still an open problem. Instead the fact that galaxy structures are strongly irregular and form complex patterns has become a well-established fact. From the theoretical point of view the understating of the statistical characterization of these structures represents the key element to be considered by a physical theory dealing with their formation. The primary questions
93 that such a situation rises are therefore: (i) which is the nature of galaxy structures and (ii) which is the maximum size of structures ? A number of statistical concepts can be used to answer to these questions: in general one wants to characterize npoint correlation properties which are able to capture the main elements of points distributions12. The main statistical tool traditionally used to described these structures is the reduced two-point correlation function which measures the correlation between density fluctuations,
Since the early sixties2’ it has been found that in the range of scales from about 0.1 Mpc/h to 10 Mpc/h, this is a simple power-law function
with y M 1.8 and ro M 4.7 Mpclh. Note that EE(ro) = 1 and the length scale TO separates the regime of large fluctuations from the regime of weak clustering (or of small fluctuations). This would correspond to the fact that large scale structures of size of hundreds Mpc, as the ones observed in Fig.1, can be treated as small perturbations of a smooth matter distribution with average density well defined beyond ro M 5 Mpc/h. Despite the fact that this result has been found through the reconstruction in real space of the angular correlation function estimated in a catalog where only the two angular coordinates and the apparent flux were measured, it has been confirmed by many other authors in many different redshift surveys5, with the additional evidence that its amplitude changes for galaxies of different luminosity in samples of different sizes. This variation in amplitude is usually ascribed a posteriori to an intrinsic difference in the correlation properties of galaxies of different luminosity “luminosity bias”6 -: brighter galaxies present larger values of ro. Theoretically this is interpreted through the concept of biasing15 or more recently in terms of “halo model^"^. In simple terms, one may explain the amplification of the amplitude of EE(r) by sampling, in a correlated manner, a (correlated) point distribution resulting from a gravitational N-body simulation26. In such a way, when one selects only the points lying in the largest non-linear structures present in a simulation one does not change the small scale correlation properties but one does alter the value of the average density to which fluctuations are normalized, lowering it 23. The net effect is a change of the amplitude of ( ( T ) , without changing the typical size of strongly correlated structures26. However such a qualitative statement has never been put on quantitative grounds, i.e. there is not a prescription which may say what is the amplification for fluctuations of a given amplitude in the interesting regime of strong clustering (i.e. [ ( r ) >> 1). Although “luminosity bias” plays then the fundamental role of reconciling the presence of large scale structures of hundreds
94
Mpc with the detection of a typical clustering scale of 5 Mpc/h, almost twenty years after its definition6, it remains an undetermined fitting scheme to interpret galaxy correlations. The variation in amplitude of t E ( ~ however, ) , has a much simpler explanation in the context of irregular distributions, for the following reasons (see GSLJP for a comprehensive discussion of the problem). At scales where the t E ( r )has a powerlaw behavior and its amplitude is larger than unity, the average density is ill defined: fluctuations are still too large to allow a fair estimation of the average density. For this reason the scale TO beyond which C(T) < 1 has a crucial importance for the characterization of structures both phenomenologically, as it sets the minimal size of a sample needed to have a “good” estimation of the average density, and theoretically, as it separates the regimes of large and small fluctuations. However, in general, there can be a problem with the estimation of the homogeneity scale by tE( ro)= 1: this is related t o the fact that the average density itself could have a power-law behavior at scales of order of, or larger than, the sample size and thus in such a situation any estimation of the average density, or of the scale T O , becomes sample size dependent: In this case such a length scale is not intrinsic to galaxy clustering. A direct test to look for the homogeneity scale consists in the study of the conditional density12 defined, for stochastic and isotropic distributions, as
so that (n(7))p dV gives the +priori probability of finding 1 particles placed in the infinitesimal volumes dV around ?with the condition that the origin of coordinates is occupied by a particle, i.e. it represents the average density of particles seen by a fixed particle at a distance T from it. The symbol ( X ) indicates the ensemble average. One may also consider the integrated conditional density,
which measures the average density of particles in spheres of volume V ( r ) and radius T instead of in spherical shells as ( n ( ~ does. ) ) ~ The characteristic scale XO which marks the transition from the regime of large fluctuations to the one of small fluctuations can be defined as the length scale beyond which the average density becomes well-defined, or beyond which fluctuations become small with respect to the average density, i.e. for example l(n(F))P- no( < n o for all T > XO. The operative way to detect the homogeneity scale (i.e. without knowing a priori the value of no) consists in the determination of the flattening of the conditional density ( n ( ~ )i.e. ) ~when , ( n ( T ) ) p M no > 0 for all T 2 XO. In such a situation, the exact relation between TO and A-, depends on the functional behavior of the crossover to homogeneity of ( n ( r ) ) p :when it has a abrupt change from a power-law with exponent y to a constant value then TO M X 0 / 2 ~ / 7 .
95 We noticed” that in many galaxy samples, up to a depth of tens of Mpc where p Br-7, where B is a constant the full volume average could be performed, ( n ( r ) ) = related to the lower cut-off of the distribution. The most evident interpretation of such power-law correlations is that galaxy structures form a statistically isotropic stochastic fractal: in a such a case the exponent is related t o the fractal dimension D trough the equation y = 3 - D . Such a fact opens a simple way t o understand the variation of ro observed in different catalogs. In fact, it is simple to showz4 that in a fractal point distribution contained in a spherical sample of radius R, and volume V = 4~R:/3, with N points, the estimator of <(r)can be written as
<E ( r ) = L - - 1 p - nE
(4)
given that the estimator of the sample density is, for instance, nE = N/V N (3/D)BRF-3 and, for isotropic self-averaging fractals ( n ( r ) ) p E= Br-7 as long as the volume average can be properly performed”. The latter equation makes clear that t E ( r )has a power-law behavior up to the sample-dependent scale TO =
(+)& R,. Its exponent is however slightly distorted when estimated at scales of order To1’. Recently a team of the SDSS c~llaboration’~ has measured the full sphere estimator” of ( n ( r ) ) pas a function of scale in a sample of the SDSS survey which covers, to date, the largest volume of space ever considered for such an analysis with a very robust statistics and precise photometric calibration (see Fig.2). They found that: (i) There is clearly a “fractal regime”, with a dimension D M 2, which appears to terminate at somewhere between 20 and 30 Mpc/h - this behavior agrees very well with what we found at the scales we could probe properly (i.e. by making the full volume average) with the samples at our disposal a few years agoZ4>l8.Note that the expone$ (y = 1 ) is different from the standard quoted value y = 1.85 and this is due to the sharp break of t E ( r )around ro evidenced by Eq.4. (ii) The data show then a slow transition t o homogeneity in the range 30 < r < 70 Mpc/h, where a flattening of the conditional density seems to occur for scales larger than A0 M 70 Mpc/h: the tendency toward homogeneity shown near 70 Mpc/h in the SDSS data occurs at a scale comparable with the sample size precisely where its statistical validity becomes weaker. Often in the past, samples have shown finite size effects which produced this type of behavior, which was then eliminated by deeper samples. For example such a high value of A0 implies that all previous determinations of ro are biased by the finite size effect expressed by Eq.4 19. In fact the estimated ro has grown of about a factor 3 from 5 Mpc/h to about 13 Mpc/h in the most recent dataZ9J9. Whether the latest measurement will remain stable in future larger samples is a key issue to be determined, and this is directly related to the reality of the flattening at 70 Mpc/h: The issue will be clarified soon, as the volume surveyed by the SDSS will increase rapidly in the near future.
96
h
&
W
B
Figure 2. Behavior of the conditional density in the SDSS data and in a sample of the CfA2 catalog. The agreement between the two different samples is excellent up to 20 f 30 Mpc/h. N
Figure 3. Fluctuation field of the cosmic microwave background radiation measured by the WMAP satellite (Bennett et al. 2003). Different colors refer to different temperature fluctuations with respect t o the mean temperature of T=2.73 ...Kelvin and have an amplitude of hundreds of micro Kelvin. The WMAP data have been measured to be consistent with primordial Gaussian fluctuation (Komatsu et al.,2003).
97 Irrespective of the scale to which a simple fractal behavior is finally found to persist, we would like to stress that the understanding of the main observational issues above mentioned (scaling regime, homogeneity scale, “luminosity bias”, etc.) can be achieved by studying the conditional density: In this respect we emphasize three key questions for the clarification of the problem of large scale galaxy correlations. (i) Do galaxies of different luminosity show the same behavior f o r (n(r)),? The question can be then rephrased as whether galaxy samples constructed by different luminosity thresholds show (or not) the same (n(r)),behavior and particularly the same homogeneity scale. If this the case, a variation of the amplitude B is clearly expected for galaxies of different luminosity” while the fractal dimension D past we have collected can show a weak dependence on luminosityz4. In the past evidences that ( n ( r ) ) ,is roughly the same for galaxies of different luminosity and concluded that the variation of rg is indeed due to a finite size effect described by Eq.4. Actually we also related the higher values of rg found for galaxy clusters (rg M 15 + 25 Mpc/h) to the same finite-size effectz4. In this situation a map of galaxy clusters is the coarse grained representation of the galaxy distribution. (ii) Will the complete SDSS confirm these results at large scales ? On large scales (r> 3OMpc/h) Hogg. et a1 (2004) provided evidence that there is a slow crossover to homogeneity, which is reached at about 7OMpc/h: The problem is related to the statistical robustness of these results and to whether the volume is large enough to allow a unambiguous determination of the conditional density at those scales. (iii) What is the nature of power-law ? As discussed in GSLJP a power-law behavior in ( n ( r ) ) ,can be associated to a genuine fractal nature of the underlying distribution or to a particular distribution of spherically symmetric clusters (known as halos) with power-law density profiles - with the additional presence of other finite size effects. These are different structures (the first non-analytical and the second analytical) and thus it is needed a specific test which would discriminate between the two. The behavior of the box-counting dimension26 or of the conditional varianceg can be ways to study this problem, especially when comparing galaxy data with results of gravitational N-body simulations (see discussion below). We have collectedevidences with a limited statistical confidence, that the box counting dimension is consistent with the exponent found by the conditional density, thus supporting the fractal nature of galaxy structures. On the other hand particle distributions extracted from N-body simulations do show a box counting dimension equal to the space dimension and hence they are not fractal, although their conditional density show a power-law behaviorz6. As for the problem of the homogeneity scale, with the data now becoming available from SDSS it should be possible to determine the answer to this basic question about the nature of the galaxy distribution more definitively. If a real crossover to homogeneity would happen at some scale this would identify the “maximum” size of structures. For example an homogeneity scale of order 100 Mpc/h would be more in agreement with observations of large scale structures
98 extending over some hundreds Mpc like the Great Wall and the Sloan Great Wall previously mentioned. In this view, such structures are not “isolated” fluctuations or “rare” events, but rather the typical structures expected given the measured behavior of the conditional density. 2. The problem of gravitational structure formation
Let us now discuss the relation of the observed galaxy structures with the initial matter density field in standard cosmological models like the Cold Dark Matter (CDM) one. In these models dark matter plays the dominant role of providing density fluctuations which, from the one hand are compatible with observations of the Cosmic Microwave Background Radiation (CMBR - see Fig.3) and from the other hand they are large enough to allow the formation of non linear structures at the present time. These density fluctuations represent the seed which will give rise, through a complex non linear dynamics, to the galaxy structures we observe today. In fact the fluctuation field observed in the CMBR, a relict of the high energy process occurred in the early universe according to standard models, is related to the three dimensional dark matter distribution by considering the (gravitational) interaction between matter and photons. Note that in the CDM scenario (non-baryonic) dark matter has one hundred times the density of baryonic (i.e., visible) matter. Both on large and small scales the coIrelation properties of the matter distribution would have a quantitative correspondence to the correlations properties of the CMBR fluctuations2’. Let us discuss the large scales situation which correspond to the large angles correlation properties of the CMBR fluctuations. The most prominent feature of the initial conditions derived from inflation in the matter distribution is that it presents on large scale super-homogeneous features”. This means the following. If one consider the paradigm of uniform distributions, the so-called Poisson process where particles are placed completely randomly in space, the mass fluctuations in a sphere of radius R growths as R3, i.e. like the volume of the sphere. A super-homogeneous distribution is a system where the average density is well defined (i.e. it is uniform) and where fluctuations in a sphere grow slower than in the Poisson case, e.g. like R2:in this case one talks about “surface fluctuations” to differentiate them from Poisson-like volume fluctuations. (Note that a uniform system with positive correlations present fluctuations which grow faster than Poisson). For example a perfect cubic lattice of particle is a superhomogeneous system: it is a long-range ordered distribution. In statistical physics systems of this kind are found, for example, in plasma physics (the so-called one component plasma”), and are in general described by a dynamics which at thermal equilibrium gives rise to such configurations. Note that small fluctuations in the one component plasma are Gaussian l1 and that the WMAP data have been measured to be consistent with primordial Gaussian fluctuation 16: for this reason such a model well captures the main statistical properties of the primordial density field. CDM models, in addition to the condition of super-homogeneity which in the lit-
99
erature is called the Harrison-Zeldovich condition for the power spectrum” (Fourier conjugate of the real space correlation function [ ( r ) ) ,have a particular feature in the correlation function: it is positive at small scales, it crosses zero at scales of order 100 Mpc/h and then it is negative approaching zero with a tail which goes as r-*12. The super-homogeneity condition says that the volume integral over all space of the correlation function is zero
Jo
d3r[(r) = 0
.
(5)
This means that there is a fine tuned balance between small-scale positive correlations and large-scale negative anti-correlationslO,lz. This is the behavior that one would like to detect in the data in order to confirm inflationary models. Up to now this search has been done trough the analysis of the power spectrum which has to go correspondingly as P ( k ) k at small k (large scales). However in the usual perspective galaxies result from a sampling of the underlying density field (i.e. one selects only the highest fluctuations of the field) and sampling a super-homogeneous fluctuation field may change the nature of correlations’. The reason can be found in the property of super-homogeneity of such distribution: the sampling, as for instance in the so-called “bias model” (selection of highest peaks of the fluctuations field) necessarily destroys the surface nature of the fluctuations, as it introduces a volume (Poisson-like) term in the mass fluctuations, giving rise to a Poisson-like power spectrum on large scales P(k) const.. The “primordial” form of the power spectrum is thus not apparent in that which one would expect t o measure from objects selected in this way. This conclusion should hold for any generic model of bias and its quantitative importance has to established in any given model8. On the other hand one may show’ that the negative r-4tail in the correlation function does not change under sampling: actually one may show that on large enough scales, where in these models correlations are small enough, the biased fluctuation field has a correlation function which is linearly amplified with respect to the underlying dark matter correlation function. For this reason the detection of such a negative tail would be the main confirmation of CDM models”. Up to now, no quantitative evidences in this respect have been reported, as clearly one should firstly establish the homogeneity scale of the conditional density in a firm way. In other words, according t o these models, one expects that there is a transition from a strongly clustered (fractal) regime on small scales to a highly uniform (super-homogeneous) regime on large scales. Let us now briefly discuss the problem of the formation of strongly clustered structures. The main instrument for treating the theoretical problem of gravitational clustering in the regime of strong (non-linear) fluctuations, is numerical (in the form of N-body simulations - NBS), and the analytic understanding of this crucial problem is very limited. The studies of NBS is interesting for fundamental reasons concerning the nature of the gravitational many-body problemz2 and for the comparison of the formed structures with observations1’. Concerning the former
-
-
100 issue, we have already mentioned the problem of “biasing”, which is the underlying scheme for the comprehension of the compatibility of the observed structures with the ones formed in NBS. Moreover we also noticed that structures in the NBS appear to be non-fractal although they show a power-law decaying conditional density26. If it will be confirmed that galaxy structures are instead fractal this will pose a fundamental problem for the pure gravitational explanation of the nature of the observed strong clustering of galaxies. On the other hand, NBS are a useful tool to study some fundamental issues in the gravitational many body problem. One of the central questions is whether one may have some analytical predictions which relate the observed power-law in the correlation function of the particles at late times with some features of the initial particle configuration. For example it has been recently observed25 that in a broader class of gravitational NBS it is shown a universal behavior in the nonlinear clustering which develops, characterized by the eqponent of the two-point power-law correlation function. The nature of clustering in the non-linear regime is associated with what is common to all these simulations: their evolution in the nonlinear regime is dominated by fluctuations at small scales, which are similar in all cases at the time this clustering develops. Such “shot-noise” fluctuations are in fact intrinsic to any particle distribution. This corresponds to domination by nearest neighbor interactions when the first non-linear structures are formed‘. The open problem is that of understanding whether large non-linear structures, which at late times contain many particles, are produced solely by collision-less fluid dynamics, or whether the particle collisional processes are important also in the long-term, or whether they are made by a mix of these two effects.
Acknowledgments We thank T. Baertschiger, A. Gabrielli, and M. Joyce for useful discussions.
References 1. Baertschiger T.& Sylos Labini, F., Phys.Ftev.D, 69,123001-1 (2004) 2. Bennett, C.L., et al., Astrophys.J.Suppl., 148,1, (2003) 3. Coleman, P.H. & Pietronero, L., Phys.Rep., 231,311 (1992) 4. Cooray, A., & Sheth R., Phys.Rep., 372,1 (2002) 5. Davis, M. & Peebles, P.J.E., Astrophys. J., 267,46 (1983) 6. Davis, M. et al., Astrophys.J.Lett., 333,L9 (1988) 7. De Lapparent V., Geller M. & Huchra J., Astrophys.J., 343,1 (1989) 8. Durrer, R., Gabrielli, A., Joyce, M., Sylos Labini, F., Astrophys.J.Lett, 585,L1, (2003) 9. Gabrielli, A. & Sylos Labini, F., Europhys.Lett., 54,1 (2001) 10. Gabrielli, A., Joyce, M., Sylos Labini, F., Phys.Fkv.D, D65,083523 (2002) 11. Gabrielli A., Jancovici B., Joyce M., Lebowitz J., Pietronero L., Sylos Labini F., Phys.Rev., D67,043406 (2003) 12. Gabrielli A,, Sylos Labini F., Joyce M., Pietronero L., “Statistical Physics For Cosmic Structures” (Springer Verlag, Berlin, 2004) (GSLJP)
101 13. Gott, J.R. 111, Jurit, M., Schlegel, D., Hoyle, F., Vogeley, M., Tegmark, M., Bahcall, N., Brinkmann, J., astro-ph/0310571 14. Hogg, D.W., Eistenstein, D.J., Blanton M.R., Bahcall N.A, Brinkmann, J., Gunn J.E., Schneider D.P. astro-ph/0411197 15. Kaiser, N., Astrophys.J.Lett., 284, L9 (1984) 16. Komatsu, E. et al,, Astrophys. J. Suppl., 148, 119 (2003) 17. Jenkins, A., et al., Astrophys.J., 499, 20, (1998) 18. Joyce, M., Montuori, M., Sylos Labini, F., Astrophys.J., 514, L5, (1999) 19. Joyce, M., Sylos Labini, F., Gabrielli, A., Montuori, M., Pietronero, L. astro-ph/0501583 20. Joyce, M. & Sylos Labini, F., Astrophys.J.Lett., 554, L1, (2001) 21. Padmanabhan, T., “Structure formation in the universe”, (Cambridge University Press, Cambridge, 1993) 22. Saslaw, W.C., “The Distribution of the Galaxies”, (Cambridge University Press, Cambridge, 2000) 23. Sheth, R.V., Diaferio, A., Hui, L., Scoccimarro, R., Mon.Not.R.Astron.Soc., 326,463, (2001) 24. Sylos Labini, F., Montuori, M. & Pietronero, L., Phys.Rep., 293, 66, (1998) 25. Sylos Labini, F., Baertschiger, T . & Joyce, M., Europhys.Lett. 6 6 , 171, (2004) 26. Sylos Labini, F., & Baertschiger, T. preprint 27. Totsuji, H. & Kihara, T., Publ.Astron.Soc.Jpn, 21, 221 (1969) 28. York, D., et al., Astron.J., 120, 1579 (2000) 29. Zehavi, I. et al. astro-ph/0411557
METASTABILITY AND ANOMALOUS BEHAVIOR IN THE HMF MODEL: CONNECTIONS TO NONEXTENSIVE THERMODYNAMICS AND GLASSY DYNAMICS
ALESSANDRO PLUCHINO, ANDREA RAPISARDA, V I T O LATORA Cactus Group, Dipartimento da Fisica e Astronomaa, Universitd di Catania and INFN setione da Catania Via S. Sofia 64, I-95123 Catania, Italy We review some of the most recent results on the dynamics of the Hamiltonian Mean Field (HMF) model, a systems of N planar spins with ferromagnetic infiniterange interactions. We show, in particular, how some of the dynamical anomalies of the model can be interpreted and characterized in terms of the weak-ergodicity breaking proposed in the framework of glassy systems. We also discuss the connections with the nonextensive thermodynamics proposed by Tsallii.
1. Introduction
Metastability, nonextensivity and glassy dynamics are features so ubiquitous in complex systems that they are often used to characterize them or, more in general, to define complexity I. In this work we will discuss such features in the context of the so called Hamiltonian Mean Field (HMF) model, a system of inertial spins with long-range interaction. The model, originally introduced by Antoni and Ruffo 2 , has been thoroughly investigated and generalized in the last years for its anomaWith respect to systems with shortlous dynamical behavior range interactions, the dynamics and the thermodynamics of many-body systems of particles interacting with long-range forces, as the HMF, are particularly rich and interesting. In fact, more and more frequently nowadays, the out-of-equilibrium dynamics of systems with long-range interactions or long-term correlations has shown physical situations which can be badly described within the ergodic assumption that is at the basis of the Boltzmann-Gibbs thermostatistics. In all such cases it happens, for instance, that a system of particles kept at constant total energy E , does not not visit all the a-priori available phase space (the surface of constant energy E), but it seems to remain trapped in a restricted portion of that space, giving rise to anomalous distributions that differs from those expected. A few years ago, Tsallis has introduced a generalized thermodynamics formalism based on a nonextensive definition of entropy 14. This nonextensive thermodynamics is very useful in describing all those situations characterized by long-range correlations or On the other hand, the latter feature is fractal structures in phase space 3i41516,7,819,10111,12113.
6,15716.
102
103 also connected with the so called ”weak ergodicity breaking” scenario, which is at the basis of the long-term relaxation and aging observed in glassy systems. Such systems show competing interactions (frustration) and are characterized by a complex landscape and a hierarchical topology in some high dimensional configuration space 17, which, in turn, generates a strong increase of relaxation times together with metastable states and weak chaos. The Hamiltonian Mean Field model, considered in this paper, is exactly solvable at equilibrium and exhibits a series of anomalies in the dynamics, as the presence of quasistationary states (QSS) characterized by: anomalous diffusion, vanishing Lyapunov exponents, non-gaussian velocity distributions, aging and fractal-like phase space structure. Furthermore, the model is easily accessible by means of both molecular dynamics and Monte Carlo simulations. Thus, it represents a very useful “laboratory” for exploring metastability and glassy dynamics in systems with longrange interactions. The model can be considered as a minimal and pedagogical model for a large class of complex systems, among which one can surely include self-gravitating systems l3 and glassy systems lo, but also systems apparently belonging t o different fields as biology or sociology. In fact, we recently found similar features also in the context of the Kurumoto Model 18, one of the simplest models for synchronization in biological systems 19. Moreover, the proliferation of metastable states just below the critical point in a phase diagram seems to be responsible for the onset of complexity and diverging time calculation in many different kind of algorithms ‘O. In this paper we focus on two different aspects of the HMF model: its glassydynamics and the possible connections with the generalized thermodynamics. The paper is divided into two parts. In Section 2.1 we investigate the model following the analogy with glassy systems and the ‘weak ergodicity breaking’ scenario. In previous works we have shown that the “thermal explosion”, characteristic of initial conditions with finite magnetization, drives the system into a metastable glassylike regime which exhibits ’dynamical frustration’. With the aim to characterize in a quantitative way this behavior, we have explicitly suggested to introduce a new order parameter, the ’polarization’, able to measure the degree of freezing of the rotators (or particles). Here we present new numerical results reinforcing the glassy nature of the QSS’s metastability and the hierarchical organization of phase space. In Section 2.2 we investigate the links with nonextensive thermostatistics. In ref.” we have found that, for a particular class of initial conditions with constant velocity distribution and finite magnetization, the velocity correlations obtained by integrating the equations of motion of the HMF model are well reproduced by q-exponential curves. Here, we show numerical evidences that the superdiffusion observed in the anomalous QSS regime (ref. 5 , can be linked with the q-exponential long-term decay of the velocity correlations, as analitically suggested by a formula obtained by Tsallis and Bukman for a nonlinear Fokker-Planck (FP) equation, using an ansatz based on the generalized entropy.
104 2. Anomalous dynamics in the HMF model
The HMF model has been introduced originally in ref.2 with the aim of studying clustering phenomena in N-body systems in one dimension. The Hamiltonian of the ferromagnetic HMF model is:
where the potential energy is rescaled by 1/N in order to get a finite specific energy in the thermodynamic limit N -+ 00. This model can be seen as classical XYspins (inertial rotators) with unitary masses and infinite range coupling, but it also represents particles moving on the unit circle. In the latter interpretation the coordinate Bi of particle i is its position on the circle and pi its conjugate momentum (or velocity). Associating to each particle the spin vector
it is possible to introduce the following mean-field order parameter:
representing the modulus of the total magnetization. The equilibrium thermodynamical solution in the canonical ensemble predicts a second-order phase transition from a low-energy condensed (ferromagnetic) phase with magnetization M # 0, to a high-energy one (paramagnetic), where the spins are homogeneously oriented on the unit circle and M = 0. The caloric curue, i.e. the dependence of the energy density U = H/N on the temperature T,is given by U = $ (1- M 2 ) 2 , 5 . The critical point is at energy density Uc = corresponding to a critical temperature Tc = At variance with the equilibrium scenario, the out-of-equilibrium dynamics shows, just below the phase transition, several anomalies before complete equilibration. More precisely, if we adopt the so-called M1 initial conditions (i.c.), i.e. 0, = 0 for all i ( M ( 0 )= 1) and velocities uniformly distributed (water bag), the results of the simulations, in a special region of energy values (in particular for 0.68 < U < Uc) show a disagreement with the canonical prediction for a transient regime whose length depends on the system size N. In such a regime, the system remains trapped in metastable states (QSS) with vanishing magnetization at a temperature lower then the canonical equilibrium one, until it slowly relaxes towards Boltzmann-Gibbs (BG) equilibrium, showing strong memory effects, correlations and aging. This transient QSS regime becomes stable if one takes the infinite size limit before the infinite time limit.
F+
4.
a,
105
2.1. Dynamical frustration and hierarchical structure As required by the discovery of correlations and aging, which in turn imply complex trajectories of the system in phase space, it is interesting to explore directly the microscopic evolution of the QSS. This can be easily done plotting the time evolution of the Boltzmann p-space, where each particle of the system is represented by a point in a plane characterized by the conjugate variables &andpi, respectively the angular position and the velocity of the i t h particle . It has been showns that, during the QSS regime, correlations, structures and clusters formation in the p-space appear for the M 1 i.c., but not for initial conditions with zero magnetization, the so called MO ix.: in the latter case both the angles and velocities distributions remain homogeneous from the beginning and a very slow mixing of the rotators has been observed. For the M1 case, the dynamics in p-space can be clarified through the concept of "dynamical frustration": the clusters appearing and disappearing on the unit circle compete one with each other in trapping more and more particles, thus generating a dynamically frustrated situation that put the system in a glassy-like regime. In Fig.1 we show a molecular dynamics simulation where the complete distribution function f ( O , p ,t ) is considered for different values of time. In fact, we plot - for M1 i.c., N=10000 and U=0.69 - a sequence of snapshots of f ( O , p , t ) for six different times: at the beginning of the simulation (t=O), in the QSS regime (t=50500) and when the system has definitively relaxed t o the canonical BG equilibrium (t=lOOOOOO). In the QSS regime one clearly observes the presence of competing clusters, each cluster being composed by particles with both angles and velocity included in the same p-space cell (notice that, in our simulations, we considered a total of 100x100 cells for the p-space lattice). For t=1000000, instead, the equilibrium has been reached and any trace of macroscopic glassy behavior has disappeared (actually, at equilibrium - for U=0.69 - only one big rotating cluster survives, but it spreads over many cells of the p-space, thus cannot t o be dectected with the method 'one cell - one cluster'). In Fig.2 we show the power law behavior of the cluster size cumulative distributions calculated in the case U = 0.69 for several snapshots in the QSS regime at time t=200,350 and 500. For each one of the 100x100 cells a sum over 20 different realizations (events) has been performed. Then, for each cluster size (greater than 5 particles) the sum of all the clusters with that size has been calculated and plotted. As one can see from 2, the distribution does not change significatively in the plateaux region as expected. We report also a power law fit (drawn as a straight dashed line above the data points) which indicates that the cluster distribution has an approximately exponent decay -1.6. The cluster size distribution reminds closely that of percolation at the critical point, where a lenght scale, or time scale, diverges leaving the system in a self-similar state 22. More in general, it has been also suggested 23 that, optimizing Tsallis' entropy with natural constraints in a regime of long-range correlations, it is possible t o derive a power-law hierarchical
106
Figure 1. Snapshots of f(0,p,t) for t=o (upper left), 50 (upper right), 100 (center left), 200 (center right), 500 (lower left) and 1000000 (lower right). In this case we considered N=10000 at U=0.69 and M1 i.c.. See text.
107
I
,
I
I
,
,
I
1
N=lOOOO U=0.69 M1 ic (sums over 20 realizations)
1 f ::q t = 500
mn
loo
' '
I
10
100
i
Clusters Size Figure 2. We plot for the case U = 0.69 and N = 10000 (using M1 i.c.) the cumulative distribution of the clusters size, calculated for three times (t=200,350,500) in the QSS region. See text.
cluster size distribution which can be considered as paradigmatic of physical systems where multiscale interactions and geometric (fractal) properties play a key role in the relaxation behavior of the system. Therefore, we can say that the power-law scaling resulting in the distributions of Fig.2 strongly suggests a non-ergodic topology of a region of phase space in which the system remains trapped during the QSS regime (for the M I ix.), thus supporting the weak-ergodicity breaking scenario. A weak breakdown of ergodicity, as originally proposed by Bouchaud for glassy systems '*, occurs when the phase-space is a-priori not broken into mutually inaccessible regions in which local equilibrium may be achieved, as in the true ergodicitybreaking case, but nevertheless the system can remain trapped for very long times in some regions of the complex energy landscape. In fact it is widely accepted that the energy landscape of a finite disordered (or frustrated) system is extremely rough, with many local minima corresponding to metastable configurations. Since the energy landscape is rough, these local minima are surrounded by rather high energy barriers and we thus expect that these states would act as "traps" which get hold of the system during a certain time 7 . In ref.24 such a mechanism has been proposed in order to explain the aging phenomenon, i.e. the dependence of the relaxation time on the history of the system, i.e. on its age t,. Actually, it results that r,,, = t,, being T ~ , , , the longest
108 trapping time actually encountered during a waiting time t,. In other words, the deepest state encountered traps the system during a time which is comparable to the overall waiting time, a result that - in turn - allows to quantitatively describe the relaxation laws observed in glassy systems 17. Aging phenomenon has been found also in the HMF model for M1 i.c., more precisely in the autocorrelation functions decay for both the angles and velocities and for velocities only ', thus reinforcing the hypothesis that a weak ergodicity-breaking could really occur in the metastable QSS regime and could be related to the complex dynamics generated by the vanishing of the largest Lyapunov exponent and by the dynamical frustration due t o the many different small clusters observed in this regime. Such a scenario is in agreement also with the results about anomalous diffusion shown in ref.4, where the probability distribution of the trapping times, calculated for a test particle in the transient QSS regime for M1 i.c., shows a clear power law decay Ptrap
t-"
1
(3)
characterized by an exponent u related to the anomalous diffusion coefficient. In the next section we will show that the anomalous diffusion coefficient can in turn be connected with the velocity correlations decay by means of the nonextensive formalism, thus suggesting a deeper link between the latter and the weak ergodicity breaking. 2.2. Nonextensive thermodynamics and HMF model
In previous works it was shown that the majority of the dynamical anomalies of the QSS regime, among which p-space correlations, clusters and dynamical frustration, are present not only for M1 initial conditions, but also when the initial magnetization M ( t = 0) is in the range (0,1] ll. In order to prepare the initial magnetization in the range 0 < M 5 1, we distribute uniformly the particles into a variable portion of the unitary circle. In this way we fix the initial potential energy V ( 0 ) and, in turn, the magnetization. Finally, we assign the remaining part of the total energy as kinetic energy by using a water bag uniform distribution for the velocities. The velocity correlations can be calculated by using the following autocorrelation function''
where p j ( t ) is the velocity of the j-th particle at the time t. In Fig.bleft, we plot the velocity autocorrelation function (4)for N = 1000 and M ( 0 ) = 1,0.8,0.6,0.4,0.2,0. An ensemble average over 500 different realizations was performed. For M ( 0 ) 2 0.4 the correlation functions are very similar, while the decay is faster for M ( 0 ) = 0.2 and even more for M ( 0 ) = 0. If we fit these relaxation functions by means of the
109
h
c v
0 M08 1.5
0
MO
1.1
4
10'
10"
time t
time t
Figure 3. (Left) Correlation functions vs time for different initial magnetization (symbols). The solid lines are normalized q-exponentials. (Right) We plot the mean square displacement of the angular motion uz 0: tr vs time for different initial magnetizations. The exponent y which fit the data and characterize the behavior in the QSS regime is also reported for the different initial conditions used. The dashed lines have a slope corresponding t o these values
Tsallis' q-exponential function
with z = -$', and where T is a characteristic time, we can quantitatively discriminate between the different initial conditions. In fact we get a q-exponential with q = 1.5 for M ( 0 ) 2 0.4, while we get q = 1.2 and q = 1.1 for M ( 0 ) = 0.2 and for M ( 0 ) = 0 respectively. Notice that for q = 1 one recovers the usual exponential Thus for M ( 0 ) > 0 correlations exhibit a long-range nature and a decay slow power-law decay. This decay is very similar for M ( 0 ) 2 0.4, but diminishes progressively below M ( 0 ) = 0.4 to become almost exponential for M ( 0 ) = 0. In order to study diffusion, one can consider the mean square displacement of phases a2(t)defined as 15,5,618.
. N
where the symbol < ... > represents the average over all the N rotators. The quantity a2(t)typically scales as a2(t) t r . The diffusion is normal when y = 1 (corresponding to the Einstein's law for Brownian motion) and ballistic for y = 2 (corresponding to free particles). For different values of y the diffusion is anomalous, in particular for 1 < y < 2 one has superdiffusion. We notice that the quantity a 2 ( t )can be rewritten by using the velocity correlation function C ( t ) as N
110
A
1.4
Y
@
1.3
N=5000 Moso N=2000 but different i.c. N=1000 Mo=l
M, =0.8 Results averaged over 30 realizations
0.6 1.o
1.1
1.2
1.3
1.4
1.5
1.6
Anomalous diffusion exponent y
Figure 4. In this figure we check the validity of the y-q conjecture for the HMF model by plotting the ratio as a function of the exponent y for different initial conditions and system sizes. zl(3-q) = 1 f 0.1 confirming the conjecture. See text for The numerical simulations show that further details.
&
where C ( t )is defined as in Eq.4. Superdiffusion has been already observed in the HMF model for M1 initial conditions4. Recently we have also checked that, even decreasing the initial magnetization, the system continues to show superdiffusion ll. We illustrate this behavior in Fig.3-right, where one sees that, after an initial ballistic regime (y = 2) proper of the initial fast relaxation, the system shows superdiffusion in correspondence of the QSS plateau region and afterwards. The exponent goes progressively from y = 1.4 - 1.5 for 0.4 < M ( 0 ) < 1 to y = 1.2 for MO. In the latter case, we have checked that, by increasing the size of the system, diffusion tends to be normal (y 1 for N=10000). The slow decay and the superdiffusive behavior illustrated in Fig.3 can be connected by means of a conjecture based on a theoretical result found in ref.” by Tsallis and Bukman. In fact in that paper the authors show, on general grounds, that non-extensive thermostatistics constitutes a theoretical framework within which the unification of normal and correlated (driven) anomalous diffusions can be achieved. They obtain, for a generic linear force F ( z ) , the physically relevant exact (space, time)-dependent solutions of a generalized Fokker-Planck (FP) equation N
111 by means of an ansatz based on the Tsallis entropy. For our purpose, we remind here that such a FP equation, in the nonlinear ”norm conservation’’ case (v # 1 and p = I), generates Tsallis space-time distributions with the entropic index q being related to the parameter v by q = 2-v. By means of the latter, and following again ref. it is possible to recover the following relation between the exponent y of anomalous diffusion (being n2 cx t’) and the entropic index q 2 - 2 y=--(9) l+v 3-q’ Hence, being in diffusive processes the space-time distributions linked to the respective velocity correlations by the relation (7), one could think t o investigate if the relation (9) would be satisfied choosing the entropic index q , characterizing the correlation decay, and the corresponding anomalous diffusion exponent. This is done in Fig.4 where in order to check the latter hypothesis, that we call the y - q conjecture, we report the ratio vs the exponent y for various initial conditions ranging from M(O)=1 to M(O)=O and different sizes at U = 0.69. Both q and y have been taken from the results shown in Fig.3. Within an uncertainty of f0.1, the data show that this ratio is always one, thus providing a strong indication in favor of this conjecture, which is satisfied for the HMF model. Summarizing, we have shown numerical simulations which connect the superdiffusion observed in the anomalous QSS regime of the HMF model t o the q-exponential long-term decay of the velocity correlations in the same regime. This new result is very interesting because it opens a way to set a rigorous analytical link between the entropic index q and the dynamical properties of nonextensive Hamiltonian many-body systems.
&
Conclusions We have briefly reviewed some of the anomalous features observed in the dynamics of the HMF model, a kind of minimal model for the study of complex behavior in systems with long-range interactions. We have also discussed how the anomalous behavior can be interpreted within the nonextensive thermostatistics introduced by Tsallis, or in the weak ergodicity breaking framework typical of glassy systems. The two pictures are not in contradiction and probably have more strict links than previously thought, which deserve to be further explored in the future. References 1. Y. Bar-Yam, Dynamics of Complex Systems. Addison-Wesley, Reading Mass, (1997). 2. M. Antoni and S. Ruffo, Phys. Rev. E 52, 2361 (1995). 3. T. Dauxois, V. Latora, A. Rapisarda, S. Ruffo and A. Torcini, Dynamics and Thermodynamics of Systems with Long Range Interactions, T. Dauxois, S. Ruffo, E. Arimondo, M. Wilkens Eds., Lecture Notes in Physics Vol. 602, Springer (2002) p. 458
and refs. therein.
112 4. V. Latora, A. Rapisarda and S. Ruffo, Phys. Rev. Lett. 83 (1999) 2104 5. A. Pluchino, V. Latora, A. Rapisarda, Continuum Mechanics and Thermodynamics 16, 245 (2004) and refs therein. 6. C. Tsallis, A. Rapisarda, V. Latora and F. Baldovin, Dynamics and Thermodynamics of Systems with Long Range Interactions, T . Dauxois, S. Ruffo, E. Arimondo, M. Wilkens Eds., Lecture Notes in Physics Vol. 602, Springer (2002) p.140 and refs therein. 7. For a generalized version of the HMF model see: C. Anteneodo and C. Tsallis, Phys. Rev. Lett. 80, 5313 (1998); F. Tamarit and C. Anteneodo,Phys. Rev. Lett. 84, 208 (2000); A. Campa, A. Giansanti, D. Moroni, Phys. Rev. E 62, 303 (2000) and Physica A 305, 137 (2002) B.J.C. Cabral and C. Tsallis Phys. Rev. E, 66 065101(R) (2002). 8. A. Pluchino, V. Latora, A. Rapisarda, Physica D 193, 315 (2004). 9. M.A.Montemurro, F.A.Tamarit and C.Anteneodo, Phys. Rev. E 6 7 , 031106 (2003). 10. A. Pluchino, V. Latora, A. Rapisarda, Phys. Rev. E 69, 056113 (2004) and Physica A 340, 187 (2004). 11. A. Pluchino, V. Latora, A. Rapisarda, Physica A 338, 60 (2004). 12. Y. Yamaguchi, J.Barr6, F.Bouchet, T.Dauxois and S.Ruffo, Physica A 337, 653 (2004). 13. P.H. Chavanis, J. Vatterville and F. Bouchet, cond-mat/0408117. 14. C.Tsallis, J.Stat.Phys. 52, 479 (1988). 15. Noneztensive Entropy: interdisczplinar ideas, C. Tsallis and M. Gell-Mann Eds., Oxford University Press (2004). 16. A. Cho, Science 297, 1268 (2002); S. Abe, A.K. Rajagopal; A. Plastino; V. Latora, A. Rapisarda and A. Robledo, Science 300, 249 (2003). 17. M.M&ard, G.Parisi and M.A.Virasoro,Spin Glass Theory and Beyond, World Scientific Lecture Notes in Physics Vo1.9 (1987); J.P.Bouchaud, L.F.Cugliandolo, J.Kurchan and M.MBzard, Spin Glasses and Random Fields, ed.A.P.Young, World Scientific, (1998) Singapore. 18. Y . Kuramoto, in it International Symposium on Mathematical Problems in Theoretical Physics, Vol. 39 of Lecture Notes in Physics, edited by H. Araki (Springer-Verlag, Berlin, 1975); Chemical Oscillations, Waves, and Turbulence (Springer-Verlag, Berlin, 1984). 19 A. Pluchino, V. Latora, A. Rapisarda, Metastability hindering synchronization in HMF and Kzlramoto models (2005) in preparation. 20. M.MBzard, G.Parisi and R. Zecchina, Science 297, 812 (2002). 21. C. Tsallis and D.J. Bukman, Phys. Rev. E 54,R2197 (1996). 22. J.J.Binney, N.J.Dowrick, A.J.Fisher,and M.E.J.Newman, The Theory of Critical Phenomena, Oxford Science Publications (1992). 23. Sotolongo-Costa O., Rodriguez Arezky H. and Rodgers G.J., Entropy 2 (2000) 77; Sotolongo-Costa O., Rodriguez Arezky H. and Rodgers G.J., Physica A A286 (2000) 638; 24. J.P.Bouchaud Weak ergodicity breaking and aging an disordered systems J.Phis.1 fiance 2 (1992)
VLASOV ANALYSIS OF RELAXATION AND META-EQUILIBRIUM
CELIA ANTENEODO Departamento de Fisica, Pontificia Universidade Catdlica do Rio de Janeiro, CP 38071, 22452-970, Rio de Janeiro, Bmzil, and Centru Bmsileiro de Pesquisas Fa'sicas, R. Dr. Xawier Sigaud 150, 22290-180, Rio de Janeiro, Brazil Email: [email protected] RAUL 0. VALLEJOS Centro Bmsileiro de Pesquisas Fisicas, R. Dr. Xavier Sigaud 150, 22290-180, Rio de Janeiro, Brazil Email: vallejosOcbpf. br The Hamiltonian Mean-Field model (HMF), an inertial XY ferromagnet with infiniterange interactions, has been extensively studied in the last few years, especially due t o its long-lived meta-equilibrium states, which exhibit a series of anomalies, such as, breakdown of ergodicity, anomalous diffusion, aging, and non-Maxwell velocity distributions. The most widely investigated met-equilibrium states of the HMF arise from special (fully magnetized) initial conditions that evolve t o a spatially homogeneous state with well defined macroscopic characteristics and whose lifetime increases with the system size, eventually reaching equilibrium. These met-equilibrium states have been observed for specific energies close below the critical value 0.75, corresponding t o a ferromagnetic phase transition, and disappear below a certain energy close t o 0.68. In the thermodynamic limit, the @-spacedynamics is governed by a Vlasov equation. For finite systems this is an approximation t o the exact dynamics. However, it provides an explanation, for instance, for the violent initial relaxation and for the disappearance of the homogeneous states at energies below 0.68.
1. Introduction
Consider the one-dimensional Hamiltonian
H =
N
J Cp:+ C (1- cos(ei - ej)l . 2N
l N
i=l
i,j=l
It represents a lattice of classical spins with infinite-range interactions. Each spin rotates in a plane and is therefore described by an angle --A 5 0i < -A,and its conjugate angular momentum p i , with i = 1,.. . ,N ; the constant J is the interaction strength. Of course, one can also think of point particles of unitary mass moving on a circle. This model is known in the literature as mean-field XY-Hamiltonian (HMF)
'.
113
114 The HMF has been extensively studied in the last few years (see for a review). The reasons for such interest are various. From a general point of view, the HMF can be considered the simplest prototype for complex, long-range systems like galaxies and plasmas (in fact, the HMF is a descendant of the masssheet gravitational model ’). But the HMF is also interesting for its anomalies, be them model-specific or not. Especially worth of mention are the long-lived meta-equilibrium states (MESs) observed in the ferromagnetic HMF. These states exhibit breakdown of ergodicity, anomalous diffusion, and non-Maxwell velocity distributions, among other anoma(see also the contribution by A. Rapisarda et al. in this volume). It has been lies conjectured that it may be possible to give a thermodynamic description of these MESs by extending the standard statistical mechanics along the lines proposed by Tsallis 5 . The simplicity of the HMF makes possible a full analysis of its equilibrium statistical properties, either in the canonical or microcanonical ensembles ‘. If interactions are attractive ( J > 0), the system exhibits a ferromagnetic transition at the critical energy E, = 0.75JN. Here we will focus on the out-of-equilibrium behavior of the ferromagnetic HMF ( J > 0), when the system is prepared in a fully magnetized configuration, at an energy close below E,, with uniformly distributed momenta (“water-bag” initial conditions). Under these initial conditions the system evolves to a spatially homogeneous state with well defined macroscopic characteristics and whose lifetime increases with the system size, eventually reaching equilibrium. Numerical experiments have shown the disappearance of the family of homogeneous MESs below a certain energy close to 0.68JN. 334
2. Equations of motion
It is convenient to write the Hamiltonian (1) in the simplified form:
where we have introduced the magnetization per particle .
N
and for simplicity we have taken J = 1. The equations of motion read
for i = 1 , . . . ,N , with Oi = (cosOi, -sin&). Without loss of generality, we can set the axes such that m,(t = 0) = 0. If, additionally, the distribution of momenta is symmetrical, then m Z ( t )= 0, W. In that case, the equations of motion become
115
Notice that these equations can be seen as the equations for a pendulum with a time-dependent length. 3. First stage of relaxation Fully magnetized states violently relax t o a state of vanishing magnetization, within finite size corrections. The most elementary approach to describing the relaxation of m, from a given initial condition, is to perform a series expansion around t = 0, i.e.,
m(t) =
C k!1
-ck
tk.
k10
In our case, the initial condition is such that m = 1 (with m, = 0), then one obtains the following coefficients for mv(t)
Q=l c2
= -(P2)0
c4 =
( P ~+)4(P2)o ~
cs = - ((p6)o+ 26(p4)0 + 1 6 ( p 2 ) o + 1 8 ( p 2 ) i )
and C,dd = 0, where averages are calculated with the initid distribution of momenta h ( p ) . If h ( p ) at t = 0 is symmetrical around p = 0, then m(t) is an even function of t. In particular, if the initial condition is water-bag, i.e., Bi = 0,Vi and additionally pi are uniformly distributed in the interval [-p,,p,], then (from Hamiltonian (I), p, = 6 , with E the energy per particle), one obtains
;:(
;)
+-
t4
+.
,
The convergence of this series is very slow and, given that a general expression is not available, only the very short time of the relaxation can be described. 4. Vlasov equation
On the other hand, the evolution equation of the reduced probability density function (PDF) in p-space isformally equivalent to the Vlasov-Poisson system'
where V = - m . i ( B ) andm=JdBF(O)Jdpf(B,p,t). I f m = r n i j , then
df + p -af at
ae
-
msine-af = 0 , aP
(4)
116 with
The Vlasov equation (4) can be cast in the form
-af_ - [Lo+ Ll(t)l f, at
where Lo = -pa@ and & ( t ) = m(t)sinOap. We will consider states (for instance, with vanishing magnetization) for which the term L l ( t ) can be treated as a perturbation. It is convenient to switch t o the interaction representation, i.e., to define T(t) = e-Lotf((t), then
-
where L1 = eVLotLleLot. The equation for the propagator therefore,
6(t)=
1
6,such that F(t)= 6(t)T(O),is a6/at = zl6,
+
Jd
t
dt%l(t~)6(t~),
and recursively, one has
The solution at order k of the Vlasov Eq. (4)is
where the index (k) indicates the order at which the expansion (6) is truncated. Fk-om here on, we will deal with continuous distributions, hence our treatment is valid in the thermodynamic limit.
4.1. Lowest-order truncation
6
At zeroth-order, the propagator is approximated by N G ( O ) = 1. This is equivalent to neglecting the magnetization. Thus, if m = 0, the truncation is exact. For the initial distribution f(O,p,O) = g(O)h(p), where g(0) is uniform in [-7r,n] (hence, m = 0) and h(p) is an arbitrary even function, both distributions remain unaltered in time, consistently with the numerical simulations in Fig. 7 of '. In fact, if m = 0 for any time, there are no forces t o drive the system out of the macroscopic state. If the initial condition is f(0,p, 0) = 6(0)h(p),where 6 is the Dirac delta function and h(p) an arbitrary even function of p (as in our case of interest), although
117 m # 0 , L1 is small (it is null at t = 0 and remains small for later times), allowing a perturbative treatment. Then, we have f(O)(e,p,t) = e-Pta”(e,p,o)
= h(p)
W
1
q e -pt)
= %h(p)
C
eik(’-Pt).
k=-w Therefore,
h(o)(p,t) = S _ : w ( O ) ( e , p , t ) =
w,
that is, at zeroth-order, the distribution of momenta, whatever it is, does not change in time. However the angular distribution does indeed change. For instance, in the particular case of the water-bag distribution, where h(p) is a uniform distribution in [ - p O , p , ] , we obtain
Notice that the distribution of angles becomes uniform in the long time limit. It gets uniform through a mechanism of phase mixing, where particles do not interact (remember that magnetization has been neglected). From Eqs. (5) and (8), the zeroth-order magnetization is
whose expansion in powers of time yields
+ -103E 2 t 4 + . . .
m(O)(t)= 1 - &t2
Observe that this expansion up to second-order coincides with the exact one, given by Eq. (2), for any E . 4.2. First-order truncation
Now, recalling that $1
=1
+
d t l E l ( t l ) , from (7), at first-order, we have
f(l)(e,p,t ) = e-PtaOW(t)f(e,p,
+
= fO)(e,p,t)
where
El ( t )= ePtae
0)
1‘
dtlEl(tl)f(e,p,o),
m(t)sin 0 8, e-Ptae. Then
rt
118 1.o E
= 0.69
10.0
t
m2 ...... Om order -1st order
0.5
0.0 0.1
1.o
Figure 1. Squared magnetization as a function of time. The initial state is fully magnetized with uniformly distributed momenta for E = 0.69. Symbols correspond to numerical simulations with N = 1000, the distribution of momenta is regular. Dashed lines correspond to the zeroth-order approximation obtained from Eq. (9). Full lines correspond to the first-order approximation given by Eq. (10)
After some algebra, for the case h(p) uniform in [-p,,p,], we obtain
Substituting m(t)by m(')(t)one obtains the magnetization at first-order. Moreover,
+ h(p)
s' 0
dtl tlm(tl) cos(pt1)
+
.
We recall that, for the uniform distribution, h'(p) 0: [b(p p,) - b(p - p,)]. This explains why h(p, t ) presents two spikes at p = fp,. Fig. 1 shows the first stage of the relaxation of the magnetization. Numerical simulations were performed for N = 1000. Increasing the system size does not change the numerical curve in the time interval considered (t 5 50). Of course, for longer times the curve becomes size d e ~ e n d e n tThe ~ > ~squared magnetization rapidly decreases from its initial value m2 = 1 down t o zero at t 2: 2. Then, it remains very close to zero up to t = 20. From then on, one observes bursts of small amplitude. Since Vlasov equation is exact in the thermodynamic limit, it describes the exact N = 1000 evolution up to time t N 50. The zero order approximation describes correctly m2 vs t for a very short time (t 2: 0.4). The first-order approximation describes satisfactorily the violent initial relaxation (up t o t 1: 2), but it does not reproduce the structure appearing later. Higher order corrections are required to describe that behavior. Extrapolation of numerical sir nu la ti on^^?^ shows that m + 0 in the thermodynamic limit. This regime settles for times beyond the scope of our approximation.
119 4.3. Equilibrium
For completeness, let us discuss the distributions at thermal equilibrium'l. If the system has already attained equilibrium, then at = 0. Let also assume that the equilibrium distribution can be factorized, i.e., f ( 0 , p ) = g(0) h(p). Then, from (4)7 as ah pas h(p) = msinOg(0)-. aP Assuming h(p) = Aexp(-Pp2/[2fi]), Eq. (11) reduces to ag/a0 = -msinBg(0). Thus g(e) = Ceomcos',
(12)
with the normalization constant C = 1/[27rI0(/3m)],where I0 is the modified Bessel function of zeroth-order. The equilibrium magnetization can be obtained from the consistency condition (5):
thus recovering the results of canonical calculations
'.
4.4. Meta-equilibrium
Although we have not found the long-time solution of Vlasov equation, starting from fully magnetized initial conditions, numerical simulations3 indicate that in the thermodynamic limit the system tends to a spatially homogeneous state. We have seen in Sect. 4.1 that, once reached a homogeneous state, the distribution of momenta, whatever it is, does not change in time. But, the question is whether the homogeneous solutions are stable or not under perturbations. One one hand, the Vlasov approach is a good approximation to the discrete dynamics, on the other finite-size effects may be the source of perturbations that may take the system out of a Vlasov steady state. Therefore, we will perform a stability test (valid in the thermodynamic limit) and discuss the results under the light of the discrete dynamics. There is the well known Landau analysis which concerns linear stability. A more powerful stability criterion for homogeneous equilibria has been proposed by Yamaguchi et al. '. This is a nonlinear criterion specific to the HMF. It states that f(p) is stable if and only if the quantity
is positive (it is assumed that f is an even function of p ) . This condition is equivalent to the zero frequency case of Landau's recipe ','. Yamaguchi et al. showed that a distribution which is spatially homogeneous and Gaussian in momentum becomes unstable below the transition energy E,, = 3/4 (see also v,1)' in agreement
120 with analytical and numerical results for finite N systems. They also showed that homogeneous states with zero-mean uniform f(p) are stable above E = 7/12 = 0.58... (see also In the same spirit, it is instructive to analyze the stability of the family of q-Gaussian distributions ‘1’).
f(P) 0: exP,(-aPZ) = [I- 4 1 - q)P2]
VO-9) 7
(14)
which allows to scan a wide spectrum of PDFs, from finite-support to power-law tailed ones, containing as particular cases the Gaussian (q = 1) and the water bag (q = -m). In Eq. (14), the normalization constant has been omitted and the parameter a > 0 is related to the second moment (p2),which is finite only for q < 5/3. In the homogeneous states of the HMF one has (p2) = 2~ - 1, as can be easily derived from Eq. (1). Then, the stability indicator I as a function of the energy for the q-exponential family reads
I=1-
3-9 2(5 - 3q)(2~- 1)
’
Therefore, stability occurs for energies above E~
3 =
Eq
q-1
4 -t 2(5
-
3q) ’
(16)
The stability diagram is exhibited in Fig. 2. It is easy to verify that one recovers the known stability thresholds for the uniform and Gaussian distributions. We remark that Eq. (16)states that only finite-support distributions, corresponding to q < 1, are stable below E , ~ . This agrees with numerical studies in the meta-equilibrium regimes of the HMF. . .................................................... ................i.................. .:...............’ : i
513
0.5
7112
?
314
Figure 2. Stability diagram of the q-Gaussian ansatz for the momentum PDFs.
We have also shown recently l2 that a similar analysis can be performed for a very simple family of functions exhibiting the basic structure of the observed f(p), basically, a uniform distribution plus cosine. Fitting of numerical distributions leads
121 to points in parameter space that fall close t o the boundary of Vlasov stability, and exit the stability region for energies below the limiting value E N 0.68. This result is confirmed when the stability criterion is applied t o the discrete distributions arising from numerical simulations 12, although for the discrete dynamics the magnetization is not strictly zero. The stability index I is positive for energies above E N 0.68. The fact that the stability indicator becomes negative below E N 0.68 signals the disappearance of the homogeneous metastable phase at that energy. In fact, extrapolation of numerical simulations to the thermodynamic limit confirm this result. The present stability test only applies t o homogeneous states. Strictly speaking, m = 0 does not imply that the states are inhomogeneous. However, the sudden relaxation that leads to the present MESs mixes particles in such a way that m = 0 and spatial homogeneity are expected to be synonimous. Below E = 0.68, the measured distributions are evidently inhomogeneous ( m # 0). In these cases, negative stability refers to hypothetical homogeneous states having the measured f(p).
5. Final remarks We have seen that, although our approach is valid in the continuum limit, it gives useful hints on the finite size dynamics. Of course, it can not predict complex details of the discrete dynamics. However, the present approach gives information on the violent initial relaxation from fully magnetized states, for sufficiently large system. It also explains the dissapearance of homogeneous MESs below a certain energy observed by extrapolation of numerical simulations to the thermodynamic limit. Moreover, the identification of MESs with Vlasov solutions is also consistent with the fact that when the thermodynamic limit is taken before the limit t + oc), the system never relaxes t o true equilibrium, remaining forever in a disordered state.
Acknowledgements C.A. is very grateful to the organizers for the opportunity of participating of the nice meeting at the Ettore Majorana Foundation and Centre for Scientific Culture in Erice.
References 1. M. Antoni and S. Ruffo, Phys. Rev. E 53, 2361 (1995). 2. T. Dauxois, V. Latora, A. Rapisarda, S. Ruffo and A. Torcini, in Dynamics and Thermodynamics of Systems with Long Range Interactions, edited by T. Dauxois, S. Ruffo, E. Arimondo and M. Wilkens, Lecture Notes in Physics Vol. 602, Springer (2002). 3. A. Torcini and M. Antoni, Phys. Rev. E 59, 2746 (1999). V. Latora, A. Rapisarda, and S. Ruffo, Physica A 280, 81 (2000); V. Latora, A. Rapisarda, and C. Tsallis, Phys. Rev. E 64, 056134 (2001); V. Latora and A. Rapisarda, Chaos, Solitons and Ractals 13, 401 (2002); A. Giansanti, D. Moroni, and A. Campa, ibid., p. 407; V.
122 Latora, A. Rapisarda, and C. Tsallis, Physica A 305,129 (2002);M. Montemurro, F. A. Tamarit and C. Anteneodo, Phys. Rev. E 67,031106 (2003). 4. Pluchino, V. Latora and A. Rapisarda, Physica D 193,315 (2003). 5. C. Tsallis, J. Stat. Phys. 52, 479 (1988);C. Tsallis, in Nonextensive Statistical Mechanics and ats Applications, edited by S. Abe and Y. Okamoto, Lecture Notes in Physics Vol. 560 (Springer-Verlag, Heidelberg, 2001); Chaos, Solitons and Fractals 13, 371 (2002); Non Extensive Thermodynamics and Physical Applications, edited by G. Kaniadakis, M. Lissia, and A. Rapisarda, Physica A 305 (Elsevier, Amsterdam, 2002). See http://tsallis.cat.cbpf.br/biblio.htm for further bibliography on the subject. 6. M. Antoni, H. Hinrichsen, and S. Ruffo, Cham, Solitons and Ractals 13,393(2002). 7. R. Balescu, Statistical Dynamics (Imperial College Press, London, 2000). 8. Y.Y.Yamaguchi, J. Bar& F. Bouchet, T. Dauxois and S. Ruffo, Physica A 337 , 36 (2004). 9. M.Y. ‘Choi and J. Choi, Phys. Rev. Lett. 91, 124101 (2003). 10. S. Inagaki, Prog. Theo. Phys. 90,577 (1993). 11. V. Latora, A. Rapisarda and S. Ruffo, Physica D 131,38 (1999). 12. C.Anteneodo and R.O. Vallejos, Physica A 344,383 (2004).
WEAK CHAOS IN LARGE CONSERVATIVE SYSTEM INFINITE-RANGE COUPLED STANDARD MAPS
LUIS G . MOYANO Centro Brasileiro de Pesquisas Fisicas Rua Xawier Sigaud 150, Urca 22290-180 Rio de Janeiro, Brazil E-mail: moyanoOcbpf.br ANA P. MAJTEY Facultad de Matemcitica, Astronomia y Fisica Universidad Nacional de Co'rdoba, Ciudad Universitaria 5000 Co'nloba, Argentina E-mail: [email protected] CONSTANTINO TSALLIS Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA E-mail: tsallisOsantafe.edu and Centro Bmsileiro de Pesquisas Fisicas, Rua Xavier Sigaud 150, Urca 22290-1 80, Rzo de Janeiro, Brazil We study, through a new perspective, a globally coupled map system that essentially interpolates between simple discrete-time nonlinear dynamics and certain long-range many-body Hamiltonian models. In particular, we exhibit relevant similarities, namely (i) the existence of long-standing quasistationary states (QSS), and (ii) the emergence of weak chaos in the thermodynamic limit, between the present model and the Hamiltonian Mean Field model, a strong candidate for a nonxtensive statistical mechanical approach.
PACS numbers: 05.10.-a, 05.20.Gg, 05.45.-a, 05.90.+m 1. Introduction
In the last years, considerable effort has been made in order to clarify the role that nonextensive statistical mechanics' plays in physics. In this context, there has been significantly growing evidence relating several, physically motivated, nonlinear dynamical systems. It has been repeatedly put forward that the statistical behaviour of a physical system descends from its microscopic dynamics2. Consequently, the study of paradigmatic nonlinear dynamical systems is important in order to describe and
123
124 understand anomalies and deviations from the well known Boltzmann-Gibbs (BG) statistical mechanics. The scenario within which we are working tries to capture the most relevant features of nonextensive statistical mechanics in the complete range of dynamical systems: from extremely simple dissipative low-dimensional maps to complex conservative Hamiltonian dynamics3. In the present paper we will make specific progress along these lines by focusing on a model which illustrates the deep similarities that can exist between nonlinear coupled maps and many-body Hamiltonian dynamics. Let us first recall a paradigmatic and intensively studied many-body infiniterange coupled conservative system, namely the Hamiltonian Mean Field (HMF) model4s5:
The HMF model may be thought of as N globally coupled classical spins (inertial version of the XY ferromagnetic model). Its molecular dynamics exhibit a remarkably rich behaviour. When the initial conditions are out of equilibrium (for example, the so called waterbag initial conditions6), it can present an anomalous temperature evolution (we consider T = 2 K / N , being K the total kinetic energy). These states are characterized by a first stage (quasistationary state, Q S S ) whose temperature is different from that predicted by the BG theory, followed by a crossover t o the expected final temperature. These QSS appear to be a consequence of the longrange coupling. They are important because their duration diverges with N , thus becoming the only relevant state for a macroscopic system'. At the other end of the range of dynamical systems we may consider a dissipative, one-dimensional model, such as the logistic map (and its universality class): 2t+l
= 1 - px:
(t = 0 , 1 , 2...; 2 E [-1,1]; p E [0,2]).
(2)
Because of its physical importance, the logistic map is one of the most studied lowdimensional maps. Despite its apparently simple form, it exhibits a quite complex behaviour. Important progress has recently been made which places this model as an important example of the applicability of nonextensive statistical mechanical concepts. Indeed, Baldovin and Robledo8 rigorously proved that, at the edge of chaos (as well as at the doubling-period and tangent bifucations), the sensitivity t o initial conditions is given by a q-exponential function Furthermore, they proved8 the q-generalization of a Pesin-like identity concerning the entropy production per unit time. For the stationary state at the edge of chaos, the entropic index q can be obtained analytically. Moreover, when a small external noise is added, a twestep relaxation evolution is foundg, similarly to what occurs for the HMF case. At this point, a natural question may arise. Is it possible to relate the results found for such simple maps to the anomalies found in the HMF model? Further-
'.
125 more, can we treat various nonlinear dynamical systems within the nonextensive statistical mechanics theory? Many studies are presently addressing such questions. A first step that can be done in this direction is to move closer to a Hamiltonian dynamics by considering a symplectic, conservative map. This is the case of the widely investigated Taylor-Chirikov standard maplo:
+ 1) + q t ) + 1) = p ( t ) + f sin[2d(t)]
e(t + 1) = p ( t p(t
(mod 11, (mod 1 ) .
(3)
This map may be obtained, for instance, by approximating the differential equation of a simple pendulum by a centered difference equation, and converting a secondorder equation into two first-order equations. The standard map was studied along the present lines by Baldovin et all’. For symplectic maps, what plays a role analogous to the temperature is the variance , ( ) denotes the ensemble of the angular momentum: T = 0; = ( p 2 ) - ( P ) ~where average. Beginning with the same type of initial conditions as before (waterbag), we observe once again a two-plateaux relaxation process, suggesting a connection with the phenomena already described for the HMF model. A step forward to capture the behaviour of the HMF system of rotors is to consider N standard maps, coupled in such way as to mantain their symplectic (hence conservative) structure. However, there are several ways to achieve this. A particular coupling in the m o m e n t a has been recently addressedg with quite interesting results such as QSS relaxation and nonergodic occupation of phase space. A different type of coupling is addressed in the next Section. 2. Symplectic coupling in the coordinates
As before, we consider N standard maps but, this time, with a global, symplectic coupling in the coordinates :
This coupling arises as a natural choice. In fact, applying to the HMF model the difference procedure mentioned above for the standard map, we obtain precisely the a = 0 particular instance of model (4). This model has already been addressed in the literature12, but in a quite different context, related to the study of the Lyapunov exponents in the completely chaotic regime. We present next numerical simulations of the map system (4).We calculated the evolution of the variance of the momenta 0;. Our results show that, for waterbag initial conditions and appropriate ranges for the parameters a and b, two-step relaxation processes are again found. In Fig. 1 we show these results for different sizes of the system. It can be seen that the crossover time t , grows as t , N N1.07* thus never reaching BG equilibrium when N 00. In other words, the N + co and t + co limits do not commute. --f
126
0.02
t
Il '
8
Figure 1. Temperature evolution illustrating the presence of tw+step relaxation (QSS) for typical system sizes. We have used a = 0.05, b = 2, and waterbag initial conditions within po = 0.3f0.01. Ensemble averages were done, typically over 100 realizations. Only much longer simulations could confirm, or exclude, the possibility that all curves, i.e. VN, saturate at the equal-probabilityvalue 1/12 N 0.08. Inset: The crossover time t , corresponds to the inflexion point of T versm log t.
Finally, we calculated the largest Lyapunov ezponent (LLE) XL (we recall that Lyapunov exponents measure the instability of dynamicd trajectories, and provide a quantitative idea of the sensitivity t o the initial conditions of the system). Indeed, for the (a, b)-parameters in the range illustrated in Figs. 1 and 2, we found that the dependence of the LLE with the sistem size is consistent with XL N-0.40*0 i.e., a clear indication of weak chaos in the thermodynamic limit. N
0' 10'
' ",,...'
lo2
''sl.*.*'
lo3
''*ll..ll
lo4
''....I.'
lo5
''......I
lo6
".,.A t
Figure 2. Time dependence of the eflective largest Lyapunov exponent Xr. for typical sizes (same parameters as in Fig. 1). Inset: N-dependence of the asymptotic value of XL, consistent with weak chaos in the thermodynamic limit.
Summarizing, we presented a conservative model consisting in N standard maps
127 symplectically coupled through the coordinates. We have found results suggestively similar to those obtained for other nonlinear dynamical systems including the HMF model. More specifically, we found the double plateaux in the time evolution of the temperature, and a LLE which approaches zero for increasing size. We are currently studying several other quantities (e.g., correlation functions and momenta probability distribution functions), as well as the influence of (a, b) on the present ones. These results place naturally the present system within a series of nonlinear dynamical systems which starts with one-dimensional dissipative maps, follows with low-dimensional conservative maps, then many symplectically coupled maps, and ends with long-range many-body Hamiltonians. They all share important phenomena, typically related, in one way or another, to weak cham and long-standing nonergodic occupation of phase space. These features precisely constitute the scenario within which nonextensive statistical mechanics appears t o be the adequate thermostatistical theory, in analogy to Boltzmann-Gibbs statistical mechanicis, successfully used since more than one century for strongly chaotic and ergodic systems. LGM thanks the organizers for warm hospitality at the meeting in Erice, Italy, in particular A. Rapisarda. Partial finantial support from CNPq, Faperj and Pronex/MCT (Brazilian agencies) is acknowledged as well.
References 1. C. Tsallis, J. Stat. Phys. 52, 479 (1988). For a review see M. Gell-Mann and C. Tsallis, eds., Nonextensive Entropy - Interdisciplinary Applications, (Oxford University Press, New York, 2004). For bibliography see http://tsallis.cat.cbpf.br/biblio.htm 2. E.G.D. Cohen, Physica A305, 19 (2002); C. Tsallis, Physica A340, 1 (2004); E.G.D. Cohen, Boltzmann Award Communication at Statphys-Bangalore-2004, Pramana (2005), in press. 3. C. Tsallis, A. Rapisarda, V. Latora and F. Baldovin, in Dynamics and Thermodynamics of Systems with Long-Range Interactions, eds. T. Dauxois, S. Ruffo, E. Arimondo and M. Wilkens, Lecture Notes in Physics 602 (Springer, Berlin, 2002), p. 140. 4. M. Antoni and S. Ruffo, Phys. Rev. E52, 2361 (1995). 5. T. Dauxois, V. Latora, A. Rapisarda, S. Ruffo and A. Torcini, in Dynamics and Thermodynamics of Systems with Long-Range Interactions, eds. T. Dauxois, S. Ruffo, E. Arimondo and M. Wilkens, Lecture Notes in Physics 602 (Springer, Berlin, 2002), p. 458. 6 . By waterbag initial conditions we mean totally aligned spins, with momenta taken from
a uniform distribution. See, for instance, A. Pluchino, V. Latora and A. Rapisarda, Physica A338, 60 (2004). 7. A. Pluchino, V. Latora and A. Rapisarda, Physica D193, 315 (2004). 8. F. Baldovin and A. Robledo, Phys. Rev. E66, 045104(R) (2002); F. Baldovin and A. Robledo, Phys. Rev. E69, 045202(R) (2004). 9. F. Baldovin, L.G. Moyano, A. P. Majtey, A. Robledo and C. Tsallis, Physica A340, 205 (2004). 10. E. Ott, Chaos in Dynamical Systems, (Cambridge University Press, Cambridge, 1993). 11. F. Baldovin, E. Brigatti and C. Tsallis, Phys. Lett. A320, 254 (2004). 12. V. Ahlers, R. Zillmer and A. Pikovsky, Phys. Rev. E63, 036213 (2001).
DETERMINISTIC AGING *
ELI BARKAI Department of Physics, Bar-Ilan University, Ramat-Gan 52900, Israel E-mail: barkaieamail. biu.ac.il We investigate aging behavior in a dynamical system: a non-linear map which generates sub-diffusion deterministically. Behaviors of the diffusion process are described using aging continuous time random walks. We briefly relate the aging behavior to other anomalous features of the map: q exponential sensitivity of trajectories to initial conditions, divergence of escape times from unstable fixed points, anomalous diffusion, breaking of ergodicity, and the absence of an invariant measure.
There is growing interest in physical systems which exhibit aging behavior. Aging is found in glasses, polymers, and in random walks in random environments. These disordered complex systems are composed of many interacting units and stochastic forces govern their dynamics. In contrast we recently showed that a low dimensional model, a deterministic non-linear map which has no element of disorder built into it, exhibits aging behavior The aging behavior is related to diverging of average waiting time in vicinity of unstable fixed point of the map under investigation (see details below) The diverging waiting time is also responsible for a non-stationary evolution which leads t o anomalous diffusion and ergodicity breaking which are behaviors related t o aging. The dynamics of the map in vicinity of the unstable fixed point, is governed by q exponential sensitivity of the trajectories on initial conditions, i.e. weak chaos (see details below). The relation of such q exponential behavior in models related t o ours and Tsallis statistics is a subject of ongoing research Probably the simplest theoretical tool which generates normal and anomalous diffusion deterministically are one dimensional maps
'.
'.
39495.
Zt+l = xt
+F(zt)
(1)
with the following symmetry properties of F ( x ) : (i) F ( z ) is periodic with a periodicity interval set t o l, F ( z ) = F ( z N ) , where N is an integer. (ii) F ( z ) has inversion anti-symmetry; namely, F ( z ) = - F ( - z ) , while t in Eq. (1)is the discrete time. Geisel and Thomae considered a rather wide family of such maps which behave like
+
F ( z ) = axz for z + +0, *This work is supported by the center of complexity-jerusalem
128
(2)
129 where z > 1. Eq. (2) defines the property of the map close to its unstable fixed point. In numerical experiments soon to be discussed I will use the map 1
F ( z ) = (2z)=, 0 5 z 5 2
(3)
which together with the symmetry properties of the map define the mapping for all z. In Fig. 1 I show the map for three unit cells. It is important to emphasize that the main properties of the aging behavior I investigate will not be sensitive to the detailed shape of the map, besides its behavior in vicinity of the fixed points Eq. (2). To investigate aging, e.g. numerically, I choose an ensemble of initial conditions x-t, which is chosen randomly and uniformly in the interval -112 < x - ~ ,< 112. The quantity of interest is the displacement in the interval (0, t ) ,z = zt -20 which is obtained using the map Eq. (1). Previous work ' 6 considered the non-aging regime, namely t, = 0. In numerical simulations averages like (z2(t,,t ) ) ,are averages over the set of initial conditions, which generally depend both on t and on t,. In an ongoing process a walker following the iteration rules may get stuck close to the vicinity of unstable fixed points of the map (see Fig. 1). It has been shown, both analytically and numerically, that probability density function (PDF) of escape times of trajectories from the vicinity of the fixed points decays like a power law '. To see this, one considers the dynamics in half a unit cell, say 0 < 2 < 112. Assume that at time t = 0 the particle is on z* residing in vicinity of the fixed point z = 0. Close to the fixed point we may approximate the map Eq. (1) with the differential equation dzldt = F ( z ) = axz. This equation is reminiscent of the equation defining the q generalized Lyaponov exponent '. The solution is written in terms of the q-exponential function, ezpq(y) = [l (1 - q)y]'/@-q) where q = 2 and
+
(4) We invert Eq. (4)and obtain the escape time from z* to a boundary on b (z* < b < b z*-z+l b-'+l 112) is t N S,.[F(z)]-'da: using Eq. (2) t N a - x] , a In q behavior. The PDF of escape times $(t) is related to the unknown PDF of injection points q(z*),through the chain rule $(t) = v(z*)ldz*/dtl. Expanding q(z*)around the unstable fixed point z* = 0 one finds that for large escape times
' T[
- r(--o
$(t)
A
t-l-a
1
, a = - (2 - 1) '
(5)
where A depends on the PDF of injection points, namely on how trajectories are injected from one cell to the other. The parameter A will be sensitive to the detailed shape of the map, which implies that it is non-universal. In contrast the parameter z depends only on the behavior of the map close to the unstable fixed points. When z > 2 corresponding to a < 1 the average escape time diverges. The consequence of this is a non-stationary evolution which leads to anomalous diffusion, ergodicity
130
*
breaking, and aging. In turn these behaviors are related to the observation that the invariant time independent density is never reached (the latter is defined only on 0 < x < 1/2 with suitable boundary conditions and see also '). Since in our problem q = z > 1 the relation between q and the anomalous diffusion exponent is Ly = l/(q - 1).
-0.5
0
0.5
1 xl
I 1.5
2
2.5
Figure 1. The map zt+l = zt f F ( z t ) ,defined by Eq. (3) with z = 3. The linear dash-dot curve is zt+l = zt. The unstable fixed points are on zt = 0,1,2.
To consider stochastic properties of the aging dynamics I now investigate aging continuous time random walks ACTRW loill, deriving an explicit expression for the asymptotic behavior of the Green function. ACTRW describes the aging properties of the well known CTRW, and was introduced in the context of aging of the trap model by Monthus and Bouchaud lo. ACTRW considers a one-dimensional nearest neighbor lattice random walk, where lattice points corresponds to the cells of the iterated maps. Waiting times on each lattice point are assumed to be described by $(t). Note that after each jumping event it is assumed that the process is renewed, namely, we neglect correlation between motions in neighboring cells. This assumption will be justified later using numerical simulations. As mentioned start of the ACTRW process is at t = -t, and our goal is to find the ACTRW Green function P ( x ,t,, t ) , were 2 is the random displacement in the interval (0, t ) after the random walk was aged for a period t,. In ACTRW we must introduce the distribution of the first waiting time tl: the time elapsing between start of observation at t = 0 and the first jump event in the interval (0,t). Let ht,(tl) be the PDF of tl. Let hS(u)be the double Laplace
131 transform of hta(tl)
11,12
when z > 2 corresponding to a
< 1 in Eq. (5) sin (7ra)
A
hta(tl)
t,*
(7)
t y (tl f t a ) '
which is valid in the long aging time limit. Note that Eq. (7) is independent of the exact form of $ ( t ) , besides the exponent a. When a + 1 the mass of the PDF ht,(tl) is concentrated in the vicinity of tl + 0, as expected from a 'normal process'. I have checked numerically the predictions of Eq. (7) for z = 3, analyzing trajectories generated by the map Eq. (3) with three different aging times. In Fig. 2 I show the probability of making at-least one step in the interval (0,t ) :$ hta( t ) d t = l - p o ( t , , t ) ,where po(t,, t ) is the probability of making no steps, i.e. the persistence probability. The results show a good agreement between numerical results and the theoretical prediction Eq. (7) without fitting. Fig. 2 clearly demonstrates that as the aging time becomes larger the time for the first jumping event, from one cell to its neighbor, becomes larger in statistical sense (i.e. the older the particle gets its tendency to make a jump decreases). The aging behavior is clearly related to the slow escape times from the vicinity of fixed points (when z > 2), as the age of the process is increased there is more likelihood of finding the particles very close to the
09080706-
-a o s I
-
cp 04
-
030201
-
00
1
2
3 I
4
5 x lo'
Figure 2. The probability of making at least one step in a time interval (0,t ) for different aging times specified in the figure. The solid curve is the theoretical prediction EQ. (7),the dotted, dashed, and dot dashed curves are obtained from numerical solution of the map with z = 3.
132 unstable fixed points, which in turn means that they become more localized. The interesting observation is that this aging behavior is captured by the limit theorem Eq. (7). We now investigate the ACTRW Green function. Let P ( k ,s, u ) be the doubleLaplace -Fourier transform (z + k , t, + s, t + u ) of P ( x ,t,, t ) , then showed lill
1
P ( k , s, u ) = su
+
[Ict (4- Ict ( S ) l P - cos I)@ 2L
(u - s) [l- 11(s)] [l - ? (u) /l cos ( k ) ].
Eq. (8) is a generalization of the well known Montroll-Weiss equation describing the non-equilibrium CTRW process l31I4. Note that only if the underlying process is a Poisson process, the Green function P ( z ,t,, t ) is independent of the age of the process t,. Before considering the behavior of the Green function P ( x ,t,, t ) let me consider the second moment. By differentiating Eq. (8) with respect to k twice and setting k = 0, and using Tauberian theorem, I obtain the mean square displacement of the random walk for t ,t , >> All"
For times t >> t, I recover the standard CTRW behavior, (z2(t,,t)) 0: t" 14. For t << t, I find (z2(t,t,)) 0: t / t k " , hence as t, becomes larger the diffusion in this regime is slowed down. This ACTRW behavior was compared to numerical simulations of the map and good agreement between ACTRW and the numerical simulations is found. In ',11 we investigate properties of the Green function 0.8,
Figure 3. The Green function obtained from simulation of the deterministic map in scaling f o n n for: (i) ta = lo4, t = lo3, circles, (ii) t, = lo4,t = lo4, triangles, (iii) and t, = 0, t = 4 * lo5, diamonds. The curves are our theoretical results which are in good agreement with the simulations with z = 3.
133
P(z,t,,t) in the limit of long t and ta, one finds that the Green function is a sum of two terms:
st';"ta
where in this limit p&, t ) *. The first term on the right hand side of Eq. (10) is a singular term. It corresponds to a random walk which did not make a jump in the time interval (0,t). The symbol 8 in the second term in Eq. (10) is the Laplace convolution operator with respect to the forward time t. While l a / z ( t )is the one sided LBvy stable PDF, whose Laplace pair is exp(-u"/2). In the limit a 1 we get a Gaussian Green function which is independent oft,. The behavior of the Green function Eq. (10) is shown in Fig. (3). A good agreement between simulations and the ACTRW Green function is obtained. Not shown is the singular behavior on the origin [i.e, the 6(z) term in Eq. (lo)]. The behavior of this singular term is displayed in Fig. 2. The applications of the ACTRW stochastic model are not limited t o the non-linear map under investigation. For example " J 1 J 5 discuss transport in disordered medium and glassy dynamics. In we discuss an aging fractional diffusion equation, which generalizes the results in 13914 t o the aging regime, while in l6 a related stochastic aging of blinking quantum dots is considered. The ACTRW model is an extension of the well known CTRW model. The CTRW is used to model many systems when non-equilibrium initial conditions are considered. When aging initial conditions apply, aging CTRW equations must be considered. Hence we expect that domain of validity of the aging CTRW framework to be wide. Here we demonstrated the validity of the ACTRW framework for the example of deterministic diffusion generated using non-linear maps. It would be interesting t o give other examples of simple dynamical systems which exhibit statistical aging behavior, to see how general is the aging behavior in this manuscript. In particular more general relations and precise connections between q exponential behavior of weakly chaotic systems, and aging, anomalous diffusion and ergodicity breaking, will continue t o be of interest in future research. N
--f
'
References 1. E . Barkai Phys. Rev. Lett. 90 104101 (2003). 2. J. Klafter, M. F.Shlesinger, and G. Zumofen, Phys. Today 49 (2) 33 (1996). 3. F. Baldovin, A. Robledo, Europhysics Letters 60 518 (2002). 4. A. Robledo, Physica A 342 104 (2004). 5. G. F. J. Ananos, C. Tsallis, Phys. Rev. Lett. 93 020601 (2004). 6. T. Geisel, and S. Thomae, Phys. Rev. Lett. 52, 1936 (1984). 7. G. Zumofen, and J. Klafter, Phys. Rev. E. 47, 851 (1993). 8. M. Ignaccolo, P. Grigolini, and A. Rosa, Phys. Rev.E, 64 026210 (2001). 9. P. Allegrini, G. Aquino, P. Grigolini, et a1 Phys. Rev. E 68 056123 (2003). 10. C. Monthus and J. P. Bouchaud, J. Phys. A 29, 3847 (1996).
134 11. 12. 13. 14. 15. 16.
E. Barkai, and Y. C. Cheng J. of Chemical Physics 118 6167 (2003). C. Gordeche, and J. M. Luck, J. of Statistical Physics 104 489 (2001). G. M. Zaslavsky, Physics Report 371,461 (2002). R. Metzler, and J. Klafter, Physics Report 339 1 (2000). E. M. Bertin, and J. P. Bouchaud, Phys. Rev. E 67 065105 (2003). G. Margolin, E. Barkai, J. of Chem. Phys. 121 1566 (2004).
EDGE OF CHAOS OF THE CLASSICAL KICKED TOP MAP: SENSITIVITY TO INITIAL CONDITIONS
S~LVIOM. DUARTE QUEIROS Centro Bmsileiro de Pesquisas Fisicas, Rua Xavier Sigaud f 50, 22290-f 80 Rio de Janeiro-R J, Brazil E-mail: [email protected] CONSTANTINO TSALLIS Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM,87501 USA E-mail: [email protected] and Centro Bmsileiro de Pesquisas Fisicas, Rua Xavier Sigaud 150, 22290-180 Rio de Janeiro-RJ, Brazil We focus on the frontier between the chaotic and regular regions for the classical version of the quantum kicked top. We show that the sensitivity to the initial conditions is numerically well characterised by = eqq , where e; = [1+ (1- q) z]* (e? = ez), and A, is the q-generalization of the Lyapunov coefficient, a result that is consistent with p:)/(q- 1) (Si = nonextensive statistical mechanics, based on the entropy S, = (1 - pi lnpi). Our analysis shows that q monotonically increases from zero to unity when the kicked-top perturbation parameter a increases from zero (unperturbed top) to ac,where a, N 3.2. The entropic index q remains equal to unity for a 2 aerparameter values for which the phase space is fully chaotic. A t
-xi
xi
1. Introduction Non-linearity is ubiquitously present in nature, e.g., fluid turbulence', extinction/survival of species in ecological systems2, finance3, the rings of Saturn4, and others. Consistently, the study of low-dimensional non-linear maps plays a significant role for a better understanding of complex problems, like the ones just stated. In a classical context, a main characterisation of the dynamical state of a non-linear system consists in the analysis of its sensitivity to initial conditions. From this standpoint, the concept of chaos emerged as tantamount of strong sensitivity to initial conditions5. In other words, a system is said to be chaotic if the distance between two close initial points increases exponentially in time. The appropriate theoretical frame to study chaotic and regular behaviour of non-linear dynamical systems is, since long, well established. It is not so for the region in between, edge of chaos, which has only recently started to be appropriately characterised, by means of the so-called nonextensive statistical mechanical concepts '. In this article we study the sensitivity to initial conditions at this intermediate region for the classical
135
136 kicked top map and its dependence on the perturbation parameter a. The sensitivity to initial conditions is defined through i$ ( t ) = limllAF(o)ll.+o llA~(t)ll where
m,
AF(t) represents the difference, at time t , between two trajectories in phase space. When the system is in a chaotic state, i$ increases, as stated previously, through an exponential law, i.e.,
where A1 is the maximum Lyapunov exponent (the underscript 1 will become transparent right a-head). Equation (1) can also be regarded as the solution of the differential equation, $$ = A 1 . In addition to this, the exponential form has a special relation with the Boltzmann-Gibbs entropy SBG = pi Inpi. Indeed, the optimization of SBG under a fixed mean value of some variable z yields p(z) a e-p5, where p is the corresponding Lagrange parameter. Excepting for some pioneering work7, during many years the references to the sensitivity at the edge of cham were restricted to mentioning that it corresponds to A 1 = 0, with trajectories diverging as a power law of time5. With the emergence of nonextensive statistical mechanics6, a new interest on that subject has appeared, and it has been possible to show details, first numerically and then analytically (for one-dimensional dissipative unimodal maps) 14. Albeit A 1 vanishes, it is possible to express the divergence between trajectories with a form which conveniently generalizes the exponential function, namely the q-exponential form
<
xi
9~10711J2,13
i$ ( t )= ewqS(Aqst )
(Aqa > 0; q3
< I),
(2)
where exp, z = [1+ (1 - q ) z]”(’-~) (expl z = e 2 ) , and Aqs represents the generalised Lyapunov coefficient (the subscript s stands for sensitivity)12. Equation (2) can be looked as the solution of f = A,, <,a. Analogously to strong chaos (i.e.,
1-21pz
A 1 > 0), if we optimize the entropy l5 S, = (S1 = SBG)under the same type of constraint as before, we obtain p(z) c( ezpq(-pqz), where p, generalizes p.
2. The classical kicked top map
+ + z2 = 1,
The classical kicked top corresponds to a map on the unit sphere z2 y2 corresponding to the following application Xt+l
= zt
yt+l = xt sin ( az t )
+ yt cos (az t )
(3)
zt+l = -zcos(az) + y t s i n ( a z t ) where a denotes the kick strength. It is straighforward to verify that the determinant of the Jacobian matrix of (3) equals one, meaning that this map is conseruative. It is therefore quite analogous to Hamiltonian conservative systems, the phase space of which consists of a mixing of regular (the famous K A M - t o d ) and chaotic regions characterised, respectively, by a linear (qs = 0) and exponential (q9 = 1) time
137
100
10
1000
(d)'
10
15
20
25
30
Figure 1. (a) Orbit of the Q = 2.3 kicked top, where chaotic and regular regions are visible. The spherical phase space is projected onto I - z plane by multiplying the I and z coordinates of each point by R / T where R = ,/and r = ( b d ) Time dependence of the sensitivity to initial conditions (with IlAF(0)ll = lO-'O) at (b) regular region (A; linearevolution), (c) edge of chaos (+; q,-qonential evolution), and (d) chaotic region (V, ezponential evolution).
m.
evolution of ( 12. The region of separation presents a 9,-exponential law for its sensitivity. In Fig. l(a) we exhibit a trajectory where the various regions can be seen. In Figs. 1 (b,c,d) we see the time evolution of E for the three possible stages: regular, edge of chaos and chaotic, respectively. It is worthy mentioning at this point that the quantum version of this map constitutes a paradigmatic example of quantum chaos. At its threshold to chaos, it has been verified a nonextensive behaviour (for details see 16). We analysed here the sensitivity to initial conditions on the verge of chaos of (3), for several values of the kick strength a E [0,4] averaged over a set of (typically 50) initial conditions for each value of a. More precisely, for fixed a , aided by its typical orbits, we determined a set of points in the regular-chaos border and then, for these points, determined the average value of E at time t. See typical results in Figs. 2 and 3. We verify that the increase of a induces a gradual approach of q, to 1. This behaviour is in accordance with what was verified12 for a non-linear system composed by two simplectically coupled standard maps. Summarising, we numerically analysed the sensitivity to initial conditions at the edge of chaos of the conservative classical kicked top, and found that its time evolution exhibits a qs-exponential behavior in all cases. For a = 0, the phase space is composed by a regular region, where the sensitivity depends linearly on time, hence qs = 0. As a increases, the top is more perturbed, hence chaotic regions emerge in phase space. Above some critical value a, N 3.2, the chaotic region fulfils the entire phase space. Consistently, the usual exponential dependence (i.e., qs = 1)
138
Figure 2. Time dependence of the averaged (over close to 50 points of the edge of chaos region) sensitivity ( E ) t o initial conditions, for typical values of a. Insets: Same data but using a lnqsordinate, where ln,(r) E ( d - 9 - 1)/(1- q ) (In1 = In) . With this q-logarithm ordinate, the slope of the straight line is simply A,, .
0.6 0.5 0.4
0.1 0.00.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
a Figure 3. The a-dependence of q s . For some critical value a, N 3.2, qs reaches unity (corresponding to fully chaotic phase spaxe), and maintains this value for all a L. ac.
is recovered. These results can be useful to understand, within a nonextensive statistical mechanical framework, the everlasting metastable states that are known to exist in systems composed of many symplectically coupled maps17, as well as in isolated many-body long-range-interacting classical Hamiltoniansl*. We would like to thank F. Baldovin and G.F.J. Aiiaiios for fruitful discussions, as well as FAPERJ, PRONEX and CNPq (Brazilian agencies) and FCT/MCES (Portuguese agency) for partial financial support. References 1. C . Beck, Physica A 233,419 (1996).
139
Figure 4. Representation of the orbits for typical values of a. As a increases we verify t h e emergence of chaotic regions that, for ac 2 ac N 3.2, fulfil the entire phase space. 2. V. Volterra, LeGon sur la the'orie mathkmatique de la lutte pour la vie, (GauthierVillars, Paris, 1931); G.A. Tsekouras, A. Provata and C. Tsallis, Phys. Rev. E 69, 016120 (2004). 3. L. Borland, Phys. Rev. Lett. 89, 098701 (2002); J.D. Farmer, Toward Agent-Based Models for Investment in Developments in Quantitative Investment Models, ed. R.M. Darnel1 (Assn for Investment Management, 2001). 4. J. Froyland, Introduction to Chaos and Coherence (IOP Publishing, Bristol, 1992). 5. Ya. Pesin, RUSS.Math. Surveys 32, 55 (1977); Hamiltonian Dynamical Systems: A Reprint Selection, eds. R.S. MacKay and J.D. Meiss (Adam Hilger, Bristol, 1987). 6. M Gell-Mann and C. Tsallis, eds., Nonextensive Entropy- Interdisciplinary Applications (Oxford University Press, New York, 2004). 7. P. Grassberger and M. Scheunert, J. Stat. Phys. 26, 697 (1981); T. Schneider, A. Politi and D. Wurtz, 2. Phys. B 66, 469 (1987); G. Anania and A. Politi, Europhys. Lett. 7,119 (1988); H. Hata, T . Horita and H. Mori, Progr. Theor. Phys. 82, 897 (1989). 8. G. M. Zaslavsky, R. Z. Sagdeev, D. A. Usikov and A. A. Chernikov, Weak chaos and quasi-regular patterns, (Cambridge University Press, Cambridge, 1992). 9. C. Tsallis, A.R. Plastino and W.-M. Zheng, Chaos, Solitons €4 Fractals 8, 885 (1997). 10. M. L. Lyra and C. Tsallis, Phys. Rev. Lett. 80, 53 (1998). 11. E.P. Borges, C. Tsallis, G.F.J. Aiiaiios, and P.M.C. de Oliveira, Phys. Rev. Lett. 89, 254103 (2002). 12. G.F.J. Aiiaiios, F. Baldovin and C. Tsallis, cond-mat/0403656. 13. G.F.J. Aiiaiios and C. Tsallis, Phys. Rev. Lett. 93, 020601 (2004). 14. F. Baldovin and A. Robledo, Europhys. Lett. 60, 518 (2002), and Phys. Rev. E 66, R045104(2002). 15. C. Tsallis, J . Stat. Phys. 52, 479 (1988). 16. Y. Weinstein, S. Lloyd and C. Tsallis, Phys. Rev. Lett. 89, 214101 (2002); also in Dewherence and Entropy in Complex Systems, ed. H.T. Elze, Lecture Notes in Physics (Springer, Heidelberg, 2003). 17. F. Baldovin, E. Brigatti and C. Tsallis, Phys. Lett. A 320 , 254 (2004); F. Baldovin, L.G. Moyano, A.P. Majtey, A. Robledo and C. Tsallis, Physica A 340, 205 (2004). 18. V. Latora, A. Rapisarda and C. Tsallis, Phys. Rev. E 64, 056134 (2001); F.D. Nobre and C. Tsallis, Phys. Rev. E 68, 036115 (2003).
WHAT ENTROPY AT THE EDGE OF CHAOS? *
MARCELLO LISSIA, MASSIMO CORADDU AND ROBERTO TONELLI Ist. Naz. F i s h Nucleare (I.N.F.N.), Dipart. di Fisicn dell’llniversitd di Cagliari, INFM-SLACS Laboratory, I-090&? M o n s e m t o (CA), Italy E-mail: marcello.lissiaOca.Infn.it
Numerical experiments support the interesting conjecture that statistical methods be applicable not only to fully-chaotic systems, but also at the edge of chaos by using Tsallis’ generalizations of the standard exponential and entropy. In particular, the entropy increases linearly and the sensitivity to initial conditions grows as a generalized exponential. We show that this conjecture has actually a broader validity by using a large class of deformed entropies and exponentials and the logistic map as test cases.
Chaotic systems at the edge of chaos constitute natural experimental laborai tories for extensions of Boltzmann-Gibbs statistical mechanics. The concept of generalized exponential could unify power-law and exponential sensitivity t o initial conditions leading to the definition of generalized Liapounov exponents : the sensitivity = limt+m lima,(o),o Az(t)/Az(O) = ( A t ) , where the generalized exponential =(z) = exp,(z) = [l (1 - q)z]l/(l-q); the exponential behavior for the fully-chaotic regime is recovered for q -+ 1: limq+l exp,(A,t) = exp(At). Analogously, a generalization of the Kolmogorov entropy should describe the relevant rate of loss of information. A general discussion of the relation between the KolmogorovSinai entropy rate and the statistical entropy of fully-chaotic systems can be found in Ref. 2: asymptotically and for ideal coarse graining, the entropy grows linearly p;)/(q--l), with time. The generalized entropy proposed by Tsallis S, = with pi the fraction of the ensemble found in the i-th cell, reproduces this picture at the edge of chaos; it grows linearly for a specific value of the entropic parameter q = qsens = 0.2445 in the logistic map: limt+oo 1imL-o S,(t)/t = K,. The same exponent describes the asymptotic power-law sensitivity to initial conditions ‘. This conjecture includes an extension of the Pesin identity Kq = A,. Numerical evidences with the entropic form S, exist for the logistic and generalized logistic-like maps ‘. Renormalization group methods yield the asymptotic exponent of the sensitivity to initial conditions in the logistic and generalized logistic maps for specific
<
-
+
(l--xLl
‘This work was partially supported by MIUR (Minister0 dell’Istruzione, dell’Universiti e della Ricerca) under MIUR-PRIN-2003 project “Theoretical physics of the nucleus and the many-body systems”.
140
141 initial conditions on the attractor; the Pesin identity for Tsallis’ entropy has been also studied ‘. Sensitivity and entropy production have been studied in one-dimensional dissipative maps using ensemble-averaged initial conditions and for two simpletic standard maps ’: the statistical picture has been confirmed with a different q = qi:& = 0.35 ’. The ensemble-averaged initial conditions is relevant for the relation between ergodicity and cham and for practical experiments. The present study demonstrates the broader applicability of the above-described picture by using the consistent statistical mechanics arising from the two-parameter of logarithms family 10>11>12913
Physical requirements l4 on the resulting entropy select l5 0 5 a 5 1and 0 5 p < 1. All the entropies of this class: (i) are concawe 12, (ii) are Lesche stable 16, and (iii) yield nonnalizable distributions 15; in addition, we shall show that they (iv) yield a finite non-zero asymptotic rate of entropy production for the logistic map with the appropriate choice of a. We have considered the whole class, but we shall here report results for three interesting one-parameter cases: (1) the original Tsallis proposal (a = 1 - q, /3 = 0):
(2) Abe’s logarithm
+
+
where PA = 1/(1 a ) and p = a/(l a ) , which has the same quantum-group symmetry of and is related to the entropy introduced in Ref. 17; (3) and Kaniadakis’ logarithm, a = /3 = IC, which shares the same symmetry group of the relativistic momentum transformation l8
The sensitivity to initial conditions and the entropy production has been studied in the logistic map xi+l = 1 - pxq at the infinite-bifurcation point pm = 1.401155189. The generalized logarithm G(E) of the sensitivity, E(t) = ( 2 p ) t lzil ~ ~for~1 ~5 t 5 80, has been uniformly averaged by randomly choosing 4 x lo’ initial conditions -1 < xo < 1. Analogously to the chaotic regime, the deformed logarithm of E should yield a straight line g(E(t)) =G(G(At)= ) At. Following Ref. 8, where the exponent obtained with this averaging procedure, indicated by (...), was denoted q:lns for Tsallis’ entropy, each of the generalized logarithms, (G(<(t))), has been fitted to a quadratic function for 1 5 t 5 80 and a
142 has been chosen such that the coefficient of the quadratic term be zero: we call this value a&. Statistical errors, estimated by repeating the whole procedure with sub-samples of the 4 x lo' initial conditions, and systematic uncertainties, estimated by including different numbers of points in the fit, have been quadratically combined. We find that the asymptotic exponent = 0.650 f 0.005 is consistent with the value of Ref. 8: q:lnnS = 1 = 0.36. The error on is dominated by the systematic one (choice of the number of points) due to the inclusion of small values of E which produces 1% discrepancies from the common asymptotic behavior. Figure 1 shows the straight-line behavior of for all formulations when (Y = (right frame) ; the corresponding slopes X (generalized Lyapunov exponents) are 0.271 f 0.004 (Tsallis), 0.185 f 0.004 (Abe) and 0.148 f 0.004 (Kaniadakis). While (Y is a universal characteristic of the map, the slope X strongly depends on the choice of the logarithm.
G(()
8
6
8 A
q4 2
'0
t
5
10
15
20
25
M
t
Figure 1. Generalized logarithm of the sensitivity to initial conditions (left) and generalized entropy (right) as function of time. From top to bottom, Tsallis', Abe's and Kaniadakis' logarithms (entropies) for 01 = a&. In the left frame the slopes X (generalized Lyapunov exponents) are 0.271 f 0.004, 0.185 f 0.004 and 0.148 f 0.004; in the right frame the slopes K (generalized Kolmogorov entropies) are 0.267 f 0.004, 0.186 & 0.004 and 0.152 f0.004.
The entropy has been calculated by dividing the interval ( - 1 , l ) in W = lo5 equal-size boxes, putting at the initial time N = lo6 copies of the system with a uniform random distribution within one box, and then letting the systems evolve according to the map. At each time p i ( t ) = n i ( t ) / N ,where ni(t) is the number of systems found in the box i at time t , the entropy of the ensemble is
where (. . . ) is an average over 2 x lo4 experiments, each one starting from one box randomly chosen among the N boxes. The application of the MaxEnt principle to the entropy (5) yields as distribution the deformed exponential that is the inverse function of the corresponding logarithm of Eq. (1): e%(z) = l"n-'(z) 15.
143
do
I
m
?
I
.
.
.
.
=
I
t
Figure 2. Tsallis’ (top left), Abe’s (top right) and Kaniadakis’ (bottom) entropies as function of time for c2 = 1 - q = 0.80,0.74,0.64,0.56,0.52 (from top to bottom). Straight lines are guides for the eyes when cy = 0.64=
Analogously to the strong chaotic case, where an exponential sensibility ( a = ,B = 0) is associated to a linear rising Shannon entropy, which is defined in terms of the usual logarithm ( a = ,B = 0), we use the same values a and p of the sensitivity for the entropy of Eq. (5). Fig. 1 shows (right frame) that this choice leads to entropies that grow linearly: the corresponding slopes K (generalized Kolmogorov entropies) are 0.267 f0.004 (Tsallis), 0.186 f0.004 (Abe) and 0.152 f 0.004 (Kaniadakis). This linear behavior disappears when a # a& as shown in Fig. 2 for Tsallis’, Abe’s and Kaniadakis’ entropies. In addition, the whole class of entropies and logarithms verifies the Pesin identity K = X confirming what was already known for Tsallis’ formulation The values of X and K for the specific Tsallis’, Abe’s and Kaniadakis formulations are given in the caption to Fig. 1 as important explicit examples of this identity. An intuitive explanation of the dependence of the value of K on p and details on the calculations can be found in Ref. 19. In summary, numerical evidence corroborates and extends Tsallis’ conjecture that, analogously to strongly chaotic systems, also weak chaotic systems can be described by an appropriate statistical formalism. In addition to sharing the same
144 asymptotic power-law behavior to correctly describe chaotic systems, extended formalisms should verify precise theoretical requirements. These requirements define a large class of entropies; within this class we use the tweparameter formula (5), which includes Tsallis’s seminal proposal. Its simple power-law form describes both small and large probability behaviors. Specifically, the logistic map shows: (a) a power-low sensitivity to initial conditions with a specific exponent [ t 1 I a , where a = 0.650 f0.005; this sensitivity can be described by deformed exponentials (see Fig. 1, left frame); with the same asymptotic behavior [ ( t )= =(At) (b) a constant asymptotic entropy production rate (see Fig. 1, right frame) for trace-form entropies that go as p l - a in the limit of small probabilities p , where a is the same exponent of the sensitivity; (c) the asymptotic exponent a is related to parameters of known entropies: a = 1 - q, where q is the entropic index of Tsallis’ thermodynamics 3; a = l/qA - 1, where qA appears in the generalization (3) of Abe’s entropy ”; a = r;, where n is the parameter in Kaniadakis’ statistics lS; (d) Pesin identity holds Sp/t -+ Kp = Xp for each choice of entropy and corresponding exponential in the class, even if the value of Kp = Xp depends on the specific entropy and it is not characteristic of the map as it is a 19; (e) this picture is not valid for every entropy: an important counterexample is the Renyi entropy a, SiR’(t)= ((1 - q)-’ log p ; ( t ) ] ) , which has a non-linear behavior for any choice of the parameter q = 1 - a (see Fig. 3). N
[EL,
I
Figure 3.
10
Renyi’s entropy for 0.1 5 Q = 1 - q
20
t
3
5 0.95 (from top to bottom).
We gratefully thank S. Abe, F. Baldovin, G. Kaniadakis, G. Mezzorani, P. Quarati, A. Robledo, A. M. Scarfone, U. Tirnakli, and C. Tsallis for suggesaA comparison of Tsallis’ and Renyi’s entropies for the logistic map can also be found in Ref. 20.
145 tions and comments. References 1. 2. 3. 4. 5. 6.
7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
20.
C. Tsallis, A. R. Plastino, and W.-M. Zheng, Chaos Solitons Fractals 8, 885 (1997). V. Latora and M. Baranger, Phys. Rev. Lett. 82, 520 (1999). C. Tsallis, J . Statist. Phys. 52, 479 (1988). V. Latora, M. Baranger, A. Rapisarda and C. Tsallis, Phys. Lett. A 273, 97 (2000). U. M.S. Costa, M. L. Lyra, A. R. Plastino and C. Tsallis, Phys. Rev. E 56, 245 (1997). F. Baldovin and A. Robledo, Phys. Rev. E 66,045104 (2002); Europhys. Lett. 60, 518 (2002). F. Baldovin and A. Robledo, Phys. Rev. E 69, 045202 (2004). Garin F. J. Ananos and Constantino Tsallis, Phys. Rev. Lett. 93, 020601 (2004). G. F. J. Ananos, F. Baldovin and C. Tsallis, arXiv:cond-mat/0403656. D.P. Mittal, Metrika 22, 35 (1975); B.D. Sharma, and I.J. Taneja, Metrika 22, 205 (1975). E.P. Borges, and I. Roditi, Phys. Lett. A 246, 399 (1998). G. Kaniadakis, M. Lissia, A. M. Scarfone, Physica A 340, 41 (2004). G. Kaniadakis and M. Lissia, Physica A 340, xv (2004) [arXiv:cond-mat/0409615]. J. Naudts, Physica A 316, 323 (2002) [arXiv:cond-mat/0203489]. G. Kaniadakis, M. Lissia, A. M. Scarfone, arXiv:cond-mat/0409683. S. Abe, G. Kaniadakis, and A.M. Scarfone, J. Phys. A (Math. Gen.) 37, 10513 (2004). S. Abe, Phys. Lett. A 224, 326 (1997). G. Kaniadakis, Physica A 296, 405 (2001); Phys. Rev. E 66, 056125 (2002). R. Tonelli, G. Mezzorani, F. Meloni, M. Lissia and M. Coraddu, arXiv:condmat/0412730. R. S. Johal and U. Tirnakli, Physica A 331, 487 (2004).
FRACTAL GROWTH OF CARBON SCHWARZITES
GIORGIO BENEDEK Dipartimenfa di Fisica, Universitd di Mlano-Bicocca e ZNFM UdR Milano-Bicocca, Via Cazzi 52, 20125 Milano, Italy
HOOMAN VAHEDI TAFRESHI College ofTe*riler. North Carolina State University
2401 Research Dr.. Raleigh, NC27695-8301, USA
ALESSANDRO PODESTA', PAOLO MILAN INFM UdR Milano-Universitd e CIMINA, Universifd di Milano, Diparrimento di Fisica, Via Celoria 16, 20133 Milano, Ifaly
The potential energy, the thermodynamicproperties and the growth conditions of random carbon schwarzitesarc theoretically bvestigated in connection with their topological pmpedies and self-aftine struemre. An nnalysis based on numerical simulations of hansmission electron microscopy images permils to assign certain carbon foams, recently produced by means of supersonic cluster beam dewsition, to self-affine random schwmites. I1 is shown lhat self-affinity makes their thermodynamic pmpcrlies noo+xtenaive. The fractal growth exponent is shown lo be related to the parameter q -1 of the Tsallis ooa-extensive entropy.
1 Introduction
The synthesis of new sp*-honded carbon forms, such as fullerenes [I] and nanotubes [2], and the observation of important properties such as the superconductivity in alkali metal-doped fullerenes [3], field-emission [4], and supercapacitance [5] from arrays of nanotubes, are opening fascinating perspectives for nmostructured carbon as a novel all-purpose material [6]. However, fullerenes and nanotubes, as well as graphite, aggregate into van der Waals three-dimensional (3D) solids. Recently a clear evidence has been obtained that a highly connected, fully covalent sp2 form, combining certain valuable properties of fullerenes and nanotubes with a robust 3D architecture can be grown by means of supersonic cluster beam deposition (SCBD) [7,8]. This new form of spongy carbon is characterized by a nanometrk porosity and, as appears from the numerical simulations of the TEM images [9] and of the SCBD process [lo], by the structure of a random schwarzite 1113 grown in the form of a fractal self-affine minimal surface [12]. Thus, besides offering appealing technological perspectives, this novel material shows intriguing aspects of differential geometry and topology. In what follows (Sec. 2) it is shown that the growth of this kind of complex structures is actually determined by simple initial topological conditions [9].The growth of random schwarzites as a minimal surface is shown to constantly keep at a minimum of the free energy, and the combination of the self-affinity and rninimality properties is found to imply a nonextensivity of the thermodynamic potential [13-151. The fiee energy of a self-affme random scbwarzite is briefly discussed in connection with the observed porosity in the limit of a small non-extensivity.
146
147 2. Structure and Growth of Random Schwarzites
2.1 Topology
A basic question is whether a graphite sheet can be transformed into a surface characterized by a negative Gauss curvature everywhere through the creation of a sufficient number of negative disclinations, which occur wherever a hexagonal ring is replaced by a larger polygon. A special case of negative Gauss curvature occurs when the mean curvature is zero everywhere, which corresponds to what in differential geometry is known as a minimal surjuce. The conjecture that a minimal surface is particularly stable has stimulated much thcorctical work on hypothetical graphite sheets (graphenes) with the shape of aperiodic minimul sur&uce (Figure 1) [16-201. These structures have been called schwurzites, after the name of the mathematician H. A. Schwarz who investigated the differential geometry of this class of surfaces at the end of the 19th century [21]. Similar theoretical sp2 carbon structures like polybenzenes [22] and hollow graphites [23,24] can be ascribed to the general family of schwarzites.
Figure I The carbon schwarzite fccic,,), obtained from a tiling with carbon hexagonal and heptagonal rings of a three-pericdic D-typeminimal surface [21,23]. The unit elemcnts of a D-type schwmite are centred at the Sites of a diamond fcc lattice. Each unit cell contains hvo elements and each element is made of I 2 heptagons and my number h (* I ) of hcnagons. Hcre h = 28. This is the smallest schwarzite with nun-abulling heptagons (equivalent to CSOwhich is the smallest fulleremewith non-abutting pentagons and has h = 20).
From the topological point of view graphenes like fullerenes, graphite sheets, nanotubes and schwarzites are described as a polygonal tiling of surfaces with only hexagons, pentagons and heptagons, where each vertex corresponds to a carbon atom, each edge to a covalent bond and each polygon to a carbon ring. Each atom has a threefold coordination. The surface covered by the polygonal tiling of carbon rings is characterized by its connectivity or order of connection k. According to Hilbert [25] the order of connection is the number plus one of the closed cuts wbicb can be made on the given surface without breaking it apart in two pieces. The surface topology may be alternatively
148 characterized by the Euler-Poincark characteristic x = 3 - k or the genus g = (k-l)/2.For example a simple (one-hole) torus can be cut along two closed lines without splitting it in two pieces, and therefore k = 3 or x = 0, g = 1. For a sphere k = 1 (g = 0, x = 2) whereas for an n-hole torus k = 1 + 2n, ,y = 2( I-n) and g = n. Thus the genus represents the number of “holes” (or “handles”) of a generalized torus. While fullerenes are represented by a closed surface topologically equivalent to a sphere (k = I), uncapped nanotubes, graphite sheets and schwarzites are open surfaces with an i n f ~ t extension e in one, two or three dimensions, respectively. However graphenes characterized by a periodic atomic structure can be reduced to a closed surface by applying cyclic boundary conditions. In this way uncapped nanotubes and graphite sheets become topological equivalent to an ordinary (one-hole) torus (k= 3). On the other hand the connectivity of an infmite periodic surface is idmite. However, if cyclic boundary conditions are applied on a finite portion of the periodic surface, the EulerPoincart: (EP) characteristic is fmite and proportional to the actual number of unit cells. Thus it is convenient to define the EP characteristic per unit cell x ~This ~ is~obtained ~ . by closing the portion of surface contained in the unit cell on itself as implied by the cyclic boundary conditions, and the number gCellof handles generated by the closure operation gives = 2( 1- gcrll) and kcen= 2 gCen+ 1. The simplest schwarzite forms have the periodicity of a simple cubic lattice (P-type) or of a diamond lattice (D-type, Figure 1) [17]. A D-type schwarzite is made by the conjunction of two identical unit elements, each one having four branches in the tetrahedral directions, which makes six branches per unit cell and therefore ,yc.11 = -4 and kcell = 7. On the other hand the closure of a D-element is obtained by connecting two pairs of branches so as to form a two-hole torus. Thus the EP characteristic and the connectivity per element are xer= -2 and k.1 = 5 , respectively. The unit cell of a P-type schwarzite contains one element with six branches in the orthogonal directions and therefore ,yCL1l = ,yer = -4 and kceir= k.i= 7. According to Euler’s theorem applied to graphenes, the numbers of atoms (v), of bonds (e) and of rings v) are related to the connectivity by
For periodic schwarzites all quantities in Equation (1) may refer to the unit cell as well as to the unit element. For the sp2three-fold coordination e = 3v/2 and for graphenes with only six-, five- and seven-fold rings (let their number be fa, fs andf,, respectively) it is v = (5f5 + 6h + 7h)/3. Thus
f, -f,=-62 For fullerenes (x = -2) with no 7-fold rings the well known result fs = 12 is obtained. For open nanotubes and graphite sheets fs =h whereas for D- and P-type schwarzites with no 5-fold rings h = 24 in each unit cell (h= 12 per element for the D-type schwarzite). The shapes of P- and D-type schwarzites can be analytically described by the Weierstrass-Enneper representation in the complex plane [26,27] and are well approximatedby the lowest terms of a Fourier expansions as
cosx + cosy + cosz = 0 ,
(3)
cosx cosy cos z + sin x sin y sin z = 1,
(4)
respectively, where the coordinatesx , y, z are in units of some conventional length a0
149
Figure 2 A i m s m ~ s i m nelccuon microrcopc (TEM) image of a andom carbon schwarrile oblarncd by rupcnonic clu.tcr beam dcpo,mun I1 dercnbcd a\ il single, highly ronnccicd graphene shcci wilh an average pore diameter m thc range of 100 nm
IS
The spongy carbon structures (Figure 2) obtained by means of SCBD in presence of nanocrystalline transitionmetal catalysts [7,9] have been assigned lo a random schwarzite on the basis of a nwnencal simulation of the TEM images. It has been shown by means of an AFM analysis dunng the deposition proccss [28-.3J] that SCBD carbon films are self-affine along the growth drectron. The surface of the self-affine film is charactenzed by a height-height correlation function u ( R ) associated with the surface roughness, R being thc distance between two surface points; w(R) is cc R” for R smaller than the surface correlation length and tends to w , ~for , R >> 6 . The saturation value w,.,, increases with the film thickness I as fl [31,12]. The roughness exponent a and the growth exponent p have been measured [28] and found to be a :0.66 I0.02 and /? = 0.50 f 0.03. In order to simulate extended regions of the observed graphcne surfaces as they appear in TEM images, a model analytical surface is adopted in the form of a P-type schwamite, Equation (3), distorted along the growth direction (z axis), by a scaling factor 2. Its equation is written as:
<
+
cos(xz-P) cos(yz-P)
+ cos(-)
Zl-a
1-P
=0
Figure 3 shows a comparison between a portion of the TEM image depicted in Figure 2 of an SCBD film (top) and a simulation based on a distorted P-type schwarzite, Equation 5 , with p = 0.50 and a. = 1 nm (bottom) [9]. Figure 4 displays a similar simulation for a distorted D-type schwarzite, also for /?= 0.50. There is a clear visual resemblance between the experiment and the distorted P-type simulation, whereas the comparison with the D-type image is less convincing. The latter, however, shows certain quasi-circular features which are seen in the TEM image but not in the P-type simulation.
150
Figure 3. Comparison between a portion of random carbon schwardte rn observed by TEM (tap) and a simulation by means of a distorted P-type schwamite, Equation 5 (bottom). Thc contrast of the simulated image has k e n chosen so as to give a field depth comparable to that of the TEM image (from Ref. [91).
Figure 4 Simulation analogous 10 that ofFigure 3 for B distoRed D-type schwarite.
151 It is noted that the minimality condition for a surface represented by the equation x = x&z) is fulfilled when [32]
XJl+
x i ) - 2XyX,X,
+ x,(l + x i ) = 0 ,
(6)
which corresponds to a vanishing mean curvature at any point of the surface. It is found that the transformation in Equation (5)
violates Equation (6) by terms of o r d e r p h because
whereas the principal curvatures decrease like z-? Thus for p < 1 the minimality condition is progressively recovered in the distorted scbwarzite model at large z,say at the mesoscopic scale.
2.2 Thermodynamics
The general question of why and when a catalyzed growth of a pure sp' carbon leads to a scbwarzite rather then to single-walled nanotubes or fullerenes can be explored by considering the total energy of a curved single-walled graphene in the form suggested by Helfrich for membranes and foams [33-351:
E=LdA(y+lrH*-FK),
(10)
where A is the (portion of the) surface which the total energy refers to, y = 2.82 eVIA2 is the energy for unitaryflaf surface (a graphite sheet) [36], and
the mean and gaussian curvatures, respectively, with R, and R2 the principal radii of curvature. Minimal surfaces are at all positions. The constants K and F a r e two elastic constants associated with cylindrical characterized by R, = -RR~ and ellipticalhyperbolic deformations of the surface, respectively. Density functional @F) calculations in the local density approximation for nanotubes of variable radius R (where H = l/(ZR)andK=0)[36,37JandCso(whereH=l/RandK= 1/R2)[36]permittoextractthevaIuesx=3.1 eVand F = 1.7 eV. Consistently a value of = 1.5 eV can be extracted form the available calculations of the cohesive energy of schwarzites [11,19,22] by means of the Gauss-Bonnet theorem [32]:
~ A dA K =2q,,
(12)
152 This shows that the Helfrich's form for the total energy, Equation (lo), approximately holds also for graphenes. The total energy expressed by Equation (10) has the important property, if F is constant, of having a stable local minimum for a minimal surface, since for K > 0 the integral on H2 is always positive unless H = 0, while the integral over K is, according to Gauss-Bonnet theorem, independent of any small continuous deformation of the surface. Thus graphenes taking the shape of a minimal surface like schwarzites are stable forms (up to effects of the contour where K may change, as discussed below). If the negative disclinations yielding a negative Gauss curvature are exclusively due to heptagons, the number of disclinations Nd is fixed by the Euler-Poincar6 characteristic as Nd = 6(2-x) independently of the length scale of the surface [38]. The values of K and i? give general indications about whether the growth process of sp2 carbon preferentially leads to fullerenes, nanotubes or schwarzites. In an incipient growth process from an initial radius RI on the surface of a catalyst, the second principal radius R2 correspondingto a local energy minimum, is obtained from Equation (3) as
R, - 2P R*
1
(13)
K
The surface deformation energy densities for (spherical) fullerenes (Rt = Rn R),nanotubes (RI = R, R2 + a) and schwarzites (Rl = -R2 = R) are ( K - K )/R2, d4R' and F /Rz,respectively, and therefore for any given RI the values of K and F defme three different topological domains: schwarzites are favoured for i? < %K, nanotubes for %K < F < % K and fullerenes for i? > %K. when ri = d2, as found from DF calculations, nanotubes are more likely to occur. However the local values of i? and K either at the surface termination into vacuum, where the growth takes place by cluster addition, or at the contact with a catalyst, are likely to be quite different from the above values (which have been fitted to regular structures) and should be obtained from ab-initio calculations. One should consider that the local change in the electronic structure, e.g., a IT bond-charge depletion or accretion, can substantially modify F . The charge redistribution produced by a catalyst depends on the actual size of catalyst nanoparticles, which may explain why the growth of schwarzites supersedes that of nanotubes when metallorganic precursors are used. In this case the metallic particles are in general very small and highly dispersed. The ordinary configurationalentropy of a schwarzitemade of fs 6-fold rings and fi 7-fold rings can be derived from number of possible tiling combinationsforfa +f7=fand is given by
Thus the free energy at temperature Tcan be obtained from Eqs. (10) and (12) and the minimal conditionH = 0 as
where As is the area of a 6-fold ring. The total area A is related to the connectivity and the total number of atoms N. by the equation
where A* = 6A7- 7As = 0.598 a*,with AT the area of the 7-fold ring and d the average interatomic distance. For a periodic schwarzite [A is proportional to A and both are proportional to No. Thus also F(A,Q, Equation 15, is proportional to A, and can be written as
153 where ?? is the average Gauss curvature defmed through the Gauss-Bonnet theorem, Equation (12) as = 2xx / A. This ensures an exact extensivity for the thermodynamic functions of periodic schwarzites. For a fxed No
and T there are equilibrium values for the connectivity and the pore average size R ,which are obtained from Equation (15) by setting a F ( A , T ) l a x = O . Itis found
with the activation potential @ = Zm-+yA* g 29.8 eV . For SCBD experiments with an average deposition energy E. = 0.15 eV per atom it may be assumed kT = 2E. = 0.3 eV, which gives R=880nrn. The present calculated mesoscopic size of the pores has the right order of magnitude of, though somewhat larger than those observed in TEM images for the same deposition energy. 2.3 Serf-aflnily and Non-extemivily
The observed random schwanites, however, are far from being periodic. Probably a better model is the self-affine construction described by Equation (5). For this model the total area A is no longer simply proportional to x. It is found
where the film thickness t (also in units of ao)is also dependent on x. This clearly makes, for p> 0, the free energy, Equation (1 5), non-extensive, and yields a correcting factor in the expression of the equilibrium connectivity:
and similarly for the average pore size. There is not enough information about the actual value of a0 (something of the order of the initial pore size at the catalyst surface) for a quantitative comparison with the experiment, though Equation (21) introduces some interesting dependence of the ratio I x I eq/Na(constant for the periodic case) on the film thickness through the (non-zero) growth exponent fl These aspects would deserve further investigation. The non-extensivity of the thermodynamic functions for the self-affine structures suggests an analysis in terms of Tsallis non-extensive entropy [13,14], which in the present case is written as
with the parameter q # 1 andfa +f7=f: It is easy to show that for q + 1 the q-entropy S, tends to the ordinary entropy S , Equation (14). By constructing Fq = E - TS, with E given by Equation (lo), H = 0 and the integration made with the Gauss-Bonnet theorem, and by minimizing F, with respect to x at a constant No, the equilibrium connectivity and average pore size can be obtained in a rather involved algebraic form. However in the limit of small q - 1 it is found
154
The comparison of Equation (23) with Equation (21) shows some link between the deviation from extensivity q - 1 and the emergence of self-affinity, as argued from Equation (20).
Acknowledgement
One of us (G.B.) acknowledges a partial support by MIUR, Italy, under the program PP.IN03.
References
1. H. W. Kroto, J. R. Heath, S. C. O'Brien, R. F. Curl and R. E. Smalley,Nature 318, 162 (1985). 2. S. Iijima, Nature 324, 56 (1991). 3. L. D. Ratter, Z. Schlesinger, J. P. McCauley, N. Coustel, J. E. Fisher and A. B. Smith, Nature 355,532 (1992). 4. H. Wang, A. A. Setlur, J. M. Lauerhaas, J. W. Dai, E. W. Seelig and R. P. H. Chang, Appl. Phys. Lett. 72,2912 (1998). 5. C. Niu, E. K. Sichel, R. Hoch, D. May and H. Tennent, Appl. Phys. Lett. 70, 1480 (1997). 6. G. Benedek and M. Bernasconi, in Encyclopaedia of Nanoscience and Nanotechnology (Marcel Dekker, Inc., New York 2004) p. 1235. 7. E. Barborini, P. Piseri, P. Milani, G. Benedek, C. Ducati and J. Robertson, Appl. Phys. Lett. 81, 3359 (2002) and E. Gerstner, Nature, Materials Update, 7 Nov 2002. 8. P. Milani e S. Iannotta, Cluster Beam Synthesis ofNanoshrctured Materials (Springer, Berlin 1999). 9. G. Benedek, H. Vahedi-Tafieshi, E. Barborini, P. Piseri, P. Milani, C. Ducati and J. Robertson, Diamond and Rel. Muter. 12,768 (2003). 10. D. Donadio, L. Colombo, P. Milani and G. Benedek, Phys. Rev. Lett., 84,776 (1999). 1 1 . T. Lenosky, X. Gonze, M. Teter and V. Elser, Noture 355,333 (1992). 12. M. Bogana, D. Donadio, G. Benedek and L. Colombo, Europhys.Lett., 54,72 (2001). 13. C. Tsallis, J. Stat. Phys.. 52,479 (1988). 14. C. Tsallis, in Non-extensive Entropy - Interdisciplinary Applications, M. Gell-Mann and C. Tsallis Eds. (Oxford University Press, 2004) p.1, and present volume. 15. C. Tsallis, present Volume. 16. A. L. McKay, Nature 314,604 (1985). 17. A. L. McKay andH. Terrones, Nature 352,762 (1991). 18. H. Terrones and A. L. McKay, in The Fullerenes, H. W. Kroto, J. E. Fisher and D. E. Cox, Eds. (Pergamon Press, Oxford 1993) p. 113. 19. D. Vanderbilt and J. Tersoff, Phys. Rev. Lett. 68,511 (1992). 20. S. Gaito, L. Colombo and G. Benedek, Europhys. Left. 44,525 (1998). 21. H. A. Schwarz, Gesammelte Mathematische Abhandlungen (Springer, Berlin 1890). 22. M. O'Keeffe, G. B. Adam and 0. F. Sankey, Phys. Rev. Len. 68,2325 (1992). 23. G. Benedek, L. Colombo, S. Gaito, E. Galvani and S. Serra, J. Chem. Phys. 106,23 11 (1997). 24. M. CotC, J. C. Grossman,M. L. Cohen and S. G. Louie, Phys. Rev. 858,664 (1998). 25. D. Hilbert and S. Cohn-Vossen, Amchauliche Geornefrie (Springer, Berlin 1932). 26. S . T. Hyde, in Sponges, F o a m andErnulsions, J. F. Sadoc and N. Rivier, Eds. (Kluver, Dordrecht 1999) p. 437 27. D. Hoffman, Nature 384,28 (1996). 28. R. Buzio, E. Gnecco, C. Boragno, U.Valbusa, P.Piseri, E. Barborini and P. Milani, SurJ Sci. 444, LI (2000). 29. P. Milani, A. Podesta, P. Piseri, E. Barborini, C. Lenardi, C. Castelnova, DiamondandRel. Muter. 10,240 (2001). 30. C. Castelnova, A. PodestA, P. Piseri, P. Milani, Phys. Rev. E 65,21601 (2001). 31. A. L. Barabasi and H. E. Stanley, Fractal Concepts in Suflace Growth (Cambridge University Press, Cambridge 1983). 32. R. Osserman, A Survey ofMinirnal Suflaces (Dover, New York 1986). 33. W. Helfrich, Z. Naturforsch. 28,768 (1973). 34. S. T. Hyde, in Foams andEmulsions, J. F. Sadoc and N. Rivier (Kluwer, Dordrecht, 1999) p. 437. 35. C. Oguey, in F o a m andEmulsions, J. F. Sadoc andN. Rivier (Kluwer, Dordrecht, 1999) p. 471.
155 36. J. M. Sullivan, in Foams andEmulsions, J. F. Sadoc and N. Rivier (Kluwer, Dordrecht, 1999) p. 379. 37. C . T. White eta!, in BuckminsterfuNerenes,W. E. Billups and M. A. Ciufolini (VCH, New York, 1993) p. 125. 38. J. F. Sadoc, in Foams andEmulsions, J. F. Sadoc and N. Rivier, Eds. (Kluver, Dordrecht 1997) p. 51 1.
CLUSTERING AND INTERFACE PROPAGATION IN INTERACTING PARTICLE DYNAMICS
A. PROVATA Institute of Physical Chemistry, National Center for Scientific Research “Demokritos” 15310 Athens, Greece E-mail: aprovataO1imnos.chem.demokritos.gr
V. K. NOUSSIOU Institute of Physical Chemistry, National Center for Scientific Research “Demokritos” 25320 Athens, Greece and Department of Chemistry, University of Athens 10679 Athens, Greece We study the development of rough interfaces in lattice models with multispecies nearestneighbour interactions. In particular, we study a bimolecular and a quadrimolecular interacting particle model and the Ziff - Gulari - Barshad model which involves both spontaneous (single particle) and cooperative (multiparticle) reactive steps. We show that interface roughening follows a scaling function in all models and the critical exponents depend on the particular type of interactions and the number of species involved.
1. Introduction
In recent studies considerable interest is devoted to the development of models which dcscribe reactive processes taking place on low dimensional lattices.’-’’ As it has been shown,” when a process takes place on a low dimensional support, the vacancies of the support must also be taken into account to properly describe the steady state properties and the dynamics. The lattice models which consider the support vacancies as an independent species and can thus be directly implemented on lattice are called ”lattice compatible models”. As an example of a lattice compatible model we have studied the Lattice LotkaVolterra (LLV) model” which is described by the following scheme:
XI -k x2 5 2x2 x2 f
s 4 2s
156
157 where X1 and Xz are the reactive species while S represents the empty lattice sites. In the reactive scheme (l),Eq.(la) corresponds to reaction between X1 and X;?, provided they reside in neighbouring sites, while Eq.( lb) corresponds to desorption of X z from the lattice leaving an empty site S, provided that a neighbouring empty site S already exists. Similarly, Eq.(lc) corresponds to desorption of XI. In this, lattice compatible, LLV mechanism a lattice site is allocated for every species, XI, X 2 , or S . Equivalently, the total number of sites of the lattice that are empty ( S ) ,or covered by X1 or Xz is constant, N.
N
= x1 +x2
+s
(2) a counter example, the original Lotka-Volterra (LV) model: [XI Xz 2 x 2 , X2 3 0, X1 3 2x11 is not lattice compatible. The fact that the LV model does not take vacant sites into account makes it inappropriate for the direct implementation of on lattice interactions. In the third step, for example, X1 gives 2x1. If this step was to be implemented on a lattice, there would be no available site for the extra X1 produced to adsorb. In order to use the LV model on lattice, modifications of the mechanism are necessary and introduction of empty sites. The LLV model is therefore a special modification of the LV model that is applicable on lattice. Except for the LLV model, many other lattice compatible models have been used in literature, such as the Lattice Limit Cycle (LLC) model,13 the epidemic mode1,l4- l6 the Ziff-Gulari-Barshad (ZGB) model.' , 2 Some of these models, e.g. the LLV and the LLC, have been designed to study basic pattern formation mechanisms, while others have been designed to simulate specific reactive processes; e.g. the ZGB model was used to simulate the catalytic CO oxidation on a Pt surface. Heterogeneous catalytic reactions are important examples of processes that are best described by lattice compatible models. Clustering and pattern formation (including oscillations) are observed as a result of molecule interactions on catalysts. Various examples of patterns arise in experiment, such as in the CO oxidation on Pt,17 the NO reduction on Fth or Pt,17,18the NO CO reaction on Pt,l79l9 etc. (The NO H2 reaction has also been studied on substrates with different proper tie^).'^ In the case of reactions on catalysts, patterns are concentration gradients on the surface which evolve in time. Stripes, target patterns and spiral waves are the usual patterns that appear in the above experimental processes. By far the richest variety of spatiotemporal patterns has been observed in catalytic CO oxidation on P t ( l l 0 ) - target patterns, rotating spiral waves, solitary oxygen pulses, standing waves and chemical turbulence -.17 Simulating realistic processes like the ones above demands more complicated models than the LLV, the LLC, or even the ZGB model. In the meantime, both approaches, that is studying basic mechanisms as well as simulating simplified specific processes, are necessary in the effort to find the true mechanistic pathways of real processes. In particular, the ZGB model is very important in that it predicts spatiotemporal phenomena such as kinetic phase
As
+
+
2
+
158 transitions and interface propagation through a minimal mechanism and simultaneously it corresponds to a real chemical system. On the other hand, the LLV and the LLC model have been very successful in producing patterns, thus shedding light in the direction of identifying the mechanisms that are responsible for pattern formation in general. In the current study we will focus on the interface propagation between the different species in bi-molecular and quadri-molecular reactive schemes and on determining the scaling of the interface width. We will show that the characteristic exponents depend on the number of species and also on the parameter values. In the next section we will study the bimolecular reactive scheme, while in section 3 we study the quadrimolecular reactive scheme. Section 4 we devote to pattern formation and interface propagation in the ZGB model while in the concluding section we draw our main results and we discuss open problems. 2. Bimolecular Reactive Schemes
In the case of bimolecular reactions, as in the LLV model, previous studies have demonstrated the formation of fractal clusters in free LLV systems and stripes and spiral patterns in LLV systems with specific initial condition^.^ An important element for the creation of such patterns was the autocatalytic nature of all the steps involved in the LLV scheme (Eqs.(la),(lb),(lc)), which are of the form
A + B ~ B + B
(3)
This model, known as the epidemic model, has been extensively studied in 1 i t e r a t ~ r e .The l ~ ~bimolecular ~~ character of this kind of reactions results in competition of the domains of species A and B and intrusion of the B domains within the A domains. This particular type of interaction gives rise to a rough interface when two phases A and B interact, even if we start from a completely linear interface. In this section we will study the roughening of a 1-dimensional linear interface, when the system (3) is realised in a 2-d square lattice via Kinetic Monte Carlo (KMC) simulations. The realisation of system (3) is as follows. (1) Start with a 2-d square lattice of size LxL filled with particles A or B and with given initial conditions and concentrations. (2) At every Elementary Time Step (ETS) choose one lattice site at random. (3) If the lattice site chosen contains a B particle then disregard the site and go to algorithm step 5. (4) If the lattice site chosen contains an A particle then select one of the four neighbours at random. If the selected neighbour is A go to algorithm step 5. If the selected neighbour is B then with probability p change A to B. ( 5 ) CONTINUE. One ETS is completed and the algorithm returns to step 2 starting a new ETS.
159 The time unit we use is the Monte Carlo Step (MCS) which is equivalent to LxL ETS; that is, one MCS is the time required for a number of trials equal to the total number of lattice sites (LxL) to be completed.
Figure 1. Interface roughening in the bimolecular model (3) (a) Initial stages of evolution (after 2 MCS), (b) snapshot after 20 MCS, (c) 50 MCS, (d) 100 MCS. Parameter values are L = 28, p = 1.0.
To study the interface roughening in model (3) we start the algorithm with initial conditions in which the whole system is covered by A particles (coloured gray) except for a linear band which is covered by B (black) and thus the interface between the A and B phases is linear (see Fig. 1). In Fig. 1 we present some representative stages of the interface roughening as the KMC algorithm proceeds. Figure l(a) represents initial stages of evolution (after only 2 MCS), while in Figs. l(b), l(c) and l(d) we observe how the surface has evolved after 20 MCS, 50 MCS and 100 MCS respectively. The system size is L = 28, while p = 1.0. To describe the interface roughening we calculate the average height < h(t) > and the width of the interface w ( t ) which are defined as
c L
< h(t) >=
i=l
h(i,t )
(4)
160 and
l N w2(t)= - C(h(i, t)-
< h(t) > ) 2
(5)
i=l
where h(i,t ) is the height of the i-th column a t time t.
Figure 2. The scaling of the width as a function of time for the bimolecular reaction scheme. The straight line represents a power law with an exponent of 2/3.
In Fig. 2 we present w2 as a function of time for p l = 1.0. For both values in a double logarithmic scale, the function w 2 shows a linear increase, then a second linear region, while for larger times w2 reaches a plateau. The behaviour of w 2 after the initial transitory phase can be described by a scaling function
w(t)
t L”
= Laf(-)
where (Y is called the roughness exponent and z is called the dynamic exponent. The scaling function w(t) behaves as
while
161 where fl is the growth exponent. Calculating the exponents after an initial transitory regime where d ( t )cc t, the growth enters a roughening phase and W 2 ( t ) = t2P
(9)
with 2p = 0.65 f 0.05
(10)
The exponent fl values are very close to 0.33, u-ich is the p value for t-5 Eden model, the LLV model and the Kardar - Parisi - Zhang (KPZ) equation.'O This is a strong indication that the epidemic model is in the same universality class described by the KPZ equation. 3. Quadrimolecular Reactive Schemes We will now consider a more complicated quadrimolecular reactive scheme in which two A particles and two B particles are involved as follows 2A+ 2B 3 4B
(11)
that is, when two A particles are found to be neighbours with two B particles, both A particles change into B particles. The KMC scheme which simulates the process (11) has the following form (1) We start with a 2-d square lattice of size LxL filled with particles A or B and with given initial conditions and concentrations. (2) At every ETS one lattice site is chosen at random. (3) If the lattice site contains a B particle the algorithm jumps to step 5 . (4) If the lattice site contains an A particle and amongst the 4 first nearest neighbours there is another A particle and 2 B particles then with probability p both A particles change into B particles. (5) CONTINUE. One ETS is completed and the algorithm returns to step 2 starting a new ETS. Following the above algorithm we have realised the quadrimolecular scheme on a 2-d square lattice with the following initial condition: All lattice sites are covered by A particles (gray) except for a linear band which is covered by €3 (black) (see Fig. 3). In Fig. 3 we present some representative stages of the interface roughening as the KMC algorithm proceeds. Fig. 3(a) represents initial stages of evolution (after only 10 MCS), while in Figs. 3(b), 3(c) and 3(d) we observe the surface evolution after 100 MCS, 200 MCS and 300 MCS respectively. The system size is L = 2 ' , while p = 1.0. This initial configuration will initiate reaction while having almost zero roughness. Note that a perfectly linear interface cannot initiate reaction. For this reason
162
Figure 3. Interface roughening in the quadrimolecular reaction scheme (11) (a) Initial stages of evolution (after 10 MCS), (b) snapshot after 100 MCS, (c) 200 MCS, (d) 300 MCS. Parameter values are L = 28, p = 1.0.
the initial state contains a toothy interface between the A and B phases which slowly develops considerable roughness. In Fig. 4 we present the evolution of the width w 2 as a function of time in this process. In a double logarithmic plot, the scaling follows a power law of the form of Eq.(9) with /? = 0.5 f 0.05. This exponent is distinctly different from the one calculated for the bimolecular model indicating that these two models do not belong to the same universality class. We can then conclude that the number of interacting species is important in defining the roughening exponents in growth models. 4. Interface Formation in the ZGB model
The ZGB model, which is a lattice compatible model as well, has been introduced for the simulation of the Langmuir - Hinschelwood (LH) mechanism of the CO oxidation that takes place on the surface of various metals e.g. P t , Pd, Rd.'-' Our realisation of the LH mechanism is inspired by the ZGB model, yet the implementation is slightly different. The lattice compatible LH mechanism has the following form
163
100
I
I
I 1
I 10
I
?
I 100
time (MCS)
Figure 4. The scaling of the width as a function of time for the quadrimolecular reaction scheme. The straight line represents a power law with an exponent of 1
Co(ad8) + O(ads)-’COZ(g) -k 2 s
(124
where S are the vacant lattice sites, and the subscripts g and ads imply that the molecules are in the gaseous and adsorbed state respectively. The particles in the gas phase affect the on lattice interactions only through the adsorption probabilities of CO and 0 2 which express the mole fractions of CO, and OZ(,). The KMC algorithm we used for the LH simulation on a 2-d lattice is the following (1) We start with a 2-d square lattice of size LxL either vacant (S)or filled with CO or 0 particles with given initial concentrations and conditions. (2) At every Elementary Time Step (ETS) one lattice site is chosen at random. (3) If the lattice site chosen is vacant (S) then: a) with probability yco, CO adsorbs, b)with probability 1 - yco two 0 adsorb, one on this site and one on a randomly selected neighbor (provided it is vacant (S)). (yco is the ”mole fraction” of CO in the ”gas phase”). (4) If the lattice site chosen contains a CO particle then one of the four neighbours is selected at random. If this neighbour contains an 0 then both CO and 0 change to S. If the selected neighbour is CO or S then the site is disregarded and the algorithm goes to step 6 .
164
(5) If the lattice site chosen contains an 0 particle then one of the four neighbours is selected at random. If this neighbour is CO then both 0 and CO change to S. If the selected neighbour is 0 or S then the site is disregarded and the algorithm goes t o step 6. (6) CONTINUE. One ETS is completed and the algorithm returns to step 2 starting a new ETS.
(C)
(d)
Figure 5. Interface roughening in the ZGB reaction scheme (a) Initial condition of the system for t = OMCS, (b) snapshots after t = SOMCS, (c) t = 270MCS and (d) t = 39OMCS. Parameter values are L = 200, p = 0.3.
More generally we can denote A = CO, B = 0, S = (vacant lattice site) and yco = p . Using this mechanism we have simulated systems of various sizes ranging from L = 30 to L = 500 and with parameter values ranging from p=O to 1. As can be seen in Figs. 5 and 6, phase separation and wave propagation take place in this model for specific parameter regions. In particular, in Fig. 5 we have started with the specific initial conditions presented in Fig. 5(a), i.e. a circular area (disc) of B (coloured gray) surrounded by A (black). Due to the reactive step 12c, reaction between A and B takes place and vacant sites (coloured white) arise at the disc circumference. Reactions take place only on these regions of vacant sites which travel both towards the center of the disc and the outside. However for small p such as p = 0.3 these areas have a tendency to propagate within black areas (towards the outside of the disc that is covered by A particles) while the originally circular circumference of the disc roughens. In Figs. 5(b), 5(c) and 5(d) we observe various
165
Figure 6. Interface roughening and propagation in the ZGB reaction scheme (a) Initial condition of the system for t = OMCS, (b) Snapshots a t = 2 0 M C S , (c) t = GOMCS, and (d) t = 12OMCS. Parameter values are L = 200, p = 0.5.
stages of roughening where the effect of the bimolecular step 12c is predominant. The intrusion of the A domains towards the center of the disc (covered by B) happens more slowly due to the small adsorption probability of A. Also as soon as an A is deposited next to one or more B particles it is very likely to disappear, since the reaction step 12c happens with high probability, r=l. Thus the area of the disc that contains only B particles grows with time. For larger values of p the system goes through a critical point (pc 0.4) where propagation towards the center and the outside of the disc happens at the same rate. In this case we have a reactive steady state of the system and so it never reaches a poisoned state. Thus for values of p near the critical point stable, fractal clusters are observed. For still larger values of p , intrusion of phase A within phase B prevails. In Fig. 6 we also present the diagram of phase separation for different initial conditions shown in Fig. 6(a), that is a random deposition on a band enclosed in a bulk region of B particles. The boundary conditions are periodic in both the x and y directions, the system size is L = 200 and the adsorption probability p of A is 0.5. For these parameter values the phase A gradually dominates the initially random area while vacant sites (S) arise at the interface between A and B regions because of the reaction step 12c between them (as in the previous case of Fig. 5). The adsorption of A particles is favoured on the vacant sites since the adsorption of B (12b) demands the simultaneous selection of two neighbouring vacant sites. Thus when p = 0.5 the A regions grow in expense of the B regions. The roughness of the interface between the A and B regions increases N
166 with time at first (see Fig. 6(b)) until the roughness reaches a plateau (Figs. 6(c) and 6(d)). The steady state roughness depends on the system size L . Thus, as in the above cases, when cooperative phenomena are involved in reactive dynamics clustering, interface propagation and roughening are observed. More detailed study is expected to shed light on the roughening transition away from the critical point as well as on the cluster structure at the critical point.
Conclusions In the current study we examine the interface propagation and roughening on a surface between phases in competition. We present three models of interacting particle systems with a variable degree of interactions, that is a) the bimolecular model A B+B B, b) the quadrimolecular reactive model 2A 2B+4B and the more complex ZGB model which involves spontaneous and cooperative reaction steps. In all these models we observe clustering of homologous species due to the cooperative character of the interactions. The various clusters compete and we observe interface roughening between the different clusters. The roughness of the surface follows a scaling function and the roughening exponents depend on the type of the interactions and the number of species involved. Namely, for the bimolecular interaction model, the dynamic scaling exponent is p = 0.32 f 0.03, while for the quadrimolecular interaction model is p = 0.5 f 0.03. It is therefore clear that the dynamic scaling exponent (and the universality class) depends crucially on the degree of interactions. More detailed study needs to be carried out in order to investigate the values of the a exponent for all the models and to determine the clustering characteristics of the ZGB model at the critical point of p and far from it.
+
+
+
Acknowledgments The authors would like to thank Dr. G. A. Tsekouras, and Profs. V. Havredaki and A. A. Tsekouras for helpful discussions.
References 1. R. M. Ziff, E. Gulari and Y. Barshad, Phys. Rev. Lett., 56, 2553 (1986). 2. B. J. Brosilow, E. Gulari and R. M. Ziff, J. Chem. Phys., 98,674 (1993); C.A. Voigt and R. M. Ziff, Phys. Rev. E, 56, R6241 (1997). 3. J. W. Evans and M. S. Miesch, Phys. Rev. Lett., 66, 833 (1991); M. Tammaro and J. W. Evans, Phys. Rev. E,52,2310 (1995); M. Tammaro and J. W. Evans, J. Chem. Phys., 108,762 (1998). 4. D. J. Liu and J. W. Evans, Phys. Rev. Lett., 84,955 (2000). 5. V. P. Zhdanov, Phys. Rev. E 59, 6292, (1999); V. P. Zhdanov, Surf. Sci. Rep., 45, 231, (2002). 6. E. V. Albano and J. Marro, J. Chem. Phys., 113,10279 (2000). 7. H. Rose, H. Hempel and L. Schimanksy-Geier, Physica A 206, 421, (1994).
167 8. A. Provata, J. W. Turner and G . Nicolis, J. Stat. Phys. 70,1195 (1993). 9. A. Tretyakov, A. Provata and G . Nicolis, J. Phys. Chem. 99,2770 (1995). 10. A. Provata, G. A. Tsekouras, Phys. Rev. E, 67,art. no 056602 (2003). 11. A. Provata, G. Nicolis and F. Baras,'J. Chem. Phys. 110,8361 (1999). 12. L. Frachebourg, P. L. Krapivsky and E. Ben-Naim, Phys. Rev. E, 54,6186 (1996). 13. . V. Shabunin, F. Baras and A. Provata, Phys. Rev. E,66,art. no 036219 (2002). 14. W. Wang and X. Q. Zhao Math.Biosci., 190,97 (2004). 15. 0. Alves, C.E. Ferreira, F. P. Machado Math. Comput. Simulat., 64,609 (2004). 16. J. D. Murray, Mathematical Biology, Springer, Verlag 2002. 17. R. Imbihl and G. Ertl, Chem. Rev., 95 697 (1995). 18. Y . De Decker, F. Baras, N. Kruse, G. Nicolis J. Chem. Phys., 117,22 (2002). 19. N. Hartman, Y . Kevrekides and R. Imbihl J. Chem. Phys., 112,15 (2000). 20. M. Kardar, G. Parisi and Y. -C. Zhang Phys. Rev. Lett., 56,889 (1986).
RESONANT ACTIVATION AND NOISE ENHANCED STABILITY IN JOSEPHSON JUNCTIONS
A. L. PANKRATOV Institute for Physics of Microstructures of Russian Accademy of Science, GSP-105, Nizhny Novgorod 603950, Russia E-mail: alp9apm.sci-nn0v.m
B. SPAGNOLO Dipartimento d i Fisica e Tecnologie Relative and INFM, Group of Interdisciplinary Physics* Universitb da Palenno, Viale delle Scienze pad. 18, I-90128 Palenno, Italy E-mail: spagnoloOunipa.it
We investigate the interplay of two noise-induced effects on the temporal characteristics of short overdamped Josephson junctions in the presence of a periodic driving. We find that: (i) the mean life time of superconductive state has a minimum as a function of driving frequency, and near the minimum it actually does not depend on the noise intensity (resonant activation phenomenon ); (ii) the noise enhanced stability phenomenon increases the switching time from superconductive to the resistive state. As a consequence there is a suitable frequency range of clock pulses, at which the noise has a minimal effect on pulse propagation in RSFQ electronic devices.
1. Introduction and Basic Formulas
The investigation of thermal fluctuations and nonlinear properties of Josephson junctions (JJs) is very important owing to their broad applications in logic devices. Superconducting devices in fact are natural qubit candidates for quantum computing because they exhibit robust, macroscopic quantum behavior l. Recently, a lot of attention was devoted to Josephson logic devices with high damping because of their high-speed switching 2*3. The rapid single flux quantum logic (RSFQ), for example, is a superconductive digital technique in which the data are represented by the presence or absence of a flux quantum @po = h/2e in a cell which comprises Josephson junctions. The voltage pulse from a moving single flux quantum is the unit of information. The short voltage pulse corresponds to a single flux quantum moving across a Josephson junction, that is a 277 phase flip. However the operating temperatures of the high-Tc superconductors lead to higher noise levels by increasing the probability of thermally-induced switching errors. Moreover during *electronic address: http://gip.dft.unipa.it
168
169 the propagation within the Josephson transmission line fluxon accumulates a time jitter. These noise-induced errors are one of the main constraints to obtain higher clock frequncies in RSFQ microprocessors 2 . In this work after a short introduction with the basic formulas of the Josephson devices, the model used to study the dynamics of a short overdapmed Josephsonn junction is described. In the next section two main noise-induced phenomena observed in metastable states, namely the resonant activation and the noise enhanced stability, are shortly presented. Finally in the last section the results and the interplay of these noise-induced phenomena on the temporal characteristics of the Josephson devices are discussed. The role played by these noise-induced effects in the accumulation of timing errors in RSFQ logic devices is analyzed. The Josephson tunneling junction is made up of two superconductors separated from each other by a thin layer of oxide 4 . The phase difference cp between the wave function for the left and right superconductors is given by the Josephson equation
where V ( t )is the potential difference across the junction, e is the electron charge, and ti = h/27r is the Planck’s constant. A small junction can be modelled by a resistance R in parallel with a capacitance C across which is connected a bias generator and a phase-dependent current generator, Isincp, representing the Josephson supercurrent due to the Cooper pairs tunnelling through the junction. Since the junction operates at a temperature above absolute zero, there will be a white Gaussian noise current superimposed on the bias current. Therefore the dynamics of a short overdamped J J , widely used in logic elements with high-speed switching and corresponding to a negligible capacitance C, is obtained from Eq. (1) and from the current continuity equation of the equivalent circuit of the Josephson junction. The resulting equation is the following Langevin equation
valid for p << 1, with p = 2eI,R2C/fi the McCamber-Stewart parameter and I, the critical current. Here u ( p ) = 1 - cp - icp, i = io
+f ( t ) ,
f(t) = Asin(wt)
(3)
is the dimensionless potential profile (see Fig. l.),cp is the difference in the phases of the order parameter on opposite sides of the junction, f ( t )is the driving signal, . I . IF ~ ~ R N I , z = -, t F ( t ) = -, I F is the random component of the current, and w, = I C I5 ti is the characteristic frequency of the JJ. In the case when only thermal fluctuations are taken into account 4 , the random current may be represented by the white ~
170 UiVj 1 0 . 0 ~
,
,
1 1 1 1 1 1
-1000
-6.00
d.00
* - -
I,,(,,,
I00
6 00
lono
V
Figure 1. The potential profile u(p) = 1 - cosp - i9,for values of the current, namely i = 0.5 (solid line) and i = 1.2 (dashed line).
motion of a Brownian particle moving in a washboard potential (see Fig. 1). A junction initially trapped in a zero-voltage state, with the particle localized in one of the potential wells, can escape out of the potential well by thermal fluctuations. The phase difference cp fluctuates around the equilibrium positions (minima of the potential u(cp)) and randomly performs jumps of 27r across the potential barrier towards a neighbor potential minimum. The resulting time phase variation produces a nonzero voltage across the junction with marked spikes. For a bias current less than the critical current Ic, these metastable states correspond to ”superconductive” states of the JJ. The mean time between two sequential jumps is the life time of the superconductive metastable state ’. For an external current greater than Ic, the JJ junction switches from the superconductive state to the resistive one and the phase difference slides down in the potential profile, which now has not equilibrium steady states. A Josephson voltage output will be generated in a later time. Such a time is the switching time, which is a random quantity. In the presence of thermal noise a Josephson voltage appears even if the current is less than the critical one (i < I), therefore we can identify the lifetime of the metastable states with the mean switching time 3,5. For the description of our system, i. e. a single overdamped JJ with noise, we will use the Fokker-Planck equation for the probability density W(p, t ) that corresponds to Langevin equation (2)
The initial and boundary conditions of the probability density and of the probability current for the potential profile (3) are as follows W(cp,O)= S((p - PO),
171 W(+co,t ) = 0 , G(-co, t ) = 0. Let, initially, the JJ is biased by the current smaller than the critical one, that is io < 1, and the junction is in the superconductive state. The current pulse f(t), such that i(t) = io f ( t ) > 1, switches the junction into the resistive state. An output voltage pulse will appear after a random switching time. We will calculate the mean value and the standard deviation of this quantity for two different periodic driving signals: (i) a dichotomous signal, and (ii) a sinusoidal one. We will consider different values of the bias current i, and of signal amplitude A. Depending on the values of i, and A as well as values of signal frequency and noise intensity, two noiseinduced effects may be observed, namely the resonant activation (RA) (see Refs.[&S,ll] and the noise enhanced stability (NES) (see Refs.[5,10-12]. These effects have different role on the temporal characteristics of the Josephson junction and occur because of the presence of metastable states in the periodic potential profile of the Josephson tunnel junction and the thermal noise. Specifically the RA phenomenon minimizes the switching time and therefore also the timing errors in RSFQ logic devices, while the NES phenomenon increases the mean switching time producing a negative effect
+
’.
2. Noise induced effects 2.1. Resonant Activation
The escape of a Brownian particle moving in a fluctuating metastable potential shows resonant activation phenomenon, that is the average escape time has a minimum as a function of the oscillating frequency of the potential barrier. This effect was theoretically predicted in Ref.[6], where random fluctuations of the potential were considered, and experimentally observed in tunnel diodes and in underdamped Josephson tunnel junctions ’. The fluctuations of the potential barrier can be random or periodic between two limiting configurations of the potential, upper and lower positions respectively. The average frequency of fluctuations must be less than the natural frequency of the system at the metastable state. Recently the RA effect was obtained theoretically in a piece-wise linear dichotomously fluctuating potential with metastable state If the potential fluctuations are very slow, the average escape time is equal to the average of the crossing times over upper and lower configurations of the barrier, and particularly in this case the slowest process determines the value of the average escape time. In the limit of very fast fluctuations, the Brownian particle ”see” the average barrier and the average escape time is equal to the crossing time over the average barrier. In the intermediate regime, the crossing is strongly correlated with the potential fluctuations and the average escape time exhibits a minimum at a resonant fluctuation rate. In the following Fig. 2 we show a typical picture of RA phenomenon observed in a metastable fluctuating potential l l .
172
-6
-4
-2
0
2
4
0
Figure 2. Semilogarithmic plot of the average escape time as a function of the mean switching rate of the piecewise linear metastable potential profile for seven different values of the noise intensity D.
2.2. Noise Enhanced Stability
The noise-enhanced stability (NES) phenomenon was observed experimentally and and, as arecent review, numerically in various physical systems (see Refs. [3,5,10,11] Ref. [12]). The investigated systems were subjected to the action of two forces: additive white noise and driving force. The driving force was futed, periodical or random. The noise enhanced stability effect implies that, under the action of additive noise, a system remains in the metastable state for a longer time then in the deterministic case, and the escape time has a maximum as a function of noise intensity. We can lengthen or shorten the mean lifetime of the metastable state of our physical system, by acting on the white noise intensity. The noise-induced stabilization, the noise induced slowing down in a periodical potential, the noise induced order in one-dimensional map of the Belousov-Zhabotinsky reaction, and the transient properties of a bistable kinetic system driven by two correlated noises, are akin to the NES phenomenon 12. In the next Fig. 3 we report the behavior of the average escape time as a function of the noise intensity for a piece-wise linear metastable potential subjected to dichotomous random fluctuations ll.
1.4.
1.3
D Figure 3. Semilogarithmic plot of the normalized average escape time as a function of the white noise intensity D for three values of the dimensionless mean switching rate of the piece-wise linear metastable potential profile.
173 3. Temporal characteristics Now we investigate the following temporal characteristics: the mean switching time (MST) and its standard deviation (SD) of the Josephson junction described by Eq. (2). These quantities may be introduced as characteristic scales of the evolution
s W(p, t)dp, to find the phase within one period of the
9 2
of the probability P ( t ) =
9 1
potential profile of Eq. (3). We choose therefore p2 = T , (PI = -T and we put the initial distribution on the bottom of a potential well: po = arcsin(i0). A widely used definition of such characteristic time scales is the integral relaxation time '. The mean switching time T = ( t )may be introduced in the form
where w(t) =
,*
and the SD of the switching time is c =
d< t 2 > - < t >2.
Let us focus on the case of dichotomous driving, f ( t ) = Asign(sin(wt)). The results of computer simulations are shown in Fig. 4. Both MST and its SD does not depend on the driving frequency below a certain cut-off frequency, above which the characteristics degrade. In the frequency range from 0 t o 0 . 2 therefore ~ ~ we can describe the effect of dichotomous driving by time characteristics in a constant potential. The exact analytical expression, as well as asymptotic representation of the MST has been obtained in R,ef.[5]. For an arbitrary y we have
and for y
(<
1
where
Using the approach of Ref.[9], the exact expression for ~2~ = ( t 2 )in a time-constant potential, which corresponds also to a single unit-step pulse, may be derived as
174
0'
0 01
W
Figure 4. The MST 7 ( ~and ) SD U ( W ) as functions of the frequency for dichotomous driving, for two values of noise intensity: y = 0.2,0.02, and io = 0.5, i = 1.5. The results of computer simulations are: T ( W ) (solid line) and U ( W ) (diamonds and circles). Dashed line are the theoretical results given by Eqs.(7), (10). 'pa
where H(z) =
Teu(u)/y 5
V
e--u(Y)/y
Te"(')/Ydzdydv, and
T,((Po)
is given by
Y
Eq. (6). The asymptotic expression of D = d-
in the small noise limit y << 1
is
It is interesting to see that the SD of the switching time scales as square root of noise intensity. The comparison of the approximate expression of D (Eq. (10)) with the results of computer simulation, is presented in Fig. 5 for i = 1.2, i = 1.5 and y = 0.001. We can see that formula (10) works rather well up to y = 0.05. Not only low temperature devices (y 5 0.001), but also high temperature devices therefore may be described by formulas (7) and (10). It is important to mention that, since the largest contribution to the MST comes from the deterministic term T , ( ( P o) = {fl(cpz) - fi(cpo)}, the low noise limit formula (7) gives actually the same results as presented in the linear approach. However the formula (10) in some cases significantly deviates from the results of linearized calculations 13. If the noise intensity is rather large, the phenomenon of NES may be observed in our system: the MST increases with the noise intensity, as it may be easily seen from Eq. (7). Here we note that it is very important to consider this effect in the design of large arrays of RSFQ elements, operating at high frequencies. To neglect this
&
175
1
1
i.2
1.4
16
1.8
f
Figure 5. The MST and SD vs bias current for the time-constant case for y = 0.001.Solid lines - results of computer simulation. Inset: the SD vs noise intensity for the time-constant case (i > 1). Solid line - formula (lo), circles - results of computer simulation for i = 1.5. Dashed line - formula (lo), diamonds - results of computer simulation for i = 1.2.
- formulas (7) and (lo), diamonds and circles
om
0.r
i,
Figure 6. The MST vs frequency for f ( t ) = Asin(wt) (computer simulations) for i = 1.5. Longdashed line - y = 0.02, short-dashed line - y = 0.05, solid line - 7 = 0.5, from top to bottom io = 0.5;A = 1, io = 0.8;A = 0.7. Inset: comparison between simulations and theoretical results obtained from Eq.(12) for io = 0.5;A = 1, and 7 = 0.02 (diamonds), 7 = 0.05 (circles) and 7 = 0.5 (crosses).
noise induced effect in such nonlinear devices it may lead to malfunctions due to the accumulation of errors. Now let us consider the case of sinusoidal driving. The corresponding time characteristics may be derived using the modified adiabatic approximation
with rc(p0,t’) given by Eq. (6). We focus now on the current value i = 1.5, because i = 1.2 is too small for high frequency applications. In Fig. 6 the MST as a function of the driving frequency for different values of bias current is shown. For smaller
176
0
.
0.m
1
om
1
0.1
,
1 0
Figure 7. The SD vs frequency for f ( t )= Asin(wt) and y = 0.02. Computer simulations - dashdotted line: io = 0.3;A = 1.2, short-dashed line: io = 0.5;A = 1, long-dashed line: io = 0.8;A = 0.7. Formula (10) - solid line.
io the switching time is larger, since cpo = arcsin(i0) depends on io. On the other hand, the bias current io must be not too large, since it will lead, in absence of driving, to the reduction of the mean life time of superconductive state, i.e. t o increasing storage errors (Eq. (6)). Therefore, there must be an optimal value of bias current 20, giving minimal switching time and acceptably small storage errors. We observe the phenomenon of resonant activation: MST has a minimum as a function of driving frequency. The approximation (12) works rather well below 0.1 we, that is enough for practical applications. It is interesting to see that near the minimum the MST has a very weak dependence on the noise intensity, i. e. in this signal frequency range the noise is effectively suppressed. We observe also the NES phenomenon. There is a frequency range, around 0 . 2 - 0 . 4 ~for ~ io = 0.5 and around 0.3 - 0 . 5 for ~ ~io = 0.8, where the switching time increases with the noise intensity. The NES effect increases for smaller io because the potential barrier disappears for a short time interval within the driving period T = 2 n / w and the potential is more flat lo, so noise has more chances to prevent the phase to move down and delay switching process. This effect may be avoided, if the operating frequency does not exceed 0 . 2 ~ Besides ~. (see Fig. 7) the SD also increases above 0 . 2 ~ The ~ . plots of SD as a function of driving frequency for y = 0.02, i = 1.5 and different values of io are shown in Fig. 7. The approximation (12) is not so good for SD as for MST, even if the qualitative behaviour of SD is recovered. We see that the minimum of a ( w ) , for y = 0.02, is located near the corresponding minimum for 7 ( w ) in Fig. 6. For the SD the optimal frequency range, where the noise induced error will be minimal, is from 0.1 to 0.3 for the considered range of parameters. It is interesting to see that, near the minimum, the SD for sinusoidal driving actually coincides with SD for dichotomous driving (Eq.(lO)). Close location of minima of MST and its SD means that optimization of RSFQ circuit for fast operation will simultaneously lead to minimization of timing errors in the circuit.
177 4. Conclusions
In the present paper we reported an analytical and numerical analysis of influence of fluctuations and periodic driving on temporal characteristics of the JJ. For dichotomous driving the analytical expression of standard deviation of switching time works in practically interesting frequency range and for arbitrary noise intensity. For sinusoidal driving the resonant activation effect has been observed in the considered system: mean switching time has a minimum as a function of driving frequency. Near this minimum the standard deviation of switching time takes also a minimum value. Utilization of this effect in fact allows to suppress time jitter in practical RSFQ devices and, therefore, allows to significantly increase working frequencies of RSFQ circuits. NES phenomenon was also observed and its effect on the dynamics of JJ was discussed. Our study is not only important t o understand the physics of fluctuations in a Josephson junction, to improve the performance of complex digital systems, but also in nonequilibrium statistical mechanics of dissipative systems, where noise assisted switching between metastable states takes place.
Acknowledgments This work has been supported by INTAS Grant 01-450, INFM, MIUR, by the FLFBR (Project No. 03-02-16533) and by the Russian Science Support Foundation.
References 1. Y . Makhlin, G. Schon, and A. Shnirman, Rev. Mod. Phys. 73, 357 (2001); Y . Yu et al., Science 296, 889 (2002); R. W. Simmonds et al., Phys. Rev. Lett. 93, 077003 (2004); A. N. Cleand and M. R. Geller, Phys. Rev. Lett. 93, 0770501 (2004). 2. M. Dorojevets, P. Bunyk and D. Zinoviev, IEEE Bans. Appl. Supercond. 11, 326 (2001); V. Kapluneko, Physicu C 372-376, 119 (2002); T. Ortlepp, H. Toepfer and H. F. Uhlmann, IEEE Bans. Appl. Supercond. 13, 515 (2003). 3. Andrey L. Pankratov and Bernard0 Spagnolo, Phys. Rev. Lett. 93, 177001 (2004). 4. A. Barone and G. Paterno, Physics and Applications of the Josephson Effect, Wiley, 1982; K. K. Likharev, Dynamics of Josephson Junctions and Circuits (Gordon and Breach, New York, 1986). 5. A. N. Malakhov and A.L. Pankratov, Physicu C 269, 46 (1996). 6. C. R. Doering and J. C. Gadoua, Phys. Rev. Lett. 69, 2318 (1992). 7. R.N. Mantegna and B. Spagnolo, Phys. Rev. Lett. 84, 3025 (2000); Yang Yu and Siyuan Han, Phys. Rev. Lett. 91, 127003 (2003). 8. A. L. Pankratov and M. Salerno, Phys. Lett. A 273, 162 (2000). 9. A. N. Malakhov and A. L. Pankratov, Adv. Chem. Phys. 121, 357 (2002). 10. R.N. Mantegna and B. Spagnolo, Phys. Rev. Lett. 76, 563 (1996); N. V. Agudov and A. N. Malakhov, Phys. Rev. E 60, 6333 (1999); N. V. Agudov and B. Spagnolo, Phys. Rev. E 64,035102(R) (2001); N. V. Agudov, A. A. Dubkov and B. Spagnolo, Physica A 325, 144 (2003); A. Fiasconaro, D. Valenti, and B. Spagnolo, ibid. 325, 136 (2003). 11. B. Spagnolo, A. A. Dubkov, N. V. Agudov, Physicu A 340,265 (2004); A. A. Dubkov, N. V. Agudov and B. Spagnolo, Phys. Rev. E 6 9 , 061103 (2004). 12. B. Spagnolo, A. A. Dubkov, N. V. Agudov, Acta Physicu Polonica B 35, 1419 (2004). 13. A. V. Rylyakov and K. K. Likharev, IEEE Bans. Appl. Supercond. 9, 3539 (1999).
SYMMETRY BREAKING INDUCED DIRECTED MOTIONS
CHENG-HUNG CHANG National Center for Theoretical Sciences, Physics Division, Hsinchu 300, Taiwan National Chiao Tung university, Institute of Physics, Hsinchu 300, Taiwan E-mail: [email protected]. nthu. edu.tw TIAN YOW TSONG Institute of Physics, Academy of Sciences, Taipei 115, Taiwan University of Minnesota, College of Biological Science, St. Paul, Minnesota 55108 E-mail: [email protected] A variety of directed motions in microsystems are ascribed to symmetry breakings. Well known examples include the spatial symmetry breaking of ratchets in biological motors and the temporal symmetry breaking of quantum pumping in quantum dots z. Since these two mechanisms are often mixed together in a real system, an interesting question emerges, namely, which kind of symmetry breaking is dominant and decides the directed motion of a system. This question will be illustrated in a simple model with an asymmc try potential and a driving force generated by deterministic chaotic maps. The analysis reveals that the driving force frequency is the most crucial parameter, which decides whether the directed motion is determined by the spatial or the temporal symmetry breakings.
1. Guidelines Introduction Condensed materials usually contain certain symmetry giving rise to periodic structures. The periodicity is asymmetric if the unit cell of the periodic structure does not have reflection symmetry, which is quite often the case in soft and hard matters. Recently a widely discussed problem related to asymmetric structure is the ratchet effect The most characteristic behavior in this effect is that a biased particle movement can be induced by an unbiased driving source, if the particle is exposed to an asymmetric background. This phenomenon is expected to account for a variety of physical and biological systems. However, such spatially symmetry breaking is not the only mechanism which can lead to biased movement. If the driving source has certain correlation, biased movement also can happen, even when the structure is spatially symmetric and the driving force is of zero mean. This correlation effect is called temporal symmetry breaking. Accordingly, if we discover certain biased movement, say, on some biological asymmetric structure, it is too hasty to conclude immediately that the movement is due to ratchet effect (spatial symmetry breaking). More carefully one should ask which symmetry breaking is the dominant effect for the biased movement. 394.
178
179
To illustrate this problem, let us consider a simple model
which describes the motion of a particle with mass m on a periodic asymmetric pounder a damping force with damping tential v ( z )= - 4sin(2~(z-zo))+sin(4~(z-z0)) 16?r2d coefficient y, where d = 1.6, zo = -0.190, and c = 0.028/d, such that the position z = 0 is a minimum of the potential. The particle is exposed t o a temporally discrete kicks where ,B denotes the period of the kicks, a represents the strength of the force and a, are pure numbers with zero mean, i.e., (a) = lim7+W C;=,a, = 0. For simplicity the period of the kicks is kept constant. However, the amplitudes a, of the kicks are determined by chaotic maps including the circle map Tc, the baker map TB,and the logistic map TL,defined on the unit interval I := [0,1):
+
Baker Logistic
TC : z H z a mod 1 TB : z H 2 z mod 1 TL : z H 4z(1 - z )
Invariant measure P(z ) Lebesque P c ( z ) = 1 Lebesque PB( z ) = 1
PL(z) =
The number a = m / 1 0 is chosen to be irrational, so that TC is ergodic. The other two maps are not only ergodic, but also mixing and exact '. All these maps are deterministic and belong to different hierarchies of chaos. The last two have a positive Lyapunov exponent and their long time behavior is unpredictable. After many iterations, the distribution of the positions in the orbit {T" zo, n = 0, 1, .. . } approaches an invariant probability density for almost all initial points zo 5 . For the above-mentioned maps, the densities are listed in the above tabular and plotted in Fig. 1. Since all these densities are symmetric with respect to the axis z = 0.5, the points in the orbits of the maps can be used to generate the amplitudes a, of the deterministic driving force in (1) by the replacement: a, = T n zo - 0.5, for almost all zo E I . Obviously, this force has a zero mean ( a ) = 0 with a, E [-0.5,0.5). Without a loss of generality, we set m = 1. For a large ratio y / a , the trajectories x ( t ) will be trapped around a minimum of the potential and cannot hop over the potential barriers into the other unit cells. For a small ratio y / a , the particle motion is a random walk on an asymmetric potential. The trajectories z ( t )wander between different unit cells. For a ratio y/o between these two regions, unidirectional net transport becomes apparent, which is of interest here. To make it concrete, we take the damping coefficient y = 1 and the period ,B and the strength a of the kicks as follows: (I) p = 8, a = 1.17 for all maps; (11) ,B = 1, a = 0.9 for Tc; a = 0.3 for TB; and a = 0.4 for TL. Since the long time behaviors of the trajectories are similar for different initial conditions, we show only one trajectory for every map. Their initial conditions are ( 2 ,i) = (0,O) for (I) and ( x ,i) = (-50,O) for (11), with zo = &/lo for both cases. Interestingly, the following observations can be made (Fig. 2): (i) For kicks with a long period, i.e., ,B = 8, all maps induce negative transport.
180
(a) Circle map Tc
1
1
0.5
1
(c) Logistic map TL
1
0.5
0.5
00
(b) Baker map TB
'0
0.5
0.5
1
00
0.5
1
1.5 2/
10 0.51
Figure 1. (a) Circle map (b) Baker map (c) Logistic map and the invariant probability densities for (d) Circle map (e) Baker map and (f) Logistic map.
(ii) For kicks with a short period, i.e., p = 1, baker map and circle map prefer positive transport and logistic map prefers negative transport. The reason for (i) is simple. The damping force tends to drag the particle in a potential unit cell toward its left barrier since the potential is asymmetric and the potential minimum is closer t o the left potential maximum of its unit cell than the right one '. Thereafter, a random kick has a higher likelihood of pushing the particle over the left barrier than the right barrier, assuming that the time span between two kicks is not too short. This effect is significant when the damping y is strong. Of course, a must be enhanced simultaneously to maintain the ratio y / a . Therefore, due to the asymmetric effect, the system prefers to induce a negative current independent of whether the kicks are random or deterministic. This current is apparent, as long as the kick period is large. For a short kick period p, the asymmetric effect is slight. It can be realized by observing the evolution an ensemble of 4969 uniformly distributed states (z, v) in the basins of the attractor (0,O) bounded by (211 < 0.5 (Fig. 3(a)). Therein, 50.51% of these states are located on the left hand side of the center z, = of the unit cell. Owing t o the dissipative nature of the system, all states are contracted into the attractor. However, the contraction is mainly along the direction of the stable manifolds for most initial states. Only those states with a small 1211 obtain
,+:,-
181 Net transport under discrete f(t) generated by chaotic maps 300 200 -
'
21-
_c_cc_
\
baker a = u.3 .........................................
~~
logistic a = 0
I
baker a = 0.17
circle a = 0.17 0
10
20 2000
30 4000 6000 Time step t
8000
10000
Figure 2. Directed net transport for different maps. Three trajectories with short period p = 1 = 8 begin with 3c = -50. For short begin with z = 0. Three trajectories with long period period = 1, the first few steps of the maps T, and TL are magnified in the two insets, where the amplitudes man, n = 1, 2, . . . of the driving force are connected by two thick zigzag curves.
a stronger contraction parallel to the x-direction toward the attractor, as indicated in the ensemble evolution shown in Fig. 3(a), (b), (c), and (d) for t = 0, 1, 3, and 8. The corresponding histograms of the position distribution are shown in Fig. 3(e), (f), (g), and (h), with 50.51%, 52.59%, 61.58%, and 89.92% of the states on the left hand side of x,. For a system with a short period, e.g., around ,B = 1, a state obtains successive kicks before becoming trapped into the attractor. In this case, the transport into left and right basins, induced by kicks, is nearly equal. For such ,B regime, the asymmetric effect contributes only weakly to the directed transport. Therefore, the deterministic property of the driving force induced by different chaotic maps becomes apparent for the transport direction. In summary, a biased movement can be ascribed to both spatial and temporal symmetry breakings. Which factor is dominant is decided by the frequency of the driving force. This suggests that one should examine the driving force before concluding the mechanism for a biased movement on an asymmetric structure.
References 1. R.D. Astumian, Science 276, 917 (1997); F. Jiidlicher, A. Ajdari, and J. Prost, Rev. Mod. Phys. 69, 1269 (1997). 2. M. Switkes, C.M. Marcus, K. Campman, A.C. Gossard, Science 283, 1905 (1999).
182
0.5 >O
-0.5
0.5
v
-0.5'
'
-0.5
0
0.5
\.
) t=8 Y
z -0.5
-0.5
0,
0.5
OL-O:5'
tx' 0:5.
I
Figure 3. Evolution of an ensemble of initial states in the phase space and the corresponding evolution of positions in the configuration space. (a), (b), (c), and (d) are the distributions of states in the phase space at time t = 0,1,3,and 8. ( e ) , ( f ) , (g), and (h) are the corresponding position histograms. 3. P. Reimann, Phys. Report 361, 57 (2002). 4. T.Y. Tsong and T.D. Xie (in press in Appl. Phys. A); T.Y. Tsong, R.D. Astumian, Bioelectrochem. Bioenerg. 15, 457 (1986). T.D. Xie, P. Marszalek, Y.d. Chen, and T.Y. Tsong, Biophys. J. 6 7 , 1247 (1994); T.D. Xie, Y.d. Chen, P. Marszalek, and T.Y. Tsong, Biophys. J. 72,2496 (1997); M.O. Magnasco, Phys. Rev. Lett. 71,1477 (1993); R.D. Vale and F. Oosawa, Adv. Biophys. 26, 97 (1990); S.C. Kuo, and M.P. Sheetz, Science 260, 232 (1993); G . Lattanzi and A. Maritan, Phys. Rev. Lett. 86, 1134 (2001). B. Alberts, D. Bray, J. Lewis, M. Raff, K. Roberts, and J.D. Watson, Molecular Biology of the Cell, Garland (1994); 5 . A. Lasota and M.C. Mackey, Chaos, Fractals, and Noise, Springer-Verlag (1994). 6. C.-H. Chang, Phys. Rev. E 6 6 , 015203(R), (2002).
Granular Media, Glasses and Wrbulence
This page intentionally left blank
GENERAL THEORY OF GALILEAN-INVARIANT ENTROPIC LATTICE BOLTZMANN MODELS
B. M. BOGHOSIAN Department of Mathematics Bromfield-Pearson Hall, Tujb University Medford, MA 02155, USA E-mail: bruce. boghosion(0tufts. edu
In recent works Isz, it was shown that the requirement of Galilean invariance led to a unique form of the H function used in entropic lattice Boltzmann models for the incompressible Navier-Stokes equations. In the first of these works ', this result was derived for single-speed models on Bravais lattices, while in the second it was generalized to multispeed models for which the lattice vectors of each speed present separately satisfied a certain isotropy condition. In this work, we further generalize the result to include all entropic lattice Boltzmann models for the incompressible Navier-Stokes equations. We find that the H function always has the form of the Tsallis entropy, we make contact with another class of entropic lattice Boltzmann models due t o Ansumali and Karlin ', and we correct the form of the q parameter reported in previous works on this subject.
1. Introduction Lattice Boltzmann models of fluids 4,5 evolve discrete-velocity single-particle distribution functions in discrete time steps on a regular spatial lattice. In the limit of small Mach and Knudsen numbers, the Chapman-Enskog analysis may be used to derive the hydrodynamic equations corresponding to the conserved quantities of the model. The method takes advantage of the fact that the form of the incompressible Navier-Stokes equations that emerge from this analysis is very robust with respect to radical simplifications of the underlying kinetic model. An earlier paper on this subject assumed that the collection of velocities all had the same magnitude, or speed. A subsequent study generalized this result to multispeed models for which the lattice vectors of each speed present separately satisfy a certain isotropy condition. Though a substantial generalization of the earlier work, this restriction ruled out certain important lattice Boltzmann models, such as the D2Q9, D3Q15 and D3Q19 models 5 . In the present work, we generalize the result to any lattice Boltzmann equation whatsoever that yields the incompressible Navier-Stokes equations in the hydrodynamic limit.
185
186 2. Discrete Kinetic Equation
We suppose that there are b discrete velocities denoted by cj along which move particle of mass mj, where j = 1,.. . b. Since the particles traverse these vectors in time At = 1, these velocities must be linear combinations of lattice vectors with integer coefficients so that the particles remain on the lattice. We do not assume that all velocities have the same magnitude. In contrast to notation used in earlier work ’, we do not employ a separate subscript to denote the magnitude of the discrete velocities. The a’th Cartesian component of a velocity vector is denoted by c j a . The single-particle distribution function component associated with lattice vector cj at lattice position r and time step t is denoted by Nj(r,t). The simplest lattice Boltzmann models are the so-called lattice-BGK models with evolution equation
Nj(r
+ cj,t + At)= Nj(r,t ) + -1 [ N y ) ( r ,t) 7
-
Nj(r, t)]
for j = 1,.. . ,b. Here Nj(eq)(r,t ) is a specified local equilibrium distribution function with the same hydrodynamic moments as Nj, and r is a specified collisional relaxation time. In what follows, we often suppress functional dependence on r and t for notational convenience. The mass and momentum moments of the distribution function,
and
are then conserved, and we shall require that they obey the incompressible NavierStokes equations for certain choices of equilibrium distribution. In an entropic lattice Boltamann model, we must also require that the dynamics decrease a Lyapunov function of trace form, h
H = x h j (Nj)
(4)
j
where h’(x) 2 0 for x > 0. In past work, we have restricted attention to the case where hj is independent of j , but in the interests of generality we now drop that restriction. We demand that extremizing H under the constraints of fixed p and pu yield the equilibrium distribution used in the BGK collision operator,
187 where 43 is the inverse function of h$, and where p(r, t ) and multipliers determined by the constraints
p(r,t ) are Lagrange
and
3. Lattice Symmetries To show that the hydrodynamic mass and momentum densities obey the incompressible Navier-Stokes equations, we must impose certain requirements on the moments of the outer products of the velocity vectors. Specifically, we suppose that
j
That is, we assume that odd-rank moments of the velocity vectors vanish, and that even-rank moments up to the fourth rank are completely symmetric isotropic tensors. Since the moments are weighted by the functions 43, these requirements involve both the lattice symmetries and the form of the H function. For unit-speed particles on sufficiently symmetric lattices, they may hold even if hj is independent of j ; this was the assumption made in the earliest work on this subject Subsequent models allowed for more than one speed, but needed to assume that the above requirements held for each speed separately. We claim that the requirements as stated above subsume and generalize the models treated in all earlier works, and allow us to consider entropic versions all known lattice Boltzmann models of the Navier-Stokes equations, including the D2Q9, D3Q15 and D3Q19 models, along with other entropic lattice Boltzmann models due to Ansumali and Karlin ’.
’.
188 We note that these requirements of Eqs. (8) through (12) define the functions @ j ( p )for j E {0,2,4}. Since the requirements are identities that must hold for all
p, we may also take their derivatives with respect to p.
4. Equilibrium Distribution Function
We expand the equilibrium distribution to second order in Mach number by using
/3 as a formal expansion parameter,
~ j ( e q )= 4j (mjp)+ mi$; (mjp)/3. cj + Tm34 j
/I
(mjp)/
3 :~cici + ' . .
(13)
Eqs. (2) and (3) are then used to derive the constraints
b
pu =
C mjcjNj(eg) = @h(p)/3+ . . . ,
(15)
j
where we have used the requirements Eqs. (8) through (11). Solving these order by order for the Lagrange multipliers, we find
where we have assumed that @O is invertible. Inserting these Lagrange multipliers into the Mach-expanded equilibrium distribution, and retaining terms to second order in the Mach number, we find
where 1 denotes the rank-two unit tensor. This equation is the generalization of Eqs. (A.12) through (A.14) in the previously mentioned reference The use of this equilibrium distribution in the BGK collision operator, Eq. (l),completely defines the model. We do not describe the algorithmic details of the model here, since those were discussed in both of the earlier references
'.
'1'.
189 5 . Hydrodynamic Equations
The Chapman-Enskog analysis deriving the hydrodynamic equations is carried out in the Appendix A of one of the earlier references ’, and so we do not repeat it here. The very same analysis applies to the more general model described above, and leads t o the incompressible Navier-Stokes equations, V . u = 0 and
aU
- + gu . vu = - V P + U P U , at
where we have defined the scalar pressure
the kinematic viscosity
and the factor multiplying the convective derivative
where all of the functions an are understood to be evaluated at (p). We note that the correct form of the convective derivative, and therefore Galilean invariance, is recovered when g = 1 which leads to the requirement
aoa;
(23)
=
While this equation seems identical in form to that derived in one of the earlier references ’, that is only because we have chosen our notation shrewdly. It is important to keep in mind that the above analysis is substantially more general than the earlier ones, since it does not impose Eqs. (8) through (12) on each particle speed separately. The notation does have the virtue of making manifest the fact that the present analysis reduces to the earlier one under the more restrictive conditions.
6. Galilean Invariance Requirement on H We now take the trace of Eq. (10) by setting a = a = ,G’ and p = u. We obtain
p,
and of Eq. (12) by setting
b
@o(c1) = E m 3 4 3
(w4
3
b
(24)
190 showing that the @ j functions are just different linear combinations of the & ( m j p ) . We note that Eqs. (25) and (26) are weaker than Eqs. (8) through (12); the former are implied by but do not imply the latter which must hold in any case. Using the above equations, the condition of Galilean invariance becomes
This is a functional differential equation since the unknown functions
$ j are evaluated at multiple locations m j p . In one of the earlier references ’,it was noted that the more restrictive version of the above analysis, in which there was only one unknown function 4, admitted a power-law solution. In this more general context, we have more than one dependent variable function +j, but if we try the solution
we quickly find that Eq. (27) is satisfied if
which is a transcendental equation for P. This solution also exhibits invariance under uniform scaling of the Wj’s;the normalization of the single-particle distribution then determines the overall magnitude of the Wj. The constant B is entirely arbitrary. Thus, even in this very general context, we see that there will be power-law solutions for the single-particle distribution function. For a single-speed model with Wj = 1, the right-hand side of Eq. (29) will be one. The equation is then seen to be satisfied by P = - 0 1 2 , in agreement with earlier work l~’.
7. Connection with the Tsallis Entropy In earlier references it has been noted that a power-law solution for the singleparticle distribution function yields an H function that has the form of a Tsallis entropy. The italicized words in the last sentence are essential, since it must be emphasized that an H function is not an entropy; the former involves only the single-particle distribution function while the latter involves the full phase-space density. Nevertheless, Boltzmann’s H involves the logarithm function, as does the Boltzmann-Gibbs entropy; likewise, the H of Galilean-invariant lattice Boltzmann models involves the q-logarithm function ‘y2,
191
as does the Tsallis entropy. In this section, we relate the power ,B used above to Tsallis’ q parameter; in doing so, we correct an algebraic mistake in an earlier work on this subject. We first recall that +j is the inverse function of h;, so that z = hi
(+j(Z))
-By).
= hi (W&
(31)
If we set z = Wj(z - B)o, this becomes UP
Integrating this yields
where C is a constant of integration which we may set to zero with no loss of generality. The result may be rewritten
which may be recast as proportional to zln,(z/Wj) if we make the identification 1 - q = I//?, or q=l--
1
P‘
(35)
This differs from the relation q = l+l/,Bthat was reported in the earlier reference ’. We assert that the above is correct, and note that applying it to the case of a singlespeed model for which Wj = 1, we find q = 1 - 2/D, which was also reported in earlier references 1!2. 8. Condition for Boltzmann-Gibbs Entropy
The generality of the present treatment makes it possible for us to inquire under what circumstances the Boltzmann-Gibbs entropy is appropriate. Because of the restricted nature of the previous studies, this question has not heretofore been answered. We see that q + 1 as ,B + M. If we suppose that all particles have unit mass, Eq. (29) then reduces to the condition
’.
Indeed, one model that obeys this condition was put forth by Ansumali and Karlin They considered a lattice Boltzmann model on a Cartesian grid in three dimensions (D = 3 and b = 27) with one rest particle having W = (2/3)3, six speed-one particles
192 having W = (2/3)’(1/6), twelve s p e e d - 4 particles having W = (2/3)(1/6)’, and eight s p e e d - a particles having W = (l/6)3. With these choices, we find
whence the condition for Galilean invariance is satisfied. We thus finally make contact between this class of Gailiean-invariant entropic lattice Boltzmann models and that of Ansumali and Karlin 3; such contact had heretofore been an outstanding theoretical problem. Moreover, the present analysis suggests generalizations of those authors’ very clever model, since it is clear that there are many other ways to solve Eq. (36). 9. Conclusions
A previous analysis of the H function for Galilean-invariant entropic lattice Boltzmann models posed two outstanding theoretical challenges in its Conclusion section. The first was the need to find a more general theoretical framework that would render unnecessary the restriction to models for which a separate symmetry condition had to hold for particles of each speed present. The second was the need to make contact with the work of Ansumali and Karlin who have studied entropic lattice Boltzmann models with H functions of the form H = Cj”Nj In ( N j / W j ) , where the Wj are speed-dependent weights. In this work, we have met both of these two theoretical challenges. Indeed, the theoretical framework described in this model is powerful enough to subsume all previously known entropic lattice Boltzmann models for the incompressible Navier-Stokes equations, and to suggest new and heretofore unknown models of this type. The Tsallis entropic form has often been reported as arising from a lack of ergodicity, or a fractal spatiotemporal structure. There is no clear reason to believe that either of these ingredients are present in entropic lattice Boltzmann models, yet the Tsallis entropic form arises quite naturally from our mathematical development, and appears to be rather robust in that it holds for an entire family of such models. A clear and illuminating physical interpretation of our result, or at least a simpler mathematical explanation for it, remains an important outstanding challenge.
’
Acknowledgments BMB would like to thank the organizers of the 31st Workshop of the International School of Solid State Physics on Complexity, Metastability and Nonextensivity, held at the Ettore Majorana Foundation and Centre for Scientific Culture in Erice,
193 Sicily from 20-26 July 2004, for their hospitglity and for providing an excellent and congenial atmosphere conducive to scientific inquiry. Particular thanks are due to Professor Constantino Tsallis of the Centro Brasileiro de Pesquisas Fisicas and the Santa Fe Institute for helpful discussions and for noting the above-mentioned sign error in the earlier work. BMB was supported in part by the U.S. Air Force Office of Scientific Research under grant number FA9550-04-1-0176. He performed part of this work while visiting the Centre for Computational Science, Department of Chemistry, University College London as an EPSRC Visiting Fellow under RealityGrid contract GR/R67699.
References 1. B.M. Boghosian, P.J. Love, P.V. Coveney, S. Succi, I.V. Karlin, J. Yepez, “GalileanInvariant Lattice Boltzmann Models with H-Theorem,” Phys. Rev. E Rapid Communications 68 (2): Art. No. 025103 Part 2 (2003). 2. B.M. Boghosian, P.J. Love, J. Yepez, “Galilean-InvariantMulti-speed Entropic Lattice Boltzmann Models,” Physica D 193 (2003) 169-181. 3. S. Ansumali, I.V. Karlin, Phys. Rev.E 62 (2000)7999; S. Ansumali, I.V. Karlin, Phys. Rev. E 65 (2002) 056312. 4. R. Benzi, S. Succi, M. Vergassola, Phys. Reports, 222 (1992) 145. 5. S. Succi, “The Lattice Boltzmann Equation - For Fluid Dynamics and Beyond,” Oxford University Press (2001). 6. Y.-H. Qian, D. d’Humieres, P. Lallemand, Europhys. Lett. 17 (1992) 479.
UNIFYING APPROACH TO THE JAMMING TRANSITION IN GRANULAR MEDIA AND THE GLASS TRANSITION IN THERMAL SYSTEMS
A. CONIGLIO*~+,A. DE CANDIA*, A. FIERRO*~+, M. NICODEMI*~+,M. PICA CIAMARRA* AND M. TARZIA* Dipartimento di Scienze Fisiche, Universitci degli Studi d i Napoli “Federico 11”, INFM and INFN, Complesso Universitario d i Monte Sant ’Angelo, via Cinthia, 80126 Napoli, Italy INFM - Coherentia, Napoli, Italy We discuss some recent results on Statistical Mechanics approach to dense granular media. In particular, by analytical mean field investigation we derive the phase diagram of monodisperse and bydisperse granular assemblies. We show that “jamming” corresponds to a phase transition from a “fluid” to a “glassy” phase, observed when crystallization is avoided. The nature of such a “glassy” phase turns out to be the same found in mean field models for glass formers. This gives quantitative evidence to the idea of a unified description of the “jamming” transition in granular media and thermal systems, such as glasses.
1. Introduction
An important conceptual open problem concerning granular media, is the absence of an established theoretical framework where they might be described. Several methods and theories were put forward in the last years. Edwards1,2, in particular, proposed first that a Statistical Mechanics approach might be feasible t o describe dense granular media. He introduced the hypothesis that time averages of a system, exploring its mechanically stable states subject to some external drive (e.g., “tapping”), coincide with suitable ensemble averages over its “jammed states”. The Statistical Mechanics approach to dense granular media was later supported by observations from experiment^^!^ and simulations5f’ which suggested that when the system approaches stationarity during its “tapping” dynamics, its macroscopic properties are univocally characterized by a few control parameters and do not depend on the system initial configuration or dynamical protocol. Of course, the open problem remains to understand and predict the features of the “suitable” ensemble average for the system. This is a very important current research issue in granular media which has recently seen interesting contributions from both computer simulations and experiments. We discuss here the basic ideas in the Statistical Mechanics of dense granular
194
195 media at stationarity and recent results about its extensions. A central concept in this approach is the configurational entropy, SC,f = lnR, where R ( E , V ) is the number of mechanically stable states corresponding t o the volume V and energy E. From S,,, conjugated thermodynamic parameters can be derived: the compactivand the configurational temperature Tc;kj = aSc,f/dE. ity, X-l = aSc,f/aV, The “thermodynamic” parameters should completely characterize the macroscopic properties of the system, as much as pressure or ordinary temperatures in gases. Methods have been developed, thus, to measure these parameters by exploiting different techniques. In the stationary regime we consider here, for instance, one can show that Tc,,f can be related to an equilibrium Fluctuation-Dissipation (FD) T h e ~ r e m ~ , ~This , ~ , allows ~ , ~ ~a .simple evaluation of Tc,,j from measures, for example, of the sample bulk density (or height) and its fluctuations, taken in the stationary regime of, e.g., a tap dynamics. The knowledge of the system distribution function and its parameters can be exploited to depict a first theoretical comprehensive picture of the vast phenomenology of powders, ranging from their phase diagrams to segregation properties. This was partially accomplished in Ref.s6i11>12. A different a p p r ~ a c h ’ ~to~ measure ’ ~ ~ ~ an “effective temperature”, Tdyn,in granular media which are far from stationarity, is based on the out-ofequilibrium extension of the Fluctuation-Dissipation Theorem discovered in glassy theoryl59l6. Interestingly, it was shown14>5,10 that in the limit of small shaking amplitudes, T d y n coincides with the above “configurational temperature” TConf. We review below the basic ideas in the Statistical Mechanics of dense monodisperse granular media at stationarity and in such a framework derive their “phase diagram” in mean field approximation. This allows to discuss the nature of jamming in non-thermal sy~tems’~7’~ and the origin of its close connections to glassy phenomena in thermal ones2.
2. Statistical mechanics approach to granular media
Granular media are strongly dissipative systems not affected by temperature, since thermal fluctuations are usually negligible. Therefore, in absence of driving, the usual temperature of the external bath can be considered zero. Edwards’ suggested that, by gently shaking the system under the constraint of fixed volume V , the probability distribution, P,., over the mechanically stable states would be uniform. As usual in statistical mechanics, the knowledge of P,. allows t o make theoretical predictions substituting time averages with ensemble ones. Following Edwards’ original ideas, we suggested for a granular system under taps that at stationarity the probability, P,., t o find the system in a blocked state, r , satisfy the principle of maximum entropy, S = - C,.P,. In P,., with suitable macroscopic constraints. We use the canonical ensemble approach as experimentally the energy of the system under an external driving is not conserved. Thus, maximizing the entropy under the constraint that the system energy, E = C,.PrE,., is fixed, a generalized
196
Gibbs distribution is obtained:
P, 0: e-PEr (1) where p = %&@is a Lagrange multiplier enforcing the constraint on the energy, R(E) is the number of mechanically stable states with energy E, and Tc;Ar = ,B is the configurational temperature. In conclusion the partition function of the system is
C
Z=
e-oEr .II,,
(2)
TERTot
where R T represents ~ ~ all microstates. II, is 1 if the state T is mechanically stable and 0 otherwise. In general more than one Lagrange multiplier is necessary to assign the macroscopic status of the system. In particular for a hard sphere binary mixture under gravity we have found6 that at least two configurational temperatures must be introduced. In this case P, is obtained maximizing the entropy with two independent constraints on the gravitational energies of the two species of grains, El and E2. This gives two Lagrange multipliers:
P1 =
aInR(E1,Ez)
dEl
7
P2 =
dInR(El,E2)
(3)
where R(E1, Ez) is the number of mechanically stable states with energies respectively E l , E2. In this case the partition function of the system is
z=
C
. n,
e-P~E~r-P~E2r
(4)
VERTot
where again II, is 1 if the state T is mechanically stable and 0 otherwise. We note that Eq. (3) implies the existence of two distinct Lagrange multiplier, one for each species. This pose the question whether it is possible that the configurational temperature satisfy the zero principle of thermodynamics. We note that in this approach, in which the total energy is not conserved, the zero principle of thermodynamics does not necessarily hold. Indeed, in the previous example only if the total energy El Ez could be somehow kept constant, by maximizing the entropy one would obtain 01 = Pz. Note that in real systems the stationary states where these distributions are supposed to work may be very difficult to reach and many out-of-equilibrium effects appear. Nevertheless this approach, by allowing to apply all the techniques used in statistical mechanics, suggests possible interpretations of phenomena experimentally observed, and theoretical predictions which can be experimentally verified.
+
3. Monodisperse hard sphere model for granular materials
The simplest model for granular media we considered6 is a system of hard-spheres of equal diameter a0 = f i ,subjected to gravity. We have studied this model on a
197
time
Figure 1. Monte Carlo dynamics: the system is subjected to a sequence of “taps”. A “tap” is a period of time, of length TO (the t a p duration), during which the system evolves a t a finite bath temperature Tr (the t a p amplitude); after each “tap’ ’ the system evolves a t Tr = 0 and reaches a mechanically stable state. By cyclically repeating this procedure the system explores the space of its mechanically stable configuration
lattice, constraining the centers of mass of the spheres on the sites of a cubic lattice. The Hamiltonian of the system is:
where the height of site i is zi,g = 1 is gravity acceleration, m = 1 the grains mass, ni = 0,1 the usual occupancy variable (i.e., ni = 0 or 1if site i is empty of filled by a grain) and H ~ c ( { n ~an} )hard-core interaction term that prevents the overlapping of nearest neighbor grains (this term can be written as H ~ c ( { n i }=) J C(ij, ninj, where the limit J + co is taken). We perform a standard Metropolis algorithm on the system. The particles, initially prepared in a random configuration, are subject to taps (see Fig. l),each one followed by a relaxation process. During a tap, for a time TO (called tap duration), the temperature is set to the value Tr (called tap amplitude), so that particles have a finite probability, pup ecrnglTr, to move upwards. During the relaxation the temperature is set to zero, so that particles can only reduce the energy, and therefore can move only downwards. The relaxation stops when the system has reached a blocked state, where no grain can move downwards. Our measurements are performed at this stage when the shake is off and the system is at rest. The time, t , is the number of taps applied to the system. Under such a tap dynamics the systems reaches a stationary state where the Statistical Mechanics approach to granular media can be tested. In particular, it
-
198 has been verified6 that the ensemble averages of Eq. (1) coincide with time averages. 4. Mean field solution in the Bethe-Peierls approximation
Having shown in previous works6 that in the model Eq. (5) the partition function is given by Eq. (2), in the present section we show the phase diagram of the model, Eq. (5), obtained using a mean field theory in the Bethe-Peierls approximation (see19'20and refs therein), based on a random graph (plotted in Fig. 2) which keeps into account that the gravity breaks up the symmetry along the z axis. This lattice is made up by H horizontal layers (i.e., z E (1, ....H } ) . Each layer is a random graph of connectivity, k - 1 = 3. Each site in layer z is also connected to its homologous site in z - 1and z + 1 (the total connectivity is thus k + 1). Locally the graph has a tree-like structure but there are loops whose length is of order In N, insuring geometric frustration. In the thermodynamic limit only very long loops are present. The details of calculations are given in2' (see also Ref.sl1>l2where this mean field theory was first introduced). .................................
z+l
- .......................... ,-..........................
z
__....._ ................ ___,
...........................
...............................
z- 1
.'
Figure 2. In the mean field approximation, the grains are located on a Bethe lattice, sketched in the figure, where each horizontal layer is a random graph of given connectivity. Homologous sites on neighboring layers are also linked and the overall connectivity, c, of the vertices is c k+ 1 = 5.
We solve the recurrence equations found in the Bethe-Peierls approximation in three cases: 1) A fluid-like homogeneous phase; 2) a crystalline-like phase characterized by the breakdown of the horizontal translational invariance; 3) a glassy phase described by a 1-step Replica Symmetry Breaking (1RSB). The results of the calculations are summarized in Fig. 3, where the bulk density at equilibrium, @ = N , / ( 2 ( z ) - 1)22 (where (2) is the average height) is plotted as a function of the configurational temperature, Tconf, for a given value of the number of grains per unit surface, N,.We found that at high Tconf a homogeneous solution at T, a phase corresponding to the fluid-like phase is found. By lowering Tconf transition to a crystal phase (an anti-ferromagnetic solution with a breakdown of the translation invariance) occurs. The fluid phase still exist below T, as a metastable phase corresponding to a supercooled fluid when crystallization is avoided. Finally a lRSB solution (found with the cavity methodlg), characterized by the presence
199 of a large number of local minima in the free energylg, appears at TO,and becomes stable at a lower point T K ,where a thermodynamic transition from the supercooled fluid to a lRSB glassy phase takes place. The temperature To, which is interpreted in mean field as the location of a dynamical transition where the relaxation time diverges, in real systems might instead correspond to a crossover in the dynamics has a shape very similar to that observed in (see16,20,23 and Refs therein). @(Tconf) the “reversible regime” of tap experiment^^>^. The location of the glass transition, T K ,corresponds to a cusp in the function @(Tcmf). The dynamical crossover point To might correspond to the position of a characteristic shaking amplitude I?’ found in experiments and simulations where the “irreversible” and “reversible” regimes approximately meet.
Fluid
I
i
2
I
I
I
I
I , ,
2.5 Tconf
Figure 3. The density, = N , / ( 2 ( 2 ) - l), for N s = 0.6 as a function of T,,,f. amazis the maximum density reached by the system in the crystal phase.
5. Monte Carlo tap dynamics
The model, Eq. (5), simulated in 3d by means of Monte Carlo tap dynamics” presents a transition from a fluid to a crystal as predicted by the mean field approximation, density profiles in good agreement with the mean field ones, and in the fluid phase a large increase of the relaxation time as a function of the inverse tap amplitude. In the following section we study a more complex model for hard spheres, where an internal degree of freedom allows to avoid cry~tallization~’ In the following the tap duration is fixed, TO = lOMCsteps/particle, and
200
0
0
0
0.7
Figure 4. The bulk density, iP E N/L2(2(z) - l), is plotted as function of Tr for TO = 10 MCsteps/partzcZe. The empty circles correspond to stationary states, and the black stars to out of stationarity ones. iPmax is the maximum density reached by the system in the crystal phase, QmaX = 6/7.
different tap amplitudes, Tr, are considered. In Fig. 4 the bulk density, N / L 2 ( 2 ( z )- l),is plotted as a function of Tr: @(Tr)has a shape resembling that found in the “reversible regime’’ of tap experiments314, and moreover very similar to that obtained in the mean field calculations and shown in Fig. 3. At low shaking amplitudes (corresponding to high bulk densities) a strong growth of the equilibration time (i.e. the time necessary to reach stationarity) is observed, and for the lowest values here considered (the black stars in Fig. 4) the system remains out of stationarity. In conclusions the system here studied presents a jamming transition at low tap amplitudes as found in real granular media. In order to test the predictions of the mean field calculations, in the following we measure quantities usually important in the study of glass transition: The relaxation functions, the relaxation time and the dynamical susceptibility, connected to the presence a dynamical correlation length. The autocorrelation function has a behavior very similar to that found in usual glass-formers: At low values of the tap amplitudes, Tr, two-step decays appear, well fitted in the intermediate time region, by the 0-correlator predicted by the mode coupling theory for supercooled liquidsz5 and at long time by stretched exponentids’l . In Fig. 5 the relaxation time, 7,is plotted as a function of the tap amplitude, Tr: A clear crossover from a power law to a different regime is observed around a tap amplitude To. The power law divergence can be interpreted as a mean field behavior, followed by a hopping regime. We note that a similar behavior is found inz6 where the escape time of a system in a sub-diffusive medium has a similar
201
Figure 5. The relaxation time, T , as function of the t a p amplitude inverse, T;’. The dashed line is a power law, (Tr - T ~ ) - 7 2with , TD = 0.40 f 0.01 and 7 2 = 1.52 f0.10. The continuous line is an Arrhenius fit, eAITr, with A = 17.4f 0.5 (the data in this region are also well fitted by both a super-Arrhenius and Vogel-Fulcher laws).
shape as a function of the inverse diffusion coefficient (i.e. l / T ) . In this case the escape time obeys a generalized Arrhenius law. The divergence of the relaxation time at vanishing tap amplitude is consistent with the experimental data of Philippe and Bideau4 and D’Anna et aLZ7. Their findings are in fact consistent with an Arrhenius behavior as function of the experimental tap amplitude intensity. However a direct comparison with our data is not possible since we do not know the relation between the experimental tap amplitude and the tap amplitude in our simulations. In Fig. 6 we plot the dynamical non linear susceptibility, x(t)’l at different Tr, which exhibits a maximum at a time, t*(Tr).The presence of a maximum in the dynamical non linear susceptibility is typical of glassy system^^^,^'. In particular the value of the maximum, x ( t * ) ,diverges in the pspin model as the dynamical transition is approached from above, signaling the presence of a diverging dynamical correlation length. In the present case the value of the maximum increases as Tr decreases (except at very low Tr where the maximum seems t o decrease30). The growth of X ( t * ) in our model suggests the presence of a growing dynamical length also in granular media.
6. Conclusions In conclusions using standard methods of statistical mechanics we have investigated the jamming transition in a model for granular media. We have shown a deep connection between the jamming transition in granular media and the glass
202
Figure 6. The dynamical non linear susceptibility, X ( t ) , (normalized by X ( t o ) , the value at t o = 1) as a function of t , for tap amplitudes T y = 0.60, 0.50, 0.425,0.41,0.40,0.385,0.3825(from left to right).
transition in usual glass formers. As in usual glass formers the mean field calculations obtained using a statistical mechanics approach t o granular media predict a dynamical transition a t a finite temperature, T D ,and, at a lower temperature, T K , a thermodynamics discontinuous phase transition t o a glass phase. In finite dimensions 1) t h e dynamical transition becomes only a dynamical crossover as also found in usual glass formers15~20~23 (here the relaxation time, 7 ,as a function of both the density and the t a p amplitude, presents a crossover from a power law t o a different regime); and 2) the thermodynamics transition temperature, T K ,seems t o go t o zero (the relaxation time, 7,seems t o diverge only at Tr P 0, even if a very low value of the transition temperature is consistent with the data).
References 1. S. F. Edwards and R. B. S. Oakeshott, Physica A 157 (1989) 1080. A. Mehta and S. F. Edwards, Physica A 157 (1989) 1091; S.F. Edwards, in (Disorder in Condensed Matter Physics” p. 148, Oxford Science Pubs (1991); and in Granular Matter: a n interdisciplinary approach, (Springer-Verlag, New York, 1994), A. Mehta ed. 2. “Unifying concepts in granular media and glasses”, (Elsevier Amsterdam, 2004), Edt.s A. Coniglio, A. Fierro, H.J. Herrmann, M. Nicodemi. 3. J. B. Knight, C. G . Fandrich, C. N. Lau, H. M. Jaeger and S. R. Nagel, Phys. Rev. E 51 (1995) 3957; E. R. Nowak, J. B. Knight, E. Ben-Naim, H. M. Jaeger and S. R. Nagel, Phys. Rev. E 57 (1998) 1971; E. R. Nowak, J B. Knight, M. Povinelli, H. M. Jaeger and S. R. Nagel, Powder Technology 94 (1997) 79. 4. P. Philippe and D. Bideau, Europhys. Lett. 60 (2002) 677. 5. H. A. Makse and J. Kurchan, Nature 415 (2002) 614. 6. A. Fierro, M. Nicodemi and A. Coniglio, Europhys. Lett. 59 (2002) 642; Phys. Rev. E
203 66 (2002) 061301; Europhys. Lett. 60 (2002) 684. 7. J . J . Brey, A. Prados and B. Shchez-Rey, Physica A 275 (2000) 310. 8. D. S. Dean and A. Lefevre, Phys. Rev. Lett. 86 (2001) 5639. 9. G. Tarjus and P. Viot, Phys. Rev. E, 69:011307, 2004. 10. J . Berg, S. Franz and M. Sellitto, Eur. Phys. J. B 26 (2002) 349. 11. M. Taraia, A. de Candia, A. Fierro, M. Nicodemi, A. Coniglio, Europhys. Lett. 66, 531 (2004). 12. M. Tarzia, A. Fierro, M. Nicodemi, A. Coniglio, Phys. Rev. Lett. 93, 198002 (2004). 13. M. Nicodemi, Phys. Rev. Lett. 82 (1999) 3734. 14. A. Barrat et at., Phys. Rev. Lett. 85 (2000) 5034. 15. L. F . Cugliandolo and J. Kurchan. Phys. Rev. Lett., 71:173-176, 1993. 16. L. F. Cugliandolo, J. Kurchan, and L. Peliti. Phys. Rev. E, 55:3898-3914, 1997. 17. A. J . Liu and S. R. Nagel, Nature 396 (1998) 21. 18. C. S. O’Hern, S. A. Langer, A. J. Liu and S. R. Nagel, Phys. Rev. Lett. 86 (2001) 111. C. S. O’Hern, L. E. Silbert, A. J . Liu and S. R. Nagel, Phys. Rev. E 68, 011306 (2003). 19. M. MCzard and G. Parisi, Eur. Phys. J. B 20, 217 (2001). 20. G. Biroli and M. MBzard, Phys. Rev. Lett. 88,025501 (2002). 21. A. Fierro, M. Nicodemi, M. Tarzia, A. de Candia and A. Coniglio, cond-mat/O~l2l20. 22. In the case of uniform density profile, i.e. u ( z ) = const., we have u ( z ) = @ (where = N/L2(2(z) - 1)) below the maximum height and zero above. 23. C. Toninelli, G. Biroli, D. S. Fisher, Phys. Rev. Lett. 92, 185504 (2004). 24. M. PicaCiamarra, M. Tarzia, A. de Candia, and A. Coniglio, Phys. Rev. E 67,057105 (2003); Phys. Rev. E 68,066111 (2003). 25. W. Gotze, in Liquids, Freezing and Glass Transition, eds. J.P. Hansen, D. Levesque, and Zinn-Justin, Elsevier (1991). T . Franosch, M. Fuchs, W. Gotze, M.R. Mayr and A.P. Singh, Phys. Rev. E 55,7153 (1997). M. Fuchs, W. Gotze and M. R. Mayr, Phys. Rev. E 58,3384 (1998). 26. E. K. Lenzi, C. Anteneodo and L. Borland, Phys. Rev. E 63,051109 (2001). 27. G. D’Anna and G. Gremaud, Nature 413, 407 (2001); G. D’Anna, P. Mayor, A. Barrat, V. Loreto, and F. Nori, Nature 424,909 (2003). 28. S. Franz, C. Donati, G. Parisi and S. C. Glotzer, Philos. Mag. B 79,1827 (1999). C. Donati, S. Franz, S. C. Glotzer and G. Parisi, J. Non-cryst. Solids, 307,215 (2002) 29. S. C. Glotzer, V. N. Novikov, and T . B. S c h r ~ d e rJ, . Chem. Phys. 112,509 (2000). 30. Interestingly this anomalous behavior seems to occur around the crossover temperature TD previously calculated. The origin of this behavior, also observed in molecular dynamics simulations of a usual glass former2’, is still unclear.
SUPERSYMMETRY AND METASTABILITY IN DISORDERED SYSTEMS
IRENE GIARDINA, ANDREA CAVAGNA AND GIORGIO PARIS1 Institute for Complex Systems INFM-CNR and Department of Physics, University of Rome La Sapienza, P.le A . Moro 2, 00185 Rome, Italy E-mail:irene.giardinaQromal.infn.it The presence of metastable states is a well known feature of disordered systems and plays a crucial role in the slowing down of the dynamics and the occurrence of the glass transition. A deep understanding of the geometric structure of these states and its implications on the dynamical behaviour therefore represents a very important issue. We will show that when analyzing the properties of metastable states, and in particular their entropic contribution, a supersymmetry is revealed at a formal level, which has a clear physical interpretation. Systems that have different structures of metastable states seem t o behave differently in terms of this supersymmetry: for some of them the supersymmetry is obeyed, for others it is spontaneously broken. We will discuss the physical meaning of the supersymmetry breaking and its connection with the cavity method.
1. Introduction Disordered systems are often said to have a complex landscape, which may be thought as the fundamental reason of the non trivial behaviour exhibited at low temperature. This statement can actually be made more precise, and a whole series of analytical and numerical analysis have been performed t o show the deep connections between the topological structure of the energy (or, when it is possible, free energy) function and thermodynamical properties. One feature that seems ubiquitous is the presence of a huge number of metastable states, i.e. states that are locally stable or quasi-stable but have energy density higher than the equilibrium one. These states have been studied in details via analytical techniques and numerical simulations for a great variety of models. The general scenario that emerges is the following. There is an exponential number of metastable states, i.e. the number of metastable states at a given free energy density f scales as N ( f )= expNC(f), where the entropy C has a finite support f E [ f o , f t h ] , and is an increasing function being zero at fo (the ground-states) and maximal at f t h (the threshold highest energy states). This situation is quite generic, however there are cases where these metastable states do have a crucial role, and some others where they seem much less relevant. One can then distinguish two classes of systems: i) A first class, where metastable states influence the asymptotic dynamics. In
204
205 this case if the system is started at low temperature with a high energy (random) initial condition, with very high probability at long times it will remain trapped at the threshold level: the dynamics is consequently slowed down, until activation processes enable to cross barriers and explore lower energy regions. This activation time-scale may however be quite large, and for mean-field systems is infinite, meaning that the system never reaches the equilibrium energy. Another important feature, much related to the one above, concerns the behaviour in temperature: these systems exhibit a dynamical crossover (which becomes a true transition in mean-field) t o an activated Arrhenius-like behaviour at a certain temperature T d . Interestingly, Td is higher than the critical temperature where thermodynamical anomalies occur, and is therefore a purely dynamical phenomenon. Examples of systems belonging to this class, are pspin models glass forming super-cooled liquids of the fragile kind *, K-satisfiability problems for certain ranges of the control parameter ‘. ii) For systems in the second class instead, metastable states, despite having a finite entropy, are not relevant at all. They can sometimes be computed by analytical or numerical means, but do not influence the dynamical behaviour. In this case then, also at low temperature, the system is always able to asymptotically reach the equilibrium level at the bottom of the metastable states energy band. Dynamically, anomalies (i.e. divergence of the relaxation time) only occurr at the static transition, as one would naturally expect. Examples of systems in this class are mean-fiels models of spin-glass (e.g. the Sherrington-Kirkpatrick model 16); certain disordered problems on random graphs and satisfiability problems for K = 3 and low values of the control parameter ‘. An intriguing question is then why metastable states have such a different role in these two classes. What we have understood in the last years through a series of analytical works on mean-field models and numerical studies on finite dimensional ones, is that what really matters is the nature and the stability properties of the metastable states. For systems of the first class, one can show that metastable states are indeed locally stable, and therefore correspond to confining regions of the configuration space. Besides, their global structure is very robust and does not change much when external perturbation are applied to the system. On the other hand, the situation is completely different for models of the second class. In this case, metastable states are not “truly” stable: when a description in terms of some free-energy functional is possible (for example for mean-field systems) one can show that they have an almost soft-mode l 2 9 l 3 determining a quasi-instability. As a consequence, they are not completely confining. Also, some arguments indicate that they have small attraction basins 14. Finally, as we shall better discuss, their whole structure is very fragile to external perturbations. Interestingly, for mean-field models of spin-glasses, where analytical computations can be performed, these features of metastable states and the difference in their physical role are captured in an elegant formal description where the model
’,
206
belongs t o one class or to the other according to whether a certain supersymmetry is obeyed or not. To understand how this supersymmetry comes into play we have to explain more in details how to deal with metastable states and how t o investigate their properties. 2. Metastable states in mean-field spin glass models
Metastable states in mean-field spin-glasses can be identified with the local minima of a mean-field free energy F (also known as the TAP free energy 5 ) , that is a function of the local magnetizations mi of the system. The number of local minima of F(m) can be written as
where the delta-functions enforce the stationarity of the free-energy, &F(m) = 0, and the determinant of the second derivative (the Hessian) gives the appropriate normalization. By using an integral representation for the delta-functions and the determinant, the number N can be expressed in terms of an effective action S given by lo S = z&F(rn)
+ &$JjL@jF(rn) ,
(2)
where xi is a standard commuting variable, while &,~)i are anticommuting Grassmann variables, and sums over repeated indices are understood. This action S is invariant under a generalized form of the Becchi-Rouet-StoraTyutin supersymmetry 6i7i8 namely, under the transformation bmi = E $Ji; 66 = -E z i , where E is an infinitesimal Grassmann parameter. The physical meaning of this supersymmetry becomes clearer if we look at the Ward identities generated by it. One of them, in particular, reads (mixj) (&$Jj) = 0, which, with some simple algebra, can be rewritten as lo,ll
+
Here the brackets indicate an average over all metastable states, i.e. (...) = l/NC,. . . with a a state label. This relation is nothing else than the static average fluctuation-dissipation theorem (FDT) and shows that this formal supersymmetry encodes an important physical property of the system. 3. Supersyrnmetry breaking and its physical interpretation
It turns out that, while for mean-field models belonging to the first class described in the Introduction this supersymmetry is always obeyed ', for models of the second class it is in fact spontaneously broken when metastable states with high enough free-energy are considered However, as we have seen, this supersymmetry is intimately connected t o a fluctuation-dissipation relation that we expect on physical 12913315.
207 grounds to hold. How is it then possible that supersymmetry is broken and what is the interpretation of this breaking ? To answer this question we must understand under what circumstances such a general relation as the FDT of Eq. (3) may be violated. We have said that for models of the second class metastable states (at least most of them) have an almost softmode in one direction. Indeed recent studies show that at low temperatures metastable states, i.e. minima of the free-energy functional, are organized into minimum-saddle pairs. The minimum and the saddle are connected along a mode that is softer the larger the system size N , corresponding then t o marginal directions in the thermodynamic limit. Also, their free energy difference vanishes as N + 00. One may wonder whether this peculiar structure and the presence of marginal modes may be related to FDT violations. Let us then start to look at the fluctuation-dissipation relation for a single metastable state (we remind that in (3) we have an average over all the states). A metastable state is defined as a particular solution m of the stationarity equations a F ( m ) / d m i = 0. Let us add an external field h, the equation that identifies the metastable state then becomes a F ( m ) / a m i - hi = 0. If we now differentiate with respect to hi we get dmildhj = [aaF(m)];l, i.e. the FDT. Thus, for each state, the FDT is a very natural relationship between the susceptibility and the curvature of the minima of the free energy. Note that this relation also holds when marginal modes are present, since both sides diverge. Thus clearly, if a connection exists between features of metastable states and validity of the average FDT, it must not concern the states considered individually, but rather their global structure: FDT is always valid inside one state, even if marginal, but something goes wrong when we consider averages over all of them, such as in Eq. (3). Indeed another important consequence of marginality is for these systems the extreme fragility towards external perturbations. Since minima and saddles of the free-energy come into pairs and are connected via an almost zero mode, it is clear that even an infinitesimal (i.e. of order 1 / N ) external field may destabilize some states: if the applied field is opposite to local magnetization of the state and its intensity is of the same order as the free-energy difference with the nearby saddle, the minimum-saddle pair will “merge” and the state will disappear. On the other hand, virtual states, i.e. inflection points of the free energy with a very small second derivative, may be stabilized by the field, giving rise to pairs of new states. Thus, the number of states may change dramatically when an external field is applied. In such a situation we can then understand what may go wrong when doing averages. If we want t o compute the average FDT, as in (3), we must start from the average magnetization: 12,13115
Then, to get the 1.h.s. of equation (3) we must differentiate with respect to a magnetic field. The problem is that, due to marginality, some elements in the sum
208
defining the average mangnetization may disappear or appear as the field goes to zero. More precisely, we have,
The key point is that the elements in the two sums at the 1.h.s. of the relation above may be different, because of the action of the field. Therefore, even though the static fluctuation-dissipation relation holds for each individual state, when we sum over all states, an anomalous contribution arises due to the instability of the whole structure with respect to the field and relation (3) is thus violated. Supersymmetry breaking is thus the mathematical expression of a great instability in the structure of metastable states. It has also important consequences in a well-known and widely applied approch to disordered system known as the cavity method l S , l . The basic assumption of this method is that by adding one extra degree of freedom (for example, a spin) to a large system, its physical properties do not change dramatically, and that it is therefore possible to write some recursive equations connecting the system with one extra spin to the old one. In the thermodynamic limit, these relations become self-consistent equations for the physical observables, for example for the distribution of the local magnetization P(mi). Unfortunately, the main assumption of this method, that is the stability under the addition of one new degree of freedom, is no longer valid if the supersymmetry is broken. In fact, in this case, even the small field produced by the new spin and acting on the rest of the system, may, as we have discussed, completely change the structure of metastable states of the system. When this happens, it becomes harder to write equations connecting the old and new properties of the system. For example this cannot be done any longer for the distribution P(mi). However, the cavity approach can be modified to deal with this problem. The crucial point is that the effect of a new spin is somehow analogous to an external field. Thus, in a way, a system with one more spin can be compared with a system that has one spin less and an appropriate field acting on it. Or, in other words, if one considers a system in presence of an external magnetic field, the adding of one spin can be balanced by appropriately tuning the field. As a result, self-consistence equations can be written for P(mi(hi),i.e. for the conditional distribution of the local magnetizations at given external magnetic field 19.
References 1. M.Mezard, G.Parisi and M.A. Virasoro, S p i n Glass T h e o r y and beyond, World Scientific, Singapore (1987). 2. See, e.g. M. MBzard, Physica A 306 25 (2002). 3. See, e.g., O.C. Martin, R. Monasson, R. Zecchina, Theoretical Computer Science 265, 3 (2001).
209 4. A.Montanari, G.Parisi, F. Ricci-Tersenghi, Preprint cond-mat/0308147 (2003). 5. D.J. Thouless, P.W. Anderson and R.G. Palmer, Phil. Mag., 35,593 (1977). 6. C. Becchi, R. Rouet and A. Stora, Comm. Math. Phys. 42,127 (1975); I.V. Tyutin Lebedev preprznt FIAN 39 (1975). 7. G. Parisi and N. Sourlas, Phys. Rev. Lett. 43,744 (1979). 8. Kurchan J 1991 J. Phys. A: Math. Gen. 24 4969 9. A. Cavagna, J.P. Garrahan and I. Giardina, J. Phys. A 32,711 (1998). 10. A. Cavagna, I. Giardina, G. Parisi and M. Mezard, J. Phys. A 36,1175 (2003). 11. A. Crisanti, L. Leuzzi, G. Parisi, and T. Rizzo, Phys. Rev. B 68,174401 (2003). 12. T. Aspelmeier, A. J. Bray, M. A. Moore, Phys. Rev. Lett. 92,087203 (2004) 13. A. Cavagna, I. Giardina and G. Parisi, Phys. Rev. Lett. 92,120603 (2004) 14. M. Moore, private communication. 15. G. Parisi and T. Rizzo, Preprint cond-mat/0401509. 16. D. Sherrington and S. Kirkpatrick, Phys. Rev. Lett. 32,1792 (1975). 17. A.J. Bray and M.A. Moore, J. Phys. C 13,L469 (1980). 18. M.Mezard, G.Parisi and M.A. Virasoro, Europhys. Lett. 1,77 (1986). 19. A. Cavagna, I. Giardina and G. Parisi, Phys. Rev. B, to be published (February 2005).
THE METASTABLE LIQUID-LIQUID PHASE TRANSITION: FROM WATER TO COLLOIDS AND LIQUID METALS.
GIANCARLO FRANZESE Departament de Fisica Fonamental, Universitat de Barcelona Diagonal 64 7,08028 Barcelona, Spain, E-mail: gfranzesedub. edu H. EUGENE STANLEY Center for Polymer Studies and Department of Physics, Boston University, Boston, MA 02215 USA E-mail: [email protected] The possibility of a liquid-liquid (LL) phase transition has been proposed as an interpretation of the anomalous behavior of liquids such as water, whose isobaric density has a maximum for decreasing temperature. By using theoretical models and numerical simulations, we show (i) that this property for molecular liquids implies the existence of a LL critical point, (zi) that the existence of a LL critical point does not imply the anomalous density behavior in systems whose effective potential resembles an isotropic soft-core attractive potential, such as protein solutions, colloids, star-polymers and, to some extent, liquid metals.
1. Water Anomalies and Their Interpretations
We all know that ice floats on water, while solid forms of normal substances are denser than their liquid form. This anomalous property of water is a manifestation of the density maximum at 4 "C at ambient pressure. Many other anomalies are known for water, especially in the supercooled liquid region where the liquid is metastable with respect to the crystal. For example, the absolute magnitude of isobaric heat capacity1, isothermal compressibility', and thermal expansivity3, appear as if they might diverge to infinity at a temperature of about -45"C, while in normal liquids all three response functions decrease as temperature decreases. These anomalies have been interpreted in different ways. (i) The stability-limit interpretation4, assumes that in the pressuretemperature (P-T) plane the limits of stability of the supercooled, superheated and stretched liquid water form a single retracing spinodal line. However, no experimental or numerical evidence of a retracing spinodal has thus far been found5. (ii) The singularity-free interpretation6i7 predicts no retracing spinodal and envisages that the experimental data represent apparent singularities, due to anticorrelated fluctuations of volume and entropy, responsible for the anomalies. (iii) Finally, the liquid-liquid (LL) phase transition hypothesis' proposes the presence of a first order line of phase transitions, possibly ending in a critical point,
210
211
separating two liquids differing in density, the high density liquid (HDL) and the low density liquid (LDL), and responsible for the anomalies. The last two interpretations can be recovered within the same model by smoothly tuning a parameter, i.e. they could be complementary, describing different physical s i t u a t i ~ n s ~ ~ 'The ~ ~ 'two ~ . interpretations seem t o suggest a one-way implication going from the LL phase transition to the density anomaly: the LL phase transition hypothesis apparently implies the presence of density anomaly, but the singularityfree interpretation suggests that the density anomaly does not imply the occurrence of a LL phase transition. Understanding the relation between the density anomaly and the LL phase transition is relevant to understand this transition, and to predict in which systems a LL phase transition can be found experimentally. The interest in this issue has been renewed by recent experimental evidences of LL phase transitions in one-component systems such as phosphorous'2, triphenyl phosphite13 and Yz03-A120314. 2. Does the Density Anomaly Imply the LL Phase Transition?
To answer this question, we consider a model for a water-like molecular fluid with density anomaly and with intermolecular and intramolecular interactions". To mimic the density anomaly we assume, motivated by experimental observations", that the formation of a HB leads to an expansion in local volume7, V = VOf N H B V H B . Here VOis the volume of the liquid with no HBs, N H B = C(i,j) ninjboij,oji is the total number of HBs in the system, and U H B is the specific volume per HB. We partition the fluid into N cells of equal size and associate a variable ni with each cell i = 1,.. . ,N , where ni = 1 if a molecule occupies the cell and ni = 0 otherwise. The cells have the size comparable to that of a water molecule, and the molecules have four arms, one per possible HB. Experiments" show that the relative orientations of the arms of a water molecule are correlated, suggesting an orientational intramolecular interaction between the arms, with a finite interaction energy. Hence, we introduce the Hamiltonian"
The first two terms describe the isotropic and orientational contribution, respectively, of the intermolecular interaction, where E > 0 is the van der Waals attraction energy for molecules in nearest neighbor (NN) cells summed over all the possible NN cells, and J > 0 is the energy gain per each HB formed between molecules in NN cells. The Potts variable uij = 1,.. . ,q, with a finite number q of possible orientations, represents, for the molecule in cell i, the orientation of the arm facing the cell j , with molecules in NN cells forming a HB only if they are correctly oriented, i.e. if 6,ij,,,ji = 1 (6=,b = 1 if a = b and 6a,b = 0 otherwise). The third term accounts for the intramolecular interaction, with an energy gain J , > 0 for each of the 4Cz = 6 different pairs (k,l ) i of arms of the same molecule
212
i with the appropriate orientation (buiLlCil= 1). For J , = 0 we recover the model introduced in Ref.7 that predicts the singularity-free scenario. We perform analytic calculations in mean field approximation" and off-lattice Monte Carlo (MC) simulation", finding for J , > 0 a LL phase transition, ending in a critical point whose temperature decreases to zero and vanishes with J, (Fig. l)ll. General considerations suggest that the liquid-liquid phase transition for water occurs below the glass temperature, i.e. outside the accessible experimental range". Therefore, this model predicts that the singularity-free scenario for anomalous liquids is strictly valid only for molecular liquids with vanishing intramolecular interaction, while for a finite intramolecular interaction the density anomaly implies a LL phase transition. Once this implication is proved, one can explore the implication in the other direction, in order to clarify if experimental investigation of a possible LL phase transition should be limited only to anomalous liquids.
3. Does a LL Phase Transition Imply the Density Anomaly? To answer this question we inve~tigate'~, by molecular dynamics (MD) simulation^^^^^^, integral equation^'^!^^ and modified van der Waals approach16, the phase behavior of an isotropic soft-core attractive potential for a single-component system in 3 dimensions (inset Fig. 2), similar to potentials used to describe systems such as colloids, protein solutions or, to some extent, liquid metals. For largeenough attractive range we find a gas-LDL phase transition and, at lower T , higher p and higher P, a LDL-HDL phase transition16. For short attractive range and small repulsive shoulder (Fig. 2), we predict a phase diagram with a gas-LDL phase transition and a gas-HDL phase transition, both ending in critical points and both metastable with respect to the crystal phase. The latter phase diagram is reminiscent of the experimental phase diagram for fluid phosphorous, showing gas-LDL phase transition and gas-HDL phase transition12. In all the cases we studied, this potential does not show density anomaly, so our work shows that, at least theoretically, the occurrence of a LL critical point, or more generally the coexistence of two liquids, does n o t necessarily imply the presence of a density anomaly. This result suggests the possibility of finding a LL phase transition in a class of systems wider than liquids with density anomalies. 4. Conclusions
Our results for a water-like model predict that a liquid with a density anomaly and non-zero intramolecular interaction has a LL phase transition, which may be pre-empted by inevitable freezing. This conclusion applies for network-forming systems, such as water or silicals. Recently an analogous conclusion has been reached by other authors with a different approachlg. On the other hand, by studying an isotropic soft-core attractive potential, we show15i16,17that a system with a LL phase transition, and possibly a LL critical
213
,,TMD (Temperature of Maximum Density: ‘a\.-.
4
‘9
0
40
C’ 0
0.5
1
Temperature T/E Figure 1. The mean field P-T phase diagram for the water model in Sec.2 with J O / e = 0.05, showing the gas-liquid first-order phase transition (black) line ending in the critical point C,the (dashed blue) line of temperatures of maximum density (TMD) and the LDLHDL first-order phase transition (red) line ending in the critical point C‘.
point, does not necessarily have a density anomaly. Since this kind of potential describes systems like colloids, solutions of biomolecules and liquid metals, within specific approximations, it is reasonable t o predict t h a t a LL phase transition could be experimentally investigated in these systems. In particular, we found results reminiscent of t h e recently-investigated phosphorus experimental phase diagram12. We thank our collaborators, S. V. Buldyrev, G. Malescio, M. I. Marquks, F. Sciortino, A. Skibinsky, and M. Yamada. We thank Ministerio de Ciencia y Tecnologia (Spain) and NSF Chemistry Program C H E 0096892 and CHE0404673 for support.
References C.A. Angell, M. Oguni, and W.J. Sichina, J . Phys. Chem. 8 6 , 998 (1982). R.J. Speedy, C.A. and Angell, J . Chem. Phys. 6 5 , 851 (1976). D.E. Hare, and C.M. Sorensen, J. Chem. Phys. 87, 4840 (1987). R. J. Speedy, J. Phys. Chem. 86,3002 (1982); M. C. D’Antonio and P. G. Debenedetti, J. Chem. Phys. 86, 2229 (1987). 5. S. Sastry, F. Sciortino, and H. E. Stanley, J. Chem. Phys. 98, 9863 (1993). 6. H. E. Stanley and J. Teixeira, J. Chem. Phys. 73, 3404 (1980). 7. S. Sastry et al., Phys. Rev. E 53, 6144 (1996); L. P. N. Rebelo et al., J. Chem. Phys. 109, 626 (1998); E. La Nave et al., Phys. Rev. E 5 9 , 6348 (1999). 8. P. H. Poole et al., Nature (London) 360, 324 (1992). 9. P. H. Poole et al., Phys. Rev. Lett. 73, 1632 (1994); S. S. Borick et al., J. Phys. Chem. 99, 3781 (1995); C. J. Roberts et al., Phys. Rev. Lett. 77, 4386 (1996); C. J. Roberts and P. G. Debenedetti, J. Chem. Phys. 105, 658 (1996); T. M. Truskett et al., ibid. 111, 2647 (1999). 10. G . Franzese, and H.E. Stanley, J . Phys. Condens. Matter 14, 2201 (2002);
1. 2. 3. 4.
214
Number Density p ( aJ )
0
0.b7
o.ii 0.i3 Number Density p ( a-3)
0.69
0.i5
Figure 2. The MD P-p phase diagram for the potential in the inset (with U R / V A= 0.5, w R / a = 1, w A / a = 0.2), showing the gas phase, the LDL phase and the HDL phase, with a gas-LDL critical point (CI) and a gas-HDL critical point (Cz) with the corresponding coexistence regions. Panel (b) is a blow up of panel (a) in the vicinity of C1.
11. G . Franzese, M. I. MarquBs, and H.E. Stanley, Phys. Rev. E67, 011103 (2003). 12. Y . Katayama et al., Nature (London) 403, 170 (2000); Science 306, 848 (2004); G. Monaco et al. Phys. Rev. Lett. 90, 255701 (2003). 13. R. Kurita and H. Tanaka, Science 306, 845 (2004). 14. S. Aasland and P. F. McMillan, Nature 369, 633 (1994); M. C. Wilding et al., J. Non-Cryst. Solids 297, 143 (2002). 15. G. Franzese et al., Nature 409, 692 (2001); Phys. Rev. E 66 051206 (2002). 16. A. Skibinsky et al., Phys. Rev. E 69 061206 (2004). 17. G. Malescio et al., cond-mat/0412159 (2004). 18. ISaika-Voivod et al., FSciortino and P.H.Poole, Phys. Rev. E 63, 011202-1 (2001). 19. F. Sciortino, E. La Nave, and P. Tartaglia, Phys. Rev. Lett. 91, 155701 (2003).
OPTIMIZATION BY THERMAL CYCLING
A. MOBIUS Leibniz Institute for Solid State and Materials Research Dresden, PF 2701 16, 0-01171 Dresden, Germany E-mail: [email protected]
K.H. HOFFMANN T U Chemnitz, Institute of Physics, 0-09107 Chemnitz, Germany E-mail: [email protected] c . SCHON Max Planck Institute for Solid State Research, 0-70569 Stuttgart, Germany E-mail: [email protected]
Thermal cycling is an heuristic optimization algorithm which consists of cyclically heating and quenching by Metropolis and local search procedures, respectively, where the amplitude slowly decreases. In recent years, it has been successfully applied to two combinatorial optimization tasks, the traveling salesman problem and the search for low-energy states of the Coulomb glass. In these cases, the algorithm is far more efficient than usual simulated annealing. In its original form the algorithm was designed only for the case of discrete variables. Its basic ideas are applicable also to a problem with continuous variables, the search for low-energy states of Lennard-Jones clusters.
1. Introduction
Optimization problems with large numbers of local minima occur in many fields of physics, engineering, and economics. They are closely related to statistical physics, see e.g. Ref. [l].In the case of discrete variables, such problems often arise from combinatorial optimization tasks. Many of them are difficult to solve since they are NP-hard, i.e., there is no algorithm known which finds the exact solution with an effort proportional to any power of the problem size. One of the most popular such tasks is the traveling salesman problem: how to find the shortest roundtrip through a given set of cities 2 . Many combinatorial optimization problems are of considerable practical importance. Thus, algorithms are needed which yield good approximations of the exact solution within a reasonable computing time, and which require only a modest effort in programming. Various deterministic and probabilistic approaches, so-called search heuristics, have been proposed to construct such approximation algorithms. A considerable part of them borrows ideas from physics and biology. Thus simu-
215
216
lated annealing and relatives such as threshold accepting as well as various genetic algorithms have successfully been applied to many problems. Particularly effective seem to be genetic algorithms in which the individuals are local minima For recent physically motivated heuristic approaches we refer to thermal cycling ', optimization by renormalization *, and extremal optimization g. For problems with continuous variables, approaches which combine Monte-Carlo procedures for global search with deterministic local search by standard numerical methods, for example the basin-hopping algorithm, have proved to be particularly efficient They can be considered as relatives of the genetic local search approaches for the case of discrete variables. Here we focus on the thermal cycling algorithm and illuminate the reasons for its efficiency. 516.
loill.
2. Thermal cycling algorithm Simulated annealing can be understood as a random journey of the sample (i.e. the approximate solution) through a hilly landscape formed by the states of its configuration space. The altitude, in the sense of a potential energy, corresponds to the quantity to be optimized. In the course of the journey, the altitude region accessible with a certain probability within a given number of steps shrinks gradually due to the decrease of the temperature in the Metropolis simulation involved. The accessible area, i.e., the corresponding configuration space volume, thus shrinks until the sample gets trapped in one of the local minima.
t Figure 1. Time dependence of the energy E (quantity t o be optimized) of the sample currently treated in the cyclic process. Gaps in the curve refer t o cycles where the final state has a higher energy than the initial state, so that the latter is used as initial state of the next cycle too.
Deep valleys attract the sample mainly by their area. However, it is tempting to make use of their depth. For that, we substitute the slow cooling down by a cyclic process: First, starting from the lowest state obtained so far, we randomly deposit energy into the sample by means of a Metropolis process with a certain temperature T, which is terminated, however, after a small number of steps. This
217 part is referred t o as heating. Then we quench the sample by means of a local search algorithm. Heating and quenching are cyclically repeated where the amount of energy deposited in a cycle decreases gradually, see Fig. 1. This process continues until, within a ‘reasonable’ CPU time, no further improvement can be found. It is an essential feature of the thermal cycling algorithm that two contradicting demands are met in heating: the gains of the previous cycles have to be retained, but the modifications must be sufficiently large, so that another valley can be reached. Thus the heating process has to be terminated in an early stage of the equilibration. An effective method is to stop it after a fixed number of successful Metropolis steps. The efficiency of the proposed algorithm depends to a large extent on the move class considered in the local search procedure. For discrete optimization problems, it is a great advantage of our approach that far more complex moves can be taken into account than in simulated annealing so that the number of local minima is considerably reduced. The local search concerning complex moves can be enormously sped up by use of branch-and-bound type algorithms. Their basic idea is to construct new trial states following a decision tree: At each branching point, a lower bound of the energy of the trial state is calculated. The search within the current branch is terminated as soon as this bound exceeds the energy of the initial state. The basic thermal cycling procedure can be easily accelerated in three ways: (i) partition of the computational effort into several search processes in order to minimize the failure risk 14, (ii) restricting the moves in heating to the ‘sensible sample regions’ by analyzing previous cycles, or by comparing with samples considered in parallel, and (iii) combining parts of different states 7912.
3. Applications The thermal cycling algorithm was first tested on the traveling salesman problem ’. For that, we considered problems of various size from the TSPLIB95 l5 for which the exact solutions, or at least related bounds, are known. Fig. 2 gives a comparison of thermal cycling data with results from simulated annealing and from repeated local searches starting from random states. For a meaningful characterization of the algorithms, it relates mean deviations from the optimum tour length to the CPU-time effort for various parameter values. The diagram includes data for two move classes: (a) cutting a roundtrip twice, reversing the direction of one of its parts, and connecting the parts then again, or shifting a city from one to another position in the roundtrip; (b) same as (a) and additionally rearrangements by up to four simultaneous cuts as well as Lin-Kernighan realignments 16. Fig. 2 shows that, for the traveling salesman problem, thermal cycling is clearly superior t o simulated annealing, already if the same move class is considered in both procedures - the simulated annealing code had been carefully tuned too -. However, when taking advantage of the possibility to incorporate more complex moves, thermal cycling beats simulated annealing by orders of magnitude in CPU time. Applied to an archive of samples instead of t o a single one, it can compete
218
3 1o 2
3 lo3
w
5
4)
. 4
3 v 3 1o4
3 1o5 T cpu
I sec
Figure 2. Relation between CPU time, TCPU (in seconds for one PA8000 180 MHz processor of an HP K460), and deviation, 6L = L,,,, - 27686, of the obtained mean approximate solution from the optimum tour length for the Padberg-Rinaldi 532 city problem 0: repeated quench t o stability with respect t o move class (a) defined in the text; A: simulated annealing; x and m: thermal cycling with ensembles of various size, and local search concerning move classes (a) and (b), respectively. In all cases, averages were taken from 20 runs. Errors (lu-region) are presented if they exceed the symbol size. The lines are guides t o the eye only.
'.
1oo
I
I
lo-'
6 E lo-'
1o
-~
1o4 1
Id
1o3
1o4
7 CPU t=cI
Figure 3. Mean deviation of the energy of the lowest state found from the ground state energy, 6E = Em,,, - Eground state, related t o the CPU time TCPU (180 MHz PA8000 processor of HP K460) for one realization of the threedimensional Coulomb glass lattice model with 1000 sites, half filling, and medium disorder strength 13. X : simulated annealing; A: multistart local search considering simultaneous occupation changes of up t o four sites; m: thermal cycling. For simulated annealing and multistart local search, averages were taken from 20 runs, for thermal cycling from 100 runs. In thermal cycling, the ground state was always found within 500 seconds.
219 with leading genetic local search algorithms '112. For several years, we have used thermal cycling as standard approach in numerical investigations of the Coulomb glass, which is basically an Ising model with long-range interactions. Also in this case, thermal cycling proved to be a very efficient tool. Fig. 3 presents data from an investigation comparing several algorithms 13. In simulated annealing, we could efficiently treat only particle exchange with a reservoir and one-particle hops inside the sample, that is occupation changes of one or two sites. However, in the deterministic local search, the simultaneous occupation modification of up to four sites could be considered by means of branch-and-bound approaches. Therefore, the corresponding multistart local search yields significantly better results than simulated annealing. Thermal cycling of the low-energy states proves to be still far more efficient than the local search repeatedly starting from random states. It is tempting to apply the thermal cycling approach also to problems with continuous variables. Thus we have considered Lennard-Jones cluster of various size because the energy landscapes of this system are known to have large numbers of local minima. The heating consisted of simulateneously shifting all atoms by small distances a few times according to a thermal rejection rule, and the quench combined the Powell algorithm with a systematic consideration of symmetry positions. The ground states of several clusters of up t o 150 atoms could be reproduced within 'reasonable' CPU times. Further related investigations should be promising.
References 1. Y. Usami and M. Kitaoka, Int. J. Mod. Phys. B 11, 1519 (1997). 2. D. S. Johnson and L. A. McGeoch, in Local Search in Combinatorial Optim.ization, eds. E. Aarts and J. K. Lenstra (Wiley, Chichester, 1997), p. 215. 3. S. Kirkpatrick, C. D. Gelatt, Jr., and M. P. Vecchi, Science 220, 671 (1983). 4. J. H. Holland, Adaptation in Natural and Artificial Systems: an Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, (University of Michigan Press, Ann Arbor, 1975). 5. R. M. Brady, Nature 317, 804 (1985). 6. P. Merz and B. Reisleben, in Proc. 1997 IEEE Int. Conj. on Evolutionary Computation, Indianapolis, (IEEE Press,1997), p. 159. 7. A. Mobius, A. Neklioudov, A. Diaz-Shchez, K.H. Hoffmann, A. Fachat and M. Schreiber, Phys. Rev. Lett. 79, 4297 (1997). 8. J. Houdayer and 0. C. Martin, Phys. Rev. Lett. 83, 1030 (1999). 9. S. Boettcher and A. G. Percus, Phys. Rev. Lett. 86, 5211 (2001). 10. D. J. Wales and J. P. K. Doye, J. Phys. Chem. A 101 (1997), 5111. 11. M. Iwamatsu and Y . Okabe, Chem. Phys. Lett. 399,396 (2004). 12. A. Mobius, B. F'reisleben, P. Merz and M. Schreiber, Phys. Rev. E 5 9 , 4667 (1999). 13. A. Diaz-SAnchez, A. Mobius, M. Ortuiio, A. Neklioudov and M. Schreiber, Phys. Rev. B 62, 8030 (2000). 14. B. A. Huberman, R. M. Lukose and T. Hogg, it Science 275, 51 (1997). 15. www.iwr.uni-heidelberg.de/groups/comopt/so~ware/~SPLIB95 16. S. Lin and B. Kernighan, Operations Research 21, 498 (1973).
ULTRA-THIN MAGNETIC FILMS AND THE STRUCTURAL GLASS TRANSITION: A MODELLING ANALOGY
S. A. CANNAS AND F. A. TAMARIT Facultad de Matemcitica, Astronomia y Fisica Universidad Nacional de Cdrdoba Ciudad Universitaria, 5000 Cdrdoba, Argentina E-mail: [email protected]. edu.ar and [email protected]. ar P. M. GLEISER Centro Atdmico Bariloche, Sun Carlos de Bariloche, 8400 Rio Negro, Argentina E-mail: gleiser0cab.cnea.gov.ar
D . A. STARIOLO Departamento de Fisica, Universidade Federal do Rio Grande do Su1, C P 15051, 91501-979, Porto Alegre, Brazil E-mail: starioloOif.uj?gs. br In this work we study a two dimensional king model for ultrathin magnetic films which presents competition between short range ferromagnetic interactions and long-range antiferromagnetic dipolar interactions. We present evidence that the dynamical and thermodynamical properties of the model allow for an alternative interpretation, in terms of glass forming liquids. In particular, the existence of a first order phase transition between a low temperature crystal-like ordered phase a high temperature liquid-like disordered phase, which can be supercooled below the melting point, together with a drastic slowing down after a quench t o low temperatures suggest that these materials could present a phenomenology similar t o that observed in glass forming liquids.
1. Introduction The physics of glass forming liquids and structural glasses in general, appears today as a great challenge in statistical mechanics and chemical physics. Despite the huge effort devoted to the field and the enormous improvements obtained in the comprehension of these complex systems, due both t o theoretical and experimental studies, there are still many open questions concerning their phenomenology. Most of the theoretical knowledge in the field resides today in two different approaches. On one side, there are different phenomenological and first principles microscopic theories (for a recent review see Ref.l). Among the last ones,
220
221 perhaps the most successful one is the mode coupling theory ', in the sense of having more experimentally verified predictions. However, up to now no one of the existing theories can account for a complete description of the observed phenomenology. On the other hand, the constant improvement in the computational capacity allowed the implementation of very accurate molecular dynamics simulations. Most of these simulations are based on small binary systems of particles interacting through Lennard-Jones like potentials. While the existing microscopic theoretical approaches seem to be very limited due to the complexity of the analytical treatment, the numerical approach is limited by the small number of particles that can be considered and the extremely small time span that a simulation can cover, specially for modelling systems that have an astonishing slow relaxation dynamics '. In a completely different scheme, the statistical physics community has been looking, since many decdes ago, for a simple lattice model able to catch (independently of the degree of accuracy of its microscopic description) those few relevant ingredients which are responsible of the rich dynamical and thermodynamical phenomenology of these materials. There is a general consensus in the community in the sense that a relevant element in the description of structural glasses is the appearance of frustration at the level of microscopic interactions between molecules. And, unlike what happens in many other complex systems, as for instance spin glasses, this frustration is not due to the existence of randomness, but to the emergence of competition between attractive and repulsive interactions acting on each particle. Among the many different approaches presented in the literature in order to introduce a lattice model for structural glasses, we want to mention in first place that presented by Shore, Holzer and Sethna 3, since, as will become clear in short, it is intimately related to the scope of our paper. They consider the magnetic Ising system on a square and on a cubic regular lattices, with ferromagnetic interactions between nearest-neighbors plus antiferromagnetic interactions between next-nearest neighbors spins, and without taking into account any kind of randomness in the Hamiltonian of the model. For the two dimensional case, they actually found a relatively simple dynamical and thermodynamical behavior, which is far from being glassy. But the situation was completely different for the three dimensional case, where at very low temperatures they found a drastic slowing down of the relaxation, ruled by a logarithmic domain growth law, proper of glassy systems. This simple model, without imposed disorder and with competition, showed to be able to reproduce at least partially and qualitatively the complex phenomenology of a glassy processes. Nevertheless, its main limitation was the existence of a second order phase transition between the high temperature disordered phase (analog to a liquid state) and the ferromagnetic ordered phase (analog to a crystalline state). This continuous transition without coexistence of phases can not give account of the process of supercooling a liquid, which is intimately related to the process of
222 forming a structural glass. Since then, many other attempts have been done in order to improve that first intent. In a series of papers, Lipowski and co-workers considered the same model introduced by Shore et al. plus a four spin plaquette ferromagnetic term in the Hamiltonian. This model captures most of the complex dynamics of the original model, and presents a first order phase transition, as desired. Nevertheless, its ground state is infinitely degenerate, a fact that can be hardly associated to the crystalline ordering of a solid. Later on, Cavagna, Giardina and Grigera considered a two dimensional model with a two terms Hamiltonian: the previously described four spin ferromagnetic plaquette plus a five term ferromagnetic plaquette. This is actually an excellent model which describes most of the expected features of a twedimensional structural glass. Another interesting approach considered a system with nearest-neighbors coupling and antiferromagnetic coulomb interactions, which also proved t o be an adequate model for describing three dimensional glass forming liquids. Nevertheless, all these systems, besides their great value as statistical mechanics prototypes for modelling a structural glass without imposing disorder in the Hamiltonian, are not really inspired by any physical realization. In this work instead, we will present a model which has been vastly analyzed during the last ten years and is considered to be the proper tool for describing the physics of real ultra thin magnetic films, but instead of paying attention to the magnetic behavior, we will show that the model, and consequently perhaps ultra thin films, present evidence of displaying a structural glass-like state. 2. A metal on metal ultra thin film model
The physics of ultra-thin magnetic films has deserved a great interest during the last years, not only because of their multiple technological applications, such as data storage and catalysis, but also because their study has opened novel and nontrivial questions related t o the role of microscopic competitive interactions in the overall behavior of a system. In particular, under suitable thermal and magnetic conditions, ultra thin magnetic films form unusual complex patterns of magnetization 7,8. And precisely, most of the potential technological applications of these materials reside in the ability of controlling these patterns, both in time and space, with a high degree of accuracy. For instance, in the case of data storage the stabilization of very small metastable magnetic domains could eventually increase the compression obtained nowadays in recording devices. The model we will analyze in this paper has been used mainly in the study of metal films on metal substrates, as for example Fe on Cu or Co on Au lo. In these cases, an adequate theoretical description of the system must take into account, at least, a three terms Heisenberg Hamiltonian, including: i) the usual ferromagnetic exchange interactions between nearest-neighbors spins, ii) the dipoledipole interactions which, despite their considerable small strength (when compared with the exchange interactions) become relevant due to their long range, and finally iii) a sur-
223 face anisotropy term that takes into account the magnetic influence of the substrate on the spins of the film. The anisotropy induced by the dipolar interaction tends to align the spins parallel to the film, but, as the thickness of the film is reduced (usually around approximately five monolayers) the surface anisotropy overcomes the anisotropy of the dipolar interaction and the system suffers a reorientation transition at which the spins suddenly align perpendicular to the plane defined by the proper film. Under these particular conditions, the physics of the material can be appropriately described by replacing the Heisenberg spin variables by the much simpler Ising magnetic moments located at the nodes of a square lattice, and the Hamiltonian takes the form:
where Si = f l and 6 is the ratio between the exchange JOand dipolar Jd interactions strengths (6 = Jo/Jd). Here the first term represents the ferromagnetic exchange interaction and the sum runs only over nearest-neighbors spins, while the second one represents the dipole-dipole interaction once the spins have aligned perpendicular to the plane. In this last case, the sum runs over all pairs of spins of the lattice and rij is the distance, measured in crystal units, between the sites i and j. Then, the system is ruled only by two variables: the usual temperature T and the parameter 6, which depends on the composition and preparation of the sample. We will restrict ourselves to consider the case Jd > 0, in such a way that 6 > 0 corresponds to ferromagnetic exchange interactions and 6 < 0 to antiferromagnetic ones. Note that the model introduced by Shore et al. can be considered as truncated version of the model defined by Hamiltonian (1). In 1995 MacIsaac and coauthors l1 presented the first study of the thermodynamics of the model (for a complete review of the subject see 12). Concerning the order of the ground state, it is antiferromagnetic when 6 < 6, M 0.425, but when 6 > 6, the system orders forming stripes whose width h depends on the value of 6. In particular, h increases as 6 increases and, surprisingly, the ferromagnetic state is always metastable respect to a striped one. In other words, irrespectively of the strength of the dipole-dipole interactions, the frustration induced by the antiferromagnetic term avoids the usual ferromagnetic ordering. In the same paper l1 they also presented a phase diagram of the model, obtained through Monte Carlo simulations on a relatively small system of N = 16 x 16 spins. They observed that, for fixed values of 6, the system suffers an order-disorder phase transition between a low temperature striped phase and a high temperature tetragonal phase. The last one consists of extended magnetic domains characterized by predominantly square corners, which induces a four fold rotational symmetry (as can be clearly observed in numerical simulations of the structure factor). The existence of this tetragonal phase has been recently verified experimentally in a fcc Fe on Cu(100) films and had already been predicted by Abanov et al. by means of a continuous approximation 13. '7'
224
disordered T
Figure 1. The phase diagram for intermediate values of 6 obtained in L = 24 lattices. Triangles: critical temperature obtained from the maximum in the specific heat; circles: stability line of the h l stripe phase; open squares: stability line of the h2 stripe phase; diamonds: first order transition lines between low temperature ordered phases. TP indicates a triple point.
i
t
00
Figure 2. Specific heat 2)s. T for 6 = 3 (corresponding to an h = 4 ground state) and three different system sizes. Some typical equilibrium configuration a t the indicated temperatures for L = 48 are shown below. Note the sequence of transitions h4 tetragonal --t paramagnetic.
-
In Fig. 1 we present a detail of the phase diagram obtained in l4 corresponding to the intermediate values of 6. Here h l and h2 indicates the regions where the ground states have widths h = 1 and h = 2, respectively, and AF indicates the antiferromagnetic phase. The gray region indicates the presence of metastable states. The lines (diamonds) separating the low temperature phases are all first order ones, and TP indicates the existence of a triple point (as will be become clear in the next section).
225
.84
-
.83
-
.a. -81
-
.m .70
T,-0.776
.TI 0 . W
.wO5
,0010
,0015
,0020
,0025
.o030
.OM5
.I040
. a 5
L-2 Figure 3. Pseudo critical temperatures T: (maximum of the specific heat) and T," (minimum of the Binder cumulant) vs. L-* for 6 = 2.
3. Super cooled tetragonal liquid state
In this section we will present some recent and preliminary evidence that strongly suggests that metal on metal ultra thin magnetic films could have a glassy transition. Supercooled glass forming liquids have the property of getting trapped into a liquid metastable state (with respect to the crystalline state), when cooled below the melting temperature under suitable conditions. But, if the cooling is done suddenly enough at sufficiently low temperatures, the characteristic relaxation time attains macroscopic scales, the supercooled state is structurally arrested and the material behaves as a solid without any pattern of long range order. Under these conditions the systems becomes a glass. Then, a basic ingredient for a glass transition model is the existence of a first order phase transition between a high temperature liquid-like disordered phase and a low temperature crystal-like ordered phase. We will now show that the order-disorder phase transition observed in the model described by Hamiltonian (l),is actually a weak first order one, at least for intermediate values of b (though some analytical approximate results l6 suggest that this could be valid for any finite value of 6). In Fig. 2 we plot the specific heat CL as a function of the temperature T for b = 3, and three different system sizes. We can clearly identify two peaks. The low
226 temperature one increases with the size L and coincides with the temperature at which the tetragonal phase appears. Then it is associated with the stripe-tetragonal transition. The second broader peak does not manifest any dependence on the system size and indicates the continuous decay of the tetragonal phase into the paramagnetic phase. We also present some snapshots of typical configurations of the system for different temperatures. What about the order of the transition? One way of determining it is through the analysis of the finite size scaling behavior of different moments of the energy, like the specific heat
and the Binder fourth order cumulant
which permit to distinguish between continuous and discontinuous transitions. In a first order phase transition the specific heat presents a maximum at a pseudo critical temperature T,'(L) and the Binder cumulant a minimum at another different pseudo critical temperature T,"(15).Those temperatures present the following finite size scaling behavior T;(L) T, AL-2 and T,"(L) T, BL-2, with B > A , T, being the transition temperature of the infinite system 15. In Fig. 3 we plot T,' and T," vs. 1/L2 for 6 = 2, identifying clearly the finite size scaling behavior expected in a first order phase transition 16. Moreover, numerical simulations of the energy histogram around the transition point for 6 = 2 show a two peak structure, proper of this kind of transitions 16. Next we will show that it is possible to get a supercooled metastable state below the melting temperature, another significant feature of glass forming liquids. In Fig. 4 we plot (full circles) the average internal energy per particle u(T) as a function of the temperature in a quasi-static cooling from a high temperature. Each point corresponds t o an average over many different initial conditions and sequences of random numbers. In the same plot (empty triangles), we also display the result of a quasi-static heating from the ground state. We clearly observe the emergence of hysteresis, proper of a first order phase transition. From these energy curves we obtained the free energy per spin on cooling and heating, finding that both curves intersect at T, = 0.805 f 0.005, a temperature that we identify with the melting point. Finally, let us describe the behavior of the system when it is suddenly quenched into a very low temperature. In Fig. 5 we plot the time evolution of the internal energy per spin along a single Monte Carlo run for 6 = 2 , L = 32 and T = 0.2. We observe that the system is stuck into a disorder configuration, with a very slow relaxation rate (almost logarithmic, as can be observed in the inset). The dashed line indicates the energy per spin of the ground state, and we see that the system is magnetically arrested in this out of equilibrium disordered state. We also present, at different times, snapshots of the corresponding pattern of magnetization, where N
+
N
+
227 -0.9
-1 .o
3 -1 .l
n-J
I I I
-1.2
0.5
0.6
0.7
0.8
0.9
1 .o
T Figure 4. Internal energy per spin u(T)obtained by quasistatic cooling from infinite temperature and quasistatic heating from the ground state.
-1.2
!
Figure 5. Time evolution of the energy per spin in a single MC run. Snapshots of the spin configurations are shown below the figure. The inset presents the same results for the time evolution of the energy per spin in a log-normal plot.
one can recognize and almost tetragonal phase. This behavior, characterized by a slow relaxation of a disordered liquid-like phase well below the melting temperature,
228 which can neither be associated with nucleation nor with coarsening, can be clearly interpreted in terms of the appearance of a glassy phase. 4. Conclusions
In this paper we have revisited the two dimensional Ising model with competing nearest-neighbors ferromagnetic interactions and long range antiferromagnetic dipoldipole interactions from a new point of view. Instead of concentrating on the magnetic properties of the model, we have investigated a possible characterization of the tetragonal phase as a lattice version of a liquid that can give place to a glass state a very low temperatures. It has been well established in our simulations that the order-disorder temperature driven phase transition between the tetragonal and striped phases is a weak discontinuous one, as revealed by the scaling law of the pseudo critical temperatures obtained from the specific heat and the fourth order Binder cumulant. Analitical results on a continuous version of the model give further support to this conclusions 16. We have also shown, by simulating a quasistatic cooling from infinite temperature, that the tetragonal phase can be supercooled well below the melting temperature. Furthermore, when the system is suddenly cooled down into a low enough final temperature, it gets stuck in a glass like state, characterized by an extremely slow relaxation process. It is important here to mention that previous papers had already reported the existence of some glassy like phenomena, as for instance, aging l7ll8 and logarithmic domain growth law but all of them were connected to the existence of metastable stripe states, which it is well known, modify the landscape of the free energy function. Instead, the results presented in this paper refer to a different and novel observation, namely, the existence of a first order phase transition between the tetragonal (liquid) and the striped phases (crystal) and the supercooling of the former in a long standing metastable state well below the melting temperature, indicating the emergence of a two dimensional glass. Finally, let us stress that the present model, unlike all the other lattice models cited in this papers, did not arise as a statistical mechanics toy model able to catch the main features of a glass forming liquid. On the contrary, the model is widely accepted to be the proper one for describing the physics of real metal on metal ultra thin magnetic films. In other words, our results strongly suggest that these materials present a glass transition, a fact that would have many relevant technological consequences. As much as we know, this is the first example of a physical realization of a two dimensional magnetic system without imposed disorder that can be considered as a glass forming liquid. Nevertheless, this point requires further investigation. In that sense, a careful study of the behavior of the relaxation time as the temperature decreases, as well as an adequate characterization of the nucleation process, will not only be a clear confirmation of the existence of a glass state but will also bring light into its
229 nature. Works along these lines are in progress and will be published elsewhere. Experimental checks of our predictions will b e also very helpful.
Acknowledgments This work was partially supported by grants from Consejo Nacional de Investigaciones Cientificas y TBcnicas CONICET (Argentina), Agencia CQdoba Ciencia (Cbrdoba, Argentina), Secretaria de Ciencia y Tecnologia de la Universidad Nacional de C6rdoba (Argentina), CNPq (Brazil) and ICTP grant NET-61 (Italy). P.M.G. acknowledges financial support from Fundaci6n Antorchas (Argentina).
References 1. R. Schilling, Theories of the strzlctural glass transition, appears in ”Collective Dynamics of Nonlinear and Disordered Systems”, eds. G. Radons, W. Just and P. Hwussler, Springer (2003) - also in cond-mat/0305565. 2. W. Kob, J. Phys.: Condens. Matter 11, R85-Rl15 (1999). 3. J. D. Shore and J. P. Sethna, Phys. Rev. B 433782 (1991); J. D. Shore, M. Holzer and J. P. Sethna, Phys. Rev. B 46, 11376 (1992). 4. A. Lipowski, J. Phys. A 30, 7365 (1997); Lipowski and D. Johnston, J. Phys. A 33, 4451 (2000); Phys. Rev. E 61, 6375 (2000); Phys. Rev. E 64, 041605 (2001). 5. A. Cavagna, I. Giardina T . S. Grigera, J. Chem. Phys. 118, 6974 (2003); Europhys. Lett. 61, 74 (2003). 6. P. Viot and G . Tarjus, Europhys. Lett. 44, 423 (1998); G. Tarjus, D. Kivelson, and P. Viot, J. Phys : Cond. Matter. Special issue: Unifying concepts in Glass Physics 12, 6497 (2000); M. Grousson, G. Tarjus and P. Viot, Phys. Rev. E 62, 7781 (2000); Phys. Rev. E 64, 036109 (2001); J. Phys: Cond. Matter 14, 1617 (2002). Phys. Rev. E 65, 065103(R) (2002) 7. A. Vaterlaus, C. Stamm, U. Maier, M. G. Pini, P. Politi and D. Pescia, Phys. Rev. Lett. 84, 2247 (2000). 8. 0. Portmann, A. Vaterlaus and D. Pecia, Nature 444, 701 (2003). 9. D. P. Pappas, K. P. Kamper and H. Hopster, Phys. Rev. Lett 64, 3179 (1990). 10. R. Allenspach, M. Stampanoni and A. Bischof, Phys. Rev. Lett 65, 3344 (1990). 11. A. B. MacIsaac,J. P. Whitehead, M. C. Robinson and K. De Bell, Phys. Rev. B 51, 16033 (1995). 12. K. De’Bell, A. B. MacIsaac and J. P. Whitehead, Rev. Mod. Phys. 72, 225 (2000). 13. A. Abanov, V. Kalatsky, V. L. Pokrovsky. and W. M. Saslow, Phys. Rev. B 51, 1023 (1995). 14. P.M. Gleiser,, F.A. Tamarit and S.A. Cannas, Physica D 168-169, 73 (2002). 15. J. Lee and J. M. Kosterlitz, Phys. Rev. B 43, 3265 (1991). 16. S.A. Cannas, D. A. Stariolo and F. A. Tamarit, Physical review B 69, 092409 (2004). 17. J. H. Toloza, F. A. Tamarit and S. A. Cannas Phys. Rev. B 58, R8885 (1998). 18. D. A. Stariolo and S. A. Cannas, Phys. Rev. B, 60, 3013 (1999). 19. P. M. Gleiser, F. A. Tamarit, S. A. Cannas and M. A. Montemurro, Phys. Rev. B 68, 134401 (2003).
NON-EXTENSIVITY OF INHOMOGENEOUS MAGNETIC SYSTEMS
M. S. REIS* AND V. S. AMARAL Departamento de Fisica and CICECO Universidade de Aveiro 3810-193 Aveiro, Portugal J. P. ARAUJO IFIMUP, Departamento de Fisica Universidade do Porto 4150 Porto, Portugal I. S. OLIVEIRA Centro Brasileiro de Pesquisas Fisicas Rua Dr. Xavier Sigaud 150, Urca 22290-180 Rio de Janeiro-RJ, Brasil
In recent publications we developed the main features of a generalized magnetic system, in the sense of the non-extensive Tsallis thermostatistics. Our mean-field-non-extensive models predict phase transitions of first and second order, as well as various magnetic anomalies, as a direct consequence of non-extensivity. These theoretical features are in agreement with the unusual magnetic properties of manganites, materials which are intrinsically inhomogeneous. In the present work, we consider an inhomogeneous magnetic system composed by many homogeneous subsystems, and show that applying the usual Maxwell-Boltzmann statistics to each homogeneous bit and averaging over the whole s y s tem is equivalent of using the non-extensive approach. An analytical expression for the Tsallis entropic parameter q was obtained, and showed to be related t o the moments of the distribution of the inhomogeneous quantity. Finally, it is shown that the description of manganites using Griffiths phase can be recovered with the use of the non-extensive formalism.
1. Introduction
Tsallis statistics is applicable to systems which present non-extensivity. Broadly speaking, in order to be non-extensive, a system must present at least one of the following properties: (i) long-range interactions, (ii) long-time memory, (iii) fractality and (iv) intrinsic inhomogeneity '. Manganese oxides, or simply manganites, seems to embody three out of these four ingredients: they present Coulomb long'e-mail: mariorOfis.ua.pt
230
231 range interactions ' v 3 v 4 , they are formed by grains with fractal shapes and they are intrinsically inhomogeneous 5,6,7. In a sequence of previous publications we have shown that the magnetic properties of manganites, some of them very unusual, can be properly described within a mean-field approach using Tsallis statistics. In Ref. lo it is pointed out that the value of the entropic parameter q of a system is related to its magnetic susceptibility. In the present work, through an analogy to the paper of Beck 11, Beck and Cohen l2 and Wilk and Wlodarczyk 13,we consider an inhomogeneous magnetic system composed by many homogeneous parts, each one of them described by the Maxwell-Boltzmann statistics. By averaging the magnetization over the whole system, we recover the Tsallis non-extensivity. We obtain an analytical expression for the entropic parameter q and its dependence with temperature T . Finally, we show that the description of manganites using Griffiths phase 14,15can be incorporated to the non-extensive approach. 8y9,10,
2. Model Description
Homogeneous and Non-Extensive (HNE) case: In Ref. lo an expression for the classical non-extensive magnetization (Generalized Langevin Function) was obtained:
where q E 8 is the Tsallis entropic parameter and
x = - PH kT In this model", the non-extensive correlations lie inside each cluster, whereas the interactions inter-clusters remain extensive. Thus, pne means the magnetic moment of each non-extensive cluster. Inhomogeneous and Extensive (IE) case: Consider an inhomogeneous system formed by smaller Maxwell-Boltzmann homogeneous bits, each of them with magnetization M . We can write the average magnetization considering two types of distribution: A. Distribution of magnetic moments:
B. Distribution of temperature:
where
M ( p , T ,H ) =
[
coth x
I:
--
232
is the usual Langevin function (which yields the magnetization of each small Maxwell-Boltzmann cluster), f and g represent the respective distribution functions and z is the same of Eq. 2. Suppose we have the situation described in (A). Equaling the magnetic saturation value of the HNE case (Eq. 1) and IE case (Eq. 3), we obtain:
where ( p ) is the first moment of the f(p) distribution. Equaling the susceptibilities x4 = (x), and using Eq. 6, we found an expression that connects the q parameter to the moments of the f(p) distribution:
This general result, valid for any f(p), is analogous to that obtained by Beck and Cohen l2 for the case of a Brownian particle travelling through a distribution of temperatures. Proceeding in analogy with case (B), one finds:
where p is the single value of magnetic moment present in the system. Equaling the susceptibilities xq = ( X ) T and using Eq. 8, we obtain:
3. Connections with Griffiths Singularity Salamon and co-workers I41l5 have used the idea of Griffiths singularity t o study manganites. The authors considered a distribution of the inverse magnetic suscep tibility A:
to explain the sharp downturn in the (x)-'(T) curve (behavior usually observed in manganites). In the expression above, c is a parameter of the distribution, I'(--, -) stands for the Incomplete Gamma Function and
where a is a free parameter, /3 =0.38 is a critical exponent for the pure system, assumed to be a 3D Heisenberg-like, and TG is the Griffiths temperature 14,15. From Eq. 10 one can find the average susceptibility: PT
233 and, consequently, its inverse:
(13) that fits the strong downturn usually found on manganites. However, since the Curie Law tells us that
x=-
PZ
3kT ' inhomogeneities in x can arise from distributions of either p or T (or both). From the above equation we can obtain, for instance, a corresponding distribution of magnetic moments:
and, consequently, the q parameter:
The dependence of q with temperature T arises since the first and second moments of the distribution f ( p ) have such dependence. In other words, q(T)is consequence of ( P W ) and ( P 2 ) ( T ) . From the above, one can write the non-extensive magnetic susceptibility:
that is equal to the average one. It is important t o note that, once we know f(p), the value of q can be direct obtained, and, consequently, xq. The present model does not consider the entropic index q as a fitting parameter, but as a known quantity, previously determined and direct related to the inhomogeneity of the system. Finally, an analogous reasoning applies to the case of a distribution of T . 4. Connections with Experimental Results
Pro.o5C~.g5Mn03(T, =110 K) is an example of manganite that presents a strong downturn, around the Curie temperature, in the curve of the inverse susceptibility as a function of temperature. A sample of this manganite was prepared by the starting from the stoichiometric amount of PrzO3 sol-gel method with urea l6l1', (99.9 % pure), CaC03 (99 %) and MnOz (99 %). The final crushed powder was compressed and sintered in air at 1300 "C during 60 hours, with a subsequent fast freezing of the sample. X-ray diffraction pattern confirmed that the sample lies in the Pbnm space group, without vestige of spurious phases. The temperature dependence of the susceptibility x (= M / H at low magnetic field) was carried out using a commercial SQUID magnetometer.
234
Figure 1 presents the experimental data and the model above described (Eq. 17). To obtain this result, the distribution of magnetic moments f(p) presented in Eq. 15 has the following parameters: c =-0.04, a =0.002 K and TG =510 K. The entropic index q (inset of figure l ) , is not a free parameter and could be obtained a priori, using Eqs. 7 and 15,
I0
Experimental data Present model (Eq. 17)
2 P
0 rl
X
n
E' W
01
a 1 0
v
*
'x
0
Figure 1. Experimental (open circles) and theoretical (solid line - Eq. 17) temperature dependence of the inverse susceptibility for the Pro.o5Ca,095MnOsmanganite. See text for details concerning the theoretical description.
5. Conclusion
In the present work, we considered an inhomogeneous magnetic system composed by many homogeneous subsystems, and show that applying the usual MaxwellBoltzmann statistics to each homogeneous bit and averaging over the whole system is equivalent of using the non-extensive approach. An analytical expression for the Tsallis entropic parameter q was obtained, being related to the moments of the distribution of the inhomogeneous quantity. Finally, it is shown that the description of manganites using Griffiths phase can be recovered with the use of the non-extensive formalism. 6. Acknowledgments We thank CAPES-Brasil and GRICES-Portugal for financial support concerning the Brasil-Portugal bilateral cooperation. M.S.R thanks CNPq-Brasil.
235
Bibliography 1. For a complete and updated list of references, see the web site: tsallis.cat.cbpf.br/biblio.htm. 2. LORENZANA J., CASTELLANI C. AND CASTROC.D. Phys. Rev. B 64 (2001) 235127. 3. LORENZANA J., CASTELLANI C. AND CASTROC.D. Phys. Rev. B 64 (2001) 235128. 4. MOREOA., YUNOKIS. A N D DAGOTTOE. Science 283 (1999) 2034. 5. DAGOTTOE., HOTTAT. AND MOREOA. Phys. Rep. 344 (2001) 1. 6. DAGOTTOE. Nanoscale phase separation and colossal magnetoresistance: The physics of manganites and related compounds. (Springer-Verlag, Heidelberg, 2003) . 7. BECKERT., STRENGC., Luo Y., MOSHNYAGA V., DAMASCHKE B., SHANNON N. AND SAMWER K. Phys. Rev. Lett. 89 (2002) 237203. 8. REIS M.S., FREITAS J.C.C., ORLANDOM.T.D., LENZI E.K. A N D OLIVEIRAI.S. Europhys. Lett. 58 (2002) 42. O AMARAL V.S., LENZIE.K. A N D OLIVEIRAI.S. Phys. Rev. 9. REIS M.S., A R A ~ J J.P., B 66 (2002) 134417. V.S., A R A ~ J J.P. O AND OLIVEIRA I.S. Phys. Rev. B 68 (2003) 10. REIS M.S., AMARAL 014404. 11. BECK C. Phys. Rev. Lett. 87 (2001) 180601. 12. BECK c. AND COHENE.G.D. Physica A 322 (2003) 267. 13. WILK G. AND WLODARCZYK z. Phys. Rev. Lett. 84 (2000) 2770. 14. SALAMON M.B., LIN P. AND CHUNS.H. Phys. Rev. Lett. 88 (2002) 197203. 15. SALAMON M. AND CHUNS. Phys. Rev. B 68 (2003) 014411. 16. VAZQUEZ-VAZQUEZ C., BLANCOM.C., LOPEZ-QUINTELA M., SANCHEZ R.D., RIVAS J. AND OSEROFF S.B. J . Mat. Chem. 8 (1998) 991. J.P., TAVARES P.B., GOMES A.M. A N D 17. REIS M.S., AMARALV.S., A R A ~ J O OLIVEIRAI.S. Submatted to Phys. Rev. B (2004).
MULTIFRACTAL ANALYSIS OF TURBULENCE AND GRANULAR FLOW T. ARJMITSU Graduate School of Pure and Applied Sciences, University of Tsukuba, Ibaraki 305-8571,Japan E-mail: [email protected]
N. ARIMITSU Graduate School of Environment and Information Sciences, Yokohama National University, Yokohama 240-8501,Japan E-mail: [email protected]
Abstract The probability density function of velocity fluctuations of granular turbulence (granulence) observed by Fladjai and Roux in their twdimensional simulation of a slow granular flow under homogeneous quasi-static shearing is studied by multifractal analysis (MFA) proposed by the authors. MFA is a unified self-consistent approach for the systems with large deviations, which has been constructed based on the Tsallis-type distribution function that provides an extremum of the extensive RBny or the non-eztensive Tsallis entropy under appropriate constraints. It is shown by the present precise analysis that the system of granulence and of turbulence indeed have common scaling characteristics as was pointed out by Radjai and Roux. Keywords: multifractal analysis, velocity fluctuation, turbulence, granulence
1
Introduction
There have been reported that granular materials in the rapid flow regime present non-Gaussian velocity distributions in various situations, e.g., in a vibrated bed [l,2, 31, in a fluidized beds [4], in a fluidized granular medium between two walls 151, in homogeneous granular fluids 16, 71, in granular gases IS] and so on. Radjai and Roux [9] observed non-Gaussian distribution function in their two-dimensional simulation of a slow granular flow subject to homogeneous quasi-static shearing. They reported that there is an evident analogy between the scaling features of turbulence and of granular turbulence (granulence) in spite of the fundamentally different origins of fluctuations in these systems.
236
237 In this paper, we apply the multifractal analysis (MFA) [lo, 11, 12, 13, 14, 15, 16, 17, 18, 19, 201 of fluid turbulence to granulence in order to see how far MFA works in the study of the data observed by Radjai and Roux [9]. MFA is a unified selfconsistent approach for the systems with large deviations constructed by following the assumption [21] that the strengths of the singularities distribute in a multifractal way in real physical space. The appearance of singularities originates from the invariance of the Navier-Stokes (N-S) equation under a scale transformation. The distribution function of singularities is assumed to be given by the Tsallis-type distribution function [22] that provides an extremum of the extensive RQny [23] or the non-extensive Tsallis entropy (22, 241 under appropriate constraints. This distribution of singularities determines the tail part of the probability density function (PDF). The parameters appeared in the theory are determined, uniquely, by the intermittency exponent representing the strength of intermittency. On the other hand, observed PDF should include the effect resulted from the term in the N-S equation violating the invariance under the scale transformation (the dissipative term). However, there has been no ensemble theory of turbulence including this effect, and the situation remained at the stage where almost all the theories are just trying to explain observed scaling exponents of the mth order velocity structure function, i.e., the mth moment of velocity fluctuations. MFA counts this effect as something determining the central part of PDF narrower than its standard deviation. We are assuming that the fat-tail part, which the PDF of intermittent systems took on, is determined by the global characteristics of the system, and that the central part of PDF is a reflection of the local nature of constituting eddies.
2
Basic Equations for Granular Flow
A set of basic equations for the flow of granular media is given [25] by the equation of continuity for the mass density p: ap/at
with the notation granular media:
aj =
+ ai ( p u i ) = 0
(1)
a / d x j , the equation of motion for the velocity field ii of
pauipt
+ p ( i i . $)tii = -aip + a j U $
(2)
with the fluid pressure p and the dissipative stress tensor a’.. = ij(d (30,. - D . . - b . . D k k 3% 2 23 3%
-
+
(3)
4Wji)
where Dji = ajui, and the equation for the angular velocity field media: Iawjilat I ( i i . t ) W j i = a k p k j i 2uji
wji
of granular
+
(4)
with I = ma2/2 being the momentum inertia for 2-dimensional disks with radius a and mass m, the moment Pkji
= a2ij(g) (bkjatwei - Skiaewej
+2akWji +a j w k i - a i w k j ) ,
(5)
238 and u3< = (u(ii- cij) 12. Here, the generalized viscosity f j ( g ) is defined by fj(g)
C(P) = 2J;;aPf
= C(P);/g,
G I 7
(6)
with pf being the friction coefficient between granular particles (disks), y the filling factor of the disks, and
2
=
dEjiEji/4
+ RjiRji/2 + a2 ( R j j k a j j k + a j & j i k + Q j i k a i j k ) /8
(7)
4
where Eji = D ; - b j i D k k / 2 , Rji = wji-D3;, R j i k = a j w i k with D: = ( D j i fD i j ) . The energy of granular media dissipating per unit mass and per unit time is given by the dissipation function @ = C(p)p b. We confine ourselves in this paper to the case of an incompressible granular flow where the mass density p is constant in time and space. Then, (1) reduces to V . G = 0, and ( 2 ) to aUi/dt
+ (G.G ) u =~ -&p + U ( ~ ) V '+U(~3 0 . .- D . . - 4 w . . ) 3 32
23
(8)
32
with p = p / p and the generalized kinematic viscosity u ( g ) defined by v ( g ) = f j ( g ) / p . When the angular velocity is induced by velocity field 77, it is given by wji = (Dji - D i j ) 12, i.e., w' = x G. Then, the stress tensor (3) reduces to u;; = @) (Dji D i j ) . Note that, for an ordinary fluid, f j ( g ) becomes a constant viscosity f j representing the frictional characteristics of the fluid. In this case, (8) reduces to the N-S equation for incompressible fluid
+
a77/at + (72. G ) G = - G p
+UV2C
(9)
+
with the kinematic viscosity u = f j / p , and the equation of motion aw'/at x (w'x G)= vV2w' for w' is no more an independent equation but is derived from (9).
Multifractal Analysis
3
MFA rests on the invariance of the basic equation of the type aG/at
+ (GI G)IZ= - 9 p + [disspative term(s)]
(10)
under the scale transformation [21, 261
z
--$
d = A?,
77 + GI= ~
4
3
~ t
-+ , ti = ~
1 - ~ / 3 ~ ,-+
=~
2
4
3
(11) ~
for arbitrary real number a when the effect of the dissipative term(s) is negligible in certain region, and on the assumption that the singularities due to the invariance distribute themselves in a multifractal way in physical space. The scaling invariance leads the scaling law
239 with 6, = &/lo = 6-n ( n = 0, 1 , 2 , . . .) for the velocity fluctuation 6u, = Iu(o+e,)u(0)l of the nth multifractal step where u represents a component of the velocity field 5. We will put 6 = 2 in the following in this paper that is consistent with the energy cascade model of turbulence.' The singularity appears in the velocity derivative Iu'I = limn-+a,uk c( limn+m 6, a/3-1 which diverges for a < 3. Here, we introduced the nth velocity difference uk = Sun/& for the characteristic length en. Note that a is a measure of the strength of singularities. It is assumed, in A&A model within MFA, that the singularities due to the scale invariance distribute themselves in a multifractal way in physical space with the Tsallis-type distribution function, i.e., the probability P(n)(a)dato find in real space a singularity with the strength a within the range a a da is given by [ll,12, 131 P(n)(a) = (ZC))-'{l[((Y- (YO) /Ao]'}~/('-~) (13) N
+
with (Aa)' = 2X/[(1- q ) ln21. Here, q is the entropy index introduced in the definitions of the Rknyi and the Tsallis entropies.' This distribution function provides us with the multifractal spectrum f ( a )= 1 (1 - q)-l log,[l - (a- ao)'/(Aa)'] which, then, produces the mass exponent
+
.(a)
= 1 - aoQ+ 2X@(1+
G)-' + (1
- q)-l[l
- lOg,(l
+ A)]
(14)
with C, = 1+2ij2(1-q)X In 2. The multifractal spectrum and the mass exponent are related with each other through the Legendre transformation [26]: f ( a )= aQ+~(?j) with a = -dr(Q)/dg and 4 = df (a)/da. The formula of the PDF, II(")(un),of velocity fluctuations is assumed to consists of two parts, i.e., II(n)(u,) = rIg)(un) ArI(n)(un) where the first term is with the transformation of the related to P(,)(a) by II$'(Iunl)dun 0: P(n)(a)da variables (12), and the second term is responsible to the contributions coming from the dissipative term(s) in (10) violating the invariance under the scale transformation (11). Then, we have the velocity structure function in the form ( ( 1 ~ ~ 1 " ) ) = Jdun1unlmII(")(Un) = 272'+(l-27F')am6$with 272) = Jdunlu,lmAII(n)(un), a, = {2/[-(1+ -)]}'I' and the scaling exponent
+
<"
= 1 - T(m/3)
(15)
given with the mass exponent (14). The PDF, l=I(,)(<,), both of velocity fluctuations and of velocity derivative to be compared with observed data is the one defined through I?((<,)d<, ,) = II(,)(u,)du, with the variable = u,/((u;))l/' scaled by the standard deviation of velocity
cn
lAt each step of the cascade, say at the n t h step, eddies break up into two pieces producing the energy cascade with the energy-transfer rate en that represents the rate of transfer of energy per unit mass from eddies with diameter en t o those with & + I . 2Regardless if the fundamental entropy is the extensive R h y i entropy or the non-extensive Tsallis entropy, the MaxEnt distribution functions which give the extremum of these entropies have a common structure, i.e., Tsallis-type distribution function.
240 fluctuations. For the velocity fluctuations larger than the order of its standard (equivalently, la1 5 a*),the PDF is given by [18, 171 deviation, (: 5
li("'(&
=
np(u,)dun
n(n)
with &,o = (n62'3-cz/2 and = 3(1 - 2 y p ) ) / ( 2 & d m ) . This tail part represents the large deviations, and manifests itself the multifractal distribution of the singularities due to the scale invariance of (10) when its dissipative term can be neglected. The entropy index q should be unique once an intermittent system 5 6: (equivalently, a* 5 IaI), we is settled. For smaller velocity fluctuations, (]I, assume the Tsallis-type PDF of the form [18, 171
where a new entropy index q' is introduced as an adjustable parameter. This center part is responsible to smaller fluctuations, compared with its standard deviation, due to the dissipative term(s) violating the scale invariance. The entropy index q' can be dependent on the distance of two measuring points. = (n6~'/3-cz/2 The two parts of the PDF, (16) and (17), are connected at with the conditions that they have a common value and that their slopes coincide. The value a* is the smaller solution of <2/2 - 4 3 1 - f ( a ) = 0. The point (: has the characteristics that the dependence of fI(")(<;t)on n is minimum for n >> 1. With the help of the second equality in (16) and (17), we obtain AII(,)(z,), and have the analytical formula to evaluate 72).Their explicit analytical formulae and the definition of are found in [17, 181.
<:
+
En
4
Turbulence
The dependence of the parameters 00,X and q on the intermittency exponent p is determined, self-consistently, with the help of the three independent equations, i.e., the energy conservation: ( e n / € ) = 1 (equivalently, ~ ( 1 = ) 0), the definition of the intermittency exponent p: ( E ; / E ' ) = 6;. (equivalently, p = 1 7(2)), and the scaling relation: 1/(1- q ) = 1/a- - l/a+ with a%satisfying f ( a * ) = 0. Here, E is the energy input rate to the largest eddies. The average (. . .) is taken with P(n)(a). We performed in Fig. 1 the competition among PDFs, fi(")(&,), of velocity fluctuations derived by three multifractal models, i.e., p model, log-normal model and A&A model, by analyzing PDF extracted from the direct numerical simulation
+
241
5n
Figure 1: Competition among PDFs of the velocity fluctuations on (a) log scale and (b) linear scale. PDFs are displayed in a set of two lines from the top, with closed circles representing the observed PDF, broken lines the PDF for log-normal mode, dotted lines for p model and solid lines for A&A model. The pair of solid lines in each set are the same PDF derived by A&A model. For better visibility, each PDF is shifted, vertically, by -2 in (a), and by -0.4 in (b). (DNS) conducted by Gotoh et al. [27] at Re = 13 000 with the lattice mesh size 10243 that is the largest size at that time. The log scale (a) is good to see the tail part, whereas the linear scale (b) is appropriate to study the central part. In order to secure impartiality, the observed PDFs are analyzed by the least square method with the theoretically derived PDF for each model. The distances r / q of two measuring points for three sets of pair in Fig. 1 are, from the top pair t o the bottom one, 2.38, 19.0 and 1220, respectively. For Fig. 1, p = 0.240, therefore, q = 0.391, a0 = 1.14 and X = 0.285 for A&A model. The connection points for three pairs in Fig. 1 are, from the top to the bottom, = 1.10, 1.23 and 1.43 (a* = 1.08; common to three pairs), respectively. Note that, for a < 3, the velocity derivatives become singular as stated before. Since the part of the PDF for a < a* is the tail part, it is confirmed that the tail part is a manifestation of the multifractal distribution of singularities. The connection points for log-normal model and p model are almost the same as that for A&A model. As for other parameters please refer to original papers [19]. The competition shows the superiority of A&A model. The PDFs within log-
<:
242
I
0
2
4
4"
6
0
1
4,
Figure 2: Analysis of the experimental PDF of fluctuating velocities (closed circles) measured in the quasistatic flow of granular media are performed with the help of the present theoretical PDF, lI(n)(&), for velocity fluctuations (solid lines), and are plotted on (a) log scale and (b) linear scale. For better visibility, each PDF is shifted by -1 unit in (a) and by -0.1 in (b) along the vertical axis. normal model have rather higher tail part than the observed PDFs. This is a manifestation of the fact that the values of the scaling exponents become smaller for larger m and, finally, take negative values. The PDFs within p model are very close t o those of A&A model at the tail part, and explain the observed PDFs quite well, but the domain of variables are, regrettably, too small and the tails of PDFs of p model terminate before the observed PDFs end. As the accuracy of the observations were improved, one can say that the function of p model had finished. Note that the domain of variables for the PDFs within A&A model are, usually, about 300 times larger than their standard deviations. Observation of such a large fluctuation is, practically, impossible. Through the analyses of the PDFs for velocity fluctuations in Fig. 1,we extracted quite a few information of the system [15, 14, 16, 18, 171. Among them, we only 1.71 [17]. quote here the dependence of q' on r / q : q' = -0.0510g2(r/q)
+
5
Granulence
Let us now analyze the velocity fluctuations in granulence simulated by Radjai and Roux [9]. Since they observed that the fluctuations share the scaling characteristics of fluid turbulence, we try t o investigate the system by means of A&A model within MFA which extracted, successfully, the rich information out of turbulence as was seen, partially, in the previous section. The power spectrum Ek of the fluctuating velocity field on one-dimensional cross sections exhibits a clear power-law shape
243
Ek
fl
0: k-0
with
fl
M
1.24 [9], which is quite similar to the power-law behavior with
= -513 in the inertial range of the Kolmogorov spectrum [28] for turbulence.
However, since the granular model is an assembly of frictional disks, the power-law observed in granulence does not mean the energy conservation in contrast with the case of the energy cascade model for fluid turbulence. For the conditions to determine the parameters ao, X and q, we adopt, instead of the energy conservation, the slope of the power spectrum, i.e., fl = 1 (2 = 2 - r(2/3), accompanied with the definition of the intermittency exponent and the scaling relation. The latter two are the same as those utilized for turbulence. As there is no experimental data, for the present, to determine the value of the intermittency exponent p for granulence, we cannot have the values of the three parameters which are given as functions of p with the help of the three conditions. Therefore, we determine the value of the intermittency exponent by adjusting the observed PDF with the theoretical formulae (16) and (17) by the least square method, since the accuracy of the formulae in the analysis of PDFs for turbulence is quite high as was shown in the previous section. The best fit of the observed PDF of fluctuating velocities by the formulae (16) and (17) is shown in Fig. 2. The experimental data points are symmetrized by taking averages of the left and the right-hand sides data. The integration time 7,normalized by a shear rate, for the experimental PDF are lop3 and 10-1 from top t o bottom. We found the value p = 1.347 giving q = 0.930, a0 = 0.377 and X = 0.050. Other parameters for the theoretical PDF are, from top t o bottom, (n,q’) = (88.0, 1.28), (40.0, 1.22), and (: = 1.14, 1.14 (a*= 0.364).
+
I
I
Figure 3: Scaling exponents of velocity structure function for turbulence (dotted line) and granulence (solid line). By making use of the mass exponent with these values, we have
(18) with ~ ( 1 = ) 0.648 representing a breakdown of energy conservation. Let us analyze this within the level of K41 [28]. As the collisions between granular particles are ( E n / € ) = 6,7(1)
244
frictional, the energy transfer rate may be dependent on the size of eddies, i.e., it may be reasonable to introduce the wavenumber dependent energy transfer rate as Ek cx k r l . Assuming that the energy spectrum still has the structure Ek 0: we obtain the scaling relation
EE/~~-
3p = 5 - 271.
(19)
Substituting the measured value of p into (19), we have 71 = 0.625 which is quite close to the value of the mass exponent r(1)appeared in (18). Further investigations in this direction are one of the attractive future problems, e.g., the application of Heisenberg’s approach for turbulence [29] to granulence having the basic equation of the form (8). We put in Fig. 3 the scaling exponents (15) of velocity structure functions within A&A model both for turbulence with p = 0.240 (dotted line) and granulence with p = 1.347 (solid line). We further extracted the relation between r and C, as r = 1.3 6,0.131 by comparing the observed flatness and the one with the theoretical PDFs (16) and (17). This relation may be a manifestation of the fact that Taylor’s frozen hypothesis does not work for granulence. This is also one of the attractive future problems to be clarified.
6
Summary
We showed with the help of MFA that the system of turbulence and of granulence have, actually, common scaling feature in their velocity fluctuations as was pointed out by Radjai and h u x [9]. We expect that various observation of granulence will be reported at higher statistics, and that one can extract more information out of the data to determine the underlying dynamics for granulence in the near future. In this respect, the stochastic approach with the multifractal process [30] and the one with the convolution method [31] may provide us with an attractive insight. Note however that a critical comparison between MFA and the latter approach has been performed [14]. The application of MFA to the rapid granular materials [I, 2, 3, 4, 5, 6, 7, 81 is also an attractive future problems. They will be reported elsewhere.
References [l] Y-H. Taguchi and H. Takayasu, Europhys. Lett. 30, 499 (1995).
[2] Y . Lan and A. D. Rosato, Phys. Fluids 7, 1818 (1995). [3] Y . Murayama and M. Sano, J. Phys. SOC.Jpn. 67, 1826 (1998).
Rev.E 52, 658 (1995). [5] J.J. Brey and D. Cubero, Phys. Rev.E 57, 2019 (1998). [6] T.P.C. van Noije et al., Phys. Rev.Lett. 79, 411 (1997).
[4] K . Ichiki and H. Hayakawa, Phys.
245 [7] T.P.C. van Noije and M.H. Ernst, Granular Matter 1,57 (1998). [8] A. Puglisi et al., Phys. Rev. E 59,5582 (1999).
[9] F. Radjiai and S. Rowc, Phys. Rev. Lett. 89, 064302 (2002). [lo] T. Arimitsu and N. Arimitsu, Phys. Rev. E 61,3237 (2000). [ll] T. Arimitsu and N. Arimitsu, J. Phys. A: Math. Gen. 33,L235 (2001).
[12] T. Arimitsu and N. Arimitsu, Prog. Theor. Phys. 105,355 (2001).
[13] T. Arimitsu and N. Arimitsu, Physica A 295,177 (2001). [14] T. Arimitsu and N. Arimitsu, Physica A 305,218 (2002). [15] T . Arimitsu and N. Arimitsu, J . Phys.: Condens. Matter 14,2237 (2002). [IS] N. Arimitsu and T. Arimitsu, Europhys. Lett. 60,60 (2002). [17] T. Arimitsu and N. Arimitsu, AZP Conf. Proc. 695,135 (2003). [18] T. Arimitsu and N. Arimitsu, Physica D 193,218 (2004). [19] T. Arimitsu spnd N. Arimitsu, Physica A 340,347 (2004). [ZO] T. Arimitsu and N. Arimitsu, Journal of Physics: Conference Series (2005) in press;
http://www.px.tsukuba.ac.jp/home/tcm/arimitsu/MarseillesO4.pdf
[21] U. Frisch and G. Parisi, In Tiirbulence, Predictability in Geophysical Fluid Dynamics, and Climate Dynamics, ed., M. Ghil, R. Benzi and G. Parisi (New York, NorthHolland, 1985) p. 84. [22] C. Tsallis, J. Stat. Phys. 52,479 (1988). [23] A . Rbnyi, PTOC.4th Berkeley Symp. Maths. Stat. Prob. 1,547 (1961). [24] J.H. Havrda and F. Charvat, Kybernatica 3,30 (1967). [25] K. Kanatani, Nihon Kikai Gakkai Ronbunshu 45,507 (1979); ibid, 45,515 (1979). [26] C. Meneveau and K.R. Sreenivasan, Nucl. Phys. B 2,49 (1987). [27] D. Fukayama T. Gotoh and T. Nakano, Phys. Fluids 14,1065 (2002). [28] A.N. Kolmogorov, Dokl. Akad. Nauk SSSR 30,301 (1941). 1291 W. Heisenberg, 2. Phys. 124,628 (1948). [30] E. Bacry, J. Delour and J.F.Muzy, Phys. Rev. E 64,026103 (2001). [31] C. Beck, Phys. Rev. Lett. 87, 180601 (2001).
APPLICATION OF SUPERSTATISTICS TO ATMOSPHERIC TURBULENCE
SALVO RIZZO E.N.A. V . S.p.A., U.A.A. V. Firenze, Italy e-mail:[email protected] ANDREA RAPISARDA Dipartimento d i Fisica e Astronomia Universitd di Catania, and INFN sezione di Catania, CACTUS Group, Via S. Sofia 64, 95123 Catania, Italy e-mai1:andrea.rapisardaoct. infn.it We successfully apply the recent developed superstatistics theory t o a temporal series of turbulent wind measurements recorded by the anemometers of Florence airport. Within this approach we can reproduce very well the fluctuations and the pdfs of wind velocity returns and differences.
1. Introduction
During the last decades an enormous effort has been devoted to understand the physical origin of turbulence, both at the theoretical level and at the experimental one1’2,3,4,5,6,7.Although many steps forward have been done, a well established theory of turbulence does not yet exist and many fundamental aspects of this phenomenon remain still unclear. Atmospheric turbulence is a challenge per se being characterized by very high Reynolds numbers (Re N lo8) and very intermittent distributions. On the other hand beyond basic research its interest lies also on the several engineering and meteorological applications. The Kolmogorov hypotheses were verified in the sixties doing measurements in the atmospheric boundary layer considering the flow velocity for a relative short time, with a high sampling rate and for relatively constant mean flow, in order to control and maintain constant the Reynolds number. In general this is not always possible. Although atmospheric t u r b u l e n ~ e ~ can > ~have > ~ peculiar features regarding the non-stationarity character of the wind data and the high turbulence intensity a aThe turbulence intensity of the wind speed are typically expressed in terms of standard deviation, u,,,of velocity fluctuations measured over 10 t o 60 minutes normalized by the mean wind speed V. [Iv = 91.Typical values for complex terrain are I,, 2 0.2 while, on the other hand for microscale turbulence one usually has Iv 0(10-2) N
246
247 many similarities with microscopic turbulence exists. For a detailed comparison between the two, one can see the recent paper by Bottcher et al.’. In this short paper we discuss a study of a turbulent wind data series recently measured at Florence airport for a period of six months. We show by means of a statistical analysis that we can describe this example of atmospheric turbulence within the by means of the nonextensive approach adopted in refs. more general superstatistics formalism introduced in ref. 13. The latter justifies the successful application of Tsallis statistics in different fields, and more specifically in turbulence experiment^",^^^^^,^^,^^,^^. We will show that such an approach is meaningful and can reveal very interesting features which could have also a very practical utility for safety reasons when applied to air traffic control services. Part of this study has just been published l7 and a longer paper with a complete and exhaustive analysis is in preparation 18. 8p10111912914115,
2. Statistical analysis of wind measurements
The wind velocity measurements, were taken at Florence airport and were done for a time interval of six months, from October 2002 to March 2003. Data were recorded by using two 3-cups runway heads anemometers, each one mounted on a 10 m high pole, located at a distance of 900 m and with a sampling frequency of one measure every 5 minutes. Although in our experiment we actually could not control the Reynolds number, as usually done in microscopic turbulence, and despite our low sampling frequency (3.3 . 10-3Hz) and the high intermittency of our wind data, we found several features of canonical turbulence as we will discuss in the following. We performed, on our time series, a statistical analysis using conventional mathematical tools which are normally adopted in small scale physical turbulence studied in laboratory. In particular we investigated correlations, spectral distributions as well as probability density functions of velocity components of returns and differences see refs. for more details. In this short contribution for simplicity and lack of space we discuss only returns of the longitudinal velocity components measured by one of the two anemometers (in the present case the one closest to the runway head 05 and labeled RWYUS) defined by the following expression 17118
z(qT = V,RWY05(t + 7)- V,RWY05(t) , Vz(t)being the longitudinal velocity component at time t and 7 being a fixed time interval ’. The same analysis was done also for the transversal components and for velocity difference between the two anemometers with similar results 17,18.
bReturns are here defined in a slight different way from those used in econophysics, i.e.: dt+r)-s(t)
z(t)
’
248
2.1.
Correlations and power spectra
Our data show very strong correlations and power spectra with the characteristic -5/3 law in the high-mid portion of the entire spectrum, see fig.1 in ref.”. However the dissipation branch in the high-range frequency, well known in micro-scale (or is here missing due to the low-frequency high-frequency) turbulence analysis Correlation functions also show an initial exponential decay, sampling used followed by a power law-decay modulated by the day-night wind periodicity which is a well known phenomenon. No significant difference was found for day and night periods, when air traffic is almost absent. For more details please see refs. ‘i6,
17318.
17118.
2.2. The superstatistics approach for wind velocity pdfs
The superstatistics formalism proposed recently by C. Beck and E.G.D. Cohen is a general and effective description for nonequilibrium systems. For more details see the original article and the paper by Beck in this volume13. In the superstatistics approach one considers fluctuations of an intensive quantity, for example the temperature, by introducing an effective Boltzmann factor
where f(P)is the probability distribution of the fluctuating variable have for the probability distribution 1 P(E)=-B(E) ,
p, so that we
z
(3)
Z = l o B(E)dE .
(4)
with the normalization given by
One can imagine a collection of many cells, each of one with a defined intensive quantity, in which a test particle is moving. In our atmospheric turbulence studies, the time series of the wind velocity recordings, are characterized by a fluctuating variance, so the returns (l),cannot be assumed to be by a ”simple” Gaussian process. They show a very high intermittent behavior stronger than that one usually found in small-scale fluid turbulence experiments. In our analysis we considered the following quantities: (a) the wind velocity returns x defined by eq.(l), (b) the corresponding variance of the returns x which we indicate with u , (c) the fluctuations of u, whose variance we indicate with the symbol C. We extracted from the experimental data, using an fixed time interval 7,the distribution for the fluctuations of the longitudinal wind component variance. The aim is to slice the time series in ”small” pieces in which the signal is almost Gaussian and apply superstatistics theory. This fluctuating behavior of F is plotted in Fig.1 for a time interval 7 = 1 hour. In Figs. 2 and 3 we then plot the probability
249 cs fluctuations of the RWYOS longitudinal wind component 10
Time window
9
:t
2=
I hi
I
Figure 1. Variance fluctuations of the longitudinal wind velocity component for the anemometer RWYOS obtained with a moving time window T of one hour.
c=2.70 q = 1.37
-1
Figure 2. Standardized pdf of the fluctuating variance corresponding t o the previous figure (open points) are compared with a Gamma distribution (full line) and with a Log-normal distribution (dashed line) sharing the same mean (ao) and variance (C) e x t r x t e d from experimental data. The Log-normal is not able t o reproduce the experimental data.
distribution of the variance o for 7 = 1 and 3 hours respectively. In the same figures we plot for comparison a Gamma (full curve) and a Log-normal (dashed curve) distribution6 characterized by the same average and variance extracted from the experimental data. In this sense, the curves are not fitted to the data. The comparison clearly shows that the Gamma distribution is able to reproduce very nicely the experimental distribution of the u fluctuations and that this type of distribution show robustness for different time windows choices. This is at variance with the Log-normal distribution which is usually adopted in microscopic turbulence
250
Figure 3. Standardized pdf of the fluctuating variance similar to t h e previous figure but c o r r e sponding t o a windowing of three hours in our longitudinal velocity components (open points). We show for comparison also a Gamma distribution (full line) and a Log-normal distribution (dashed line) sharing the same mean (uo)and variance (C). Also in this case the Gamma distribution reproduces very well the experimental data at variance with the Log-normal one.
and which in this case is not able to reproduce the experimental data. In general using Beck and Cohen notation13 we have for the Gamma distribution
with
where 2c is the actual number of effective degrees of freedom and b is a related parameter. Inserting this distribution into the generalized Boltzmann factor ( 2 ) one gets the q-exponential curve 14,
P ( z ) = (1 - (1 - q)PoE(z))&
.
(7)
iz2
In our analysis we have E = with z defined by eq.(l)11>17>18. Considering the fluctuations of the variance a of the returns z, we get the following correspondence with the original superstatistics formalism @=ar
,
a(P)=C(a,)
,
Po=<ar
>=a0
.
(8)
In the present case, we get for the Gamma distributions which describe the experimental variance fluctuations reported in Figs.2 and 3 the characteristic values c = 2.70 and c = 3.22, for a time interval of 1 and 3 hours, from which, using eq. (6), we get the corresponding q-values q = 1.37 and q = 1.31. In Fig.4 we plot the probability density function P ( z ) of the experimental longitudinal returns for different time intervals, i.e. 1 hour (full circles), 3 hours (open diamonds) and 24 hours (open
25 1 squares). For comparison we plot a Gaussian distribution (dashed curve) and the q-exponential curves (7) characterized by the q-values extracted from the Gamma distributions of Figs.2 and 3 for a time interval T corresponding to 1 and 3 hours. The q-exponential curves reproduce very well the experimental data which, on the other hand, are very different from the Gaussian pdf. However one can notice that for a very long time interval, i.e. T =24 hours, the data are not so far from being completely decorrelated and therefore the corresponding experimental pdf is closer to the Gaussian curve. Notice that the theoretical curves are not fitted, and that the superstatistic approach, in a self-consistent and elegant way, is able to explain and characterize in a quantitative way the wind data. In a similar way one can extract theoretical curves which reproduce the wind velocity differences pdfs with similar entropic q-values, although in that case an asymmetry correction has t o be considered t o better reproduce the tails of the p d f ~ ~ ~ ~ ' ' .
Figure 4. Comparison between standardized longitudinal velocity returns pdfs for three different time intervals (T = 1,3,24 hours)and the q-exponential curves with the q-value extracted from the c parameter of the Gamma distribution shown in the previous figures. A Gaussian pdf is also shown as dashed curve. See text.
In our analysis the large and intermittent wind velocity variance fluctuations y e reproduced very well by a Gamma type superstatistics excluding the Log-normal one and this gives exactly the Tsallis q-exponential for the velocity returns pdfs. However one has to say that the situation is much more difficult for the less fluctuating velocity flow of the microscale fluid turbulence. We add as a final remark that, very recently a similar method has been adopted by a research group at the NASA Goddard Space Flight Center to analyze the solar wind speed fluctuations16.
252 3.
Conclusions
We have studied a temporal series of wind velocity measurements recorded at F1e rence airport for a period of six months. T h e statistical analysis for the velocity components shows intermittent fluctuations which exhibit power-law pdfs. Applying the superstatistics formalism, i t is possible to extract a Gamma distribution from the probability distributions of the variance fluctuations of wind data. T h e characteristic parameter c of this Gamma distribution gives t h e entropic index q of the Tsallis q-exponential which is then able to reproduce very well the velocity returns and differences pdfs. Beyond the successful application of superstatistics and Tsallis thermostatistics for turbulent phenomena and the corresponding theoretical implications, we think t h a t this work shows a useful and interesting method t o characterize and study in a rigorous and quantitative way atmospheric wind d a t a for safety flight conditions in civil and military aviation.
Acknowledgements T h e authors are indebted t o C. Beck, E.G.D. Cohen, S. Ruffo, H.L. Swinney and C. Tsallis for suggestions and discussions.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18.
A.N. Kolmogorov, J . Fluid Mech. 13,82 (1962). A.M. Obukhov, J. Fluid Mech. 13,77 (1962). R.H. Kraichnan,J. Fluid Mech. 62,305 (1974). B. Castaing, Y . Gagne, E.J. Hopfinger, Physica D 46, 177 (1990). S.B. Pope, Turbulent Flows, Cambridge University Press (2000). U. Frisch, Turbulence, Cambridge University Press (1995). K.R. Sreenivasan, Rev. Mod. Phys. 71,S383 (1999). F.M. Ramos, M.J. A. Bolzan, L.D.A. SB, R.R. Rosa, PhysicaD 193,278 (2004). F. Bottcher, St. Barth and J. Peinke, eprint [nlin.A0/0408005]. C. Beck, Physica A 277,115 (2000); C. Beck, Phys. Rev. Lett. 87,18060 (2001); C. Beck, Physica A 306,189 (2002). C. Beck, G.S. Lewis and H.L. Swinney Phys. Rev. E 63,035303 (2001). C.N. Baroud and H.L. Swinney Physica D 284,21 (2003). C. Beck and E.G.D. Cohen Physica A 322,267 (2003) and C.Beck in this volume [cond-mat/0502306]. C. Tsallis J . Stat. Phys. 52,479 (1988). See for example C. Tsallis, Physica D 193,3(2004) and the other papers published in the same volume. For un updated list of references on the generalized thermostatistics and its applications, see also http://tsallis.cat.cbpf.br/biblio.htm. L.F. Burlaga and A.F. Vixias Geophysical Research Letters, 31,L16807 (2004); L.F. Burlaga and A.F. Viiias Journal of Geophysical Research, 109,A12107 (2004). S. Rizzo and A. Rapisarda, 8th Experimental Chaos Conference,l4-17 June 2004, Florence, Italy, AIP Conference proceedings Vol. 742, p. 176, [cond-mat/0406684] S. Rizzo and A. Rapisarda (2005) to be submitted.
Applications in Other Sciences
This page intentionally left blank
COMPLEXITY OF PERCEPTUAL PROCESSES F. TIT0 ARECCHI Department of Physics University of Firenze, Italy
At the borderline between neuroscience and physics of complex phenomena, a new paradigm is under investigation ,namely feature binding. This terminology denotes how a large collection of coupled neurons combines external signals with internal memories into new coherent patterns of meaning. An external stimulus spreads over an assembly of coupled neurons, building up a corresponding collective state. Thus, the synchronization of spike trains of many individual neurons is the basis of a coherent perception. Based on recent investigations, a novel conjecture for the dynamics of single neurons and, consequently, for neuron assemblies has been formulated. Homoclinic chaos is proposed as the most suitable way to code information in time by trains of equal spikes occurring at apparently erratic times; a new quantitative indicator, called propensity ,is introduced to select the most appropriate neuron model. In order to classify the set of different perceptions, the percept space is given a metric structure by introducing a distance measure between distinct percepts. The distance in percept space is conjugate to the duration of the perception in the sense that an uncertainty relation in percept space is associated with time limited perceptions. Thus coding of different percepts by synchronized spike trains entails fundamental quantum features . It is conjectured that they are related to the details of the perceptual chain rather than depending on Planck‘s action.
1
Feature binding
1.1. Neuron synchronization
It is by now established that a holistic perception emerges, out of separate stimuli entering different receptive fields, by synchronizing the corresponding spike trains of neural action potentials [Von der Malsburg, Singer]. Action potentials play a crucial role for communication between neurons [Izhikevich]. They are steep variations in the electric potential across a cell’s membrane, and they propagate in essentially constant shape from the soma (neuron’s body) along axons toward synaptic connections with other neurons. At the synapses they release an amount of neurotransmitter molecules depending upon the temporal sequences of spikes, thus transforming the electrical into a chemical carrier. As a fact, neural communication is based on a temporal code whereby different cortical areas which have to contribute to the same percept P synchronize their spikes. Limiting for convenience the discussion to the visual system, spike emission in a single neuron of the higher cortical regions results as a trade off between bottom-up stimuli arriving through the LGN (lateral geniculate nucleus) from the retinal detectors and threshold modulation due to top-down signals sent as conjectures by the semantic
255
256
memory. This is the core of ART (adaptive resonance theory [Grossberg]) or other computational models of perception [Edelman and Tononi] which assume that a stable cortical pattern is the result of a Darwinian competition among different percepts with different strength. The winning pattern must be codirmed by some matching procedure between bottom-up and top-down signals. 1.2. Perceptions, feature binding and Qualia
The role of elementary feature detectors has been extensively studied in the past decades [Hubel]. By now we know that some neurons are specialized in detecting exclusively vertical or horizontal bars, or a specific luminance contrast, etc. However the problem arises: how elementary detectors contribute to a holistic (Gestalt) perception? A hint is provided by [Singer]. Suppose we are exposed to a visual field containing two separate objects. Both objects are made of the same visual elements, horizontal and vertical contour bars, different degrees of luminance, etc. What are then the neural correlates of the identification of the two objects? We have one million fibers connecting the retina to the visual cortex, through the LGN. Each fiber results from the merging of approximately 100 retinal detectors (rods and cones) and as a result it has its own receptive field. Each receptive field isolates a specific detail of an object (e.g. a vertical bar). We thus split an image into a mosaic of adjacent receptive fields. Now the “feature binding” hypothesis consists of assuming that all the cortical neurons whose receptive fields are pointing to a specific object synchronize the corresponding spikes, and as a consequence the visual cortex organizes into separate neuron groups oscillating on two distinct spike trains for the two objects(fig.1) Direct experimental evidence of this synchronization is obtained by insertion of microelectrodes in the cortical tissue of animals just sensing the single neuron [Singer]. Indirect evidence of synchronization has been reached for human beings as well, by processing the EEG (electro-encephalo-gram) data [Rodriguez et al.]. Based on the neurodynamical facts reported above, we can understand how this occurs [Grossberg]. The higher cortical stages where synchronization takes place have two inputs. One (bottom-up) comes from the sensory detectors via the early stages which classify elementary features. This single input is insufficient, because it would provide the same signal for e.g. horizontal bars belonging indifferently to either one of the two objects. However, as we said already, each neuron is a nonlinear system passing close to a saddle point, and the application of a suitable perturbation can stretch or shrink the interval of time spent around the saddle, and thus lengthen or shorten the interspike interval. The perturbation consists of top-down signals corresponding to conjectures made by the semantic memory (fig.2).
257
Fig. 1: Feature binding: the lady and the cat are respectively represented by the mosaic of empty and tilled circles, each one representing the receptive field of a neuron group in the visual cortex. Within each circle the processing refers to a specific detail (e.g. contour orientation). The relations between details are coded by the temporal correlation among neurons, as shown by the same sequences of electrical pulses for two filled circles or two empty circles. Neurons referring to the same individual (e.g. the cat) have synchronous discharges, whereas their spikes are uncorrelated with those referring to another individual (the lady) [from Singer].
}Top -Down
]--up
Fig.2 ART = Adaptive Resonance Theory. Role of bottom-up stimuli kom the early visual stages an top-down signals due to expectations formulated by the semantic memory. The focal attention assures the matching (resonance) between the two streams [from Julesz].
In other words, the perception process is not like the passive imprinting of a camera film,but it is an active process whereby the external stimuli are interpreted in terms of past memories. A focal attention mechanism assures that a matching is eventually
258
reached. This matching consists of resonant or coherent behavior between bottom-up and top-down signals. If matching does not occur, different memories are tried, until the matching is realized. In presence of a fully new image without memorized correlates, then the brain has to accept the fact that it is exposed to a new experience. Notice the advantage of this time dependent use of neurons, which become available to be active in different perceptions at different times, as compared to the computer paradigm of fixed memory elements which store a specific object and are not available for others (the so called “grandmother neuron” hypothesis). We have above presented qualitative reasons why the degree of synchronization represents the perceptual salience of an object. Synchronization of neurons located even far away from each other yields a space pattern on the sensory cortex, which can be as wide as a few square millimeters, involving millions of neurons. The winning pattern is determined by dynamic competition (the so-called “winner takes all” dynamics). This model has an early formulation in ART and has been later substantiated by the synchronization mechanisms. Perceptual knowledge appears as a complex selforganizing process. Naively, one might expect that a given “qualia”, that is, a private sensation as e.g. the red of a Titian painting, is always coded by the same sequence of spikes. If so, in a near future the corresponding information could be retrieved by a high resolution detector, and hence a Rosetta stone could be established between the spike sequences and the qualia. Such a naive expectation which would lead to a world without privacy, is altogether wrong for the following reasons. After the initial experience of that qualia, the first time one has seen that Titian, any further repetition of that experience, either by memory recollection or by re-watching the painting occurs in presence of new experiential elements (one has become older, hisher store of memories has drastically mutated) and these novelties contribute to feature binding by a modified synchronization pattern. Evidence of such a fact has been established by Freeman [Freeman] reporting the synchronization pattern of the olfactory bulb of a rabbit, recorded by a large number of electrodes; as the same odor is presented twice, with an intermediate odor in between, the two patterns are all together different, even though the animal behavior hints at the same reaction. Freeman’s experiment is contrasted by the fact that some olfactory neurons of the locust yield the same bursts of spikes for the same odor [Rabinovich et al.]. Presumably, lower animals as locusts have a much smaller semantic repertoire than rabbits or humans, and hence for them the dream of the Rosetta stone has some validity. 2
Homoclinic chaos, synchronization and propensity
Let us model the neurodynamics of spike formation As for the dynamics of the single neuron, a saddle point instability separates in parameter space an excitable region, where axons are silent, from a periodic region, where the spike train is periodic (equal interspike intervals). If a control parameter is tuned at the saddle
259
point, the corresponding dynamical behavior (homoclinic chaos) consists of a frequent return to the instability [Allaria]. This manifests as a train of geometrically identical spikes, which however occur at erratic times (chaotic interspike intervals). Around the saddle point the system displays a large susceptibility to an external stimulus, hence it is easily adjustable and prone to respond to an input, provided this is at sufficiently low frequencies; this means that such a system is robust against high frequency noise as discussed later. HOMOCLINIC CHAOS through saddle connection in 3D
Chaos a < y Periodic stable oscillations a = y
Susceptibility
x
x= response/stimulus
-easy synchronization : external forcing,
DSS,NIS
-bursting
Telecomunicaiion neuronal dynamics
Fig.3 Schematic view of the phase space trajectory approaching the saddle S and escaping from it. Chaos is due to the shorter or longer permanence around S; fiom a geometrical point of view most of the orbit P provides a regular spike.
Such a type of dynamics has been recently dealt with in a series of reports that here I recapitulate as the following chain of l i e d facts. 1) A single spike in a 3D dynamics corresponds to a quasi-homoclinic trajectory around a saddle focus SF (fured point with 1 (2) stable direction and 2 (1) unstable ones); the trajectory leaves the saddle and returns to it (Fig.3).We say “quasi-homoclinic”because, in order to stabilize the trajectory away from SF, a second fixed point, namely a saddle node SN, is necessary to assure a heteroclinic connection. The experiment on a C02 laser confrms this behavior(Fig.4)
260
80
82
84
88
time (ms)
a8
so
Fig.4 Experimental time series of the laser intensity for a C02 laser with feedback in the regime of homoclinic chaos. (b) Time expansion of a single orbit. (c) Phase space trajectory built by an embedding technique with appropriatedelays [from Allaria et al.].
A train of spikes corresponds to the sequential return to, and escape from, the SF. A control parameter can be set at a value BC for which this return is erratic (chaotic interspike interval). As the control parameter is set above or below BC, the system moves from excitable (single spike triggered by an input signal) to periodic (yielding a regular sequence of spikes without need for an input), with a frequency monotonically increasing with the separation OB from BC [Meucci]. Around SF , any tiny disturbance provides a large response. Thus the homoclinic spike trains can be synchronized by a periodic sequence of small disturbances (Fig. 5). However each disturbance has to be applied for a minimal time, below which it is no longer effective; this means that the system is insensitive to broadband noise, which is a random collection of fast positive and negative signals[Zhou et all. The above considerations lay the floor for the use of mutual synchronization as the most convenient way to let different neurons respond coherently to the same stimulus, organizing as a space pattern. In the case of a single dynamical system, it can be fed back by its own delayed signal. As the delay is long enough the system is decorrelated with itself and this is equivalent to feeding an
26 1 independent system. This process allows to store meaningful sequences of spikes as necessary for a short term memory [Arecchi et a1.20021.
I 0
1
2
3
4
77
78 time <ms)
78
I
5
time (ms)
Fig. 5. Left: experimental time series for different synchronization ratios induced by periodic changes of the control parameter. (a) I : 1 locking, (b) 1 :2, ( c ) I :3, (d) 2 :I . Right: when the system is not able to spike for each period of the driver, a phase slip (one spike less or more) occurs, it is a jump of +/- 271 if the interspike interval is normalized to 2a. The rate of +/- phase slips increases with the offset of the driving frequency from the natural frequency (associated with the average interspike interval of the free system).
Several neuron models (integrate-and-fire, Hodgkin-Huxley, FitzHughNagumo, Hindmarsh-Rose ) have been used by different investigators. We have introduced the propensity to synchronization as a quantitative indicator of how easy is for a chaotic system to recognize an external input (Fig. 6) [Arecchi et a1 20031 In presence of localized stimuli over a few neurons, the corresponding disturbances propagate by inter-neuron coupling ( either excitatory or inhibitory); a synchronized pattern is uniquely associated with each stimulus; the degree of mutual synchronization is measured by the disappearance of phase slips, or defects in a space-time fabric [Leyva et al.].
262 5 1 .
I
.
I
.
,
. *.
, .
I
.
,
.
I
.
,
. ..
-0
Fig.6 The coherence parameter R is defined as the ratio between the average IS1 (interspike interval) and its r.m.s. fluctuation. R is unity for a fully chaotic system and tends to infinity for a periodic system. Here we plot in log scale the ratio between R for a driving periodic disturbance of 1% to a control parameter and Rfiee for the free system, at different frequencies o away from the natural one 00 (average of the chaotic spiking in the unperturbed system), For HC (circles) the ratio is 30 at w0 and it goes up to 104for higher frequencies; for the Lorenz system(squares) the ratio stays flat to 1. We thus take this ratio as a quantitative indicator of the propensity to synchronization
These facts have been established experimentally and confirmed by a convenient model in the case of a class B laser with a feedback loop which readjusts the amount of losses depending on the value of the light intensity output[Arecchi 1987 b]. I here recall the classification widely accepted in laser physics. Class A lasers are ruled by a single order parameter, the amplitude of the laser field, which obeys a closed dynamical equation; all the other variables having much faster decay rate, thus adjusting almost instantly to the local field value. Class B lasers are ruled by two order parameters, the laser field and the material energy storage providing gain; the two degrees of freedom having comparable characteristic times and behaving as activator and inhibitor in chemical dynamics [Areccbi 1987al The above listed facts hold in general for any dynamical system which has a 3dimensional sub-manifold separating a region of excitability from a region of periodic oscillations: indeed, this separatrix has to be a saddle focus. 3
Time code in neural information exchange
How does a synchronized pattern of neuronal action potentials become a relevant perception?
263 Not only the different receptive fields of the visual system, but also other sensory channels as auditory, olfactory, etc. integrate via feature binding into a holistic perception. Its meaning is “decided” in the PFC (pre-frontal cortex) which is a kind of arrival station from the sensory areas and departure for signals going to the motor areas. On the basis of the perceived information, motor actions are started, including linguistic utterances [Rodriguez et al.]. Sticking to the neurodynamical level, and leaving to psychophysics the investigation of what goes on at higher levels of organization, we stress here a fundamental temporal limitation. Taking into account that each spike lasts about 1 msec, that the minimal interspike separation is 3 msec, and that the decision time at the PCF level is estimated to be = 200ms , we can split Tinto 200/3 -66 bins of 3 msec duration, which are designated by 1 or 0 depending on whether they have a spike or not. Thus the a priori total number of different messages which can be transmitted is
2 6 6 = 6I~019. However we must account also for the average rate at which spikes proceed in our brain, which is t= 40 Hz (so called y band,(average ISI) = 25 ms). When we account for
S
this rate we can evaluate a reduction factor a = Y = 0.54 where S is an entropy
T
[Rieke et all, thus there are roughly 2‘ = 10” words with significant probability. Even though this number is large, we are still within a finitistic realm. Provided we have time enough to ascertain which one of the different messages we are dealing with, we can classify it with the accuracy of a digital processor, without residual error. But suppose we expose the cognitive agent to fast changing scenes, for instance by presenting in sequence unrelated video frames with a time separation less than 200 msec. While small gradual changes induce the sense of motion as in movies, big differences imply completely different subsequent spike trains. Here any spike train gets interrupted after a duration AT less than the canonical This means that the brain cannot decide among all coded perceptions having the same structure up to AT, but different afterwards. Whenever we stop the perceptual task at AT shorter than the total time T , then the bin stretch.T-AT (we measure the times in bin units) is not explored. This means that all stimuli which provide equal spike sequences up to AT, and differ afterwards by at least one spike will cover an uncertainty region AP whose size is given by
r.
hp = 2d2-4T = p M e - d T l n 2
(1) -
where PM=lO” is the maximum perceptual size available with the chosen T c 66.6 bins per perceptual session and rate r=40 Hz.Relation (1) is very different from the standard uncertainty relation
AP.AT=C
(2)
264 that we would expect in a word-bin space ruled by Fourier transform relations. Indeed, the trascendental equation (1) is more rapidly converging at short and long AT than the hyperbola (2). We fit (1) by (2) in the neighborhood of a small uncertainty AP = I0 words, which corresponds to AT = 62 bins. Around AT = 62 bins the local uncertainty (2) yields a quantum constant
C = 10.62 = 620words x bins
(3)
To convert C into Js as Planck’s h, consider that: i)l bin = 3 ms ii)in order to jump from an attractor corresponding to one perception to a nearby one, a minimal amount of energy is needed, corresponding to one spike; but one spike requires the energy corresponding to about lo7 transitions ATP-+ADP+ P [Laughlin et all each one taking 0.3 eV; thus the total energy quantum is about joules .The conversion factor is then:
Ic ~ o - ~=~i oJZ.0 ~ l %
Quantum limitations were also put forward by Penrose [Penrose] but on a completely different basis. In his proposal, the quantum character was attributed to the physical behavior of the “microtubules” which are microscopic components of the neurons playing a central role in the synaptic activity. However, speaking of quantum coherence at the h-bar level in biological processes is not plausible, if one accounts for the extreme vulnerability of any quantum system to decoherence processes, which make quantum superposition effects observable only in extremely controlled laboratory situations, and at sub-picosecond time ranges ,not relevant for synchronization purposes in the 10-100 msec range. Our tenet is that the quantum C-level in a living being emerges from the limited time available in order to take vital decisions ;it is logically based on a non-commutative set of relevant variables and hence it requires the logical machinery built for the h-bar quantum description of the microscopic world where non-commutativity emerges from use of variables coming from macroscopic experience ,as coordinate and momenta, to account for new facts. A more precise consideration consists in classifying the spike trains. Precisely, if we have a sequence of identical spikes of unit area localized at erratic time positions TIthen the wholesequence is represented by
f(o=Cw-r/)
(3)
where {q} is the set of position of the spikes. A temporal code, based on the mutual position of successive spikes, depends on the moments of the interspike interval distributions
m/
= {TI - r/-l} Different ISIs encode different sensory information.
(4)
265
A time ordering within the sequence (3) is established by comparing the overlap of two signals as (3) mutually shifted in time. Weighting all shifts with a phase factor and summing up, this amounts to constructing a Wigner function [Wigner] -t-ou
W(t, co) = \f(t + T 12)f(t - T 12) exp(icoT)dT
(5)
If now/is the sum of two packets f=fi+f2 as in Fig. 8, the frequency-time plot displays an intermediate interference. Eq. (5) would provide interference whatever is the time separation between/; and f2. In fact, we know that the decision time T truncates a perceptual task, thus we must introduce a cutoff function g(r) ~ CXp(—T2 IT1) which transforms the Wigner function as -t-cu
W(t, co) = J/0 + T 12)f(t - T / 2) exp(i aw)g(r)c/r
(5')
In the quantum jargon, the brain physiology breaks the quantum interference for t > T (decoherence phenomenon ).
4
The role of the Wigner function in brain operations.
We have seen that feature binding in perceptual tasks implies the mutual synchronization of axonal spike trains in neurons which can be even far away and yet contribute to a well defined perception by sharing the same pattern of spike sequence.The synchronization conjecture was given experimental evidence by inserting several micro-electrodes probing each one single neuron in the cortex of cats and then studying the temporal correlation in response to specific visual inputs[Singer et al]. In the human case, indirect evidence is acquired by exposing a subject to transient patterns and reporting the time-frequency plots of the EEG signals [Rodriguez et al.]. Even though the space resolution is poor, phase relations among EEG signals coming from different cerebral areas at different times provide an indirect evidence of the synchronization mechanism. The dynamics of homoclinic chaos (HC) was motivated by phenomena observed in lasers and then explored in its mathematical aspects, which display strong analogies with the dynamics of many biological clocks ,in particular that of a model . HC provides almost equal spikes occurring at variable time positions and presents a region of high sensitivity to external stimuli; perturbations arriving within the sensitivity window induce easily a synchronization, either to an external stimulus or to each other (mutual synchronization ) in the case of an array of coupled HC individuals (from now on called neurons).
266
But who reads temporal information contained across synchronized and oscillatory spike trains?[MacLeod et all. In view of the above facts, we can model the encoding of external information on a sensory cortical area (e.g. V1 in the visual case) as a particular spike train assumed by an input neuron directly exposed to the arriving signal and then propagated by coupling through the array. As shown by the experiments on feature binding [Singer], we must transform the local time information provided by Eqs (3) and (4) into a spatial information which tells the amount of a cortical area which is synchronized. If many sites have synchronized by mutual coupling, then the read out problem consists in tracking the pattern of values (3), one for each site. Let us take for simplicity a continuous site coordinate r. In case of two or more different signals applied at different sites a competition starts and we conjecture that the winning information (that is, the one channeled to a decision center) corresponds to a “muioritv rule”. Precisely, if the encoding layer is a 1dimensional chain of N coupled sites activated by external stimuli at the two ends (i=l and i=N),the majority rule says that the prevailing signal is that which has synchronized more sites. The crucial question is then :who reads that information in order to decide upon? We can not recur to some homunculus who reads the synchronization state. Indeed, in order to be made of physical components, the homunculus itself should have some interpreter which would be a new homunculus, and so on with a “regressio ad infinitum ”. On the other hand, it is well known that ,as we map the interconnections in the vision system,V1 exits through the Vertical stream and the Dorsal stream toward the Inferotemporal Cortex and Parietal Cortex respectively. The two streams contain a series of intermediate layers characterized by increasing receptive fields ; hence they are cascades of layers where each one receives converging signals from two or more neurons of the previous layer. Let us take for the time being this feed-forward architecture as a network enabled to extract relevant information upon which to drive consequent actions . We show how this cascade of layers can localize the interface between two domains corresponding to different synchronization. It is well known that ON/OFF cells with a center-surround configuration perform a first and second mace derivative[Hubel]. Suppose this operation was done at certain layer , At the successive one, as the converging process goes on, two signals will converge on a simple cells which then performs a hieher order derivative, and so on. This way ,we build a power series of space derivatives. A translated Eunction as f(ri-4 is then reconstructed by adding up many layers, as can be checked by a Taylor expansion. Notice that the alternative of exploring different neighborhoods {of r by varying { would imply a moving pointer to be set sequentially at different positions, and there is nothing like that in our physiology.
267 The next step consists in comparing the function f ( r + g with a suitable standard, to decide upon its value. Since there are no metrological standards embedded in a living brain, such a comparison must be done by comparing f with a shifted version of itself, something like the product f(r+W f(r-4 Such a product can be naturally be performed by converging the two signalsf(rY@ onto the same neuron, exploiting the nonlinear (Hebbian )response characteristic limited to the lowest quadratic nonlinearity, thus taking the square of the sum of the two converging inputs and isolating the double product. This operation is completed by summing up the different contributions corresponding to different 6, with a kernel which keeps track of the scanning over different 6, keeping information on different domain sizes. If this kernel were just a constant, then we would retrieve a trivial average which cancels the 4 information. Without loosing in generality, we adopt a Fourier kernel exp(i kg and hence have built the quantity
It contains information on both the space position r around which we are exploring the local behavior, as well as the frequency k which is associated with the space resolution . As well known, it contains the most complete information compatible with the Fourier uncertainty
Notice that building a joint information on locality@) and resolution (k) by physical measuring operations implies such an intrinsic limitation.
In summary ,it appears that the Wigner function is the best read-out of a synchronized layer that can be done by exploiting natural machinery, rather than recurring to a homunculus. The local value of the Wigner function represents a decision to be sent to motor areas triggering a suitable action. In order to have a suitable description of the feature binding mechanism in terms of a Wigner function in time (at a single site) and space (over an array of sites) we need an evolution equation. But which are the physical objects whose evolution describes the phenomenology? I conjecture that the transition from de-coupled to fully synchronized neurons is controlled by the dynamics of defects (see Fig. 7). A preliminary treatment of defects as harmonic oscillators is summarized in Fig.9 . The total Hamiltonian then rules the evolution equation for the Wigner function and shou1.d provide a complete quantum description.
268
Synchronizatian patterns in arrays of homoclinic chaotic systems
20
20
40
20
40
20
40
40
Site index
& = ( a ) 0 . 0 ; ( b ) 0 . 0 5 ; ( c ) O . l ; (d)0.12; ( e ) 0 . 2 5
28
Fig. 7. Space -time representation of spike positions for a linear array of 40 neurons and for different amounts of nearest neighbor coupling. At ~ 0 . 2 5percolation has been reached, in the sense that spikes at all sites are connected (besides a mutual time lag to the transfer operation); at FO , no correlation at all among different sites; in between, a partial synchronization with the evidence of defects as phase slips (one spike more, or less, with respect to the neighbor site)
The oscillating interference of Fig.8 is crucial for quantum computution.The reason why we don’t see it in ordinary life is “decoherence”. Let us explain what we mean. If the system under observation is affected by the environment, then the two states of the superposition have a finite lifetime; however, even worse, the interference decays on a much shorter time (the smaller, the bigger is the separation between the two states): so while the separation is still manageable for the two polarization states of a photon, it becomes too big for the two states of a macroscopic quantum system. Precisely if we call the intrinsic decay of each one of the two states of the superposition, then the mutual interference in the Wigner function decays with the so-called decoherence time
where D2is the square of the separation D in phase space between the centers of the Wigner functions of the two separate states. Notice that, in microscopic physics, 02 is measured in units of fi . Usually
o2/ f i is much bigger than unity for a macroscopic
269
system and hence T,,, is so short that any reasonable observation time is too long to detect a coherent superposition. Instead, in neurodynamics, we perform measurement operations which are limited by the intrinsic sizes of space and time resolutions peculiar of brain processes. The associated uncertainty constant C is such that it is very easy to have a relation as (9) with d / C comparable to unity, and hence superposition lifetimes comparable to times of standard neural processes. This implies the conjecture that for short times or close cortical domains a massive parallelism typical of quantum computation should be possible. The threshold readjustment due to expectations arising from past memory acts as an environmental disturbance, which orthogonalizes different neural states, destroying parallelism.
Fig.8: Wiper distribution of two localized sinusoidal packets shown at the top. Bottom :frequency-time
representationof the levels of the (real but not always positive!) Wiper function. The oscillating interference is centered at the middle time-frequencylocation
270
Dynamical model for a synchronized HC array ax
-=
at
f ( x )+E.V?X
linearize around the saddle focus
Fig. 9. A summary of the strategy to characterize the quantum aspects of the time code. Here x is an Ndimensional (N=6 for HC) dynamical field depending on position r. Its local dynamics is given by the equation Rx). Furthermore, the space coupling of strength E is given by the Laplacian. Linearizing the equation at parameter values below the percolation threshold, Fourier transforming and diagonalizing the linearized equation, we have harmonic oscillators. Their coordinate and momenta do not commute, because of the fundamental energy-time uncertainty with the quantum constant C. From the k-dependence of the oscillator amplitudes a(k) one reconstructs the space features of the defects
References
1. Allaria E., Arecchi F.T., Di Garbo A., Meucci R. 2001 “Synchronization of
homoclinic chaos” Phys. Rev. Lett 36, 791. 2. Arecchi F.T., 1987a “Instabilities & chaos in single mode homogeneous line lasers“, in: Instabilities and chaos in quantum optics, (eds. F.T. Arecchi & R.G. Harrison), Springer Series Synergetics, Vol. 34 ,pp. 9-48. 3. Arecchi F.T ., MeucciR., Gadomski W. 1987b “Laser dynamics with competing instabilities”,Phys. Rev. Lett., 58, 2205 4. Arecchi F.T., Meucci R., Allaria E., Di Garbo A., Tsimring L.S., 2002 “Delayed self-synchronization in homoclinic chaos” Phys. Rev. E, 65,046237
271 5. Arecchi F.T.,Allaria E., Leyva I. ,2003 “A propensity criterion for networkingin an arrayof coupled chaotic systems” Phys.Rev.Lett. 91,234101 6. Edelman, G.M., and G. Tononi 1995 “Neural Darwinism: The brain as a selectional system” in “Nature’s Imagination: The frontiers of scientific vision”, J. Cornwell, ed., pp.78-100, Oxford University Press, New York. 7. Freeman, W.J. 1991.” The Physiology of Perception” Sci Am. 264(2),78-85. 8. Grossberg S., 1995 “The attentive brain” The American Scientist, 83,439. 9. Hubel D.H., 1995 “Eye, brain and vision”, Scientific American Library, n. 22, W.H. Freeman, New York. 10. Izhikevich E.M., 2000 “Neural Excitability, Spiking, and Bursting” Int. J. of Bifurcation and Chaos. 10, 1171 11. Laughlin S.B., de Ruyter van Steveninck,R..,Anderson J. 1998 “The metabolic cost of neural information” Nature Neuroscience 1,36 12. Leyva I. , Allaria E., Boccaletti S. and Arecchi F. T. ,2003, “Competition of synchronization patterns in arrays of homoclinic chaotic systems”, (eprint:nlin.PS/0302008). 13. MacLeod, K., Backer, A. and Laurent, G. 1998. “Who reads temporal information contained across synchronized and oscillatory spike trains?”, Nature 395: 693-698. 14. Meucci R., Di Garbo A., Allaria E., Arecchi F.T. 2002 “Autonomous Bursting in a Homoclinic System” ”,Phys. Rev. Lett. 88, 144101 15. Penrose R. 1994 “Shadows of the Mind” Oxford University Press New York. 16. Rabinovich M.,Volkovskii A.,Lecanda P.,Huerta R.,Abarbanel H. and Laurent G. 2001”Dynamical encoding by networks of competing neuron groups”, Phys. Rev. Lett. 87,068 102 17. Rieke, F., Warland, D., de Ruyter van Steveninck, R. and Bialek, W. 1996. “Spikes: Exploring the neural code”, MIT Press, Cambridge Mass.. 18. Rodriguez E., George N., Lachaux J.P., Martinerie J., Renault B.and Varela F. 1999, “Perception’s shadow:Long-distance synchronization in the human brain”, Nature 397:340-343. 19. Singer W. and E Gray C.M., 1995, “Visual feature integration and the temporal correlation hypothesis” Annu.Rev.Neurosci. 18,555. 20. Von der Malsburg C., 1981 “The correlation theory of brain function”, reprinted in E. Domani, J.L. Van Hemmen and K. Schulten (Eds.), “Models of neural networks 11”, Springer, Berlin. 21. W i p e r E.,1932 “ On the Quantum Correction For Thermodynamic Equilibrium” Phys.Rev,70,749 22. Zhou C.S. , Allaria E. , Boccaletti S. , Meucci R., Kurths J.,and.Arecchi F.T .,2003, “Noise induced synchronization and coherence resonance of homoclinic chaos”, Phys. Rev. E 67,015205
ENERGETIC MODEL OF TUMOR GROWTH
P. CASTORINA AND D.ZAPPALA INFN, Sezione di Catania and Dept. of Physics, University of Catania, Via 5 '. Sofia 64, I-95123, Catania, Italg E-mail: [email protected] A macroscopic model of the tumor Gompertsian growth is proposed. This approach is based on the energetic balance among the different cell activities, described by methods of statistical mechanics and related to the growth inhibitor factors. The model is successfully applied to the multicellular tumor spheroid data.
1. Introduction
A microscopic model of tumor growth in vivo is still an open problem. However, in spite of the large set of potential parameters due to the variety of the in situ conditions, tumors have a peculiar growth pattern that is generally described by a Gompertzian curve ', often considered as a pure phenomenological fit of the data. More precisely there is an initial exponential growth (until 1-3 mm in diameter) followed by the vascular Gompertzian phase2. Then it seems reasonable to think that cancer growth follows a general pattern that one can hope to describe by macroscopic variables and following this line of research, for example, the universal model proposed in has been recently applied to cancer4. In this talk we present a macroscopic model of tumor growth, proposed in 5, that: i) gives an energetic basis to the Gompertzian law; ii) clearly distinguishes between the general evolution patterns, which include the internal feedback effects, and the external constraints; iii) can give indications on the different tumor phases during its evolution. The proposed macroscopic approach is not in competition with microscopic models 6, but it is a complementary instrument for the description of the tumor growth. 2. Cellular energetic balance and Gompertzian growth
The Gompertzian curve is solution of the equation
d N = Nyln dt
(%)
where N ( t ) is the cell number at time t, y is a constant and Nmasis the theoretical saturation value for t + 00.
272
273
a. (9)
It is quite natural to identify the right hand side of (1) as the number of proliferating cells at time t and then to consider f p ( N )= 71n as the fraction of proliferating cells and 1 - f p ( N ) = fnp the fraction of non proliferating cells. Since f p ( N ) depends on N ( t ) ,there is a feedback mechanism usually described by introducing some growth inhibitor factors which increase with the number of nonproliferating cells and are responsible for the saturation of the tumor size. The concentration of inhibitor factors should be proportional to the number of nonproliferating cells which is maximum at N = N,,, 7. If one considers that, during the growth, each cell shares out its available energy at time t , in the average, among its metabolic activities, the mechanical work ( associated with the change of the tumor size and shape) and the increase of the number of cells, it is conceivable to translate the previous cellular feedback effect in terms of energy content. Indeed, as shown in 5 , the specific energy for the growth should be proportional to fnp and the average metabolic plus mechanical energy per cell, M e , is proportional to f,. As we shall see this reproduces the observed cellular feedback. The model 5 , based on an analogy with statistical mechanics, assumes that in a larger system B , the body, there is a subsystem A, the tumor, made of N ( t ) cells at time t, with total energy U ,which has specific distributive mechanisms for providing, in the average, the amount of energy U / N to each cell. Then we indicate with EM the energy needed for the metabolic activities of A , with R the energy associated with the mechanical work required for any change of size and shape of A , and with p the specific energy (i.e. per cell) correlated to the change in the number of cells N , and, by assuming that these three processes summarize the whole cellular activity, we have U = EM R p N . Let us assume that the system A slowly evolves through states of equilibrium with the system B , defined by macroscopic variables, analogous for instance to the inverse temperature 8, that have the same value for the two systems, although it should be clear that in our case we do not have red thermodynamical equilibrium because the system B supplies the global energy for the slow evolution of the subsystem A. Within this scheme, there are many microscopic states of the system A, compatible with the macroscopic state, defined by 8, p and V , which are built by a large number of states of each single cell. These microscopic states of each cell have minimum total energy e and in an extremely simplified picture, we assume an energy spectrum of the form = e 16, where 1 is an integer and 6 is the minimum energy gap between two states. With this spectrum the grand partition function 2 is given by the following product Z(B, V ,p ) = lT&exp ( e - O ( Q - p ) ) and the corresponding grand potential, which is natural to associate to the energy R related to the mechanical work in our problem, is given by R(/?,V,p) = -(R/p)exp[-p(e - p ) ] where R = 1/(1- e-pa). The average value of N , defined for constant V and 8, turns out to be N = -80. According to the basic rules of statistical mechanics, the product of the “entropy times the temperature”, which in our system corresponds to EM introduced above,
+ +
+
2 74
is
where C is given by C = Rp6 exp(-pd). Fkom the previous equations it is straightforward to express p in terms of N : p = e (1//3) ln(N/R). To find the evolution of the system A which takes into account the internal feedback mechanism we recall some results obtained in :
+
the energetic balance requires that the growth with cellular feedback starts at a minimum number of cells N , and saturates at a maximum value, Nmaz related by N,,, = N,exp(l +,Be). For N , >> 1, Me = E M / N R/N = (1/p) ln(N,,,/N) is a decreasing function of N > Nm and there is a simultaneous reduction of the total metabolic energy per cell and an increase of specific energy required for the growth: there is an energetic balance between Me and p = (l/p)(l pe 1n(N/Nma,)). Moreover Me is proportional t o fp(t).
+
+ +
Then it is possible to derive the Gompertz equation for the growth, A N in an interval At. A N is proportional to the number of proliferating cells and then one can write A N = c1 At fp(t)N = c2 At Me N where c1 and c2 are constants. This gives in the continuum limit Eq. (1) with y = c2/,4.
3. Application to Multicellular Tumor Spheroids The first step to analyze the phenomenological implications of the model and t o describe the dependence of the growth on the external conditions is t o consider the multicellular tumor spheroids (MTS). A minimal MTS description consists of a spherical growth where a) the thickness, k, of the layer where the nutrient and oxygen is delivered (the crust) is independent on the spheroid radius R; b) the cell density is constant; c) the cells in the crust, receive a constant supply of nutrient for cell; d) at time t the cells are non proliferating if they are at distances d < R - k if R > k from the center of the spheroid. For R < k all cells are proliferating. To separate the effects of the external constraints due t o energy supply from those related t o biomechanical conditions, it is better t o consider first the MTS growth without external and internal stress and to introduce later these effects.
3.1. Energetic MTS growth In this case the external conditions are experimentally modified by changing the oxygen and nutrient concentration in the environment. At fixed value of these concentrations, the maximum allowed number of cells in the MTS is N,,, . For R < k all cells receive the nutrient and oxygen supply while for R > k there is a fraction of non proliferating cells and the feedback effect starts. The growth of the MTS is due to the proliferating cells in the crust and one obtains that the MTS radius R, for
275
R >> k, follows a Gompertzian law as the one in Eq. (l),with N replaced by R and N,,, by La,, where La, is the maximum radius of the spheroid corresponding to the maximum number of tumor cells Nmaz. The experimental results show that after 3-4days of initial exponential growth the spheroids essentially follow the Gompertzian pattern'. According to our model, at time t > t*,such that R(t*)= k, a fraction of the total cells becomes non proliferating, the feedback effect starts and the growth rate decreases according to the Gompertz law. The number of cells at time t' is fixed by the condition N ( t * )= N , 5 . On the other hand, the variation of the concentration of nutrient and/or of oxygen modifies the total energy supply, that is the value of N,,, and, since N , = N,,,exp(-ep - 2), there is a clear correlation among the external energetic "boundary conditions", the value N,,, and the thickness of the viable cell rim which corresponds to the radius of the onset of necrosis. It can be shown that (G, is the glucose concentration):
k(G,) = cy ( N h t , - N&z1'3)
+ ko
(3)
where cy and ko are constants depending on the supplied oxygen. From Eq. (3) one obtains the correlation among N,,,, G, and k. In Fig.1 and Fig. 2 the previous behaviors are compared with data without optimization of the parameters.
Figure 1. Thickness (pm) vs. glucose concentration (mM). Figure (a) is for an oxygen concentration of 0.28 mM and Figure (b) is for an oxygen concentration of 0.07 mM.
2k
Figure 2. Spheroid saturation cells number vs. diameter (pm) at which necrosis first develops. Circles refer to culture in 0.28 mM of oxygen Tkiangles refer to culture in 0.07 mM of oxygen.
276 Table 1. Comparison with the experimental data as discussed in the text. The experimental error is about 240%
cg(percent)
3.2.
2Rmax(Q)
[m] exper.
2Rmax(Q) [ ~ m fit]
0.3
450
0.5
414
452 429
0.7
370
404
0.8
363
394
Biomechanical eflects
The experimental data indicate that when MTS are under a solid stress, obtained for example by a gel, the cellular density p is not constant and depends on the external gel concentration C,. In particular the results in show that: 1) an increase of the gel concentration inhibits the growth of MTS; 2) the cellular density at saturation increases with the gel concentration. In the model the mechanical energy is included in the energetic balance of the system by the term R = -N/B = -PV where the pressure is P ( t ) = p ( t ) / p . The introduction of this term decreases the value of N,,, with respect t o the case in Sect. 3.1 and this reduction should also imply a decrease of the maximum size of the spheroids, i.e. R,,,(P) by increasing the pressure. The comparison with the data is reported in Table I for C, in the range 0.3 - 0.8 % (see ti for details).
References 1. B. Gompertz, Phyl. Trans. R. SOC., 115,513 (1825). 2. G.G. Steel, “Growth Kinetic of tumors”, Oxford Clarendon Press, 1977; Cell tissue Kinet., 13,451 (1980); T.E. Weldon, “Mathematical models in cancer research”, Adam Hilger Publisher, 1988 and refs. therein. 3. G.B. West et al., Nature 413,628 (2001). 4. C. Guiot et al., J. Theor. Biol. 25, 147 (2003). 5. P. Castorina and D. ZappalA, “Tumor Gompertzian growth by cellular energetic balance”, q-bio.CB/0407018. 6. M. Marusic et al., Cell Prolif. 27 73 (1994); Z. Bajzer et 06. in: “Survey of model for tumor-immune system dynamics”, J.A. Adams and N. Bellomo eds., Birkhauser 1997; A. Bru et al., Phys. Rev. Lett. 81, 4008 (1998); Z. Bajzer, Growth Dev. Aging, 63, 3 (1999); N. Bellomo et al., “Mathematical topics on the modelling complex mulicellular systems and tumor immune cells competition”, Preprint: Politecnico di Torino, 2004. 7. J. P. Freyer, R.M. Sutherland, Cancer Research, 46,3504 (1986). 8. L. Norton et al., Nature 264, 542 (1976); L. Norton Cancer Research, 48,7067 (1988). 9. G. Helmlinger et al., Nature Biotechnology ,15,778 (1997).
ACTIVE BROWNIAN MOTION - STOCHASTIC DYNAMICS OF SWARMS
WERNER EBELING AND UDO ERDMANN Institut fur Physik, Humboldt- Universitat zu Berlin Newtonstrafle 15, 12489 Berlin, Germany Email: [email protected], [email protected]. de We summarize the essential features of the new model of wtive Brownian dynamics and applications ranging from the dynamics of molecular clusters in non-equilibrium to moving swarms of animals.
1. Characteristics of the dynamics of clusters and swarms
This paper covers a wide range of related phenomena reaching from the dynamics of molecular clusters in non-equilibrium t o moving swarms of animals. We introduce the general notation of “swarms” for confined systems of particles (or more general objects) in non-equilibrium. The study of non-equilibrium clusters of molecules begins more than 70 years ago with the pioneering papers of Farkas, Becker and Doring. However most of these studies are restricted to near equilibrium phenomena, as moving over a threshold of the free energy. Compared t o the theory of clusters, the study of objects like swarms of animals is a rather young field of physical studies (see e.g. Refs. 1, 2, 3, 4). Since the dynamics of swarms of driven particles has captured the interest of theorists, many interesting effects have been revealed and in part already explained. We mention the comprehensive survey of Okubo and Levin’ on swarm dynamics in biophysical and ecological respect. Further we mention the survey of Helbing’ covering traffic and related self-driven many-particle systems and the comprehensive books of Vicsek’ Mikhailov and Calenbuhr3 and of Schweitzer4. In the book of Okubo and Levin5 we find a classification of the modes of collective motions of swarms of animals. It is discussed that animal groups have three typical modes of motion: (i) translational motions, (ii) rotational excitations and (iii) amoeba-like motions. For example, Ordemann, Balazsi and Moss6>’studied the modes of motion of Duphniu. Depending on the existence of a external light source a whole swarm of these
277
278
animals switches from a uncorrelated type of motion to a very correlated type. The whole swarm starts to rotate then. At present it seems to be impossible to describe all the complex collective motions observed in nature. Instead we study in the following the collective modes and the distribution functions of a simple model. We investigate finite systems of particles confined by attracting forces which are self-propelled by active friction and have some interactions. This is considered as a rough model for the collective motion of non-equilibrium clusters and of swarms of cells and organisms as well8. For alternative models based on velocity-velocity interactions see Refs. 2, 9, 10. From the point of view of statistical mechanics the main purpose of this work is the study of the dynamics of active Brownian particles including interactions. The self-propelling of the particles is modeled by active friction as introduced in earlier work”. The interaction between the particles is modeled by harmonic (linear) forces or by Morse potentials. The consideration is restricted to 2 - d models. Driving the system by negative friction we may bring the system to far from equilibrium states. Studies of one-dimensional models have shown that driven interacting systems may have many a t t r a ~ t o r s ~Noise ~ i ~ ~may . lead to transitions between the deterministic attractors. In the case of two-dimensional motion of interacting particles, positive or negative angular momenta may be generated. This may lead to left/right rotations of pairs, clusters and swarms. We will show that the collective motion of large clusters of driven Brownian particles reminds very much the typical modes of parallel motions in swarms of living entities. 2. Dynamics in external fields in rigid body approximation
We introduce interactions described by the potential U(r1,. . . ,r N ) and postulate a dynamics of Brownian particles determined by the Langevin equation: where [ ( t )is a stochastic force with strength D, and a &correlated time dependence. The dissipative forces are expressed in the form
The coefficient y denotes a velocity-dependent friction, which possibly has a negative part. This way the dynamics of our Brownian particles is determined by the Langevin equation with dissipative contributions. In the case of thermal equilibrium systems we have y(v) = 70 = const.. In the general case where the friction is velocity dependent we will assume that the friction is monotonically increasing with the velocity and converges to 70at large velocities. In the following we will use the following ansatz based on the depot model for the energy supply 11,14
(4)
279 where c, d, q are certain positive constants characterizing the energy flows from the depot to the particle. Dependent on the parameters yo,c, d, and q the dissipative force function may have one zero at v = 0 or two more zeros with
vz - -6; d O - C
6=qd - 1.
(5)
CYO
Here 6 is a bifurcation parameter. In the case 6 > 0 a finite characteristic velocity vo exists. Then we speak about active particles. For IvI < VO, the dissipative force is positive, i.e. the particle is provided with additional free energy. Hence, slow particles are accelerated, while the motion of fast particles is damped (see Fig. 1). The asymptotics for large velocities is passive. Now we will discuss the motion of 1
.
0.5
p -
..................
6=0
-6=1
'
............... 6=1.2
-----
6=2
,,
Figure 1. The typical form of a friction function with active (negative) part a t small velocities (parameter 6 = C 1).
+
active particles in a two-dimensional space, r = {q,q}.The case of constant external forces was already treated by Schienbein et al.15316. Symmetric parabolic external forces were studied in Refs. 11, 17 and the non-symmetric case is being investigated in Ref. 18. Here we will study Pd-systems of N 2 2 particles. Let us imagine s swarm of active Brownian particles which are pairwise bound to a cluster which is rigid. Then the problem is restricted to the motion of the center of mass 1 I XI = XZ = - c x i z . (6) Nxzi1; We consider here only the free motion of the center of mass and the motion in an external field. The relative motion under the influence of the interaction is neglected so far. The free motion of the center of mass M is described by the equations XI = Vl x 2
=
vz
MVl = - M y (Vl, V2)Vl MV2
= - M y (Vl, V2) v 2
+ &G(t) +J2D,E2(t)
(74 (7b)
280 The stationary solutions of the corresponding Fokker-Planck equation reads’’
where D, = D/m2. For simplification we specify now the potential U as 1
U(X1,X,) = -a (Xf -tX,”) . 2
(9)
First, we discuss the deterministic motion, which is described by four coupled firstorder differential equations. The motion of the center of mass corresponds to the motion of 1 particle in an external field:
XI = ~1 X2
= VZ
- ax1
(104
rnV2 = -my ( ~ 1 ~, 2 ) - ax2
(lob)
mlil = -my
( ~ 1vZ) ,
For this case we have shown earlier“ that a limit cycle in the four-dimensional space is developed, which corresponds to leftlright rotations with the frequency W O . The projection of this periodic motion to the planes (21, q )and (u1, v2) are circles
xf + x,”= r$
V:
+ hz= u:.
(11)
The trajectories converge to limit cycles and the energy to
H
-+
2 EO= muO
(12)
This corresponds to an equal distribution between kinetic and potential energy. In explicite form we may represent the motion on the limit cycle in the four-dimensional space by the four equations”
+
X1 = ro cos(w0t a) Xz = T O sin(w0t + 3)
V1 = -row0 sin(w0t
+ a)
Vz = row0 cos(wt + a)
(134 P3b)
The frequency is given by the time the particle need for one period moving on the circle with radius ro with constant speed UO. This leads to wo = ro/uo and means that the particles oscillate with the frequency given by the linear oscillator frequency. The trajectory on the limit cycle defined by equations (13) is like a hula hoop in the four-dimensional space. The projections to the ( z 1 , ~space ) as well as ‘the projections to the (u1,uZ) space are circles. The projections to the subspaces (21, u2) and (22,ul} are like a rod. In the four-dimensional space the attractor has therefore the form of a hula hoop. A second limit cycle is obtained by reversal of the velocity. This second limit cycle forms also a hula hoop which is different from the first one, however both limit cycles have the same projections to the (21, 22) and to the (ul, u2) plane. The motion in the ( 2 1 , ~ plane ) has the opposite sense of rotation in comparison with the first limit cycle. Therefore both limit cycles correspond to opposite angular momenta. L3 = +Mrouo and L3 = -Mrouo. Applying similar arguments to the stochastic problem we find that the two hooprings are converted into a distribution looking like two embracing hoops with finite
28 1
size, which for strong noise converts into two embracing tires in the four-dimensional space. In order to get the explicite form of the distribution we may introduce amplitude-phase representations". The probability crater is located above the two deterministic limit cycles on the sphere T O = vo/w0. Strictly speaking not the whole spherical set is filled with probability but only two circle-shaped subsets on it, which correspond to a narrow region around the limit sets, The full stationary probability has the form of two hula hoop distributions in the four-dimensional space. This was confirmed by simulations". The projections of the distribution to the (21, 2 2 ) plane and to the {q,v2) plane are smoothed two-dimensional rings. The distributions intersect perpendicular the (21, v2) plane and the {x2,wl} plane. Due t o the noise the Brownian particles may switch between the two limit cycles, this means inversion of the angular momentum (direction of rotation)ll,ls. As a result rotating clusters are getting unstable similar to asymmetric driven ascillators18. 3. Dynamics of self-confined Morse clusters of driven particles The study of the full many-body dynamics of interacting driven particles including drift is an extremely difficult task. Therefore we will present here first the result of some simulations. In particular we studied Morse interactions described by the interaction potential @(T)
= A [exp(-m)
-
112
-
A
(14)
Our simulations for swarms with Morse interactions (Fig. 2). show rotating clusters. We observe rotations changing from time to time the sense of rotations due to
Figure 2. The two possible stationary states of a rotating cluster of 20 particles. The arrows correspond to the velocity of the single particle. In the presence of noise the cluster changes from time to time the direction of rotation.
stochastic effects. Further we see a slow drift of the clusters. The rotating swarms simulated in our numerical experiments remind very much the dynamics of swarms
282
studied in papers of Viscek and c o l l a b ~ r a t o r sand ~ ~ ~in other recent w0rks91111g. The translation mode corresponds to a driven motion of a free particle located in the center of mass supplemented by a small oscillatory relative motion against the center of mass. The solutions for the rotational model are similar to what we have found in the rigid approximation for the case of external fields. The probability is distributed around two limit cycles corresponding to left or right rotations. The result of several simulations which show cluster configurations and amoeba-like configurations is shown in Fig. 3. lllW
70 60 50
40
30 20
10
0
10
20
30
40
so
I
60
-
D
70
10
20
30
,=lOW
40
50
60
70
so
64
70
1IIwo
70
70
60
60 SO
40 30
20 10
0 0
10
20
30
40
50
60
70
0
10
20
30
40
Figure 3. Rotating and drifting clusters as well as amoeba-like configurations of 625 particles with Morse interactions.
4. The model of harmonic swarms w i t h global coupling
This section is devoted to some analytical estimates. Due to the great complexity of the dynamics of swarms we need further simplifications. In the following we will reduce all interactions to a global coupling of the particles. We consider twodimensional systems of N point masses m with the numbers 1 , 2 , . . . ,i, . . . ,N . We assume that the masses m are connected by linear pair forces mw; (ri - rj). The dynamics of the system is given by the following equations of motion r.- v . . '$
$ ,
miri +mu: (ri - R(t)) = Fi(vi)- a2(vi- V) + m&(t) (15)
The term proportional to a2 denotes a small force tending to parallelize the velocities of the particles in the swarm. Again we start with an investigation of the
283 translational mode of this system. For the mean velocity we find by summation and expanding around V in a symbolic representation V = F(V)
+ -21 (bv)* F”(V)* (bv)+ . . .
(16)
For the relative motion 6vi = v i- V we get in first order approximation
siri
+ wi6ri = -r * 6vi + J2D,Ei(t)
(17)
In the translational mode of this system all the particles form a noisy flock which moves with nearly constant velocity modulus
V(t) = R(t) = won;
ri(t) - R(t) = 0 i = 1,. . . ,N
(18)
The direction n may change from time to time due to stochastic influences. The distribution function of the flock is Boltzmann-like. However this distribution is stable only in the region where the friction tensor I? has only positive eigenvalues. This is for sure if V 2 = vi. Our solution breaks down if the dispersion bv2 is so large that the linearization around V is no more possible. With increasing noise we find a bifurcation. This corresponds to the findings of Mikhailov and Zanette for equivalent one-dimensional systems”. We note that in the two-dimensional system the dispersion of the relative velocity 6v is not isotropic. The dispersion in the direction of the flight V is smaller than perpendicular to it. We introduce here an isotropic approximation which allows for explicite solutions of the bifurcation problem. Beyond the region of stability of the translational mode the swarm converges to a rotating swarm at rest (see Fig. 4). This second stationary state of the 6Z.7
.
233.817.5
.I,.*
,
,
, .- . .
. .. .. , . . . .
1
.111
Figure 4. Possible stable states of a system of N = 100 globally coupled active particles. In the left picture we see a typical elliptical configuration of a swarm for a noise strength below the critical value. The right picture shows the spherical configuration corresponding to a rotational state. The noise strength is beyond the critical one.
swarm corresponds to left/right rotating ring configurations with a center at rest 8,19. In order to describe the numerical results semi-quantitatively, we introduce
284 further approximations. At first we simplify the equation for the mean momentum V similar as in Ref. 20 assuming
v = (a- pv2) (V) - P C ( 6 V i ) 2 - 2-P N
C (VbVi) svi + . .. + & Z [ ( t )
(19)
N i
i
In order to find explicite solutions we decouple the center of mass motion from the relative motion. By averaging with respect to Svi and neglecting the tensor character of the coupling to the relative motion we get d -v dt
= (01
-
Here the effective driving strength proximated by
PV2)V a1
a1 = a
+ ... +
(20)
(which strictly speaking is a tensor) is a p
P
- s-
N
C(6Vi)2
i
The factor s is between 1 (corresponding to strictly perpendicular fluctuations) and 3 (corresponding to only parallel fluctuations). As some reasonable average we will assume s N 2. The corresponding velocity distribution is
This way we find the most probable velocity
The most probable velocity of the swarm is shifted to values smaller than for the free motion. The shift with respect to the free mode VOis proportional to the noise strength D. For the fluctuations around the center om mass of the swarm we find
6vi
+ d 6 r i = -mi+ &Zti(t)
(24)
Here I? = 2PV: - a follows from a diagonal approximation of the tensor I?. In this way the relative distribution can be approximated as
Now we get a quadratic equation for the dispersion Dw
(sv2) = a!
with the solution
- 2sp (6w2)
285 The corresponding effective friction reads
r=2
[ + 1J1
At the critical noise strength D, = a2/8sP the dispersion has its maximal and the effective friction its minimal value, for larger noise strength the dispersion and the effective friction get complex. In simulations we found for Q = ,B = 1 a critical noise strength D P M 0.06721. This is very close to our theoretical estimate with s = 2 which gives DYit = 1/16. This way we gave a simple explanation for the transition from translational to rotational modes. A more advanced theory will be developed elsewhere".
5. Conclusions We studied here the active Brownian dynamics of swarms of confined particles with velocity-dependent friction and attracting interactions. Confinement was created either by pair-wise linear attracting forces, or by attracting Morse interactions. The basic results of our observations may be summarized as follows:
cluster drift: We see in the simulations clusters drifting clusters rotating very slowly and clusters without rotations which move rather fast (see Fig. 3). The latter state corresponds to the translational mode studied in the previous section for N = 2. Here most of the energy is concentrated in the kinetic energy of translational movement. generation of rotations: As we see from the simulations, small Morse clusters up to N 2: 20 generate left/right rotations around their center of mass. The angular momentum distribution is bistable. This corresponds to the rotational mode studied above for N = 1,2. breakdown of rotations: The rotation of clusters may come to a stop due to several reasons. The first is the anharmonicity of clusters. As we have shown in our previous work18, strong anharmonicity destroys the rotational mode. Another reason which was investigated here are noise induced transitions. shape distribution: Under special conditions the shape of the clusters is amoebalike and is getting more and more complicated". A theoretical interpretations of the shape dynamics is still missing. cluster composition: With increasing noise we observe a distribution of clusters of different size. Again a theory of clustering in the two-dimensional case is still missing. For the case of one-dimensional rings with Morse interactions several theoretical results are a ~ a i l a b l e l ~ . ~ ~ . We have given here first an analysis of several simple cases. This way we could identify several qualitative modes of movement. f i r t h e r we have made a numerical study of special N particle systems. In particular we investigated the rotational and translational modes, the clustering phenomena and noise induced transitions.
286 We did not intend here t o model any particular problem of biological or social collective movement. We note however t h a t t h e study of dynamic modes of collective movement of swarms may be of some importance for t h e understanding of many biological and social collective motions. To support this view we refer again t o t h e book of Okubo and Levin5 where t h e modes of collective motions of swarms of animals are classified i n way which reminds very much the theoretical finding for t h e model investigated here. In particular we mention also t h e motion of animals in water, for example t h e collective motion of Duphniu6~7~24325.
References
D.Helbing, Rev. Mod. Phys. 73,1067 (2001). T. Vicsek, Fluctuations and Scaling in Biology, Oxford University Press, Oxford, 2001. A. S.Mikhailov and V. Calenbuhr, From Cells to Societies, Springer, Berlin, 2002. F. Schweitzer, Brownian Agents and Active Particles, Springer, Berlin, 2003. A. Okubo and S. A. Levin, Diffusion and Ecological Problems: Modern Perspectives, Springer, New York, 2 edition, 2001. 6. A. Ordemann, G. Balazsi, and F. Moss, Nova Acta Leopoldina NF 88,87 (2003). 7. A. Ordemann, G. Balazsi, and F. Moss, Physica A 325,260 (2003). 8. F.Schweitzer, W. Ebeling, and B. Tilch, Phys. Rev. E 64,021110 (2001). 9. T.Vicsek, A. Czirbk, E. Ben-Jacob, I. Cohen, and 0. Shochet, Phys. Rev. Lett. 75, 1226 (1995). 10. A. Czir6k and T. Vicsek, Physica A 281,17 (2000). 11. U. Erdmann, W. Ebeling, F. Schweitzer, and L. Schimansky-Geier, Eur. Phys. J. B 15,105 (2000). 12. W.Ebeling, U. Erdmann, J. Dunkel, and M. Jenssen, J. Stat. Phys. 101,443 (2000). 13. J. Dunkel, W.Ebeling, U. Erdmann, and V. A. Makarov, Int. J. Bif. Chaos 12,2359 (2002). 14. F.Schweitzer, W.Ebeling, and B. Tilch, Phys. Rev. Lett. 80,5044 (1998). 15. M. Schienbein and H. Gruler, Bull. Math. Biol. 55,585 (1993). 16. M. Schienbein, K.Franke, and H. Gruler, Phys. Rev. E 49,5462 (1994). 17. W.Ebeling, F. Schweitzer, and B. Tilch, BioSystems 49,17 (1999). 18. U. Erdmann, W.Ebeling, and V. S. Anishchenko, Phys. Rev. E 65,061106 (2002). 19. W.Ebeling and F. Schweitzer, Theory in Biosciences 120,207 (2001). 20. A. S. Mikhailov and D. Zanette, Phys. Rev. E 60,4571 (1999). 21. U. Erdmann, W.Ebeling, and A. S. Mikhailov, Noise induced transition from translational to rotational motion of swarms, http: //arxiv. org/abs/physics/04120372004, 2004. 22. W.Ebeling and U. Erdmann, Complexity 8,23 (2003). 23. J. Dunkel, W.Ebeling, and U. Erdmann, Eur. Phys. J. B 24,511 (2001). 24. A. Ordemann, G. Balazsi, E. Caspari, and F. Moss, Daphnia swarms: from single agent dynamics to collective vortex formation, in Fluctuations and Noise in Biological, Biophysical, and Biomedical Systems, edited by S . M. Bezrukov, H. Frauenfelder, and F. Moss, volume 5110 of Proceedings of SPIE, pages 172-179,Bellingham, 2003,SPIE. 25. U. Erdmann, W.Ebeling, L. Schimansky-Geier, A. Ordemann, and F. MOSS,Active brownian particle and random walk theories of the motions of zooplankton: Application to experiments with swarms of Daphnia, http ://arxiv. org/abs/q-bio. PE/0404018,2004. 1. 2. 3. 4. 5.
COMPLEXITY IN THE COLLECTIVE BEHAVIOUR OF HUMANS TAMAS VICSEK Biological Physics Department and Research Group of Hung. Acad. Sci., Eotvos Universiv, Phzmdny p . Stny lA, H-1117 Budapest, Hungav
Can we reliably predict and quantitatively describe how large groups of people behave? Here we discuss an emerging approach to this problem which is based on the quantitative methods of statistical physics. We demonstrate that in cases when the interactions between the members of a group are relatively well defined (e.g, pedestrian traffic, synchronization, panic, etc) the corresponding models reproduce relevant aspects of the observed phenomena. In particular, people moving in the same environment typically develop specific patterns of collective motion including the formation of lanes, flocking or jamming at bottlenecks. We simulate such phenomena assuming realistic interactions between particles representing humans. The two specific cases to be discussed in more detail are waves produced by crowds at large sporting events and the main features of escape panic under various conditions. Our models allow the prediction of crowd behaviour even in cases when experimental methods are obviously not applicable and, thus, are expected to be useful in assessing the level of security in situations involving large groups of excited people
1. Introduction It is becoming increasingly evident that the application of ideas, methods and results of statistical physics to a wide range of phenomena occurring outside of the realm of the non-living world is a fruitful approach leading to numerous exciting discoveries. Among many others, examples include the studies of various group activities of people from the physicist’s viewpoint. Here, I shall give a partial account of some of our new investigations in this direction, involving the interpretation of such collective human activities as group motion and synchronization. On the small scale side of the size/complexity spectrum, in the world of atoms and molecules collective behaviour is also considered to be an important aspect of the observed processes. Furthermore, there are articles on collectively migrating bacteria, insects or birds and additional interesting results are published on phenomena in which groups of various organisms or non-living objects synchronize their signals or motion. This is the natural scientist’s aspect of how many objects behave together. However, if you search for a collective behaviour related item with your web browser most of the texts popping up will be concerned with group activities of humans including riots, fashion or panics. What is common in these seemingly diverse phenomena involving interpretations ranging from social psychology to statistical physics? The answer is that they happen in
287
288
systems consisting of many similar units interacting in a relatively well defined manner. These interactions can be simple (attractiodrepulsion) or more complex (combinations of simple interactions) and may take place between neighbours in space or on a specific underlying network. Under some conditions, in such systems various kinds of transitions occur; during these transitions the objects (particles, organisms or even robots) adopt a pattern of behaviour which is nearly completely determined by the collective effects due to the presence of all of the other objects in the system. What are the motivations for this sort of research? Mankind has been experiencing a long successful period of technological development. This era has been the result of a deeper understanding of the various physical and chemical processes due to the outstanding advances in the related sciences. After these achievements there is now a growing interest in a better, more exact understanding of the mechanisms underlying the main processes in societies as well. There is a clear need for the kind of firm, reliable results produced by natural sciences in the context of the studies of human behaviour. The revolution in information and transportation technology brings together larger and larger masses of people (either physically or through electronic communication). New kinds of communities are formed, including, among many others, internet chat groups or huge crowds showing up at various performances, transportation terminals or demonstrations. Since they represent relatively simple examples, these groups or communities of people provide a good subject from the point of studying the mechanisms determining the phenomena taking place in societies. Below we discuss new quantitative approaches to collective behaviour based on the exact methods of statistical physics. It is clear that the methods developed in natural sciences contain a significantly smaller amount of subjectivity than those used for the interpretation of human behaviour. If more exact approaches could be applied to social situations they could provide the desired objectivity, reproducibility and predictability. We demonstrate that in cases when the interactions between the members of a group are relatively well defined (e.g, pedestrian traffic, rhythmic applause, panic, soccer fans in stadiums, etc) the corresponding numerical models reproduce relevant aspects of the observed phenomena. Simulating models in a computer has the following advantages: i) by changing the parameters different situations can easily be created ii) the results of an intervention can be predicted and iii) more efficient design of the conditions for the optimal outcome can be assisted. In addition to possible applications, our approach is useful in providing a deeper insight into the details of the mechanisms determining collective phenomena occurring in social groups (see Fig. 1 for an observation of a simple kind of collective human behaviour: spontaneous lane formation in crowds of oppositely moving pedestrians).
289
Figure 1. A simple kind of collective human behaviour: spontaneous lane formation in crowds of oppositely moving pedestrians
Most of the results I am discussing next are available through the home page http://angel.elte.hd-vicsek. Many other recent studies have been devoted to the question of how the concepts common in statistical physics (fluctuation, phase transitions, scaling, etc) can be applied to a group of humans. A particularly entertaining and exhausting review of these efforts is given in the very recent book by Philip Ball [ 13 written in a popular science style. There exist additional remarkable works in similar directions by groups working on traffic, evacuation dynamics, econophysics, and on further related topics (see, e.g., Refs 2-4). 2. Mefhods Our central statement is that collective behaviour can be very efficiently studied by the methods developed by statistical physicists. The related theoretical and numerical approaches provide reliable, sometimes exact description of the processes taking place in many particle systems. We assume that under some conditions a large group of humans can be considered as a collection of particles, since there are various situations where the interaction of people is reasonably well defined (e.g., two people heading towards each other in a corridor will avoid each other just as if they had a repulsive physical force acting between them).
For the last two decades perhaps the most fruitful approach to collective phenomena has been the application of computer simulations. In such studies a simple model is
290
constructed which is supposed to grab the most relevant features of the system to be studied. Then, by letting the algorithm run in the computer while monitoring the parameters of the models a great variety of collective phenomena can be observed. The true test of a model is a careful comparison of its predictions with the behaviour of the real system. Examples
The rest of the paper will present examples of group behaviour of people which could be successfully interpreted by computer simulations and the related theoretical concepts. It is hoped that the process of simultaneous investigation of particular examples and the abstraction of their most general features will in time lead to a coherent theoretical description of collective human behaviour.
Collective motion
Here we first address the more general question whether there are some global, perhaps universal features of collective motion [ 5 ] . Such behaviour takes place when many organisms are simultaneously moving and parameters like the level of perturbations or the mean distance between the individuals is changed. The simple and generic model we introduced some time ago to study collective motion assumes two rules: a) Follow the others, or in other words, try to take on the average velocity of your neighbours. b) In addition, an amount of randomness is added to the actual velocity (to account for example for the level of excitement of the pedestrians).
Simulations result in a completely disordered motion if the level of perturbations is large (each particle moves back and forth randomly). However, if the noise is smaller than a critical value (just as in the case of the ordering of ferromagnets), groups of particles are spontaneously formed the groups merge (aggregate) and sooner or later join into a single large group moving in a direction determined in a non-trivial way by the initial conditions (Fig. 2).
291
Figure 2. The simple model of collective motion leads to a globally ordered motion of particles for intermediate noise levels (a), while results in flocks moving in random directions if the level of fluctuations is larger @). The interaction radius is indicated as a horisontal line segment.
3. Applications to situations involving crowds
A) Consider, as a thought experiment, thousands of people standing on a square and trying to look in the same -- however, previously undetermined -- direction, after being asked to do so. A nice example for human collective behaviour would be if all of them managed to face the same direction. Can they do it? Statistical physicists can predict for sure that this cannot be done. They recall a theorem valid for particles with short ranged ferromagnetic interactions stating that in two dimensions no long range ordered phase (all magnets pointing in the same direction) can exist in such a system for any finite temperature and zero external field. So what happens? Locally people are looking almost in the same direction, but on a large scale, e.g., seen from a helicopter -just as the little magnets -- they locally form vortex-like directional patterns due to the small perturbations due to human errors. Curiously enough, if the crowd is allowed to choose from a few discrete directions, the ordering can be realized. Perhaps even more interestingly, our models of flocking (based on the follow the neighbours rule) predict that if the people are asked to move in the same direction they will be able to do it.
B) In the latter models, if the moving particles are confined to move around in a closed circular area stable motion can be maintained only by the simultaneous rotation of all of the objects around the centre. Remarkably enough, under some conditions even humans move in groups in a manner predicted by simple models. Indeed, in Mecca each year thousands of people circle around the Kaba stone as they are trying to both keep on moving and not confronting with others. C) Next we focus on a system of oppositely moving pedestrians in a corridor. Here the corridor is wide enough (its width is several times larger than the diameter of a person). Half of the pedestrians is assumed to move from left to right, the rest in the opposite direction. In the associated model it is assumed that the particles tend to take on a
292 constant speed in their desired direction and are avoiding each other due to a repulsive force.
Figure 3. Results from a simulation of oppositely moving particles in a strip geometry (yellow to the left, red to the right). A simple repulsive force and motion on a continuous plane (there is no underlying grid) have been assumed. An intermediatenumber of particles leads to lane formation, while a large density results in a jamming, turbulent flow (bottom).
Simulations of this simple model based on the solution of the corresponding Newton's equations of motion reproduce the experimentally observed behaviour surprisingly well. A spontaneous formation of lanes of uniform walking directions in "crowds" of oppositely moving particles can be observed (Fig. 3). It is clear that lane formation will maximize the average velocity in the desired walking direction which is a measure of the "efficiency" or "success" of motion. Note, however, that lane formation is not a trivial effect of this model, but eventually arises only due to the smaller relative velocity and interaction rate that pedestrians with the same walking direction have. Once the pedestrians move in uniform lanes, they will have very rare and weak interactions. 4. Panic
One of the most disastrous forms of collective human behaviour is the kind of crowd stampede induced by panic, often leading to fatalities as people are crushed or trampled. Sometimes this behaviour is triggered in life-threatening situations such as fires in crowded buildings; at other times, stampedes can arise from the rush for seats or seemingly without causes. Although engineers are finding ways to alleviate the scale of such disasters, their frequency seems to be increasing due to greater mass events. Next we show that simulations based on a model of pedestrian behaviour can provide valuable insights into the mechanisms of and preconditions for panic and jamming by incoordination [61. The available observations on escape panic have encouraged us to model this kind of collective phenomenon in the spirit of self-driven many-particle systems. We assume, in addition to the earlier considered socio-psychological forces the relevance of physical
293 forces as well since the latter ones become very important in the case of a dense crowd with strong drive to get through a narrow exit. Each pedestrians of mass mi likes to move with a certain desired speed vo(t) into a certain direction eo(t) , and therefore tends to correspondingly adapt his or her actual velocity vi(t) with a certain characteristic time z. Simultaneously, he or she tries to keep a velocity-dependent distance to other pedestrians j and walls W. This can be modelled by “interaction forces” fi/ and fiw, respectively. In mathematical terms, the change of velocity in time is then given by the acceleration equation
The fi/ interaction forces include an exponentially decaying, repelling term expressing socio-psychological effects, and two additional “physical” terms corresponding to elastic repulsion andfviction forces between the bodies of people [6]. The fiwinteraction with the walls is treated analogously. To avoid model artefacts (gridlocks by exactly balanced forces in symmetrical configurations), a small amount of irregularity of almost arbitrary kind is needed. This irregularity was introduced by uniformly distributed pedestrian diameters ri in the interval [0.5m, 0.7m], approximating the distribution of shoulder widths of soccer fans. Based on the above model assumptions, it is possible to simulate several important phenomena of escape panic. The simulated outflow from a room is well-coordinated and regular, if the desired velocities are normal. However, for desired velocities above 1S-m/s, i.e., for people in a rush, we find an irregular succession of arch-like blockings of the exit and avalanche-like bunches of leaving pedestrians, when the arches break. “Faster-is-slower effect’’ due to impatience: Since clogging is connected with delays, trying to move faster can cause a smaller average speed of leaving, if the friction parameter is large. This effect is particularly tragic in the presence of fires, where the fleeing people reduce their own chances of survival. Improved outflows can be reached by columns placed asymmetrically in front of the exits preventing the build up of fatal pressures.
In fact, as it is clear from our visualization of the pressure distribution in a panicking crowd near an exit, the force experienced by the people is quickly changing and is highly fluctuating. The situation is quite similar to that observed and calculated for granular flows. The spots corresponding to high stress form bridge like structures connected in a hierarchical manner. Furthermore, this structure is completely reorganized in a short time after a “discharge”, i.e., after the region close to the exit is for a short time relaxed due to the successful exit of a group of people previously temporarily hindered by others from leaving.
294 More practical versions
After the basic model is given it can be applied to cases of increasing complexity and relevance to practical settings. Thus, we investigated the following additional cases: i) large crowds, ii) complicated “geometry” (parameters of actual rock concerts organized at a square in the downtown of a major city and in a stadium - both in Belgium), iii) effects of impatiencelanxiety and the combinations of these. Here I would like to discuss briefly, how the effects of the level of anxiety of people involved in escape panic can be taken into account in a setting with several exits of varying size (Fig. 4).
Figure 4. Snapshot of a simulation of people trying to escape from a room with 5 exits. The pressure distribution is colour coded (ranging from bright red corresponding to larger values to darker green denoting smaller pressures). There is an intensive traffrc of people changing exits due to impatience.
For this, several new aspects of the dynamics have to be considered. First of all, we have to allow that - in case someone becomes too anxious about not being able to proceed or, in other words, looses hisher patience - people could choose an alternative exit if they become unhappy with the one they chose previously. For this, we constantly have to keep a track of their level of anxiety. We assume, that if a particle cannot proceed quickly enough (the distance it makes in a given time interval is below a previously set anxiety level) it become “frustrated or anxious” and makes a decision about choosing a new exit. This decision is based on a number of parameters but, qualitatively, it is inversely proportional to the distance of the exits and to the number of people which would block this particle from leaving through an exit. Thus, most of the time, a particle within a crowd surrounding an exit, chooses this given exit, however, some other particles, closer to the edge of such a crowd are likely to chose an alternative exit. As a result, in a simulation of this sort there is a permanent redistribution of the crowds
295 around the exits: the larger ones tend to “evaporate” faster and the particles leaving them “condensate” at the exits with smaller number of particles. This process makes the whole system more efficient; in other words, giving to ‘anxiety factor” an increasing weight in the simulations leads to a faster overall escape rate! Although this result is not too surprising, it is somewhat paradoxical, since impatience or anxiety is usually not thought of as source of more optimal choices. Finally, we investigate a situation in which pedestrians are trying to leave a smoky room, but first have to find one of the invisible exits. Each pedestrian may either select an individual direction or follow the average direction of his neighbours in a certain radius or try a mixture of both. We assume that both options are weighted with some parameter (1-p)and p, respectively, where O
where E is a function producing a unit vector pointing in the direction of its argument (normalization). As a consequence, we have individualistic behaviour if p is low, but herding behaviour p is high. Our model suggests that neither individualistic nor herding behaviour performs well. Pure individualistic behaviour implies that each pedestrian finds an exit only accidentally, while pure herding behaviour implies that the complete crowd is eventually moving into the same and probably blocked direction, so that available exists are not efficiently used, in agreement with observations. Accordingly, we find optimal chances of survival for a certain mixture of individualistic and herding behaviour, where individualism allows some people to detect the exits and herding guarantees that successful solutions are followed by the others (Fig. 5).
Figure 5. Snapshot of a simulation with particles which, in addition to the repulsive socio-psychological and physical forces, attempt to follow each other. In this case the particles simulate people trying to escape from a smoky room having hvo exits. There is an optimal degree of “herding” or cooperation, when medium-sized groups are formed collectively leaving the room
5. Rhythmic applause An audience expresses appreciation for a good performance by the strength and nature
of its applause. The initial thunder often turns into synchronized clapping -- an event
296
familiar to many who frequent concert halls. Synchronized clapping has a well defined scenario: the initial strong incoherent clapping is followed by a relatively sudden synchronization process, after which everybody claps simultaneously and periodically. This synchronization can disappear and reappear several times during the applause. The phenomenon is a delightful expression of social self-organization, that provides a human scale example of the synchronization processes observed in numerous systems in nature [71. The above scenario can be recorded and the recordings analysed using the techniques common in physics [8]. The analysis reveals various interesting features including a spontaneous period doubling (as compared to the natural period of a single spectator) when the synchronization takes place. In other words, after an initial asynchronous phase, characterized by high frequency clapping (Mode I), the individuals synchronize by eliminating every second beat, suddenly shifting to a clapping mode with double period (Mode 11) where the dispersion (the relative difference in the clapping ftequencies) is smaller. Statistical theories developed for a group of globally coupled periodically behaving objects can be used to demonstrate that the necessary condition for synchronization is that dispersion has to be smaller than a critical value. Consequently, period doubling emerges as a condition of synchronization, since it leads to slower clapping modes during which significantly smaller dispersion can be maintained. Thus, the evaluation of the measurements offers a key insight into the mechanism of synchronized clapping: fast clapping synchronization is not possible due to the large dispersion in the clapping frequencies. After period doubling, as Mode I1 clapping with small dispersion appears, synchronization can be and is achieved. However, as the audience gradually decreases the period to enhance the average noise intensity, it gradually slips back to the fast clapping mode with larger dispersion, destroying synchronization (Fig. 6).
Figure 6 . Fourier-gramof the clapping sound intensity after a great performance in one of the many theatres in Budapest. Grey level indicates the amplitude of the Fourier spectrum for the given frequency (vertical scale). Time is shown along the horizontal axis. The occurrence of the doubling and the subsequent decrease of the global period is indicated to happen several times.
6. Mexican wave
Mexican wave, becoming widely known during the 1986 World Cup held in Mexico, has since become a favourite paradigm for a variety of systems in which an initial
297 perturbation propagates in the form of a single "planar" wave. The most important reason for the increasing popularity of this phrase, also known as La Ola, (or simply "the wave"), is likely to be due to its unique origin; it means a human wave moving along the stands of stadiums as one section of spectators stands up, arms lifting, then sits as the next section does the same. Using video recordings we have analysed 14 waves in stadiums with above 50.000 people: the wave has a typical velocity in the range of 12mh (20 seatds), a width of about 6-12m (-15 seats) and more frequently rolls in the clockwise direction [9]. It is generated by the simultaneous standing up of not more than a few dozens of people and subsequently expands over the entire tribune acquiring its stable, close to linear shape. (see page http://angel.elte.hu/wave) dedicated to this research, offering firther data and interactive simulations). The relative simplicity of the Mexican wave allows us to develop a quantitative treatment of this kind of collective behaviour by building and simulating models accurately reproducing and predicting the details of the associated human wave. It can be shown that the well established approaches to the theoretical interpretation of excitable media - originally created for describing such processes as forest fires or wave propagation in heart tissue - can readily be generalized to include human social behaviour. In analogy with models of excitable media, people are regarded as excitable units: they can be activated by an external stimulus (a distance and direction-wise weighted concentration of nearby active people exceeding a threshold value). Once activated, each unit follows the same set of internal rules to pass through the active (standing and waving) and refiactoty (passive) phases before returning to its original, resting (excitable) state. Next, we employed these models to get an insight into the conditions for triggering a wave. Our results clearly demonstrate that the dependence of the eventual occurrence of a wave on the number of initiators is a rather sharply changing function, i.e., triggering a Mexican wave requires a critical mass. This approach is expected to have implications for the treatment of situations where influencing the behaviour of a crowd is desirable. In particular, in the context of violent street incidents associated with demonstrations or sport events, it is essential to know under what conditions groups can get control over the crowd and how fast and in which form this perturbatiodtransition can spread. In the above simulation, which was primarily aimed at interpreting the propagation of the wave, we assumed the presence of some level of asymmetry in the reaction of people from the very beginning. In our most recent work we went beyond this oversimplification of the triggering part of the whole phenomenon and undertook a more detailed study of the symmetry breaking process right during the first few tenth of the second of the wave's initiation. This delicate problem can be posed as follows: The observations suggest that almost exclusively there is a single wave in the stadium moving in a well defined direction which can be both clockwise or anticlockwise (with some overrepresentation of the clockwise direction). On the other hand, we know that the wave is triggered by a relatively small group of people who jump up in a
298
synchronized but spontaneous manner, clearly not representing a well defined asymmetry, the surrounding people are able to quickly “decide” to which direction the wave should propagate. Obviously, there must be a mechanism, an instability, which results in the breaking of the left and right symmetry and the long term survival of a single propagating wave. We propose that he most likely source of this instability is the tendency of people to follow (or rather, try to guess) the direction of the motion of the wave as a whole. In a way, they “measure” the velocity of the centre of mass of the activity and act respectively. In particular, they essentially react to the horizontal (x) component of the velocity and tend to become more excited if the wave moves towards them then away from them. Taking into account such an effect is almost straightforward. Stimuli arriving from the neighbours to a given spectator should be corrected by a multiplicative term which is larger if the centre of mass of the wave approaches this spectator and is smaller than this value otherwise. In practice this can be done by choosing this weight in such a way that it is equal to 1 if the wave is approaching and decays exponentially (as a function of v) for velocities of the opposite direction (Fig. 7).
Figure 7. Spontaneous symmetry breaking during the emergence of a Mexican wave. In this simulation it was assumed that the activity of a given spectator depends on the global direction of the initially slighly asymmetric wave.
Thus, the interpretation of the instability is the following: the group of people jumping up at the very beginning is not completely symmetric as far as concerning their level of activity (there is a distribution - even if it is over a short time interval - of the moments when they eventually start jumping up). According to the excitable media model the above small initial asymmetry leads to a slight change in the position of the centre of activity even after a short time. This relocation will be perceived by the spectators, who will react to it, and become more excited if they find that the small initial relocation is in their direction. Consequently, a higher level of the initial asymmetry is produced and resulting in a single well developed wave. Of course, all this depends on the level to which the particles differentiate between approaching and leaving waves which is taken into account by the decay rate of the velocity dependent weight function. For very small differentiation (slow decay) we have symmetric solutions (two waves moving in opposite directions, with zero velocity of the centre of mass), however, beyond a critical value, the instability comes into play, and in a manner, being in analogy with a bifurcations, we end up with a single wave, moving either to the left or to the right.
299
Several relevant comments should be made at this point. i) The position of the centre of mass of a wave is its “global” feature. While previously we considered only local interactions (within a given neighbourhood of a particle), the symmetry breaking is now interpreted as resulting from sensing also the wave as a whole. Apparently, tere are two, different, but simultaneously processed sources of the level of excitement. We could not design a model with appropriate behaviour without including both aspects and therefore, we think, the interplay of local and global effects of a crowd represent an essential feature of such groups of people. ii) The source of this global interaction is “social”, the spectators would not like to belong to an unsuccessful attempt to cany on a wave. They carefully monitor in which direction the wave tends to move, and try to avoid joining if they feel the wave is moving away and they would end up in a funny situation when jumping up in the middle of a wave just about to become extinct. iii) Our refined model still suffers from some undesirable features. For example, in some of the cases (for a narrow region of the parameters), it suggests that even in the symmetry breaking regime initially two waves are moving in opposite direction and one of them dies out only after some time (perhaps this occurs in stadiums, there is not enough evidence available to decide this). In very rare cses even spiral type solutions appear, which are common in excitable media models but seem to be unrealistic for Mexican wave situations.
6. Conclusions The models of collective behaviour of humans can account for a number of specific features of social behavior under certain conditions. The advantage of the models is that by changing the parameters different situations can be easily created. Models adequately describing group phenomena can be used for predictions. In addition to such more concrete possible applications of simulations as the design of escape routes or better networks, the models are useful in providing a deeper insight into the mechanisms behind such collective phenomena as synchronizationor panic.
Acknowledgements: The investigations reviewed in this contribution have been carried out in cooperation with a number of colleagues to whom I am grateful. In particular, I would like to thank my principal collaborators, I. Farkas, D. Helbing and Z. NBda. Some of the most recent simulations on panic were made by D. De Weerdt. The above research on collective human behaviour has been in part supported by HNSF (OTKA) No T034995. References 1. P. Ball, Critical Mass: How one thing leads to another (Random House, London, 2004) 2. D. Helbing, Traffic and related self-driven many particle systems, (2001) Rev. Mod. Phys., 73, 1067-1141 3. R. Mantegna and H. E. Stanley, An introduction to econophysics, (2001) Cambridge University Press, Cambridge
300 4. M. Schreckenberg and S. D. Sarma. Pedestrian and Evacuation Dynamics (PED), (2001), Springer, Berlin 5 . Vicsek, T. (2001) ed. Scaling and Fluctuations in Biology,Oxford University Press, Oxford 6. D. Helbing, I. Farkas. and T. Vicsek, T. (2000) Simulating dynamical features of escape panic Nature 407,487-490 7. A. P. Pikovsky, M. Rosemblum and J. Kurths, Synchronization, (2001) Cambridge University Press, Cambridge 8. Z. NBda, Ravasz, E., Y. Brechet, T. Vicsek and A-L. Barabhi, (2000) The sound of many hands clapping Nature, 403, 849-851 9. I. Farkas, D. Helbing and T. Vicsek, (2002), Mexican waves in an excitable medium Nature 419, 131-132
MONTE CARL0 SIMULATIONS OF OPINION DYNAMICS
S. FORTUNATO Fakultat fur Physik, Universitat Bielefeld, 0-33501, Bielefeld, Germany and Dipartimento di Fisica e Astronomia and INFN sezione d i Catania, Universith d i Catania, Catania I-95123, Italy E-mail: [email protected] We briefly introduce a new promising field of applications of statistical physics, opinion dynamics, where the systems at study are social groups or communities and the atoms/spins are the individuals (or agents) belonging to such groups. The opinion of each agent is modeled by a number, integer or real, and simple rules determine how the opinions vary as a consequence of discussions between people. Monte Carlo simulations of consensus models lead to patterns of self-organization among the agents which fairly well reproduce the trends observed in real social systems.
1. Introduction Statistical physics teaches us that, even when it is impossible t o foresee what a single particle will do, one can often predict how a sufficiently large number of particles will behave, in spite of the eventually large differences between the variables describing the state of the individual particles. This principle holds, to some extent, for human societies too. It is nearly impossible to predict when one person will die, as the death depends on many factors, most of which are hard to control: nevertheless statistics of the mortality rates of large populations are stable for long times and have been studied for over three centuries. We then come t o the crucial question: Can one describe social behaviour through statistical physics?
The question is tricky, and bound to trigger hot debates within the physics community. On the one hand, society is made of many individuals which interact mostly locally with each other, like in classical statistical mechanical systems. On the other hand, social interactions are not mechanical and are hardly reproducible. However we expect that the aspects of collective behaviour and self-organization in a society may be reasonably well described by means of simple statistical mechanical models and by now several such models have been introduced and analyzed, giving rise to the new field of s o c i o p h y ~ i c s ~ ~ ~ ~ ~ .
301
302 In this contribution we shall concentrate on opinion dynamics. The spread and evolution of opinions in a society has always been a central topic in sociology, politics and economics. One is especially interested in understanding the mechanisms which favour (or hinder) the agreement among people of different opinions and/or the diffusion of new ideas. Early mathematical models of opinion dynamics date back to the ~ O ’ S but , the starting point for quantitative investigations in this direction is marked by the theory of social impact proposed by Bibb Latank4. The impact is a measure of the influence exerted on a single individual by those agents which interact with him/her (social neighbours). Models based on social impact5 were among the first microscopic models of opinion dynamics. They are basically cellular automata, where one starts by assigning, usually at random, a set of numbers t o any of the N agents of a community. One of these numbers is the opinion, the others describe specific features of the agents, like persuasiveness, supportiveness, tolerance, etc. Society is modeled as a graph, and each agent interacts with its geometric neighbours, which represent friends or close relatives. The procedure is iterative: at each iteration one takes a set of interacting agents and updates their opinions (or just the opinion of a single agent), according to a simple dynamical rule. After many iterations, the system usually reaches a state of static or dynamic equilibrium, where the distribution of the opinions among the agents does not change shape, even if the agents themselves still change their mind. The dynamics usually favours the agreement of groups of agents about the same opinion, so that one ends up with just a few opinions in the final state. In particular it is possible that all agents share the same opinion (consensus), or that they split in two or more factions. Most results on opinion dynamics derive from Monte Car10 simulations of the corresponding cellular automata. We shall here shortly present two basic consensus models: the Bounded Confidence Model (BCM)6-7and the Sznajd Model’ (SM). For a complete exposition of the recent results on these models we refer to12113. Due t o lack of space we are forced t o omit the discussion of other important classes of opinion dynamics, like the voter modelsg, the majority rule models” and the Axelrod model”. 2. The Bounded Confidence Model
The BCM is based on the simple consideration that two persons usually discuss with each other about a topic only if their opinions on that topic are quite close t o each other, otherwise they quarrel or avoid discussing. This can be easily modeled by introducing a parameter E , called confidence bound, and by checking whether the opinions si and s j of two social neighbours i and j differ from each other by less than E . If this were the case, we say that the opinions of the two agents are compatible and they can start a conversation which may lead to variations of their opinions. Opinions can be integers or real numbers; the opinions are initially distributed at random among the agents. The number of opinion clusters n, in the
303 final configuration depends on the confidence bound: if E is small, n, is roughly 116; above some threshold E, the system attains consensus. There are two main versions of the BCM, which are characterized by two different dynamical rules of opinion updating: the consensus model of Deffuant et al.' (D) and that of KrauseHegselmann' (KH). Here we shall discuss the latter.
2.1. Krause-Hegselmann
25
,,,,
1=1 1
1
1
1=3
, , , ,
0
0 010203040506070809 1 @,"on
'
0 010203040506070809 1
*
@,"lo" 8
1=12
6
Bw
4w m
1YX 1wO YX
0 0 010203040508070809 1
0 01020304 0506070809 1
cchans
Figure 1. Time evolution (in Monte Carlo steps per agent) of the opinion distribution of the KH model for a society where everybody talks to everybody else. The number of agents is 10000, E = 0.13. The agents form three different factions in the final state.
The iteration of the KH model in the case of real-valued opinions consists of the following three steps:
(1) An agent A is selected, sequentially or at random; (2) One checks which of the neighbours of A have opinions compatible with that of A. (3) The new opinion of A is the average of the opinions of its compatible neighbours. The dynamics of the model is not trivial because the opinion space is bounded (typically [0, 11): in fact, the inhomogeneities at the edges determine density variations in the opinion distribution, which propagate towards the center (Fig. 1). For integer opinions, the update rule is even simpler'*: agent A takes the opinion of one of its compatible neighbours, chosen at random. This rule recalls that of the voterg and Axelrod" models. In a society where everybody talks to everybody else, if there are Q possible choices for the agents and the condition of compatibility for two opinions Si and Sj is (Si - Sjl 5 1, the community always reaches consensus
304 provided Q 5 7.
3. The Sznajd Model The SM is probably the most studied consensus model of the last years. The reasons of its success are the intuitive “convincing rule” and the deep relationship with spin models like Ising. One starts with a simple remark: an individual is more easily convinced to change its mind if more than just a single person try to persuade him/her. So, if two or more of our friends share the same view about some issue, it is likely that they will convince us to accept that view, sooner or later.
lea5 1m
10
1oW
IWW
1WXO
M186
Figure 2. Histogram of the fraction of candidates receiving a given number of votes for 1998 election in the state of Minas Gerais (Brazil). A simple election model based on Sznajd opinion dynamics reproduces well the central pattern of the data. The data points are indicated by x, the results of the election model by (from Ref. 15).
+
In the most common implementation of the model, a group of neighbouring agents which happen to share the same opinion imposes this opinion to all their neighbours. The “convincing” pool of friends can be a pair of nearest-neighbours on a graph, or groups of three or more neighbours like triads on networks or plaquettes on a lattice. One usually starts from a random distribution of opinions among the agents, with a fraction p of agents sharing the opinion $1 (the rest of the agents having opinion -1). In the absence of perturbing factors like noise, the state of the system always converges towards consensus and a phase transition is observed as a function of the initial concentration p : for p < 1/2 (> 1/2) all agents end up with opinion -1 (+I). Since the original formulation of the model8, for a one-dimensional chain of agents, countless refinements have been made, which concern the type of graph, the updating rule, the introduction of external factors like a social temperature, advertising and ageing, etc. (for more details eel','^).
305 T h e Sznajd dynamics has been used to devise simple election models which (Fig. 2): reproduce t h e bulk behaviour of votes distributions of real this is at present t h e strongest validation of t h e SM.
4. Conclusions Sociophysics and in particular opinion dynamics are moving their first steps, and there is still a lot to do. Nevertheless the first results are encouraging and t h e hope to explain in this way t h e collective behaviour of social systems is strong. For the future i t is necessary to gather more d a t a from real systems and to open collaborations with sociologists.
Acknowledgments
I thank D. Stauffer for letting me discover this fascinating field and t h e Volkswagen Foundation for financial support. References 1. E. Callen and D. Shapiro, Physics Today, 23 (July 1974). 2. S. Galam, Y. Gefen and Y. Shapir, J. Math. SOC.9, 1 (1982). 3. W. Weidlich, Sociodynamics: A Systematic Approach to Mathematical Modelling in the Social Sciences, Harwood Academic Publishers, 2000. 4. B. LatanB, Am. Psychologist 36, 343 (1981). 5. J . A. Holyst, K. Kacperski and F. Schweitzer, in Annual Reviews of Computational Physics, vol. 9, D. Stauffer ed., World Scientific, 253 (2001). 6. R. Hegselmann and U. Krause, J . Art. SOC. Soc. Sim. 5, issue 3, paper 2 (jasss.soc.surrey.ac.uk)(2002). U. Krause, p.37 in Modellierung und Simulation won Dynamiken mit vielen interugierenden Akteuren, U. Krause ed., M. Stockler, Bremen University, Jan. 1997. 7. G. Deffuant, D. Neau, F. Amblard and G. Weisbuch: Adv. Compl. Syst. 3, 87 (2000). G . Deffuant, F. Amblard, G. Weisbuch and T. Faure, J. Art. SOC.SOC.Sam. 5, issue 4, paper 1 (jasss.soc.surrey.ac.uk) (2002). G. Weisbuch, Eur. Phys. J . B 38, 339 (2004). F. Amblard and G. Defiant, Physica A 343, 725 (2004). 8. K. Sznajd-Weron, and J. Sznajd, Int. J. Mod. Phys. C 11,1157 (2000). 9. P.I.Krapivsky and S. Redner, Phys. Rev. Lett. 90, 238701 (2003). L.R. Fontes, R.H. Schonmann and V. Sidoravicius, Comm. Math. Phys. 228, 495 (2002). F. Wu and B. Huberman, cond-mat/0407252 at www.arXiv.org. C. Castellano, D. Vilone and A. Vespignani, Europhys. Lett. 63, 153 (2003). 10. S. Galam, J . Stat. Phys. 61,943 (1990). S . Galam, Physica A 336, 56 (2004). P. Chen and S. Redner, cond-mat/0408219 at www.arXiv.org. 11. R. Axelrod, J. Conflict Resolut. 41,203 (1997). 12. S. Fortunato and D. Stauffer, in Extreme Events in Nature and Society, S . Albeverio, V. Jentsch and H. Kantz eds., Springer Verlag, Berlin-Heidelberg (2005). 13. D. Stauffer, AIP Conference Proceedings 690, 147 (2003). 14. S. Fortunato, Int. J. Mod. Phys. C 1 5 , 1021 (2004). 15. A. T. Bernard-, D. Stauffer and J. Kertkz, Eur. Phys. J. B 25, 123 (2002). 16. M. C. Gonzdez, A. 0. Sousa and H. J. Herrmann, Int. J. Mod. Phys. C 1 5 , 4 5 (2004).
A MERTON-LIKE APPROACH TO PRICING DEBT BASED ON A NON-GAUSSIAN ASSET MODEL
LISA BORLAND, JEREMY EVNINE AND BENOIT POCHART Eunine- Vaughan Associates, Inc.
456 Montgomery Street, Suite 800 San Francisco, CA 94133, USA E-mail: CsaOeuafunds. com
We propose a generalization to Merton’s model for evaluating credit spreads. In his original work, a company’s assets were assumed to follow a log-normal process. We introduce fat tails and skew into this model, along the same lines as in the option pricing model of Borland and Bouchaud (2004, Quantitative Finance 4) and illustrate the effects of each component. Preliminary empirical results indicate that this model fits well to empirically observed credit spreads with a parameterization that also matched observed stock return distributions and option prices.
1. Introduction Recent years have brought with them a vastly growing interest in credit markets and the related credit derivatives markets. The amounts trading on these markets are growing rapidly. As the liquidity has increased, so has the interest in models which can price credit risk. Indeed, one of the most esteemed pricing models was proposed already in 1974 by Merton although it is only recently that the credit markets have become liquid enough and transparent enough to actually use these models to price the yield on risky corporate bonds, whose spread to Treasuries is also called the credit spread. It is quite interesting to note, as Miller and Modigliani pointed out in 1958 ’, that there is a relationship between a firm’s debt and its equity in that the sum of the two are equal to the total assets of the company. Therefore, a company should be indifferent to how it finances its projects, via debt or equity. On the other hand, the equity markets and bond markets are so much more developed than the credit and credit derivatives markets. This discrepancy together with the intimate relationship between the different markets implies that there could be arbitrage opportunities if relative mispricings exist between the fair value of the debt relative to the fair value of the equity. Such a capital structure arbitrage has been a very popular trading strategy, especially in the early 2000’s. In reality, the capital structure of a company is quite complex. For example, corporate debt has different levels of seniority, which defines the order in which the debt holders get paid. Clearly, corporate debt is risky since there is a chance that the company will default before being able to pay back its debt. Equity holders
’,
306
307 get paid after all debt has been repaid, but even among stock holders there are different priority levels. However, as Merton pointed out ’, in the simple case where a company has just one type of debt outstanding, the stock price can be seen as a call option on the total underlying assets, struck at the face value of the debt. Likewise, the corporate bond can be interpreted as long a riskless bound plus a short put on the assets, struck at the face value of the debt. Therefore, using the Black, Scholes and Merton option pricing theory it is possible to price corporate debt with the techniques of option pricing. Merton’s structural model, and similar models which derive from Merton’s ideas, constitute a widely used framework for pricing debt and credit risk, and for obtaining estimates of the risk-neutral probability of default of a company. The basic strength of Merton’s model lies in the ability to price the debt as an option on the underlying asset process, via standard tools of option pricing theory. The approach is complicated by the fact that in reality, the asset process is not observable and must be backed out from the stock process. It is typical that a log-normal distribution is assumed for the asset returns, and that the driving noise for both the asset and the stock dynamics follows a standard Brownian process. However, it is quite well-known that the empirical distribution of log returns is highly non-Gaussian 5 , 6 , exhibiting fat tails which are neglected in the standard Black, Scholes and Merton option pricing theory. Indeed, in options markets one observes that the standard theory underestimates the prices of options at strikes below and above the current stock prices. This means that one must use a higher volatility parameter in conjunction with the Black-Scholes-Merton theory in order to correctly price the options. A plot of this volatility versus the strike price generally forms a concave function, rather than a straight line which would be the case if the model was perfect. There have been several attempts in the literature to accommodate this fact, including stochastic volatility models, multifractal models, local volatility models, models that assume fat-tailed random noise such as Levy noise, GARCH-like models and recent multi-timescale models (for an up-to-date review see 7). However, these models are often quite complex and the simplicity of the Black-Scholes-Merton approach is lost. A unique martingale measure is typically not found, neither are closed-form solutions. An alternative approach which one of us proposed recently relies on modeling the noise as a statistical feedback process which yields non-Gaussian distributions for the stock returns yet maintains many useful features of the standard theory. In particular, closed form solutions which generalize the Black-Scholes-Merton model are found. These incorporate both fat tails and skew, two features of real returns which are absent in the standard theory. Since the non-Gaussian theory has shown some success in being able to price options on the underlying equity in a parsimonious fashion which matches well to empirical observations, our goal in this paper is to explore whether the same can be said about pricing credit. The basic notion is to extend Merton’s model for 394,
899910
308 pricing credit risk into the non-Gaussian framework. We shall then explore the effects of introducing fat tails and skew into the model. Finally we shall report some preliminary empirical results which indeed lend support to our approach. 2. Merton’s model
We proceed with a brief review of Merton’s model. He developed a structural model relating the equity and debt markets by thinking of both stocks and bonds as being contingent claims on the same underlying, namely the assets of the company. His model assumed the assets A follow a standard log-normal process,
d A = pAdt
+ uAAdw
(1)
where w is a standard Brownian noise, 6- correlated in time. As follows from the theorem of Miller and Modigliani, the value of the firm in invariant to its capital structure or in other words, that A is equal t o the sum of its debt D and stock (equity) S, such that the only relevant variable is the leverage L = D / A . Furthermore, one assumes that it is possible t o continuously trade the assets. (Note that this is unrealistic in reality the stock and the bonds are traded, but not the assets themselves.) The interest rate T is assumed constant over time. In Merton’s model, the scenario is such that the company has issued bonds of face-value D that will become due at a future time T . These bonds are assumed t o be zero-coupon which means that there are no intermediate payments before expiry. In addition, the company has issued equity with no dividends payments. If at time T , the value of the assets A is less than the promised debt payment D , then the company defaults. The debt holders receive A < D and the equity holders receive nothing. If instead A > D, the debt holders get D and the stock holders get A - D . The equity of the company can therefore be seen as a European call option on the assets with maturity T and a strike price equal t o D . In the standard framework this is just given by the closed-form Black-Scholes-Merton call option pricing formula (cf 1 1 ) , namely So = AoN(d1) - D e C T N ( d 2 )
where
and N ( d ) is the cumulative normal distribution calculated up to the limit d. Note that fi = DePrT is the present value of the promised debt payment, so this expression can be expressed in terms of the leverage L = b / A o , namely So = Ao(N(d1) - L N ( d 2 ) )
(5)
309 This expression depends on A0 and the asset volatility U A , both of which are unobservable. However, as shown by Jones et al (cf 12), Ito’s lemma can be used to determine the instantaneous asset volatility from the equity volatility leading to
Equations (2) and (6) allow the asset value Ao and the asset volatility U A to be backed out as functions of the quantities SO,U S , T and L. Utilizing A = D S , the present value of the debt can be thus calculated straightforwardly as
+
Do = Ao - So
(7)
where So is given by Eq(2). Now, not that DO can alternatively be expressed as
Do = eWYTD
(8)
Namely as the promised debt payment discounted by some yield y. The difference between this yield y and the risk-free rate r defines the credit spread s on the debt, namely s=y-r
(9)
This is a very interesting quantity, because it can be shown to be roughly equivalent to premium on a the highly traded credit derivative called a Credit Default Swap (CDS). This is a transaction in which one party (the protection buyer) pays an annual premium to another party (the protection seller) who basically offers insurance on the bond of a company. If a default event (which is carefully defined) occurs, the protection seller delivers par to the protection buyer and receives the defaulted bond in exchange. (There are well defined criteria associated with this, somewhat similar t o the delivery options in a bond futures contract. Cash settled versions of this contract also exist.) 3. A non-Gaussian approach
The driving noise of the asset process in Merton’s model is Gaussian, and the related equity process is also a Gaussian process with a constant volatility US which can be related to the asset volatility U A via Eq (6). Thus, the process for stock returns is a standard lognormal one, implying that the stock log-returns are normally distributed across all time scales. Although this is a standard assumption in much of mathematical finance, leading to many interesting and useful results, it is surprisingly far from what one observes empirically. Indeed, real stock returns exhibit fat tails and peaked centers, only slowly converging to a Gaussian distribution as the time scale increases 5 , 6 . One model which generalizes the standard model in a way more consistent with empirical observations has recently been proposed by one of us In the spirit of that model, we generalize the asset price dynamics to follow a non-Gaussian statistical feedback process with skew namely 879,10.
dA = pAdt + uAAadC2
(10)
310
dR = P v (R)&
(11)
is the rate of return of the firm and Here w is a standard Brownian noise, u is a variance parameter. The parameter Q introduces an asymmetry into the model. R evolves according to a statistical feedback process 13, where the probability distribution P evolves according to a nonlinear Fokker-Planck equation
This diffusion equation maximizes the Tsallis entropy of index q 14J5. Equation (12) can be solved exactly, leading, when the initial condition on P is a P ( R , t = 0) = 6(52), to a Tsallis distribution (equivalent to a Student distribution if q = (3 n)/(l n), where n corresponds to the integer degrees of freedom 16):
+
+
1 P = -(1 Z(t)
+ (q - l)P(t)CP(t))-*
with
and Z ( t ) = ((2 - q)(3 - q ) cqt)3-9
(15)
where the q-dependent constant cq = &(I?'(& - ; ))/(I"(-&)). Eq. (13) recovers a Gaussian in the limit q + 1 while exhibiting power law tails for all q > 1. The index q controls the feedback into the system. For q > 1 rare events (small P ) will give rise to large fluctuations, whereas more common events (larger P ) yield more moderate fluctuations. Pricing options based on this type of model was solved in previous work ',lo. For q = Q = 1, the standard Black-Scholes-Merton model is recovered. For q = 1 but general Q < 1, the model reduces to the Constant Elasticity of Variance (CEV) model of Cox and Ross (cf lo). Closed form solutions for European call options based on this model were found. Evaluating the equity as a call option on the asset process struck at the debt D , we thus obtain
with xf, d l and dz defined in the Appendix. Just as in the standard Merton framework, it is possible to show that the relationship Eq (6) still holds in the non-Gaussian case. Though slightly more complicated, it is still possible to back out A0 and CTAfrom Eq (16) and Eq (6). The
311 debt can then be calculated as DO= A. - SO,with SOas in Eq (16), and the credit spread can also be found as in Eq (9), namely 1
DO
(17) De-TT 1 The main difference is that now the spread is parameterized by q and a , Because Do is based on So of Eq (16) allowing us t o explore the effects of fat tails ( q > 1) and skew (a < 1). Sn,=
= y - T = --log(-
T
0.37 -
-
+ q=l, alpha=O + .-
-.-
0.32 -
0.27
-
0.22
-
0.17
-
0.12
q=l.2, alpha=l q=l.4, alpha=l q=l.4, alpha=0.5
I
I
I
I
I
I
I
0.4
0.6
0.8
1 .o
1.2
1.4
1.6
d Figure 1. Inplied volatility of the credit spread as a function of d = D / A o for various values of q (the tail) and 01 (the skew).
4. Results and Discussion
Based on the standard Merton model, many practitioners would calculate the fair price of the credit spread from Eq (9). In order t o obtain a value of s that matched
312 the market value of the traded CDS, they would have to use a particular value of us. This implied us should in theory be the real volatility of the stock. In addition, it should theoretically also be the correct volatility t o price options of the stock. But in practice, it is already known that it is very difficult to find a single value of US that can price all stock options, let alone both stock options and credit spreads. In our previous work, we showed that indeed by introducing tails and skew into the underlying stock process, it was in many cases possible to find a single value of us that matched well the underlying stock distribution, as well as observed stock option prices. The question we now ask is whether the values of q and a that fit well to the options market couId also well-model observed spreads, or CDS prices. To answer this question we cite preliminary results l7 where the parameters which best fit market spreads across a sample of 54 companies were implied based on a least squares fit between the theoretical spread values and market values. In that analysis, all the observables which enter the spread calculation were taken from reliable data sources as described in ”. Certain assumptions were made in order to map the complex capital structure of Real companies onto the simplified single-debt scenario discussed here. The way in which this was done followed the lines of 12, where a weighted average of all outstanding debt was used to create a synthetic bond with a single face value and expiry. Interestingly, the parameter q was found to be in the range q = 1.2 to q = 1.4,and on average the parameter was a = 0.3 with little variation. These results are quite encouraging, since typical values that are found to fit option markets are very close to these values. An example is given in lo where stock options on MSFT stocks are found to be well fit by q M 1.4 and a M 0.3, with u M 0.3 as well. (Note that in that example 0s was also implied from the option prices, whereas in the current example, 0s was calculated from historical observations for each entity). Finally, we show a figure (Figure 1) by way of which we want t o give a feel for the effect of tails ( 4 ) and skew ( a )on the credit spread values. We used r = 4%, 0 s = 20010, T = 1 year and A0 = 100 (in arbitrary monetary units). We varied the ratio d = D/Ao from 0.5 to 1.5 and calculated the spread according to Eq (17) for various combinations of q and a , calibrating them by adjusting us so that all curves coincide for d = 1. Then, the corresponding volatility which would have been needed for the standard Merton model (q = a = 1) of Eq (9) t o reproduce the same values of the spread were backed out. These implied volatilities were then plotted out as a function of d. Similar in spirit to the implied volatility smiles commonly used t o depict the deviations that tails and skew have on option prices relative t o the standard Black-Scholes-Merton model, these curves can give an intuition as to how the spread values vary relative to the standard log-normal Merton model. The standard model corresponds to q = 1 and a = 1. The effect of increasing the tails a bit can be seen in the curves corresponding to q = 1.2 and a = 1: the standard Merton model undervalues leverage ratios greater and less than 1. As q increases (see the curve for q = 1.4 this effect is enhanced. The curve q = 1 and
313
a = 0 depicts the effect of skew, and corresponds t o a CEV type Merton model. The standard Merton model overvalues spreads if there is a debt-to asset ratio greater than 1, and undervalues it otherwise. The effect of both tails and skew can be seen in the other curves. The observed behaviour is consistent with the intuition that fat tails correspond t o higher volatilities with respect t o a Gaussian model, and increased left skew would correspond to higher volatilities for d < 1, and relatively lower volatilities for d > 1. It is quite easy to visualize where the preliminary empirical results of q = 1.3 and a = 0.3 l7 would lie in this curve. However, further studies must be done to analyze the CDS values systematically as a function of the ratio d. In the initial study, the results reported were as average over all the different companies, each one which had its own capital structure and leverage ratio. However, encouraged by the fact that the parameters which best describe the CDS spreads are in the same ballpark as those which well-fit stock options as empirical return distributions, it might be worth while to push a little further along the path we have proposed, a project we are currently pursuing.
5 . Appendix
xp is a function of flp given by a02
a@)
x(T) = UflF - 2
+ b(p)flF + c(5?)fl$ 1
+ d(?)fl,
and T = (e2(a-')'T1)/(2(a - 1)~).Pq is given by equation Eq (13) evaluated at t = T . The coefficients a , b, c and d are given by
g)cq)3-qT3--q. The payoff condition AT = D yields a quadratic equation with the roots
di,z =
N 5 J N 2 - 4MR
2M
314
with N = -d
R= These results are as in
(De-'T/Ao)l--a - 1 1-a
(De-'T/Ao)l-" - 1 1-a
+U-b-
2
a2 2
fa-
lo.
References 1. R.C. Merton, On the Pricing of Corporate Debt: The Risk Structure of Interest Rates, Journal of Finance 29, 449-470, (May 1974). 2. F. Modigliani and M.H. Miller, The Cost of Capital, Corporation Finance and the Theory of Investment, American Economic Review 48 (3) , 261-297 (1958). 3. F. Black and M. Scholes, The Pricing of Options and Corporate Liabilities, Journal of Political Economy 81, 637-659, (1973). 4. R.C. Merton, Theory of Rational Option Pricing, Bell Journal of Economics and Munagement Science 4 , 143-182, (1973). 5. J.-P. Bouchaud and M. Potters, Theory of Financial Risks and Derivative Pricing (Cambridge: Cambridge University Press), 2nd Edition (2004). 6. P. Gopikrishnan, V. Plerou, L.A. Nunes Amaral, M. Meyer and H.E. Stanley, Scaling of the distribution of fluctuations of financial market indices, Phys. Rev. E 60, 5305, (1999). 7. L. Borland, J.P. Bouchaud, J.F. Muzy and G. Zumbach, The dynamics of financial markets Mandelbrot's cascades and beyond, to appear in Wilmott Magazine, March 2005, cond-mat/0501292 (2005). 8. L. Borland, Option Pricing Formulas based on a nowGaussian Stock Price Model, Phys. Rev. Lett. 89 N9, 098701, (2002). 9. L. Borland, A Theory of nowGaussian Option Pricing, Quantitative Finance 2, 4 1 5 431, (2002). 10. L.. Borland and J.P. Bouchaud, A Non-Gaussian Option Pricing Model with Skew,Quuntitative Finance 4, 499-514 (2004). 11. Hull J.C., Options, Futures, and other Derivatives, Third Edition, Prentice-Hall, (1997). 12. J. Hull, I. Nelken, A. White, Merton's Model, Credit Risk, and Volatility Skews, working paper, (2004). 13. L. Borland, Microscopic dynamics of the nonlinear Fokker-Planck equation: a phenomenological model, Phys. Rev. E 57, 6634, (1998). 14. Tsallis C., J. Stat. Phys. 52, 479, 1988; Curado E.M.F. and Tsallis C., J. Phys. A 24 L69, 1991; 24, 3187, 1991; 25 1019, (1992). 15. C. Tsallis and D.J. Bukman, Anomalous diffusion in the presence of external forces: Exact timedependent solutions and their thermo statistical basis, Phys. Rev. E 54, R2197, (1996). 16. A.M.C. de Souza and C. Tsallis, Student's t- and r-distributions: Unified derivation from an entropic variational principle, Physica A 236, 52-57, (1997). 17. R. Chirayathumadom, R.Z. George, V. Balla, D. Bhagchandka, N. Shah and K. Shah, Capital Structure Arbitrage using a non-Gaussian approach, Investment Practice project report MS and E 444, Stanford University, (June 2004).
THE SUBTLE NATURE OF MARKET EFFICIENCY
JEAN-PHILIPPE BOUCHAUD*vi
* Science & Finance, Capital Fund Management, 6-8 Bd Haussmann, 75 009, France SPEC, Commissariat iL 1’Energie Atomique, Orme des Merisiers, 91191 Gif-sur-Yvette CEDEX, France It is known since Bachelier 1900 that price changes are nearly uncorrelated, leading to a random-walk like behaviour of prices. We provide evidence for a very subtle compensa, tion mechanism that underlies the ‘random’nature of price changes. This compensation drives the market close to a critical point, which may explain the sensitivity of financial markets to small perturbations, and its propensity to enter bubbles and crashes. We argue that the resulting unpredictability of price changes is very far from the neo-classical view that markets are informationally efficient.
1. Introduction Statistical models that describe these fluctuations have a long history, which dates back to Bachelier’s “Brownian walk” model for speculative prices first published in in 1900 l. Despite its shortcomings (absence of fat-tails and volatility clustering effects), the model of Bachelier gets one important fact right: price changes are to a first approximation uncorrelated, which makes the prediction of stock markets difficult. However, the mechanisms removing (nearly) all predictability from price changes from much more predictable human behaviour have not been investigated in details until the past few years. The availability of high-frequency, trade by trade data on the one hand, and the shift of paradigm from efficient markets by fiat to agent based, bounded rationality models, have motivated a series of exciting studies that most probably anticipate an important revolution of ideas in economics. The aim of this paper is provide evidence for a very subtle compensation mechanism that underlies the ‘random’ nature of price changes. This compensation drives the market close to a critical point, a possibility conjectured in different contexts to explain the presence of power-laws and scale invariance in the statistics of financial time series. The proximity of a critical point might also explain the enhanced sensitivity of financial markets to small perturbations, and its propensity to enter bubbles and crashes. We argue that the resulting unpredictability of price changes is very far from the neo-classical view that markets are informationally efficient. Why are prices changes nearly uncorrelated, as postulated by Bachelier ? According to the Efficient Market Hypothesis (EMH), all available information is included in price, which emerges at all times from the consensus between fully rational
315
316 agents, that would otherwise immediately arbitrage away any deviation from the fair price 233. Price changes can then only be the result of un-anticipated news and are by definition totally unpredictable. However, as pointed out by Shiller, the observed volatility of markets is far too high to be compatible with the idea of fully rational pricing 4 . More fundamentally, the assumption of rational, perfectly informed agents seems intuitively much too strong, and has been criticized by many 5,6 *7. There is a model at the other extreme of the spectrum where prices also follow a pure random walk, but for a totally different reason Assume that agents, instead of being fully rational, have zero intelligence and take random decisions to buy or to sell, but that their action is interpreted by all the others agents as potentially containing some information. Then, the mere fact of buying (or selling) typically leads to a change of the ask a ( t ) (or bid b ( t ) ) price and hence of a change of the midpoint m(t) = [a(t)+ b ( t ) ] / 2 .In the absence of reliable information about the ‘true’ price, the new midpoint is immediately adopted by all other market participants as the new reference price around which new orders are launched. In this case, the midpoint will also follow a random walk (at least for sufficiently large times), even if trades are not motivated by any rational decision and devoid of meaningful information. Of course, reality should lie somewhere in the middle: clearly, the price cannot wander arbitrarily far from a reasonable value, and trades cannot all be random. Here, we want to argue, based on a series of detailed empirical results obtained on trade by trade data, that the random walk nature of prices is in fact highly non trivial and results from a fine-tuned competition between two populations of traders, liquidity providers (‘market-makers’) on the one hand, and liquidity takers. Liquidity providers act such as to create anti-persistence (or mean reversion) in price changes that would lead to a sub-difisive behaviour of the price, whereas liquidity takers’ action leads to long range persistence and super-diffusive behaviour. Both effects very precisely compensate and lead to an overall diffusive behaviour, at least to a first approximation, such that (statistical) arbitrage opportunities are absent, as expected. We argue that in a very precise sense, the market the dynamical compensation of two conflicting is sitting on a critical point tendencies is similar to other complex systems such as the heart 13, driven by two antagonist systems (sympathetic and para-sympathetic), or certain human tasks, such as balancing of a long stick 14. The latter example illustrates very clearly the idea of dynamical equilibrium, and shows how any small deviation from perfect balance may lead to strong instabilities. This near instability may well be at the origin of the fat tails and volatility clustering observed in financial data. Note that these two features are indeed present in the ‘balancing stick’ time series studied in 839310.
llilz;
14
2. The market response function and trade correlations
The last quote before a given trade allows one to define the sign of each trade: if the traded price is above the last midpoint m = ( a b ) / 2 , this means that the
+
317 trade was triggered by a market order to buy, and we will assign to that trade a variable E = +l. If, one the other hand the traded price is below the last midpoint m = ( a b ) / 2 , then E = -1. The simplest quantity to study is the average mean square fluctuation of the price between (trade) time n and n C:
+
+ W) = ((pn+e
-
Pd’)
.
(1)
As emphasized above, in the absence of any linear correlations between successive price changes, D(C) has a strictly diffusive behaviour, D(C) = DC. On liquid stocks one finds a remarkably linear behaviour for D ( l ) (see Fig. l), even for small C. The absence of linear correlations in price changes is compatible with the idea that (statistical) arbitrage opportunies are absent, even for high frequency trading. In order to better understand the impact of trading on price changes, one can study the following response f u n c t i o n R(t),defined as:
R ( 4 = ((pn+e - P,)
. ~ n ,)
(2)
where E~ is the sign of the n-th trade. The quantity R(C)measures how much, on average, the price moves up conditioned to a buy order at time 0 (or a sell order moves the price down) a time t later. We show in Fig. 1 the temporal structure of R(C)for France Telecom, for different periods. Note that R(C)increases by a factor N 2 between C = 1 and C = C* x 1000, before decreasing back. Including overnights allow one to probe larger values of C and confirm that R(C) decreases, and even becomes negative beyond C 2 5000. Similar results have been obtained for many different stocks as well. However, in some cases the maximum is not observed and rather R(C) keeps increasing mildly 11,12. The model discussed below does in fact allow for monotonous response functions. All the above results are compatible with a ‘zero intelligence’ picture of financial markets, where each trade is random in sign and shifts the price permanently, because all other participants update their evaluation of the stock price as a function of the last trade. This model of a totally random model of stock market is however qualitatively incorrect for the following reason. Although, as mentioned above, the statistics of price changes reveals very little temporal correlations, the correlation function of the sign E,, of the trades, surprisingly, reveals very slowly decaying correlations as a function of trade time, as discovered in More precisely, one can consider the following correlation function: 11!1591’.
If trades were random, one should observe that Co(C)decays to zero beyond a few trades. Surprisingly, this is not what happens: on the contrary, Co(C)is strong and decays very slowly toward zero, as an inverse power-law of C (see 11115112):
318
0.014
0.012
b
0.01
$.
0.m
A
0.
5 5
0.M
B 0.001
HA (2002) HA (2001
0.002
- Istsmsta)
HA(2001-2odsemsla)
0
Figure 1. Average response function R(k) for FT, during three different periods (black symbols). We have given error bars for the 2002 data. For the 2001 data, the y-axis has been rescaled to best collapse onto the 2002 data. Using the same rescaling factor, we have also shown the data for D ( t ) / t ] ' 1 2 ,which shows that (i) the process is indeed nearly diffusive and (ii) D R2, indicating a sort of Fluctuation-Response relation ".
-
The value of y seems to be somewhat stock dependent, but is consistently found to be smaller than unity, leading to a non integrable correlation function. This in general leads to super-diffusion, and is the main puzzle to elucidate: how can one reconcile the strong, slowly decaying correlations in the sign of the trades with the nearly diffusive nature of the price fluctuations, and the nearly structureless response function?
3. A micro-model of price fluctuations In order to understand the above results, we will postulate the fallowing trade superposition model, where the price at time n is written as a sum over all past trades, of the impact of one given trade propagated up to time n:
n'
n'
319 where Vn, is the volume of the n’th trade, f a certain concave function and GO(.)is the ‘bare’ impact function (or propagator) of a single trade. The vn are also random variables, assumed to be independent from the E~ and model all sources of price changes not described by the direct impact of the trades: the bidask can change as the result of some news, or of some order flow, in the absence of any trades. The bare impact function Go@)represents by definition the average impact of a single trade after C trades. In order to understand the temporal structure of Go(!), note that a single trade first impacts the midpoint by changing the bid (or the ask). But then the subsequent limit order flow due to that particular trade might either center on average around the new midpoint (in which case Go(C) would be constant), or, as we will argue below, tend to mean revert toward the previous midpoint (in which case Go([) decays with C). If the signs E~ were independent random variables, both the response function and the diffusion would be very easy to compute. For example, one would have: 16~17918911,
i.e. the observed impact function and the bare response function would be proportional. This case (no correlations between the E’S and a constant bare impact function) corresponds to the simplest possible zero intelligence market, where agents are memoryless, and the price is obviously a random walk. However, we have seen that in fact the E’S have long range correlations. In this case, the average response function reads:
R(C) = (f(V))Go(C) +
C
Go([
- n)Ci(n)+
O
C [Go(!+ n) - Go(n)]Ci(n), (7) n>O
where:
Ci(Q = (En+tEnf(Vn))c
(8)
If the impact Go is constant and Cl(n)decays as a power-law with an exponent y < 1, then the average impact R(C) should grow like Cl-7, and therefore be amplified by a very large factor as C increases, at variance with empirical data. Similarly, diffusion should be anomalously enhanced: D(C) C2-7, instead of Bachelier’s first law D(C) N C. The only way to resolve this paradox is to assume that Go@)itself should decay with time, in such a way to offset the amplification effect due to the trade correlations. If we make the ansatz that the bare impact function Go([) also decays as a power-law for large C, as C - p , then one can estimate D(C) and R in the large C limit. When y < 1, one finds D C2-’p--Y, provided /3 < 1. Therefore, the condition that the fluctuations are diffusive at long times imposes a relation between the decay of the sign autocorrelation y and the decay of the bare impact function p: /? = pc = (1 - 7)/2. For p > p,, the price is sub-diffusive, which means that price changes show anti-persistence; while for p < pc, the price is super-diffusive, i.e. price changes are persistent.
-
N
320
Figure 2. Theoretical impact function R(e), from Eq. (7), and for different values of p close to pc = 0.38. The shape of the empirical response function can be quite accurately reproduced using p = 0.42. Note that for p > pc, the response function actually becomes negative at long times, as indeed observed empirically for e > 5000.
For the response function, one finds for large C:
Therefore, only when p = pClis the prefactor exactly zero, and leads to the possibility of a nearly constant impact function. Since the dominant term is zero for the 'critical' case = pc, and since we are interested in the whole function R(C) (including the small C regime), one can compute R(C) numerically, by performing the discrete sum Eq. (7) exactly, and fitted it to the empirical response R. The value of is a fitting parameter: we show in Fig. 2. the response function computed for different values of in the vicinity of pC = 0.38. The results are compared with the empirical data for FT, showing that one can indeed satisfactorily reproduce, when p M pC, a weakly increasing impact function that reaches a maximum and then decays. One also sees, from Fig. 2, that the relation between p and y must be quite accurately satisfied, otherwise the response function shows a distinct upward trend (for p < pc) or a downward trend ( p > pC).
32 1 4. Discussion: Critical balance of market orders vs. limit orders Although trading occurs for a large variety of reasons, it is useful to recognize that traders organize in two broad categories: 0
0
One is that of ‘liquidity takers’, that trigger trades by putting in market orders. The motivation for this category of traders might be to take advantage of some ‘information’, and make a profit from correctly anticipating future price changes. Information can in fact be of very different nature: fundamental (firm based), macro-economical, political, statistical (based on regularities of price patterns), etc. Unfortunately, information is often hard to interpret correctly - except of course for insiders - and it is probable that many of these ‘information’ driven trades are misguided (on this point, see 20,21 and refs. therein.). For example, systematic hedge funds which take decisions based on statistical pattern recognition have a typical success rate of only 52%. There is no compelling reason to believe that the intuition of traders in markets room fares much better than that. Since market orders allows one to be immediately executed, many impatient investors, who want to liquidate their position, or hedge, etc. might be tempted to place market orders, even at the expense of the bid-ask spread s ( t ) = a ( t ) - b(t). The other category is that of ‘liquidity providers’ (or ‘market makers’, although on electronic markets all participants can act as liquidity providers by putting in limit orders), who offer to buy or to sell but avoid taking any bare position on the market. Their profit comes from the bid-ask spread s: the sell price is always slightly larger than the buy price, so that each round turn operation leads to a profit equal to the spread s, at least if the midpoint has not changed in the mean time (see below).
This is where the game becomes interesting. Assume that a liquidity taker wants to buy, so that an increased number of buy orders arrive on the market. The liquidity providers is tempted to increase the offer (or ask) price a because the buyer might be informed and really know that the current price is too low and that it will most probably increase in the near future. Should this happen, the liquidity provider, who has to close his position later, might have to buy back at a much higher price and experience a loss. In order not to trigger a sudden increase of a that would make their trade costly, liquidity takers obviously need to put on not too large orders. This is the rationale for dividing one’s order in small chunks and disperse these as much as possible over time so as not to appear on the ‘radar screens’. Doing so liquidity takers necessarily create some temporal correlations in the sign of the trades. Since these traders probably have a somewhat broad spectrum of volumes to trade ’”-,and therefore of trading horizons (from a few minutes to several weeks), this can easily explain the slow, power-law decay of the sign correlation function Co(!) reported above.
322 Now, if the market orders in fact do not contain useful information but are the result of hedging, noise trading, misguided interpretations, errors, etc., then the price should not move up on the long run, and should eventually mean revert to its previous value. Liquidity providers are obviously the active force behind this mean reversion, again because closing their position will be costly if the price has moved up too far from the initial price. However, this mean reversion cannot take place too quickly, again because a really informed trader would then be able to buy a large volume at a modest price. Hence, this mean reversion must be slow enough. These are the basic ingredients ruling the game between liquidity providers and liquidity takers. The subtle balance between the positive correlation in the trades (measured by y) and the liquidity molasses induced by liquidity providers (measured by P) is a self-organized dynamical equilibrium: if the liquidity providers are too slow to revert the price (P < (1 - y)/2), then the price is superdiffusive and liquidity providers lose money. If they are too fast (0 < (1 - y)/2), the residual anticorrelations can be used by liquidity takers to buy larger quantities of stocks at a low price in a given time interval, which is an incentive to speed up the trading and increase y. A dynamical equilibrium where P M (1- y)/2 therefore establishes itself spontaneously, with clear economic forces driving the system back towards this equilibrium see Fig. 3). Interestingly, fluctuations around this equilibrium leads to fluctuations of the local volatility, since persistent patches correspond to high local volatility and antipersistent patches to low local volatility. The extreme crash situations are well-known to be liquidity crisis, where the liquidity molasses effect disappears temporarily, destabilising the market (on that point, see the detailed recent study of Our finding is quite significant in the context of the long-standing question of the impact of trades in financial markets, which is often decomposed into a transient part and a permanent part (see e.g. 25). The results presented above suggest that such a distinction may not be justified. The power-law nature of Go means that there no characteristic time scale beyond which the impact can be considered as permanent (at least up to a few days). From a practical point of view, our result means that a trader launching N trades of volume v, during a period of time T will on average move the price by an amount Ap given by: 23924).
where no is the number of trades beyond which Go ceases to be approximately constant (see 11,12) and p is the total density of trades per unit time in the market. This result is in qualitative agreement with the measurement of the impact of our own trades on stock markets, and is important inasmuch as this directly governs the real cost associated to these trades.
323 0.5
0.4
Q
0.3
0.2
0.1
0.2
0.4
0.6
0.8
r Figure 3. Scatter plot of the exponents p, -y extracted from the fit of Go and C1, see l2 for details. These exponents are seen to lie in the vicinity of the critical line = (1 - 7 ) / 2 (dotted line), as expected from the nearly diffusive behaviour of prices.
5 . Conclusion
The delicate competition between liquidity takers and liquidity providers is at the heart of Bachelier's first law, i.e. that price changes are nearly linearly uncorrelated. The resulting absence of linear correlations in price changes, and therefore of arbitrage opportunities is often postulated a priori in the economics literature, but the details of the mechanism that removes these arbitrage opportunities are rather obscure. The main message of our work is that the random walk nature of price changes on short time scales may not be due to the unpredictable nature of incoming news, but appears as a dynamical consequence of the competition between antagonist market forces. The mere fact of trading such as to minimize impact for liquidity takers, and to optimize gains for liquidity providers, does lead to a random walk dynamics of the price, even in the absence of any real information. In fact, the role of real (and correctly interpreted) information appears to be rather thin: the fact that the intra-day volatility of a stock is nearly equal to its long time value suggests that the volatility is mostly due to the trading activity itself, which is dominated by noise trades. This result is most probably one of the mechanism needed
324 to explain the excess volatility puzzle first raised by Shiller 4 , and the anomalous, long ranged dynamics of the volatility discussed in many papers now. The conclusion that price changes are to a large extent induced by the trading activity itself seems to imply that the price random walk will, on the long run, wander arbitrarily far from the fundamental price, which would be absurd. But even if one assumes that the fundamental price is independent of time, a typical 3% noise induced daily volatility would lead to a significant (say a factor 2) difference between the traded price and the fundamental price only after a few years 26. Since the fundamental price of a company is probably difficult to determine better than and the recent discussion in 2s), one only within a factor two, say (see e.g. expects fundamental effects to limit the volatility on very long time scales (asindeed suggested by the empirical results of de Bondt and Thaler ”, but that these are probably negligible on the short (intra-day) time scales of interest in most statistical analysis of financial markets. From a more general standpoint, the finding that the absence of arbitrage o p portunities results from a critical balance between antagonist effects is quite interesting. It might justify several claims made in the (econo-)physics literature that the anomalies in price statistics (fat tails in returns described by power laws, long range self similar volatility correlations, the long ranged correlations in signs reported here, and many others) are due to the presence of a critical point in the vicinity of which the market operates (see e.g. 30, and in the context of financial markets 31,32). If a fine-tuned balance between two competing effects is needed to ensure absence of arbitrage opportunities, one should expect that fluctuations are crucial, since a local unbalance between the competing forces can lead to an instability. In this respect, the analogy with the balancing of a long stick is quite enticing 14. In more financial terms, the breakdown of the conditions for this dynamical equilibrium is, for example, a liquidity crisis: a sudden cooperativity of market orders, that lead to an increase of the trade sign correlation function, can out-weight the liquidity providers stabilizing (mean-reverting) role, and lead to crashes. This suggests that one should be able to write a mathematical model, inspired by our results, to describe this ‘on-off intermittency’ scenario, advocated (although in a different context) in 14,33,34. 6327,
Acknowledgments
I thank Y. Gefen, J. Kolkelkoren, M. Potters, and M. Wyart for many discussions on the content of this paper. References 1. L. Bachelier, Theorie de la Speculation, Thesis, Paris, 1900, republished by J. Gabay, Paris (1995). 2. E. F. Fama, Eficient capital markets: A review of theory and empirical work, Journal of Finance, 25, 383 (1970).
325 3. P. A. Samuelson, Proof that properly anticipated prices fluctuate randomly, Industrial Management Review, 6, 41 (1965). 4. R. J. Shiller, Do Stock Prices move too much to be justified by subsequent changes in dividends ?, American Economic Review, 71,421 (1981). R. J. Shiller, I m t i o n a l Exuberance, Princeton University Press (2000). 5. W. B. Arthur, Complexity in Economic and Financial Markets, Complexity, 1, 1 (1995). 6. A. Shleifer, Inefficient Markets, An Introduction to Behavioral Finance, Oxford University Press (2000). 7. A. OrlBan, Le pouvoir de la finance, Odile Jacob, Paris (1999); A p o i servent les rnarchb financiers ?, in Qu’est-ce que la Culture ?, Odile Jacob, Paris (2001). 8. M. Daniels, J.D. Farmer, G. Iori, E. Smith, Demand storage, market liquidity, and price volatilty, SFI working paper 02-01-001. This paper appeared in final, but truncated form as: M. G. Daniels, J. D. Farmer, G. Iori, E. Smith, Quantitative model of price diffusion and market fnction based o n trading as a mechanistic random process, Phys. Rev. Lett. 90, 108102 (2003). 9. J.P. Bouchaud, M. MBzard, M. Potters, Statistical properties of stock order books: empirical results and models, Quantitative Finance 2, 251 (2002). 10. E. Smith, J. D. Farmer, L. Gillemot, S. Krishnamurthy, Statistical theory of the continuous double auction, e-print cond-mat/0210475, to appear in Quantitative Finance. 11. J.P. Bouchaud, Y . Gefen, M. Potters, M. Wyart, Quantitative Finance 4,176 (2004). 12. J.-P. Bouchaud, J. Kockelkoren, M. Potters, Random walks, liquidity molasses and critical response in financial markets, cond-mat/0406224 13. C. K. Peng, J. Mietus, J. Hausdorff, S. Havlin, H. E. Stanley, and A. L. Goldberger, Long-Range Anticowelations and Non-Gaussian Behavior of the Heartbeat, Phys. Rev. Lett. 70, 1343-1346 (1993); P. Bernaola-Galvan, P. Ch. Ivanov, L. A. N. Amaral, and H. E. Stanley, Scale Invariance in the Nonstationarity of Human Heart Rate Phys. Rev. Lett. 87, 168105 (2001). The qualitative analogy with financial markets was recently discussed in: Z. Struzik, Taking the pulse of the economy, Quantitative Finance 3,78 (2003). 14. J. L. Cabrera and J. G. Milton, On-Ofl Intermittency in a Human Balancing Task, Phys. Rev. Lett. 89, 158702 (2002) 15. F. Lillo, J. D. Farmer, The long memory of efficient markets, e-print condmat/0311053. 16. J. Hasbrouck, Measuring the information content of stock trades, Journal of Finance, XLVI, 179 (1991). 17. V. Plerou, P. Gopikrishnan, X. Gabaix, H. E. Stanley, Quantifying Stock Price Response to Demand Fluctuations, Phys. Rev. E 66,027104 (2002). 18. F. Lillo, R. Mantegna, J. D. Farmer, Single Curve Collapse of the Price Impact Function f o r the New York Stock Exchange, e-print cond-mat/0207428; Master curve for price-impact function, Nature, 421,129 (2003). 19. F. Lillo, J. D. Farmer, On the origin of power-law tails in price fluctuations, Studies in Nonlinear Dynamics and Econometrics 8 , 1 (2004). 20. T. Odean, Do Investors Trade Too Much?, American Economic Review, 89, 1279 (1999). 21. C. Hopman, Are supply and demand driving stock prices?, MIT working paper, Dec. 2002. 22. on this point, see: X. Gabaix, P. Gopikrishnan, V. Plerou, H. E. Stanley, A theory of power-law distributions in financial markets fluctuations, Nature, 423,267 (2003). We associate the power-law distribution of investors’ size to the long range correlations
326
23. 24. 25. 26.
27. 28. 29. 30. 31.
32.
33. 34.
reported here rather than t o the power-law tail of return, as advocated in the above cited paper. This idea has been investigated in more details in F. Lillo et al, A theory for long-memory in supply and demand, cond-mat/0412708 J. Doyne Farmer, Laszlo Gillemot, Fabrizio Lillo, Szabolcs Mike, Anindya Sen, What really causes large price changes?, Quantitative Finance 4, C7 (2004). P. Weber, B. Rosenow, Large stock price changes: volume or liquidity?, e-print condmat/0401132 R. Almgren, C. Thum, E. Hauptmann, and H . Li, Direct estimation of equity market impact, submitted for publication, December 2004. M. Wyart, J.P. Bouchaud, Self-referential behaviour, overreaction and conventions in financial markets, e-print cond-mat/0303584, submitted to J. Economic Behaviour & Organisation. F. Black, Noise, J. of Finance, 41 529 (1986). 0. Guedj, J. P. Bouchaud, Experts’ earning forecasts: bias, herding and gossamer infomation, e-print cond-mat/0410079 W. de Bondt, R. Thaler, Does the market overreact I , Journal of Finance, XL,793 (1985). P. Bak, How Nature Works: The Science of Self-organized Criticality, Copernicus, Springer, New York, 1996. D. Challet, A. Chessa, M. Marsili, Y.C. Zhang, From Minority Games to real markets, Quantitative Finance, 1, 168 (2001) and refs. therein; D. Challet, M. Marsili, Y.C. Zhang, The Minority Game, Oxford University Press, 2004. I. Giardina, J. P. Bouchaud, Bubbles, crashes and intemittency in agent based market models, Eur. Phys. Jour. B 31,421 (2003); see also the discussion in: J.P. Bouchaud, Power-laws in economics and finance: some ideas from physics, Quantitative Finance, 1, 105 (2001). T. Lux, M. Marchesi, Volatility Clustering in Financial Markets: A Microsimulation of Interacting Agents, Int. J. Theo. Appl. Fin. 3,675 (2000). A. Krawiecki, J. A. Hoyst, and D. Helbing, Volatility Clustering and Scaling for Financial Time Series due to Attmctor Bubbling, Phys. Rev. Lett. 89, 158701 (2002).
CORRELATION BASED HIERARCHICAL CLUSTERING IN FINANCIAL TIME SERIES
S. MICCICHE’, F. LILLO AND R. N. MANTEGNA Dipartimento di Fisica e Tecnologie Relative Universitd di Palermo and Istituto Nazionale per la Fisica della Materia, Unitd di Palenno Viale delle Scienze - Edificio 18, I-90128, Palenno, Italy
We review a correlation based clustering procedure applied to a portfolio of assets synchronously traded in a financial market. The portfolio considered consists of the set of 500 highly capitalized stocks traded at the New York Stock Exchange during the time period 1987-1998.We show that meaningful economic information can be extracted from correlation matrices.
1. Introduction The presence of a high degree of cross-correlation of the synchronous time evolution of a set of equity returns is a well known empirical fact observed in financial markets For a time horizon of one trading day correlation coefficient as high as 0.7 can be observed for some pair of equity returns belonging to the same economic sector. The study of cross-correlation of a set of financial equities has also practical importance since it can improve the ability to model composed financial entities such as, for example, stock portfolios. There are different approaches to the study of asset cross-correlation. The most common one is the principal component analysis of the correlation matrix of the data 4. An investigation of the properties of the correlation matrix has been performed by physicists by using ideas and theoretical results of the random matrix theory Another approach is the correlation based clustering analysis which allows to obtain clusters of stocks starting from the time series of price returns. Different algorithms exist to perform cluster analysis in finance 7,8,9,10,11,12,13. 132,3.
576.
In previous work, some of us have shown that a specific correlation based clustering method gives a meaningful taxonomy for stock return time series for market index returns of worldwide stock exchanges and for volatility increments of stock return time series ”. Here we review the method proposed by applying it to the set of 500 highly capitalized stocks traded in the New York Stock Exchange. 8314315,
327
328 2. A correlation-based filtering procedure
In Ref. s, it has been proposed a correlation based method able to detect economic information present in a correlation coefficient matrix. This method is a filtering procedure based on the estimation of the subdominant ultrametric l8 associated with a metric distance obtained form the correlation coefficient matrix of set of n stocks. This procedure, already used in other fields, allows to extract from it a minimum spanning tree (MST) and a hierarchical tree from a correlation coefficient matrix by means of a well defined algorithm known as nearest neighbor single linkage clustering algorithm 19. This allows to reveal topological (throughout the MST) and taxonomic (throughout the hierarchical tree) aspects of the correlation present among stocks. The MST is obtained by filtering a relevant part of the information which is present in the correlation coefficient matrix of the original time series of stock r e turns. This is done (i) by determining the synchronous correlation coefficient of the difference of logarithm of stock price computed at a selected time horizon, (ii) by calculating a metric distance between all the pair of stocks and (iii) by selecting the subdominant ultrametric distance associated to the considered metric distance. The subdominant ultrametric is the ultrametric structure closest to the original metric structure *’. The correlation coefficient is defined as
where i and j are numerical labels of the stocks, ~i = In Pi(t)-In Pi(t - At), Pj(t)is the value of the stock price i at the trading time t and At is the time horizon which is, in the present work, one trading day. The correlation coefficient for logarithm price differences (which almost coincides with stock returns) is computed between all the possible pairs of stocks present in the considered portfolio. The empirical statistical average, indicated in this paper with the symbol (.), is here a temporal average always performed over the investigated time period. By definition, pij (At) can vary from -1 (completely anti-correlated pair of stocks) to 1 (completely correlated pair of stocks). When pij(At) = 0 the two stocks are uncorrelated. The matrix of correlation coefficient is a symmetric matrix with pii(At) = 1 in the main diagonal. Hence for each value of At, n(n- 1)/2 correlation coefficients characterize each correlation coefficient matrix completely. A metric distance between pair of stocks can be rigorously determined 2o by defining
di,j(At) = 4 2 ( 1 - pij(At)).
(2)
With this choice di,j(At) fulfills the three axioms of a metric - (i) di,j(At) = 0 if and only if i = j; (ii) &(At) = dj,i(At) and (iii) d,,j(At) 5 di,k(At) dk,j(At).
+
329
B
Figure 1. Minimum spanning tree of 500 highly capitalized stocks traded in the NYSE. The filtering procedure is obtained by considering the correlation coefficient of stock returns time series computed at a 1 trading day time horizon. Each circle represents a stock. Clustered areas are observed. With the letters A, B, C and D we indicate clusters of stocks belonging to the sectors “Finance, insurance, and real estate”, “Transportation, communications, electric and sanitary services”, “Manifacturing” and “Mining and contruction” respectively. Some of these stocks act as a “hub” of a local cluster. Examples are GE, TY, C, ITW, KO, and DUK. General Electric is the most connected stock observable at the center of the star like area at the center of the graph.
The distance matrix D(At) is then used to determine the MST connecting the n stocks. The MST, a theoretical concept of graph theory ’l, is the spanning tree of shortest length. A spanning tree is a graph without loops connecting all the n nodes with n - 1 links. We have seen that the original fully connected graph is metric with distance d i , j . Therefore the MST selects the n - 1 stronger (i.e. shorter) links which span all the nodes. The MST allows to obtain, in a direct and essentially unique way, the subdominant ultrametric distance matrix D<(At) and the hierarchical organization of the elements (stocks in our case) of the investigated data set. The subdominant ultrametric distance between objects i and j , i.e. the element d& of the D<(At)matrix, is the maximum value of the metric distance dk,l detected by moving in single steps from i to j through the path connecting i and j in the MST. The method of constructing a MST linking a set of n objects is direct and it is known in multivariate analysis as the nearest neighbor single linkage cluster
330 analysis 19. A pedagogical exposition of the determination of the MST in the contest of financial time series is provided in ref. ”. Subdominant ultrametric space lS has been fruitfully used in the description of frustrated complex systems. The archetype of this kind of systems is a spin glass 23. As an example of the results obtained with this method here we briefly discuss the results obtained by investigating a set of 500 highly capitalized stocks traded in the New York Stock Exchange (NYSE) during the period January 1987 - December 1998.
Figure 2. Minimum spanning tree of 500 highly capitalized stocks traded in the NYSE. In this figure all the stocks belonging t o the “Finance, insurance and real estate” sector are indicated with a b l x k circle. Their identity is indicated by their tick symbol. Two major homogenous areas related t o stock belonging t o this sector are observed in the graph. A few isolated stocks are also present. The homogeneous area observed in the top area of the graph is composed by insurance companies (AOC, AET, CI, HSB, PGR, PL, SPC, and TMK). The other homogeneous area of financial stocks located at the bottom mostly comprises depository institutions and security and commodity brokers. Examples are the stocks BAC, BK, C, CMB, JPM, MER, CMB, PNC, and RNB
In Fig. 1we show the minimal spanning tree obtained in this investigation with a time horizon equal to one trading day. Stocks are identified with circles. F’rom the figure it can be seen that clusters of stocks exist. Some of these clusters are composed by stocks which are rather homogeneous with respect to the economic sector of activity. The classification of the stocks used here is the one of the Standard
331
Industrial Classification system 24. To support our statement we show in Fig. 2 the MST by evidencing all the stocks belonging to the “Finance, insurance and real estate” sector. Stocks are identified by black circles and labeled with their tick symbol. Information about the identity of most of the stocks can be found in financial web sites. A few stocks are no more active after 1998 due to merging with other companies or to other reasons.
Figure 3. Minimum spanning tree of 500 highly capitalized stocks traded in the NYSE. All the stocks belonging t o the “Transportation, Communications, Electric, Gas, and Sanitary Services” sector are indicated with a black circle. Their identity is indicated by their tick symbol. Four major homogenous areas related t o stock belonging t o this sector are observed in the graph. The biggest one (bottom right part of the graph) comprises stocks belonging t o the electric gas and sanitary services subsector. The second cluster, observed in the left bottom part of the graph, is composed by AIT, BEL, BLS, GTE, SBC, and USW. All these stocks are telephone communication companies. The third and fourth cluster are observed in the left top part of the graph. Both clusters are related t o transportation. Specifically, one comprises stocks dealing with transportation by air (ALK, AMR, DAL, LUV, and U) whereas the second one comprises stocks of railroad transportation (CSX, NSC and UNP).
A wide cluster is observed at the bottom of the graph. Another cluster can be detected in the upper part of the graph. In addition to this two prominent clusters and a few isolated stocks are also present. By analyzing in detail the main activity of the companies investigated a rational explaination of the observed behavior is
332
obtained. In fact the cluster observed in the top area of the graph is composed by insurance companies and the other homogeneous area of financial stocks located at the bottom mostly comprises depository institutions and security and commodity brokers. A second example is provided in Fig. 3 where we show the stocks belonging to the SIC sector “Transportation, Communications, Electric, Gas, and Sanitary Services”. The title of this sector manifests itself that this classification is unavoidably comprising stocks characterized by a rather different economic activitiy. This aspect is reflected in the clusters detected in the MST. Four well defined clusters are directly observable. The biggest one is seen at the bottom right of the graph. Most of the stocks that are located here belong to the SIC subsector of “Electric gas and sanitary services”. The second cluster is observed in the left bottom part of the graph. It is composed by AIT, BEL, BLS, GTE, SBC, and USW. All these tick symbols identify telephone communication companies. The third and fourth cluster are observed in the left top part of the graph immediately above the star of stocks surrounding General Electric. Both clusters are related to transportation. One comprises ALK, AMR, DAL, LUV, and U, which are companies working in the subsector of transportation by air whereas the second one comprises CSX, NSC and UNP, which are companies of railroad transportation. The clustering algorithm allows to obtain a hierarchical tree also. An example is provided in Fig. 4 where we reconsider the clusters observed for the group of stocks classified as “Transportation, Communications, Electric, Gas, and Sanitary Services” by the SIC code. For the sake of clarity we show in the figure only the first 250 stocks which are first present in the construction of the MST. The structure of the cluster is now more refined and precisely described. In the figure each vertical line indicates a stock. Stocks are separated by the ultrametric distance (y-axis) at which lines are observed to merge. All the stocks belonging to the “Transportation, Communications, Electric, Gas, and Sanitary Services” sector are indicated with a thicker solid line. All the other stocks are drawn with a thinner solid line. Specific clusters are observed. From left to right, the first one is labeled numerically by number 0 to 4 and comprises AMR, DAL, U, LUV and ALK. The second one is observed from 61 to 63 (NSC, CSX, and UNP). the third one is rather big extending itself from 89 to 102 (DUK, ED, AEP, BGE, D, NSP, FPL, PEG, CPL, SO, CSR, AYE, PCG, and SCG). Then the telephone communication cluster is observed from 108 to 113 (BEL, AIT, BLS, SBC, USW, and GTE). Other electric services clusters are observed considering stocks WEC, CIN (133-134), SRE, TXU and NES (from 138 to 140), PE, FPL and EIX (from 154 to 156)) and finally OGE, FPC and IPL (from 165 to 167). 3. Conclusions
Correlation based networks can be obtained in financial markets by investigating financial time series. Here we have reviewed the basic method by analyzing the
333
1.I
a, 0 s
cd 1
c,
.-u) .-L
0.9
c,
a,
E 0.8 cd L
c,
5 0.7 0.6
I
20
40
60
80
100 120 140 160 180 200 220 240
Stock label Figure 4 . Hierarchical tree of the first 250 stocks entering in the construction of the MST. Each vertical line indicates a stock. Stocks are separated by the ultrametric distance (y-axis) at which lines are observed to merge. All the stocks belonging to the “Transportation, Communications, Electric, Gas, and Sanitary Services” sector are indicated with a thicker solid line all the other stocks are drawn with a thinner solid line. Specific clusters are observed. From left t o rigth, the first one is labeled numerically by number 0 to 4 and comprises AMR, DAL, U, LUV and ALK. The second one is observed from 61 to 63 (NSC, CSX, and UNP). the third one is rather big extending itself from 89 to 102 (DUK,ED, AEP, BGE, D, NSP, FPL, PEG, CPL, SO, CSR, AYE, PCG, and SCG). Then the telephone communication cluster is observed from 108 to 113 (BEL, AIT, BLS, SBC, USW, and GTE). Other electric services clusters are observed considering stocks WEC, CIN (133-134),SRE, TXU and NES (from 138 t o 140), P E , FPL and EIX (from 154 to 156), and finally OGE, F P C and IPL (from 165 to 167).
portfolio of stocks composed by the 500 most capitalized stocks traded at the New York Stock Exchange. The correlation-based networks are obtained with a welldefined filtering procedure ’, which mainly focuses on the most relevant correlations among stocks. Different filtering procedures have been proposed by different authors and provide different aspects of the information stored in the investigated 9,10,11712,13
334 sets. The robustness over time of the MST characteristics has been investigated in a series of studies 13,17325i26,27128. The filtering approach based on the MST can also be used to consider aspects of portfolio optimization 29 and to perform a correlation based classification of relevant economic entities such as banks 30 and hedge funds 31.
The topology of the correlation based networks depends on the investigated set and on the details of investigation. The observed topology ranges from a star-like one to the complex multi-cluster structure of Fig. 1 . In summary, the study of correlation based financial networks is a fruitful method able to filter out economic information from the correlation coefficient matrix of a set of financial time series. The topology of the detected network can be used to validate or falsify simple, although widespread, market models 15. Acknowledgments This work has been partially supported by research projects MIUR 449/97 “Dinamica di altissima frequenza nei mercati finanziari” and MIUR-FIRB RBNEOlCW3M. References 1. H. Markowitz, Portfolio Selection: Eficient Diversification of Investment (J. Wiley, New York, 1959). 2. E. J. Elton and M. 3. Gruber, Modern Portfolio Theory and Investment Analysis (J. Wiley and Sons, New York, 1995). 3. J. Y. Campbell, A. W. Lo, and A. C. MacKinlay, The Econometrics of Financial Markets (Princeton University Press, Princeton, 1997). 4. E. J. Elton and M. J. Gruber, Journal of Business 44, 432 (1971). 5. L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters, Phys. Rev. Lett. 83,1468 (1999). 6. V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Amaral, and H. E. Stanley, Phys. Rev. Lett. 83,1471 (1999). 7. D. B. Panton, V. Parker Lessig, and 0. M. Joy, Journal of Financial and Quantitative Analysis 11,415 (1976). 8. R. N. Mantegna, Eur. Phys. J. B 11, 193 (1999). 9. L. Kullmann, J. Kertesz and R. N. Mantegna, Physica A 287,412 (2000). 10. M. Bernaschi, L. Grilli, L. Marangio, S. Succi and D. Vergni cond-mat/0003025 11. L. Giada and M. Marsili, Phys. Rev. E 63,061101 (2001). 12. M. Marsili, Quantitative Finance 2 , 297 (2002). 13. M. Tumminello, T. Aste, T . Di Matteo, and R.N. Mantegna, A new tool for filtering information in complex systems, cond-mat/0501335. 14. G. Bonanno, F. Lillo and R.N. Mantegna, Quantitative Finance 1, 96 (2001). 15. G. Bonanno, G. Caldarelli, F. Lillo, R.N. Mantegna, Phys. Rev. E 68,046130 (2003). 16. G. Bonanno, N Vandewalle and R.N. Mantegna, Phys. Rev. E 62,R7615 (2000). 17. S. MiccichB, G. Bonanno, F. Lillo and R. N. Mantegna, Physica A 324, 66 (2003). 18. R.Rammal, G. Toulouse, and M. A. Virasoro, Rev. Mod. Phys. 58,765 (1986). 19. K. V. Mardia, J. T. Kent, and J. M. Bibby Multivariate Analysis, (Academic Press, San Diego, CA, 1979). 20. J. C. Gower, Biometrika 53, 325 (1966). 21. D. B. West, Introduction to Graph Theory (Prentice-Hall, Englewood Cliffs NJ, 1996).
335 22. R. N. Mantegna and H. E. Stanley, An Introduction to Econophysics: Correlations and Complexity in Finance, (Cambridge Univ. Press, Cambridge UK, 2000). 23. M. MBzard, G . Parisi, and M. A. Virasoro, Spin Glass Theory and Beyond (World Scientific, Singapore, 1987). at 24. The Standard Industrial Classification system can be found http://www.osha.gov/oshstats/naics-manual.htm1 25. L. Kullmann, J. Kertesz and K. Kaski, Phys. Rev. E 66,026125 (2002). 26. J.-P. Onnela, A. Chakraborti, K. Kaski and J. Kertesz, Eur. Phys. J. B 30,285 (2002). 27. J.-P. Onnela, A. Chakraborti, K. Kaski and J. Kertesz, Physica A 324, 247 (2003). 28. J.-P. Onnela, A. Chakraborti, K. Kaski, J. Kertesz and A. Kanto, Physica Scripta T 106,48 (2003). 29. J.-P. Onnela, A. Chakraborti, K. Kaski, J. Kertesz and A. Kanto, Dynamics of market correlations: Taxonomy and portfolio analysis, Physical Review E 68,056110 (2003). 30. I. Marsh, C. Hawkesby and I. Stevens, Large complex financial institutions: common influences on asset price behaviour? Financial Stability Review, Bank of England, 15 124 (2003). 31. M. A. Miceli and G.Susinno, Risk 16,(11) S11 (2003).
PATH INTEGRALS AND EXOTIC OPTIONS: METHODS AND NUMERICAL RESULTS
G. BORMETTI'?', G. MONTAGNA'l', N. MORENI'93 AND 0. NICROSIN129' Dipartimento d i Fisica Nucleare e Teorica, Universitd d i Pavia, Via A . Bassi 6, 27100, Pavia, Italy 'Istituto Nazionale d i Fisica Nuclenre, Sezione d i Pavia, Via A . Bassi 6, 27100, Pavia, Italy CERMICS - ENPC, 6 et 8 avenue B. Pascal, Cite' Descartes, Champs sur Marne, 77455, Marne la Valle'e, Cedex 2, fiance E-mail: [email protected]
'
In the framework of Black-Scholes-Merton model of financial derivatives, a path integral approach to option pricing is presented. A general formula to price path dependent o p tions on multidimensional and correlated underlying assets is obtained and implemented by means of various flexible and efficient algorithms. As an example, we detail the case of Asian call options. The numerical results are compared with those obtained with other procedures used in quantitative finance and found to be in good agreement. In particular, when pricing at the money (ATM) and out of the money (OTM) options, path integral exhibits competitive performances.
1. Introduction and motivation A central problem in quantitative finance is the development of efficient methods for Although the classical Black&Scholes pricing and hedging derivative securities and Merton model of financial derivatives provides an elegant framework to price financial derivatives, the level of analytical tractability of the model is limited to Plain Vanilla options and few other cases. If we are interested in pricing more sophisticated financial instruments, such as options whose payoff depends on the path followed by the underlying asset, we have to apply appropriate numerical techniques. Although for the price of some of these instruments there exist closed-form solutions or particular procedures 4, the specifications of the contracts that are traded in practice or the dependence on multiple assets require in general flexible and fast numerical algorithms to be available. There is a wide literature on the subject and many approaches have been proposed. The standard numerical procedures adopted in financial engineering involve the use of binomial or trinomial trees, Monte Carlo (MC) simulations and finite difference methods Here we extend the path integral approach to option pricing developed for unidimensional assets in Ref. 5 . We generalize the original formulation in order to '9'.
'1'.
336
337 price a variety of commonly traded exotic options. First, we obtain a pricing formula for path dependent options based on multiple correlated underlying assets; second, we improve the related numerical algorithms. Comparisons with standard MC simulations are presented. Related attempts to price exotic options using the path integral method can be found in Ref. 6. 2. Path integral
Path integral techniques, widely used in quantum mechanics and quantum field theory, can be useful to describe the dynamics of a Markov stochastic process '. In our case, we are interested in the stochastic process S, corresponding to the price of a set of given underlying assets, which satisfies a stochastic differential equation describing geometric Brownian motion. Given D E N,Vi,j,k = 1,. . . ,D and under risk neutral measure, we have
+
d S k / S k= rdt ukdWk with < d w , d W 3 >= pijdt (1) where I- is the risk-free interest rate, uk are the volatilities and pij the correlations between the Wiener processes W (pi, = l), all of them being constant. If we introduce the Variance-Covariance matrix A uiujpij and we extract the square root C,defined by relation CCT = E, using It6 lemma it is possible to show that the stochastic variable z = (log S', . . . ,log SD)satisfies the following equation
c,
dz = Adt + CdW,
(2) C f J 2 and W is now a D-dimensional
where the kth entry of A is Ak = I- vector of independent Wiener processes. In order to set up a path integral framework, we consider the finite time interval [O,T],where T is the option expiry date and we choose the initial time to equal to 0 without loss of generality. We divide it in n 1 subintervals of length At = T / ( n 1) and we define the discrete path {to,. . . ,zi,. . . ,%+l} = {z(O),. . . ,z ( i A t ) ,. . . ,z ( T ) } .When pricing path dependent options by no-arbitrage arguments, we are interested in calculating expectations of the form
+
+
E[f(zo,z1,. . . , Z n + l ) l = /+wdzO /fmdzn+l -W
-m
/ fi
dzi P ( Z O , a , .. . ,zn+1)f(zo, a , .. . , z n + l ) , (3)
i=l
where f is a given payoff function and p is the probability density function (pdf) associated with the discrete path. Path integral approach allows us to rewrite the previous expression in a more suitable form for numerical purposes and, in particular, it reduces to standard functional integration over the continuous time process z when taking the limit At +. 0. The Markov nature of the log-price dynamics allows to iterate the Chapman-Kolmogorov relation, with conditional probability density given by
338 where
I( is standard Euclidean norm, to reduce Eq.
(3) as follows
r ~ n ( ~ , z n + l , z ~ ) , z n + l I , (5) j_+_mdzn+l/ G d ~ iI[zo,~~(E,~~+I,~o),...
-
with ~i nf!=’,,Af(O, 1). The explicit expression for I can be found in Ref. ’. It is quite remarkable that {zl, . . . ,zn} depends only on n x D independent Gaussian variables E and the starting and final points, zo and zn+l. 3. Algorithms and numerical results Equation ( 5 ) represents the main results of our work. To price path dependent options, we have to evaluate it numerically. We do this in two different ways. In the first one, we “separate” the internal n x D-dimensional integration and the external D-dimensional one, performing the former via MC and the latter with a deterministic method to be specified. We evaluate Eq. (5) as
with a suitable choice of the integration weights wi and of the integration points Since we have not an explicit expression for f, we evaluate it by generating m possible paths with fixed ending point In this way we get MC estimators, with their associated error wi. Numerical investigations performed in Ref. show that the errors due to the finiteness of nint are negligible. When we consider multidimensional underlying assets, the deterministic approach can not provide competitive results. As an alternative, we propose a method based on a pure MC integration coupled with path integral. We approximate Eq. (5) by . . z ~ obtaining ], letting zn+l E l I ~ = , [ z ~ ,thus
.ti1.
ztil.
+
We read the pricing formula as the expectation of a function of n 1 independent variables, the first, z,+l, being uniformly distributed over HEl[&, z L ] , while the E ~ ’ S have standard Gaussian pdf’s. Our algorithm extracts m i.i.d. arrays (zn+l,€1,. . . ,en), so that the related error scales as m-0.5. It is also possible to implement an importance sampling with a truncated Cauchy pdf normalized on HEl [ z k , z h ] . The particular choice of a Cauchy function is suggested by the idea that, in a first rough approximation, we consider the resulting integrand to be slightly wider than a Gaussian one. Reasonable values for z,, z ~as ,well as for the mean of the Cauchy pdf, depend on the values of the strike, the spot, the volatility etc. Therefore, for a given payoff function, we perform a preliminary investigation of the shape of the integrand function f(zn+l) (see Fig. 1 in Ref. ’). In this way we introduce an asymmetry between z,,+~ and the E ~ ’ Sin the sense that zn+l plays
339 a crucial role for the variance reduction in the MC simulation. This reveals to be very useful when we price OTM options and a MC random walk approach fails to be efficient. Here, we detail the case of Asian call options. No-arbitrage price of this option at time t = 0 is
with
xLl ai= 1and K strike price. We choose a, = 1/D. In Tab. 1we present our Table 1. Numerical values for an Asian call option price obtained for the p a rameters D = 1, ezO = 100,T = 0.095,u = 0.2, T = 1 year and n 1 = 100. Errors are given at one standard deviation.
+
K = 60 MCRW
BBST PITP PICH PIFL
Price
Error
K = 100 Price Error
K = 150 Price Error
40.830 40.824 40.811 40.767 40.758
0.025 0.018 0.019 0.040 0.105
6.899 6.886 6.876 6.873 6.880
0.0054 0.0005 0.0058 0.0001 0.0057 0.0001 0.0059 0.0001 0.0057 0.0001
0.019 0.015 0.015 0.019 0.026
results for the unidimensional case, together with the ones considered as benchmark and obtained with a MC random walk (MCRW) and the Brownian Bridge with stratification (BBST) '. The parameters used in the numerical simulation are zo = ln100, T = 0.095, a = 0.2, T = 1 and n 1 = 100. In the path integral with external trapezoidal integration (PITP), as well as in the BBST, we limit the zn+lintegration to the interval [ Z - 4 a n , z + 4 a f l ] , where Z = zo (T - a2/2)Twhen K = 60,100 and Z = 1nK when K = 150. nint = 200 and for each point we generate lo3 random paths. In the cases of MCRW and of MC path integral with uniform (PIFL) or Cauchy (PICH) sampling, the total number of paths is 2 . lo5, i.e. we consider the same number of call of the random number generator used previously. We can notice that all path integral prices are in very good agreement with the benchmark ones, while, from the point of view of variance reduction, the PITP algorithm turns out to be the most performing one. Let us stress that PICH and PIFL algorithms are performing only OTM. In Tab. 2 we consider a multidimensional case, corresponding to D = 3. The correlation coefficients value p z j , for i # j , is 0.6. More details about the parameters can be found in Ref. '. It is clear that path integral is still a good choice to price OTM options, prices being in agreement with the benchmark MCRW and errors smaller, especially with a Cauchy pdf sampling (PICH). On the other side, when dimension increases, path integral ATM performance worsens. More details on Asian options, as well as on Barrier Knock Out and Reverse Cliquet options and Greek letters, are given in Ref. '.
+
+
340 Table 2. Prices and errors for Asian Basket call options for pij = 0.6 (i # j ) , el0 = (100,90,105) and obtained with 2.16 . lo5 total Monte Carlo calls. The &+I-integration is performed in a cube centered in Z, with hedge length 8 u n .
K = 100 Z= (110,100,110)
MCRW PITP PIFL PICH
Price 5.29 5.33 5.37 5.26
Error 0.02 0.04 0.06 0.03
Z= (100,100,100)
Price 5.29 5.28 5.41 5.28
Error 0.02 0.04 0.07 0.03
K = 140 Z=( 140,140,140) Z=( 130,130,130) Price 0.0049 0.0051 0.0048 0.0048
Error 0.0004 0.0003 0.0003 0.0001
Price 0.0049 0.0049 0.0048 0.0050
Error 0.0004 0.0003 0.0002 0.0001
4. Conclusions and outlook
We have shown how the path integral approach to stochastic processes can be successfully applied to the problem of pricing exotic derivative contracts. Numerical results for the fair price of Asian call options have been presented and compared with those obtained by means of standard procedures used in quantitative finance. With respect to the original formulation of Ref. the method has been generalized in order to cope with options depending on multiple and correlated underlying assets. By virtue of an appropriate separation of the integrals entering the path integral formula and of a careful choice of random paths ending points, in order to probe the relevant regions of the integrand, it has been shown that the algorithm can provide very precise results. In particular our approach exhibits best performances for OTM options. A possible perspective would be to use the results here as a benchmark to train neural networks, along the lines described in Ref. lo. A further important development concerns the application of the method to more realistic models of the financial dynamics, beyond the log-normal assumption l l .
References 1. J. Hull, Options, Futures and Other Derivatives (New Jersey: Prentice Hall) (1997). 2. L. Clewlow and C. Strickland, Implementing Derivative Models (Wiley) (1998). 3. F. Black and M. Scholes, J . Polit. Econ. 72,637 (1973); R. Merton J . Econ. Managem. Sci. 4,141 (1973). 4. C. F. Lo, H. C. Lee and C. H. Hui, Quant. Finance 3,98 (2003); J. Vecer and M. Xu, Quant. Finance 4,170 (2004). 5. G. Montagna, 0. Nicrosini and N. Moreni, Physica A 310,450 (2002). 6. M. Rosa-Clot and S. Taddei Int. J . Theor. Appl. Finance 5, 123 (2002); B. E. Baaquie, C. Corianb and M. Srikant, Physica A 334,531 (2004). 7. M. Chaichian and A. Demichev, Path Integrals in Physics (Bristol and Philadelphia: Institute of Physics Publishing) (2001). 8. G. Bormetti et al., Pricing Exotic Options in a Path Integral Approach, condmat/0407321. 9. B. Lapeyre and D. Talay, Understanding Numerical Analysis for Financial Models (Cambridge University Press) (2004). 10. M. J. Morelli et al., Physica A 338,160 (2004). 11. L. Borland, Quant. Finance 2, 415 (2002).
AGING OF EVENT-EVENT CORRELATION OF EARTHQUAKE AFTERSHOCKS
SUMIYOSHI ABE Institute of Physics, University of Tsukuba, Ibamki 305-8571, Japan E-mail: szla~eOsf6.so-net.ne.jp NONKAZU SUZUKI College of Science and Technology, Nihon University, Chiba 274-8501, Japan E-mail: [email protected] A recent discovery of the aging phenomenon in seismology is reported. From seismic time series, the nonstationary regimes are identified. They are associated with aftershocks characterized by the Omori law, and are referred to as the Omori regimes. Event-event correlation is defined and examined in both inside and outside the Omori regimes. It is found that the aging phenomenon is exhibited in the Omori regimes, whereas it vanishes outside the Omori regimes. It is also found that such aging phenomenon obeys a definite scaling law. Combined with other features, the present result suggests that earthquake aftershocks may be governed by glassy dynamics.
1. Introduction
A large shallow earthquake can be viewed as a quenching process, which induces the stress redistribution on faults, and accordingly a complex energy landscape with a number of local minima is reorganized. The resulting fault-fault interaction may be random and long-ranged [l].Thus, a shallow mainshock tends to be followed by a swarm of aftershocks. The Omori law [2] states that the number of aftershocks, d N ( t ) , occurring in the time interval, t t d t , where t is time elapsed after the mainshock at t = 0, obeys the power-law decay:
-+
Here, 7 and A are positive constants, and the exponent, p , is generally assumed to range from 0.8 to 1.5 (but careful data analysis indicates a wider range: from 0.5 to 2.5). A regime of seismic time series, in which relevant events are aftershocks obeying Eq. ( l ) , is referred to as the Omori regime, which forms a highly nonsta-
341
342 tionary time series. Such a regime is expected to contain information on dynamical features of aftershocks. An important point is that it is inappropriate to suppose the values of magnitude to be small for aftershocks, They can often be very large, sometimes even larger than that of a mainshock. It seems equally inappropriate to put a spatial window to identify aftershocks, since event-event correlation may also be long-ranged. The aftershocks should purely be identified in conformity with the Omori law in Eq. (1). The Omori law is a power law, which implies that the relaxation process in the complex energy landscape is very slow. This suggests that the mechanism underlying the aftershock phenomenon might be governed by glassy dynamics. In this article, we report a recent discovery (31 of the aging-scaling phenomena of eventevent correlation of aftershocks, which can be thought of as a strong support for the glassy-dynamics picture of aftershocks. 2. Event-event correlation
According to the results of our previous studies, there exist definite statistical properties of spatial distance [4]and time interval [5] between two successive earthquakes, indicating strong correlation between successive events. To quantify such correlation, we propose to employ, as the fundamental random variable, the occurrence time of the nth aftershock with an arbitrary value of magnitude, tn, following the mainshock at t o . We then define the event-event correlation function as follows:
where the averages and the variance are given by
0;
=< t; > - < t,>2,
(4)
respectively. Note that the correlation function in Eq. ( 2 ) is labeled not by conventional time but by the numbering of event. Such unconventional time is refereed to as natural time [6-81. If the process {tm}is stationary in natural time, then C(n n w , n w )is independent of gnatural waiting timeh, nw, whereas the n,-dependence characterizes nonstationarity. Among various nonstationary processes, there is a distinguished case, in which the aging phenomenon is observed. In the next section, we shall see how event-event correlation in the Omori regime exhibits such a phenomenon.
+
3. Aging of aftershocks
We have analyzed the seismic data provided by the Southern California Earthquake Data Center (available at http: / /www .scecdc.scec.org/ catalogs.html) . Through ex-
343 tensive data analyses, we have always found the same result. Therefore, we present and discuss two typical examples. Two mainshocks we consider here are M7.3 at 11:57:34.10 on June 28, 1992 ( 34'12.01'N latitude, 116'26.20'W longitude, and 0.97 km in depth) and M7.1 at 09:46:44.13 on October 16, 1999 ( 34'35.63" latitude, 116'16.24'W longitude, and 0.02 km in depth). In the period between 1984 and 2002, the maximum depth is 57.88 km. Therefore, they are very shallow, and in fact have been followed by the swarms of aftershocks. We have identified the corresponding Omori regimes as follows. The Omori law in Eq. (1) is integrated to yield, N ( t ) = [ A T / 1( - p ) ] X [(l+t/T) (l+to/T) (p#1), AT ln[( t + T ) / ( ~ o + T ) ]( p = I ) , where N ( t 0 ) is set equal to zero. N ( t ) contains three parameters, A, T , and p , in general. Fix a time interval [to,to TJ after the mainshock at to. Use the method of least squares for the data and the model to perform the parameter search. Then, change T , do the parameter search each time, and find the value of T*,with which best-fit regression is achieved. In this way, the Omori regime [to,t o T*]is identified. No windows are put on the spatial region and magnitude in this procedure, in accordance with nonreductionism for complex systems. Two Omori regimes thus obtained are (a) between 11:57:34.10 on June 28, 1992 and 07:07:44.27 on August 15, 1992, and (b) between 09:46:44.13 on October 16, 1999 and 20:59:00.03 on April 25, 2000. The values of the set of parameters are ( p = 0 . 8 0 , ~= 1.80 x 106[s], A = 7.58 x 10-3[s-1]) and ( p = 0 . 5 5 , = ~ 1.45 x 105[s], A = 7.42 x 10-3[s-']), with the maximum correlation coefficients pmax= 0.999923 and pmax= 0.999908 for the data in (a) and (b) and the model, respectively. In Fig. 1 (a) and (b), we present the plots of the correlation functions inside the Omori regimes, (a) and (b), with respect to natural time for several values of natural waiting time. In both regimes, first 15,000 events are analyzed, that is, A4 = 15,000. There, one can appreciate clear aging phenomena with respect to natural waiting time. The result implies that the Omori regime is a nonstationary time series of such a peculiar kind, not in conventional time but in natural time. A question arising here is why natural time plays a distinguished role in complex systems with catastrophes such as seismicity. This question is however still out of reach, and we leave it for the further investigation Closing this section, we briefly look at what happens outside the Omori regimes. In Fig. 2, we present the plots of the correlation functions outside the Omori regime: the period between 04:43:56.70 on June 14, 2000 and 14:21:09.95 on April 10, 2001. First 10,000 events (i.e., M = 10,000) are analyzed. There, no aging phenomenon is recognized, implying that the time series is stationary in terms of natural time. These results clearly characterize the feature of aftershocks in a novel manner. More specifically, combined with quenched disorder and slow relaxation (i.e., the power law nature of the Omori law), aging phenomenon can be seen as a signal of glassy dynamics. A further analogy with glassy dynamics may be supplied by establishing the scaling law, which is the subject of the next section.
'-'-
I'-'
+
+
344 1
0.9999 t I:
+2t
0.9998
v
L,
0.9991
0.9996 I 0
I
200
I
400
600
800
I
1000
1200
1400
natural time R
1
0.9999 0.9998 A
x?
0.9997
P r:
2
v
0.9996
t,
0.9995 0.9994 0.9993 I
0
200
I
,
400
600
I
800
1000
I
1200
1400
natural time R
Figure 1. Aging of the event-event correlation functions with respect to natural waiting time inside the Omori regimes, (a) and (b). All quantities are dimensionless.
345 1 0.9999 0.9998 h
2 0.9997
2
+ 0.9996
5
0.9995 0.9994 0.9993 0.9992 0
200
400
600
800
1000
~ t u r atime l n
Figure 2. The event-event correlation function outside the Omori regime. All quantities are dimensionless.
4. Scaling law
Another important feature of event-event correlation in the Omori regime is existence of the scaling law. For Fig. 1 (a) and (b), it is possible t o realize the data collapses by rescaling natural time with the help of a certain function of natural waiting time, f(n,,,).Such a manipulation, in fact, transforms Fig. 1 (a) and (b) to Fig. 3 (a) and (b), respectively, showing the scaling law
C ( n + n,,n,)
= C(n/f(n,)),
(5)
where is the scaling function. The functional form of f(n,) can be found as follows. For each n,, the value of f(nw) is determined in such a way that the discrepancies between the correlation functions quantified by the Il-norm distance is minimum. In addition, the initial condition, f(0) = 1, has to be satisfied. The values thus obtained for the cases (a) and (b) are depicted by the dots in Fig. 4 (a) and (b), respectively. From them, we find that f(n,), shown by the solid lines in Fig. 4 (a) and (b), can well be described by the following two-parameter function: f(n,) = an;
where a and y are constants.
+ 1,
(6)
346 1
*o 0.9999
A 0
600 1200
h
2 2 +
0.9998
v
t,
0.9997
0.9996 I 0
200
400
600
200
400
600
800 n!f(n w )
1000
1200
1400
800
1000
1200
1400
1
0.9999 0.9998 h
z
+e
v
0.9997 0.9996
L,
0.9995 0.9994 0.9993 0
Y
m
W
)
Figure 3. The data collapses of the event-event correlation functions inside the Omori regimes, (a) and (b), by rescaling of natural time. All quantities are dimensionless.
347 1.35 (a)
0
500
1000
1500
2000
2500
n W
1.25
I
0
200
400
600
800
1000
1200
1400
1600
n W
Figure 4. The forms of f(nu) for the data collapses in Fig. 3 (a) and (b). The solid lines represent Eq. (6) with a = 1.37 x and y = 1.00 for and 7 = 1.62 for (a), and with a = 1.70 x (b). All quantities are dimensionless.
348 Therefore, we conclude that there certainly exists the scaling law in event-event correlation of aftershocks.
5. Concluding remarks In this article, we have reviewed the recent discovery of the aging and scaling phenomena in event-event correlation of earthquake aftershocks. Taking simultaneously into account the quenching effect of a mainshock, long-range fault-fault interaction, and slow relaxation (the power-law nature of the Omori law), we can imagine that the mechanism governing aftershocks may be glassy dynamics. An important point here is that an essential role is played not by conventional time but by natural time. The concept of natural time may be best represented by theory of complex networks (i.e., growing random graphs with complex topology). In a series of works [9-111, we have proposed a procedure of mapping the seismic data to a complex network. There, a vertex and an edge represent an event and event-event correlation, respectively. It has been shown that it is a scale-free small-world network. In the picture of a growing directed earthquake network, the path length (i.e., the degree of separation) between two vertices is none other than elapsed natural time itself and is a topologically-invariant quantity. These observations naturally lead to necessity of modeling seismicity (aftershocks, in particular) by discrete glassy dynamics on a growing complex network. This will be a part of investigation in the future.
Acknowledgments
S. A. thanks Andrea Rapisarda for giving him an opportunity to present an invited lecture at ghternational School of Solid State Physics, 31st Course: Complexity, Metastability and Nonextensivityh (20-26 July, 2004, Erice-Sicily, Italy). S. A. and N. S. were supported in part by the Grant-in-Aid for Scientific Research of Japan Society for the Promotion of Science. References 1. D. W. Steeples and D. D. Steeples, Bull. Seismol. SOC.Am. 86,921 (1996). 2. F. Omori, J. Coll. Sci. Imper. Univ. Tokyo 7, 111 (1894). See also,T. Utsu, Y . Ogata, and R. S. Matsu’ura, J. Phys. Earth 43, 1 (1995). 3. S. Abe and N. Suzuki, Physica A 332,533 (2004). 4. S. Abe and N. Suzuki, J . Geophys. Res. 108 (B2), 2113 (2003). 5. S. Abe and N. Suzuki, e-print cond-mat/0410123, to appear in Physica A. 6. P. A. Varotsos, N. V. Sarlis, and E. S. Skordas, Phys. Rev. E 66,011902 (2002); ibid. 67, 021109 (2003). 7. U. Tirnakli and S. Abe, Phys. Rev. E 70,056120 (2004). 8. S. Abe, N. V. Sarlis, E. S. Skordas, H. Tanaka, and P. A. Varotsos, e-print condmat/0412056 9. S. Abe and N. Suzuki, Europhys. Lett. 65,581 (2004); Physica A 337,357 (2004).
349 10. S. Abe and N. Suzuki, Lecture Notes in Computer Science, Part 111, edited by M. Bubak, G. D. van Albada, P. M. ASloot and J. J.Dongarra, (Springer-Verlag, Berlin, 2004), p.1046. 11. S. Abe and N. Suzuki, eprints cond-mat/0402226, 0411454.
AGING IN EARTHQUAKE MODELS
UGUR TIRNAKLI Department of Physics, Faculty of Science, Ege University 35100 I m i r , Turkey E-mail: [email protected] Recently, studying event correlation between aftershocks, it is found that there exists aging phenomenon and a simple scaling associated with this phenomenon for both real earthquakes and also model systems like coherent noise models. In this study, recent findings on this subject have been summarized and they are corroborated by the new results obtained from another model known as Olami-Feder-Christenen (OFC) model of earthquakes. PACS Number(s): 89.75.Da,05.90.+m,91.30.-f
Since earthquakes can be considered as one of the best suited examples for the dynamical systems exhibiting avalanches of activity with scale-free size distribution, the increasing interest in such dynamical systems yields also an increase in the attempts of working statistical mechanical properties of real earthquakes and earthquake models One of the interesting features of real earthquakes, observed very recently by Abe and Suzuki’, is the existence of aging phenomenon. Using the earthquake catalogs for southern California, they have found that, inside the Omori regime, the correlation between earthquake events exhibits aging. Here, the Omori regime refers to the region in seismic time series where the Omori law (i.e., p t - T , where p is the rate of aftershocks) for temporal pattern of aftershocks holds. They have also determined the scaling of the aging phenomenon and obtained a nice data collapse of the data. After such an observation, the immediate question that comes naturally is whether the earthquake models, where the Omori law holds, exhibit a similar behavior or not. In order to reply this, Tirnakli and Abe have studied the coherent noise model lo in a very recent work ’. Here, we shall summarize these results and then corroborate them by presenting some new findings from another model known as Olami-Feder-Christensen (OFC) model of earthquakes. Firstly, let us introduce the coherent noise model. The system consists of N agents and each agent has a threshold value zi against an external stress q. The thresholds and the external stress are chosen randomly from some probability distributions p t h ( z ) and pstress(q),respectively. In our simulations, we always use a uniform distribution (0 5 z 5 1) for p t h ( ~ and ) an exponential distribution p s t r e s s ( ~= ) a-1 exp(-q/a) (a > 0) for p s t r e s s ( z ) . Then the dynamics of this simple model is the following: (i) at each time step, generate a random stress from 1129394,5,6171819.
N
350
351 pstress(q) and eliminate all agents which have x i 5 77 and replace them by new agents with new thresholds taken from p t h ( z ) , (ii) select a small fraction f of N agents randomly and assign them new thresholds, (iii) return to step (i) for next time step. In spite of the fact that the model is just a mean field model with no geometric configuration space, it can still describe some important properties of real earthquakes (see r e f ~ . l *for * ~ details). To analyse the features of correlation between earthquake events, one has to employ as the basic random variable the time t, of the nth aftershock with an arbitrary avalanche size, where n is the natural time. The concept of natural time is crucial in this analysis since the correlation between earthquakes exhibits aging with respect to the natural time, not with respect t o conventional time. It has already been applied to complex time series of several physical phenomena ll. Using this definition of time, the two-point correlation function is then given by
..A'
N=lOOOO
a=0.2;f=0.01;sl=l
A ...'I
*
nw=250 nw=500 nw=lOOO
A
100
10'
. , ......
I
A
a
T
=
A
0
8
nw=2000 nw=5000 I
.
0 0
*
.,
103
102
. , ....,. . , , ., 0
,,,,,I
I
104
105
n Figure 1. The behavior of the correlation function of aftershocks larger than s1 as a function of natural time n for the coherent noise model. More than 100000 realizations are used to perform ensemble averaging. All quantities are dimensionless and logarithmic binning is employed to all data sets for a better visualization.
where n, is the natural waiting time, the ensemble average is over a large number of realizations, and the variances are defined as u$ = (tk) - (t,)2. In Fig. 1 we present the behavior of C (n n,,,, n,) as a function of the natural time, where the
+
352
1.o
** N= 10000 a=0.2 ;f=O.Ol ;sl=l
T
r=3 3
+rr=: 5
'
nw=250 nw=500 n,=1000 n,=2000 nw=5000
*
0.1 10"
I
102
I
.
I 1 . 1 1 1 . 1
..,..,,I
10-1
,
.,..,,,I
16
10'
, ,,,-
102
n I nW1.O5 Figure 2. Data collapse for the correlation function shown in Fig. 1. The solid line corresponds to eq(0.7n/nko5) with q ci 2.98.
aging phenomenon is clearly seen. Moreover, we obtain a very nice collapse of all data using the scaling relation
C(n+n,,n,)
=C
($) '
which is remarkably similar to the one used by Abe and Suzuki for real earthquake data. Our results are given in Fig. 2. From this figure it is also seen that the form of the scaling function can be well represented by the q-exponential (defined as e4(x) = [l (1- q ) ~ ] ~ / ( ~ - qwhich )), is known to play central role in nonextensive thermostatistics 12913. In the remainder of this paper, we shall employ the same analysis to another well-known model of earthquakes, which is the OFC model 14. In this model, each block, represented by i, in a discrete system of blocks carries a force F ; and whenever this value exceeds a fixed threshold Fth,this block becomes unstable and relaxes to zero, which in turn results in transfering some fraction of this force to its neighbors. This may create an avalanche. The dynamics of the model is then the following :
+
+
where Q is the conservation parameter, which is defined t o be = l / ( n i k). For the free boundary conditions, ni is the number of neighbors of block i and k denotes
353 1.o
-
3
P
I
3
+PL
8
.
L
0"
. A
.
.
8
nw=250 n,=500 nw=lOOO
0
0
nw=2000 nw=5000
0.1
n Figure 3. The behavior of the correlation function as a function of natural time n for the OFC model. 20000 realizations are used t o perform ensemble averaging. All quantities are dimensionless and logarithmic binning is employed t o all data sets for a better visualization. 1.o
32 x 32 lattice
T
3
P
nw=250
3
sE:
. '
L
0=
A
.
'
nw=500 nw= 1000 nw=2000
nw=5000
103
10-2
10-1
100
1 0'
102
n I nW1.O5 Figure 4. Data collapse for the correlation function shown in Fig. 3. The solid line corresponds to e , ( 0 . 6 7 ~ 1 / n &with ~ ~ )q N 2.9.
354 the elastic constant. As a result, the model is conservative (nonconservative) for k = 0 ( k # 0). In our simulations, we use 32 x 32 lattice under free boundary conditions with k = 1 which corresponds to ai = 0.2 in the bulk. We choose this value since it is inside the parameter region which mimics the real earthquakes better 14. In Fig. 3 we present the behavior of the correlation function of the OFC model, where a very similar aging phenomenon that we observed for the coherent noise model is evident. For the data collapse, as can easily be seen in Fig. 4, the same scaling function works for the OFC model as well. In conclusion, here we concentrate on the aging phenomenon, one of the interesting features of real earthquakes, observed very recently by Abe and Suzuki 5 . A natural expectation about the existence of such a phenomenon for earthquake models has been corroborated by studying both the coherent noise model and the OFC model. The new findings related to the OFC model strenghten the recent results of Tirnakli and Abe ’. It is no doubt necessary to perform further analysis for the OFC model to understand this behavior better.
Acknowledgments Very fruitful conversations with S. Abe and M. Paczuski on earthquake models are highly appreciated.
References S. Lise and M. Paczuski, Phys. Rev. E 64, 046111 (2001). P. Bak et al., Phys. Rev. Lett. 88, 178501 (2002). S. Hergarten and H.J. Neugebauer, Phys. Rev. Lett. 88, 238501 (2002). A. Helmstetter, S. Hergarten and D. Sornette, Phys. Rev. E 70, 046120 (2004). S. Abe and N. Suzuki, Physica A 332,533 (2004). 0. SotolongeCosta and A. Posadas, Phys. Rev. Lett. 92,048501 (2004). 7. M. Baiesi and M. Paczuski, Phys. Rev. E 69, 066106 (2004). 8. S. Abe and N. Suzuki, Europhys. Lett. 65, 581 (2004). 9. U. Tirnakli and S. Abe, Phys. Rev. E 70, 056120 (2004). 10. M.E.J. Newman, Proc. R. SOC.London B 263, 1605(1996); M.E.J. Newman and K. Sneppen, Phys. Rev. E 54, 6226 (1996). 11. P.A. Varotsos, N.V. Sarlis and E.S. Skordas, Phys. Rev. E 66, 011902 (2002); 67,
1. 2. 3. 4. 5. 6.
021109 (2003); 68, 031106 (2003). 12. Nonextensive Statistical Mechanics and Its Applications, edited by S . Abe and Y. Okamoto (Springer-Verlag, Heidelberg, 2001). 13. Noneztensive Entropy- Interdisciplinary Applications, edited by M. Gell-Mann and C. Tsallis (Oxford University Press, New York, 2004). 14. Z. Olami, H.J.S. Feder and K. Christensen, Phys. Rev. Lett. 68, 1244 (1992).
THE OLAMI-FEDER-CHRISTENSEN MODEL ON A SMALL-WORLD TOPOLOGY
F. CARUSO Scuola Superiore di Catania, Via S. Paolo 73, 95123 Catania V. LATORA , A. RAPISARDA Dipartimento d i Fisica e Astronomia, Universitb d i Catania, and INFN sezione d i Catania, Via S. Sofia 64, 95123 Catania, Italy
B. TADIC Department for Theoretical Physics, JoZef Stefan Institute, P. 0. Box 3000; SI-1001 Ljubljana, Slovenia We study the effects of the topology on the Olami-Feder-Christensen (OFC) model, an earthquake model of self-organized criticality. In particular, we consider a 2D square lattice and a random rewiring procedure with a parameter 0 < p < 1 that allows to tune the interaction graph, in a continuous way, from the initial local connectivity to a random graph. The main result is that the OFC model on a small-world topology exhibits self-organized criticality deep within the non-conservative regime, contrary to what happens in the nearest-neighbors model. The probability distribution for avalanche size obeys finite size scaling, with universal critical exponents in a wide range of values of the rewiring probability p . The pdf's cutoff can be fitted by a stretched exponential function with the stretching exponent approaching unity within the small-world region.
1. Introduction In order to understand the physical mechanisms underlying earthquakes, the study and the modelling of seismic space-time distribution is a fundamental step. In the last years, inspired by statistical regularities such as the Gutenberg-Richter and the Omori laws and by a desire to quantify the limits of earthquakes predictability, a wealth of mechanisms and models have been proposed. In particular, as a possible explanation for the widespread occurrence of long-range correlations in space and time, many authors have modelled the seismogenic crust as a self-organized complex system, that spontaneously organizes into a dynamical critical state Actually the real applicability of this kind of models to describe earthquakes dynamics is A model which has played an important role in the context of still debated Self-organized Criticality (SOC) for nonconservative systems is the Olami-FederChristensen (OFC) model of earthquakes 6. The most recent numerical investigations have shown that the OFC model on a square lattice with open boundary 'l2i3.
415.
355
356 conditions displays a power law distribution of avalanche sizes with a universal exponent r 21 1.8, which is independent of the dissipation parameter; instead, in the case with periodic boundary conditions, no power law was found. Therefore, boundary conditions and in general the underlying topology play a fundamental role for the criticality; for example, in the case with open boundary conditions, the boundary sites update themselves at a different frequency from the bulk sites and this inhomogeneity, together with a divergency length scale in the thermodynamic limit, induces partial synchronization of the elements of the systems building up long-range spatial correlations and thereby creating a critical state. However, in this last case, there is no finite size scaling. On the other hand, the OFC model on a quenched random graph is subcritical and shows no power law distributions. In this context, the purpose of our work is to study the effects of the graph topology on the criticality of the non-conservative OFC model. In particular, we consider a small-world graph, following the method by Watts and Strogatz l3?l4,and we analyze some SOC properties. Then we investigate the effects of an increasing number of long-range connections on the criticality of the model, by varying the rewiring probability in the range 0 < p < 1 . In this case, we find that for a particular region of values of the structural parameter p within the small-world regime, the probability distribution for avalanche size obeys finite size scaling with universal critical exponents (within numerical error bars). Moreover, we fit the pdf's cutoffs by a stretched exponential function and we note that the stretching exponent approaches' 0 1 at the small-world threshold. ---f
2. The OFC model on a small-world topology The Olami-Feder-Christensen (OFC) model is defined on a discrete system of N sites on a square lattice, each carrying a seismogenic force represented by a real variable F,,which initially takes a random value in the interval (0,Fth). All the forces are increased simultaneously and uniformly (mimicking a uniform tectonic loading), until one of them reaches the threshold value Fth and becomes unstable (Fi2 Fth).The driving is then stopped and an "earthquake" (or avalanche) starts when
where "nn" denotes the set of nearest-neighbor sites of i. The dissipation level of the dynamics is controlled by the parameter Q and, in the case of a graph with fixed connectivity q , it takes values between 0 and l / q (for the conservative case Q! = l / q ) . As regards the model dynamics, all sites that are above threshold at a given time step in the avalanche relax simultaneously according to (1) and then, by a chain reaction, they can create new unstable sites until there are no more unstable sites in the system (Fi < Fth,V i ) : this one is an earthquake. At the end of this event, the uniform growth then starts again and other earthquakes will happen.
357 The number of topplings during an earthquake defines its size, s, and we will be interested in the probability distribution PN(s). In our approach, we point out that the boundary conditions are "open", i.e. we impose F = 0 on the boundary sites. Let us point out again that recent numerical investigations have shown that the non-conservative OFC model on a square lattice (with "open" boundary conditions) displays scaling behavior, up to lattice sizes presently accessible by computer simulations 7,9. The avalanche size distribution is described by a power law, characterized by a universal exponent r N 1.8, independent of the dissipation parameter. However, this distribution does not display finite size scaling. In the OFC model, criticality has been ascribed to a mechanism of partial synchronization lo. In fact the system has a tendency to order into a periodic state but it is frustrated by the presence of inhomogeneities such as the boundaries. In particular these inhomogeneities induce partial synchronization of the elements of the system building up long range spatial correlations and a critical state is obtained: therefore the mechanism of synchronization requires an underlying spatial structure. Indeed, it is known that the OFC model on a quenched random graph displays criticality even in the nonconservative regime but introducing some inhomogeneities 8. In a random graph in which all the sites have exactly the same number of nearest neighbors q (both for q = 4 and q = 6) the dynamics of the non-conservative OFC model organizes into a subcritical state. In order to observe scaling in the avalanche distribution, one has to introduce some inhomogeneities. It has been found that, for the OFC model on a (quenched) random graph, it enough to consider just two sites in the system with coordination q - 1 '. In fact, when either of these sites topple according to rule (l),an extra amount aFi is simply lost by the system and a critical behavior appears. Our work consists in the study of the non-conservative OFC model on a small-world topology and it is motivated by two main reasons. First of all we expect that the inclusion of some inhomogeneities in the sites degree is not the unique way to obtain SOC. In fact, as we are going to show, an alternative way is to keep fixed the sites degree and to change the topology of the underlying network, for instance by considering a small -world graph, obtained by randomizing a fraction p of the links of the regular nearest neighbor lattice. The method to construct networks with intermediate properties between regular and random graphs proposed by Watts and Strogatz 13,14 is perfectly suited to our purpose since it allows us to consider networks that are in between the two most widely investigated cases, namely the nearest neighbors lattice and the random graph. The second reason has to do with modelling the real space-time seismicity. A small-world topology is expected to be a more accurate description of a real systems according to the most recent geophysical observations that indicate that earthquakes might have long-range correlations both in time and space. In fact, if a main fracture episode occurs, it may induce slow strain redistribution through the earth crust, thus triggering long-range as well as short-range seismic effects The presence of a certain percentage of
10>11i12,
15716,17,18719.
358 long-range connections in the network takes into account the possibility that an earthquake can trigger earthquakes not only locally but also at distant regions. Actually by following the method proposed by Watts and Strogatz we start with a two-dimensional square lattice in which each site is connected to its 4 nearest neighbors and then the links of the lattice are rewired at random with a probability p . The main differences with respect to the original model is that for any value of p we want to keep fixed the connectivity of each site and therefore the connections are rewired in couples.
3. Results In our simulations we have studied a two-dimensional square lattice (with "open" boundary conditions) L x L with three different sizes: L = 32,64 and 128. The corresponding number of sites is N = L2. We have considered up to lo9 avalanches to obtain a good statistics for the avalanche size distribution P N ( s ) . All the curves can be fitted by a stretched-exponential function:
where s is the avalanche size, E is the characteristic length and r and u are two exponents. We notice that, on approaching the small-world limit, the power-law is practically lost and the stretching becomes more and more pronounced. This can be better studied by plotting the value of the two exponents I- and u as a function of p , as done in Fig.1. In this figure T takes the value 1.8 for p = 0.02 at the small-world transition. Moreover we observe a sudden change in the behavior of the stretching exponent u at SW probability. In general one can expect stretching-exponential in various cases of stochastic processes where many length scales appear. So, when the system is at the small-world threshold (and beyond) multiple lengths start playing a role, because of the presence of long-range links.
t
j
Figure 1. The two exponents T and u as a function of the rewiring parameter p ; this resu1t.srefer to the case with L=64. In the other cases, we get similar results.
359
LL1_LGLLLIu 1 -3
-2.75
-2.5
-2.25
-2
a 1 1 1 1 , , , 1 , , 1 1 , , , 1 , 1 1 1 1 1 1 1 1 1 1 1 , 1 ,
-1.75
-1.5
-1.25
-1
-0.75
-0.5
-0.25
0
log1 O(S/LAD) Figure 2. Finite size scaling for dissipative OFC model on a small-world topology for three different values of N, namely N = 322,642,1282; by fit,ting the curves, we derive the following critical exponents: D = 2 and /3 N 3.6.
In order to characterize the critical behavior of the model, the following finite size scaling (FSS) ansatz is used
P N ( s )N N-’f ( s / N D )
(3)
where f is a suitable scaling function and /3 and D are critical exponents describing the scaling of the distribution function. In Fig.2 we consider cx = 0.21 and rewiring probability p = 0.006 below the small-world transition and we show the collapse of P N ( s )for three different values of N , namely N = 322,642,1282. We find that the distribution P N ( s )satisfies the FSS hypothesis reasonably well, with universal critical coefficients, for small rewiring probability. However, on approaching the small-world limit ( p = 0.02), the cut-offs are not exponential and thus no FSS is expected. The critical exponent derived from the fit of Fig.2 are p N 3.6 and D = 2, independently of the dissipation parameter a. The FSS hypothesis implies that, for asymptotically large N , PN(s)N s-+ and the value of the exponent is 7 = P/D N 1.8. Because of the numerical errors it is difficult to assert with certainty that r is a novel exponent, different from the one for the conservative RN model (T = 1.5). However the power law exponent of the distribution is consistent with the exponent of the OFC model with open boundary conditions, i.e. r N 1.8. On a rewired topology the system behaves as in the compact square lattice, but the occurrence of a small amount of long-range links disseminate the avalanches over the network and the whole avalanche has the property to scale with the systems size; in fact the links, randomly long, can be system-size long. A superposition of
360
all these local events could be one of possible explanations of this whole instability (avalanche) and then the stretching exponential cut-offs can be used to fit the size avalanche distribution. 4. Conclusions In this work we have studied how the topology can play an important role on the dynamics of the OFC model. Following this idea, a rewiring procedure as that introduced by Watts and Strogatz l 3 has been considered. We have found that the model is then critical even in the nonconservative regime. In fact, with respect to the nearest neighbor OFC model’, in which the criticality appears only in the conservative case, unless introducing some inhomogeneities, a small-world graph has an underlying spatial structure. Therefore partial synchronization of the elements of the system can still occur, without being destroyed as in the quenched random graph without inhomogeneities. Finally we point out that, on approaching the small-world limit, the power-law is practically lost, the cutoffs are fitted by a stretched-exponential function and the sudden change of behavior of the stretching exponent u appears at SW probability. Moreover we should emphasize that this kind of study can be easily extended to other topologies, as scale-free networks, in order to better understand the role of topology for earthquakes models.
References 1. P. Bak, C. Tang, and K. Wiesenfeld, Phys. Rev. Lett. 59, 381 (1987); Phys. Rev. A 38,364 (1988). 2. P. Bak, How Nature Works: The Science of Self-Organized Criticality (Copernicus, New York, 1996). 3. H. Jensen, Self-Organized Criticality (Cambridge University Press,New York, 1998). 4. X. Yang, S. Du, and J. Ma, Phys. Rev. Lett. 92,228501 (2004). 5. M. S. Mega, P. Allegrini, P. Grigolini, V. Latora, L. Palatella, A. Rapisarda and S. Vinciguerra, Phys. Rev. Lett. 92,129802 (2004). 6. Z. Olami, H.J.S. Feder, and K. Christensen, Phys. Rev. Lett. 68, 1244 (1992); K. Christensen and Z. Olami, Phys. Rev. A 46, 1829 (1992). 7. S. Lise and M.Paczuski, Phys. Rev. E 63,036111 (2001). 8. S. Lise and M. Paczuski, Phys. Rev. Lett. 88, 228301 (2002). 9. S. Lise and M.Paczuski, Phys. Rev. E 64,046111 (2001). 10. A. A. Middleton and C. Tang, Phys. Rev. Lett. 74,742 (1995). 11. J.E.S. Socolar, G. Grinstein, and C. Jayaprakash, Phys. Rev. E 47,2366 (1993). 12. P. Grassberger, Phys. Rev. E 49,2436 (1994). 13. D.J. Watts and S.H. Strogatz, Nature 393, (1998) 440. 14. D.J. Watts Small Worlds (Princeton University Press, Princeton, New Jersey, 1999). 15. Y.Y. Kagan and D.D. Jackson, Geophys. J. Znt. 104,117 (1991). 16. D.P. Hill et al., Science 260, 1617 (1993). 17. L. Crescentini, A. Amoruso, R. Scarpa, Science 286, 2132 (1999). 18. T. Parsons, J. Geophys. Res. 107,2199 (2001). 19. M. S. Mega, P. Allegrini, P. Grigolini, V. Latora, L. Palatella, A. Rapisarda and S. Vinciguerra, Phys. Rev. Lett. 90,188501 (2003).
Networks
This page intentionally left blank
NETWORKS AS RENORMALIZED MODELS FOR EMERGENT BEHAVIOR IN PHYSICAL SYSTEMS
MAYA PACZUSKI P e r i m e t e r Institute for Theoretical Physics, Waterloo, Canada, N2L 2Y5 and Department of Mathematics, Imperial College London, London, UK SW7 2AZ E-mail: mayaOic.ac.uk
Networks are paradigms for describing complex biological, social and technological systems. Here I argue that networks provide a coherent framework to construct coarsegrained models for many different physical systems. To elucidate these ideas, I discuss two long-standing problems. The first concerns the structure and dynamics of magnetic fields in the solar corona, as exemplified by sunspots that startled Galileo almost 400 years ago. We discovered that the magnetic structure of the corona embodies a scale free network, with spots a t all scales. A network model representing the three-dimensional geometry of magnetic fields, where links rewire and nodes merge when they collide in space, gives quantitative agreement with available data, and suggests new measurements. Seismicity is addressed in terms of relations between events without imposing space-time windows. A metric estimates the correlation between any two earthquakes. Linking strongly correlated pairs, and ignoring pairs with weak correlation organizes the spati-temporal process into a sparse, directed, weighted network. New scaling laws for seismicity are found. For instance, the aftershock decay rate decreases as l / t in time up to a correlation time, tomori. An estimate from the data gives tomori to be about one year for small magnitude 3 earthquakes, about 1400 years for the Landers event, and roughly 26,000 years for the earthquake causing the 2004 Asian tsunami. Our results confirm Kagan’s conjecture that aftershocks can rumble on for centuries. N
1. Introduction
A fundamental problem in physics, which is not always recognized as being “a fundamental physics problem”, is how to mathematically describe emergent phenomena. It seems hopeless for many reasons to make a theory of emergence that harmonizes all scales, from the Planck scale to the size of our Universe, and includes life on Earth with its manifest details, such as bacteria, society, or ourselves as individual personalities. That would be a true theory of everything (TToE). (For a discussion see Refs. [1,2].) However, a reasonable aim is to describe how entities or excitations with their own effective dynamics develop from symmetries, conservation laws and nonlinear interactions between elements at a lower level. Some famous examples in statistical physics are critical point fluctuations, avalanches in sand pile^,^ vorticity in ,~?~ turbulence,“ or the distribution of luminous matter in the U n i v e r ~ e . ~Contem-
363
364 porary work in quantum gravity suggests that both general relativity and quantum mechanics may emerge from coarse graining a low energy approximation to the fundamental causal histories. These histories are sequences of changes in graphs, that may be nonlocal or small-world networks.a Similar sets of questions crop up across the board. How do you get qualitatively new structures and dynamics from underlying laws? An important distinction appears between equilibrium and far from equilibrium systems. Roughly speaking, most equilibrium systems are complex in the same way. They exhibit emergent behavior at critical points with fluctuations governed by symmetry principles, etc. Non-equilibrium systems, however, seem complex in a myriad, different ways. However, a variety of indicators point to principles of organization for emergent phenomena far from equilibrium. Various types of scaling behaviors in physical systems (scale invarian~e,~ scale covariance,l0Y1l etc.) can be quantitatively predicted using coarse-grained models. After all, the underlying equations typically govern at length and time scales well below those where observations are made. The key is to capture the dynamics of larger scale entities, or ”coherent structures”,12 and use those as building blocks to model the whole system. Ideally, renormalized models may be derived from the underlying equations, but it is not clear that this is always possible. Even without an explicit derivation, though, once such a model is born out in a specific system, by subjecting it to falsifiable tests, it may also connect to other physical situations with similar, or even different underlying laws. Nowadays, computational science tends to emphasize studies of bigger and bigger systems with more and more details. That is unlikely, by itself, to lead to any better understanding of emergence, and also can easily be demonstrated to be fruitless for many interesting problems in physics, like those discussed here. There are simply too many degrees of freedom coupled over too long times, compared to the microscopic time. That doesn’t mean that these problems are unsolvable through computational methods though. We must use a different starting point. Complex networks have been intensively investigated recently as descriptions of biological, social and technological phenomena. In fact, a sparse network expresses coarse-graining in a natural way, since the few links present highlight relevant interactions between effective degrees of freedom, with all other nodes and links deleted. Then renormalization may proceed further on the network alone by grouping tightly coupled nodes or modules together and finding the interactions between those new effective degrees of freedom. Understanding processes of network organization, perhaps through an information theory of complex networks,15 is (arguably) necessary to make progress toward theories of emergence in physical systems. In order to demonstrate the wide applicability of these ideas in diverse contexts and at different levels in our ability to describe physical phenomena, I present two distinctive examples of networks as empirical descriptions for physical systems. l 3 9 l 4
365
First, I discuss the coronal magnetic field and show that much of the important physics can be captured with a network where nodes and links interact in space and time with local rules. In this case, the network is an abstraction of the geometry of the magnetic fields. We use insights gained from studying the underlying equations, and a host of observations from the Sun to determine a minimal m ~ d e l . ~ ~ J ’ J ~ Second, I discuss a new approach to seismicity based solely on relations between events, and not on any fixed space, time or magnitude scales. Earthquakes are represented as nodes in the network, and strongly correlated pairs are linked. A sparse, directed network of disconnected, highly clustered graphs emerges. The ensemble of variables and their relations on this network reveal new scaling laws for seismicity. Our network model of coronal magnetic fields is minimal in that if any of its five basic ingredients are deleted then its behavior changes and fails to agree with observations. However, its rules can be changed in many ways, for instance by altering parameters, or adding interactions without modifying most statistical properties. Although the model is not explicitly constructed according to a formalism based on symmetry principles, relevant operators, and general arguments used for statistical field theories, it appears to have comparable robustness and fixed point properties. Lastly, the model is falsifiable. We have made numerous predictions for observables, as well as suggesting new quantities to be measured. In fact, studying its behavior led us to re-analyze previously published coronal magnetic field data, and reveal the scale-free nature of magnetic LLconcentrations” on the visible surface of the Sun. 2. The Coronal Magnetic Field
The Sun is a magnetic star.lg Like Earth, matter density at its surface drops abruptly, and a tenuous outer atmosphere, the corona, looms above. The surface, or photosphere, is much cooler than both the interior of the Sun, and the corona. For this reason, only magnetic fields at or near the surface have been directly measured. Several mechanisms have been proposed for coronal heating including nanoflares.’O Like bigger, ordinary flares, these may be caused by sudden releases of magnetic energy from reconnection. Reconnection occurs when magnetic field lines rapidly rearrange themselves. Fast reconnection is a threshold process that occurs when magnetic field gradients become sufficiently steep.20,’1~22 In the convective zone below the photosphere, temperature gradients drive instabilities. Moving charges in the plasma create magnetic fields. Rising up, these fields pierce the photosphere and loop out into the corona. The pattern of flux on the photosphere and in the corona is not uniform, though. Flux is tightly bundled into long-lived flux tubes that attach to the photosphere at footpoints. These flux loops survive for hours or more, while the lifetimes of the granules on the photosphere is minutes. Footpoints aggregate into magnetic L‘concentrations”on the photosphere. Measuring these concentrations provides a quantitative picture that can be compared
366
n
8 v
t
Erc
10
1
100
CD ( X l O l 8 M x ) Figure 1. Results demonstrating the scale free network of coronal magnetic flux and comparison with results from numerical simulations of our self-organizing network. For the concentration data F ( @ )= constant x P(@)(Aip)x where P ( @ ) ( A @is) the normalized number of magnetic concentrations in bins of size A@ = 1.55 x 1017 Mx,obtained by reanalyzing the measurement data originally shown in Figure 5 of Ref. [24]. The model data shown represents the probability , number of loops, k f o o t , connected to afootpoint. This has been rescaled distribution, P ( k f o o t ) for so that one loop, kfoot = 1, equals the minimum threshold of flux, 1.55 x 1017 Mx. The cutoff at large ip in the model data is a finite size effect that can be shifted to larger values or smaller ones by changing the size of the system.
with theory. The strongest, and physically largest concentrations are sunspots, The ~ intense magnetic fields in these rewhich may contain more than 10” M x . ~ gions cool the plasma, so they appear dark in the visible spectrum. The smallest resolvable concentrations above the current resolution scale of M 10l6 Mx are “fragments”. Solar physicists have constructed elaborate theories where at each scale a unique physical process is responsible for the dynamics and generation of magnetic concentrations, e.g. a “large scale dynamo” versus a “surface dynamo” etc. These theories predict an exponential distribution for concentration sizes. 2.1. Coronal Fields Form a Scale Free Network David Hughes and I re-analy~ed’~,’~ previously published data sets reporting the distribution of concentration sizes.24 As shown in Fig. 1, we discovered that the distribution is scale free over the entire range of measurement. The probability to have a concentration with flux a, P ( @ ) a-7 with y M 1.7, as indicated by the N
367 flat behavior of F ( 4 ) in Fig. 1. Similar results were found using other data sets.17 2.2. The Model
Results from numerical simulations of our network model are also shown in Fig. 1. The only calibration used (which is unavoidable) was to set the minimal unit of flux in the model equal to the flux threshold of the measurement. How did we get such good agreement without solving any of the plasma physics equations? Considering the long-lived flux tubes as the important coherent structures, we treated the coronal magnetic field as made up of discrete interacting loops embedded in three dimensional space.16 Each directed loop traces the mid-line of a flux tube, and is anchored to a flat surface a t two opposite polarity footpoints. A footpoint locates the center of a magnetic concentration, and is considered to be a point. A collection of these loops and their footpoints gives a distilled representation of the coronal magnetic field structure. Our network model is able to describe the three dimensional geometry of fields that are very complicated or interwoven. The essential ingredients, which must be included to agree with observations are: injection of small loops, submergence of very small loops, footpoint diffusion, aggregation of footpoints, and reconnection of loops. Observations indicate that all of these physical processes occur in the corona.20~21~24 Loops injected at small length scales are stretched and shrunk as their footpoints diffuse over the surface. Nearby footpoints of the same polarity aggregate, to form magnetic fragments, which can themselves aggregate to form ever larger concentrations of flux. Each loop carries a single unit of flux, and the magnetic field strength at a footpoint is given by the number of loops attached to it. The number of loops that share a given pair of footpoints measures the strength of the link. The link strengths also have a scalefree distribution with a steeper power law than the degree distribution of the nodes, or concentrations. Also, the number of nodes that a given node is connected to by at least one loop is scale-free. Both of these additional claims could also be tested against observation^.^'^'^ Loops can reconnect when they collide, or cross at a point in three dimensional space above the surface. The flux emerging from the positive footpoint of one of the reconnecting loops is then no longer constrained to end up at the other footpoint of the same loop, but may instead go to the negative footpoint of the other loop. This occurs if the rewiring lowers the combined loop length. The loops between the newly paired faotpaints then both relax t o a semi-circular shape. Reconnection allows footpoints to exchange partners and reshapes the network, but it maintains the degree of each footpoint. If rewiring occurs, one or both loops may need to cross another loop. A single reconnection between a pair of loops can trigger an avalanche of reconnection. Reconnections occur instantaneously compared to the diffusion of footpoints and injection of loops. It may also happen that due to reconnection or footpoint diffusion, very small loops are created. These are removed from the system. Thus
368
--
from -ve fragments model
-
3
Footpoint separation (x10 km) Figure 2. The cumulative percentage of footpoint pairs separated by a distance on the photosphere larger than d. The flux tube data corresponds to Figure 6c in Ref. 1241. The model data has been scaled such that one unit of length is equal to 0.5Mm.
the collection of loops is an open system, driven by loop injection with an outflow of very small loops. From any initial condition, the system reaches a steady state where the loops self organize into a scale free network. As shown in Fig. 1 , the number of loops, kf,,t, connected to any footpoint is distributed as a power law
-
P ( k f o o t ) k&
with y = 1.75 f -0.1
.
2.2.1. Further predictions of the network model
The distribution of distances, d , between footpoint pairs attached to the same loop can also be calculated and compared with measurement data as shown in Fig. 2. Indeed, by setting one unit of length in the model equal to 0.5 x 103km on the photosphere, good agreement between the model results and observation is obtained up to the supergranule cell size. Deviations above that length scale may be due to several causes: our assumption that the loops are perfectly semi-circular, finite system size effects in the model or observations, or the force free approximation used to calculate the flux tube connectivity from observations of concentrations. Comparing with the observed diffusive behavior of magnetic concentration^^^ allows an additional calibration of time. One unit of time in the model is equal to about
369 300 seconds on the photosphere. From these three calibrations we are able to determine values for the total solar fluxand the “flux turnover time”, which both agree quantitatively with observations. See Ref. [16] for details. Our model predicts not only nominally universal quantities like various critical exponents characterizing the flux network but also quantities that have typical scales, such as total solar flux, the distribution of footpoint separations, and the flux turnover time in the corona. In order to represent the geometry of the coronal magnetic fields, a three-dimensional model, as discussed here, is required. Whether similar network models can be used to describe other high Reynolds number astrophysical plasmas remains an open question. 3. Seismicity
Despite many efforts, seismicity remains an obscure phenomenon shrouded in vague ideas without benchmarks of testability. At present, no dynamical model can capture, simultaneously, the three most robust statistical features of seismicity: (1)the Gutenberg-Richter (GR) law26927 for the distribution of earthquake magnitudes, and the clustering of activity in (2) space and (3) time. Spatio-temporal correlations include the Omori law28,29for the decay in the rate of aftershocks (see Eq. 4) and the fractal appearance of earthquake epicenters. Note that stochastic processes like the ETAS process30 require three or more power law distributions to be put in by hand. Since these are the main scaling features (1-3) that we wish to establish a plausible dynamical mechanism for, ETAS models are not regarded by this author as dynamical models of seismicity. To begin with, better methods to characterize seismicity are needed. Here I briefly discuss a network paradigm put forward by Marco Baiesi and myself to this end.31-32
3.1. A Unified Approach to Different Patterns of Seismic Activity Since seismic rates increase sharply after a large earthquake in the region, events have been classified as aftershocks or main shocks, and the statistics of aftershock sequences have been extensively studied. Usually, aftershocks are collected by counting all events within a predefined space-time window following a main event,33!34!35These sequences are used e.g. to describe earthquake triggering36 or predict earthquake^.^^ Obviously, some types of activity, such as swarms, remote triggering,38 etc. cannot fit into this framework. Perhaps a different description is needed for each pattern of seismic activity. On the other hand, it seems worthwhile to look for a unified perspective to study various patterns of seismic activity within a coherent f r a m e w ~ r k . ~ ~ , ~ ~ What if we do not fix a priori the number of main shocks an event can be an aftershock of? Perhaps an event can be an aftershock of more than one predecessor. On the other hand, all events are not equally correlated to each other. Probably the situation is somewhere in between having one (or zero) correlated predecessors
370 or being strongly correlated to everything that happened before. In fact, a sparse but indefinite property of correlations between events may be ubiquitous to all intermittent spatietemporal processes with memory. A sparse network (where each node is an event or earthquake) linking strongly correlated pairs of events stands out as a good starting point for describing seismicity in a unified way. In order to pursue this line of reasoning, we treat all events on the same footing, irrespective of their magnitude, local tectonic features, e t ~ However, unlike other approaches we do not predefine any set of space or time windows. The sequence of activity itself selects these. Our method is also unrelated to that of Abe and S ~ z u k i . ~ ~ 3.2. Relations Between Pairs of Events: The Metric We consider ONLY the relations between earthquakes and NOT the properties of individual events. Only catalogs that are considered complete are examined,43 and no preferred scales are imposed on the phenomenon. Instead, we invoke a metric to estimate the correlation between any two earthquakes, irrespective of how far apart they are in space and/or time.31*32Consider as a null hypothesis45 that earthquakes are uncorrelated in time. Pairs of events where the null hypothesis is strongly violated are correlated. The metric measures the extent to which the null hypothesis is wrong. The specific null hypothesis that we have investigated so far31y32is that earthquakes occur with a distribution of magnitudes given by the GR law, with epicenters located on a fractal of dimension df, randomly in time. Setting df = 2 does not change the observed scaling behaviors, nor does varying the GR parameter, b. An earthquake j in the seismic region occurs at time Tj at location Rj. Look backward in time to the appearance of earthquake i of magnitude mi at time Ti, at location R,. How likely is event i, given that event j occurred where and when it did? According t o the null hypothesis, the number of earthquakes of magnitude within an interval Am of mi that would be expected to have occurred within the time interval t = Tj - Ti seconds, and within a distance 1 = IRi - Rjl meters, is
nij
= (const)t ldf
Am
.
(1)
Note that the space-time domain ( t , l ) appearing in Eq. 1 is self-selected by the particular history of seismic activity in the region and not set by any observer. All earthquake pairs are considered on the same basis according to this metric. Consider a pair of earthquakes (i, j ) where nij << 1;so that the expected number of earthquakes according to the null hypothesis is very small. However, event i actually occurred relative to j , which, according to the metric, is surprising. A small value nij << 1 indicates that the correlation between j and i is very strong, and vice vewa. By this argument, the correlation cij between any two earthquakes
.
371 i and j can be estimated to be inversely proportional to n i j , or cij = l / n i j
.
(2)
We measured cij between all pairs of earthquakes greater than magnitude 3 in the catalog for Southern California from January 1, 1984 to December 31, 2003.46 The removal of small events assures that the catalog is complete, but otherwise the cutoff magnitude is not important. The distribution of the correlation variables cij for all pairs i, j was observed to be a power law over fourteen orders of magnitude. Since no characteristic values of c appear in this distribution, it doesn’t make sense to talk about distinctly different classes of relationships between pairs. On the other hand, due to the extremely broad distribution, each earthquake j may have exceptional events in its past with much stronger correlation to it than all the others combined. These strongly correlated pairs of events can be marked as linked nodes, and the collection of linked nodes forms a sparse network of disconnected, highly clustered graphs. 3.3. Directed, Weighted Networks of Correlated Earthquakes
A sparse, directed, weighted network is constructed by only linking pairs whose correlation c exceeds a given threshold, c<. Each link is directed from the past to the future. For each threshold, c<, the error made in deleting links with c < c< can be estimated. For instance, throwing out 99.8% of links gives results accurate to within 1%. This leads to not only massive data reduction with controllable error, but also a renormalized model of seismicity, which extracts the important, correlated degrees of freedom. Each link contains several variables such as the time between the linked events, the spatial distance between their epicenters, the magnitudes of the earthquakes, and the correlation between the linked pairs. The networks are highly clustered with a universal clustering coefficient x 0.8 for nodes with small degrees, as well broad, approximately power law in- and out-degree distributions for the nodes. Consequently, some events have many aftershocks, or outgoing links, while others have one, or zero. Also, some events are aftershocks of many previous events, i.e. they have many incoming links, while others are aftershocks of only one (or zero) events. The data reveal an absence of characteristic values for the number of in or out-going links to an e a r t h q ~ a k e For . ~ ~each event j that has at least one incoming link, we define a link weight to each ”parent” earthquake i it is attached to as
where the sum is over all earthquakes k with links going into j. For instance, an event can be f an aftershock of one event, f an aftershock of another, and an aftershock of a third. Normally, the parameter q = 1, but it can also be varied without changing the scaling properties of the ensemble of network variables.
372 I
I
r-- -. .'
'
I
I
'
I
I
I
'
I
I
I
-2 -4
2 ;r
-6
s 2 -8 w
-10
*
rn rn = 7.1 (Hector Mine)
-12 I
2
I
I
I
I
I
5 log,, t
4
3
6
I
7
I
l
8
\
n
A
l
9
Figure 3. The Omori law for aftershock rates. Rates are measured for aftershocks linked to earthquakes of different magnitudes. For each magnitude, the rate is consistent with the Omori law, Eq. 4. As guides to the eye, dashed lines represent a decay l/t. The dense curves represent the fits obtained by means of Eq. 5 for m = 3, m = 4, and m = 5. N
3.4. The Omori Law for Earthquakes of All Magnitudes
Fig. 3 shows the rate of aftershocks for the Landers, Hector Mine, and Northridge events. The weights, w, of the links made at time t after one of these events are binned into geometrically increasing time intervals. The total weight in each bin is then divided by the temporal width of the bin to obtain a rate of weighted aftershocks per second. The same procedure is applied to each remaining event, not aftershocks of these three. An average is made for the rate of aftershocks linked to events having a magnitude within an interval Am of m. Fig. 3 also shows the averaged results for m = 3 (1871 events), m = 4 (175 events), m = 5 (28 events) and m = 5.9 (4 events). Earthquakes of all magnitudes have aftershocks that decay according to an Omori 1aw,2s~29 v(t)
-
K c+t '
for t
< tomori
(4)
where c and K are constant in time, but depend on the magnitude of the earthquake." We find that the Omori law persists up to time tomorithat also depends on m. The function
vm(t)
t-'e-tltomori
.
(5)
373 was fitted to the data, excluding short times, where the the aftershock rates do not yet scale as l / t . The short time deviation from power law behavior is presumably due to saturation of the detection system, which is unable t o reliably detect events happening at a fast rate. However, this problem does not occur at later times, where the rates are lower. Some examples of these fits are also shown in Fig. 3 for the intermediate magnitude events. From these fits, a scaling law
was observed for times shorter than the duration of the catalog. It corresponds to
tomoriM 11 months for m = 3, and to tomoriM 5 years for m = 4. An extrapolation yields tomoriM 1400 years for an event with m = 7.3 such as the Landers event and tomori M 26,000 years for the 9.0 Northern Sumatra earthquake causing the 2004 Asian tsunami. These results confirm Kagan’s conjecture that aftershocks can
rumble on for ~enturies.~’ Indeed, with previous measurement techniques it was not possible to test his hypothesis. 4. Acknowledgments
The author thanks David Hughes, Marco Baiesi, and Jorn Davidsen for enthusiastic discussions and their collaborative efforts which contributed to the work discussed here, as well as Peter Grassberger for critical comments on the manuscript. She also thanks her colleagues at the Perimeter Institute, including Fotini Markopoulou and Lee Smolin, for wide ranging conversations.
References 1. G. ’t Hooft, L. Susskind, E. Witten, M. Fukugita, L. Randall, L. Smolin, J. Stachel, C. Rovelli, G. Ellis, S. Weinberg and R. Penrose, Nature 433,7023, 257 (2005). 2. L. Smolin, Phil. ’pram.: Math., Phys.b Eng. Sci., (Nobel Symposium) 361,[1807] 1081 (2003). 3. P. Bak, C. Tang and K. Weisenfeld, Phys. Rev. Lett. 59,381 (1987). 4. U. Fkisch, Turbulence (Cambridge University Press, Cambridge, 1995). 5. F. S. Labini, M. Montuori and L. Pietronero, Phys. Rep. 293,62 (1998). 6. P. Bak and K. Chen, Phys. Rev. Lett. 86,4215 (2001). 7. P. Bak and M. Paczuski, Physica A 348,277 (2005). 8. F. Markopoulou and L. Smolin, Phys. Rev. D70, 124029 (2004). 9. K. E. Bassler, M. Paczuski and E. Altshuler, Phys. Rev. B64,224517 (2001). 10. B. Dubrulle, Phys. Rev. Lett. 73,959 (1994). 11. K. Chen and P. Bak, Phys. Rev. E62,1613 (2000). 12. T. Chang, Phys. Plasma 6,4137 (1999). 13. R. Albert and A. -L. Barabhi, Rev. Mod. Phys. 74,47 (2002). 14. M. E. J. Newman, SIAM Rev. 45, 167 (2003). 15. A. Peel, M. Paczuski and P. Grassberger, in preparation. 16. D. Hughes, M. Paczuski, R. 0. Dendy, P. Helander and K. G. McClements, Phys. Rev. Lett. 90,131 101 (2002). 17. D. Hughes and M. Paczuski, preprint astro-ph/0309230. 18. M. Paczuski and D. Hughes, Physica A 342,158 (2004).
374 19. For a review see J.B. Zirker, Journey from the Center of the Sun (Princeton University Press, Princeton, 2002). 20. E. N. Parker, Astrophys. J. 330,474 (1988); Sol. Phys. 121,271 (1989). 21. E. N. Parker, Spontaneous Current Sheets in Magnetic Fields (Oxford University Press, New York, 1994). 22. E. T. Lu and R. J. Hamilton, Astrophys. J. Lett. 380,L89 (1991). 23. One Maxwell (Mx) equals lo-' Weber. 24. R. Close, C. Parnell, D. MacKay and E. Priest, Sol. Phys. 212,251 (2003). 25. H. J. Hagenaar, C. J. Schrijver, A. M. Title and R. A. Shine, Astrophys. J. 511, 932 (1999). 26. B. Gutenberg and C. F. Richter,Seismicity of the Earth, Geol. Soc. Am. Bull. 34,1 (1941). 27. In large seismic regions over long periods of time, the distribution of earthquakes with with b M 1. magnitude m, P ( m ) 28. F. Omori, J. Coll. Sci. Imp. Univ. Tokyo 7,111 (1894). 29. T. Utsu, Y. Ogata and R. S. Matsu'ura, J. Phys. Earth 43,1 (1995). 30. A. Helmstetter and D. Sornette, Phys. Rev. E66,061 104 (2002). 31. M. Baiesi and M. Paczuski, Phys. Rev. E69,066 106, (2004). 32. M. Baiesi and M. Paczuski, Nonlin. Proc. Geophys. 12,1 (2005). 33. J. Gardner and L. Knopoff, Bull. Seism. Soc. Am. 64, 1363 (1974). 34. V. Keilis-Borok, L. Knopoff and I. Rotwain, Nature 283,259 (1980). 35. L. Knopoff, Proc. Natl. Acad. Sci. USA 97,880 (2000). 36. A. Helmstetter, Phys. Rev. Lett., 91,058501 (2003). 37. Y. Y. Kagan and L. Knopoff, Science 236,1563 (1987). 38. D. P. Hill et al., Science 260,1617 (1993). 39. Y. Y. Kagan, Physica D 77,160 (1994). 40. P. Bak, K. Christensen, L. Danon and T . Scanlon, Phys. Rev.Lett. 88,178501 (2002). 41. A. Corral, Phys. Rev. E68,035 102(R) (2003). 42. J. Davidsen and C. Goltz, Geophys. Res. Lett. 31,L21612 (2004). 43. J. Davidsen and M. Paczuski, Phys. Rev. Lett. 94,048501 (2005). 44. S. Abe and N. Suzuki, Europhys. Lett. 65,581 (2004). 45. E. T. Jaynes, Probability Theory: The Logic of Science (Cambridge University Press, Cambridge, 2003). 46. The catalog is maintained by the Southern California Earthquake Data Center, and can be downloaded via the Internet at http://www.data.scec.org/ftp/catalogs/SCSN/. 47. Y. Y. Kagan in J. R. Minkel, Sci. Am. 286,25 (2002). N
ENERGY LANDSCAPES, SCALE-FREE NETWORKS AND APOLLONIAN PACKINGS
JONATHAN P. K. DOYE AND CLAIRE P. MASSEN University Chemical Laboratory, Lensfield Road, Cambridge CB2 1E W, United Kingdom We review recent results on the topological properties of two spatial scale-free networks, the inherent structure and Apollonian networks. The similarities between these two types of network suggest an explanation for the scale-free character of the inherent structure networks. Namely, that the energy landscape can be viewed BS a fractal packing of basins of attraction.
1. Introduction The potential energy as a function of the coordinates of all the atoms in a system defmes a multi-dimensional surface that is commonly known as an energy landscape.’ Characterizing such energy landscapes has become an increasingly popular approach to study the behaviour of complex systems, such as the folding of a protein2 or the properties of supercooled l i q ~ i d s .The ~ ? ~aim is to answer such questions as, what features of the energy landscape differentiate those polypeptides that are able to fold from those that get stuck in the morass of possible conformations, or those liquids that show super-Arrhenius dynamics (‘fragile’liquids) from those that are merely Arrhenius (‘strong’ liquids). Such approaches have to be able to cope with the complexity of the potential energy landscape-for example, the number of minima is typically an exponential function of the number of atoms.‘ One such approach is the inherent structure mapping pioneered by Stillinger and coworkers.6 In this mapping each point in configuration space is associated with the minimum obtained by following the steepestdescent pathway from that point. Thus, configuration space is partitioned into a set of basins of attraction surrounding the potential energy minima, as illustrated in Fig. 1. One of the original aims of this approach was to remove the vibrational motion from configurations generated in simulations of liquids to give a clearer picture of the underlying ‘inherent structure’, hence the common name for the mapping. Of more interest to us is that it breaks the energy landscape down into more manageable chunks, whose properties can be more easily established and understood. As an example of the utility of this approach, the classical partition function can be expressed as an integral over the whole of configuration space, but performing this integral (except numerically through say Monte Carlo) is nigh impossible, because
3 75
376
a
C
Figure 1. (a) A model two-dimensional potential energy surface, (b) the contour plot of this surface showing the ‘inherent structure’ division of the energy landscape into basins of attraction (the minima and transition states are represented by points and the basin boundaries by the thick lines), and (c) the representation of the landscape as a network.
of the complexity of the potential energy landscape. However, if this integral is divided up into separate integrals over each basin of attraction, analytical approximations to these individual integrals can easily be obtained by assuming that the basins can be modelled as a harmonic well surrounding the minimum at the centre of the basin. Calculation of an approximate partition function, then just reduces to a characterization of the properties of the potential energy minima and their associated basin^.^ As well as providing insights into the contributions of different regions of the energy landscape to the thermodynamics, quantitative accuracy can be obtained when account is taken of the anharmonicity of the basim8 Similarly, an energy landscape perspective on the dynamics can be formulated in terms of the transitions between the basins of attraction. Except at sufficiently high temperature, a trajectory of a system can be represented as a series of episodes of vibrational motion within a basin, punctuated by occasional hopping between basins along a transition state alley.^ In a coarse-grained view that ignores the vibrational motion, the dynamics is a walk on a network where the nodes correspond to miniia and there are edges between minima that are directly connected by a transition state.= An example of such an ‘inherent structure’ network is also illustrated in Fig. 1. Although there has been much work characterizing energy landscapes with the aim of gaining insights into particular systems, some of the fundamental properties of such landscapes, particularly those related to their global structure and organization, have received relatively little attention. For example, what is the nature of aBy a transition state we mean a stationary point on the potential energy landscape that has one eigendirection with negative curvature. The steepest-descent pathways from the transition state parallel and anti-para1161 to this Hessian eigenvector then provide a unique definition of the two minima connected by this transition state.
377
Figure 2. The Apollonian packing of a circle, and the corresponding network for the central interstice between the initial disks after three generations of disks have been added.
the division of the energy landscape into basins of attraction and does the inherent structure network have a universal form? In this chapter, we will be reviewing recent results that address exactly these questions. The system that we will be analysing is a series of small Lennard-Jones (LJ) clusters for which the complete inherent structure network can be found. Another approach to understanding the properties of complex systems that has received much attention recently is through an analysis of the system in terms of networks.lO*ll The systems analysed in this way have spanned an impressive range of fields, including astrophysics,12 geophysics,13 information technology,14 b i ~ c h e m i s t r y , ' ec~logy,~' ~~~~ and sociology.18 Initially, the focus was on relatively basic topological properties of these networks, such as the average separation between nodes and the clustering coefficient to test whether they behaved like the Watts-Strogatz small-world networks,lg or the degree distributionb to see if they could be classified as scale-free networks.20 To summarize our recent results we found that the inherent structure networks associated with the LJ clusters behaved as fairly typical scale-free n e t ~ o r k s . However, the origins of most scale-free networks can be explained in terms of network growth models, where there is preferential attachment to nodes with high degree during network growth.20 By contrast, the inherent structure networks are static. They are determined just by the potential describing the interatomic interactions and the number of atoms in the system. So, why are they scale free? One of the important features of the inherent structure networks is their embednetwork parlance, the degree k is the number of connections to a node.
378
3.5
3.0
2s l*"O
2.0
1.s
0.1
.
10 10
IW
N"
ImO
10
IW
N"
Figure 3. The scaling of the average separation between nodes and the clustering coefficient with network size for LJ clusters with 7 to 14 atoms, and two-dimensional Apollonian networks with increasing numbers of generations.
ding in configuration space. There have been a number of model spatial scale-free networks proposed,24~25~26~27 but the ones on which we wish to focus are Apollonian network^.^^^^^ These networks are associated with Apollonian packings, an example of which is given in Fig. 2. To generate such a packing, one starts with a set of touching disks (or hyperspheres if one is interested in higher-dimensional packings) and then to each interstice in the packing, new disks are added that touch each disk surrounding the interstice. At each subsequent generation the same procedure of adding disks to the remaining interstices is applied. The complete spacefilling packing is obtained by repeating this process ad infiniturn. The Apollonian network is then the contact network between adjacent disks (Fig. 2). One of the reasons that the Apollonian network provides a useful comparison to the inherent structure networks is that spatial regions (the disks) are automatically associated with each node in the network, which is somewhat similar to the association of the basins of attraction with the minima on an energy landscape. Furthermore, in both networks edges are based on contacts between those spatial regions that are adjacent. As a consequence, for two-dimensional examples, both types of network are planar, that is, they can be represented on a plane without any edges crossing.30This feature contrasts with the other model spatial scale-free in ~this we will be comparing the properties n e t ~ o r k s . ~ Therefore, ~,~~*~ ! ~chapter ~ of the inherent structure and Apollonian networks. 2. Comparing Apollonian and inherent structure networks
We were able to obtain the complete inherent structure networks for all LJ clusters with up to 14 atoms. For the largest cluster, the network had 4196 nodes and 87 219 edges. By contrast, the Apollonian networks have an infinite number of nodes. Therefore, to allow a comparison we consider finite Apollonian networks obtained by only considering the first t generations of disks. The comparison we make is usually between the LJ14 network and a two-dimensional Apollonian network with a similar
379
0.1
kkk>
10
100
Figure 4. The cumulative degree distributions for the inherent structure and Apollonian networks.
number of nodes (in fact with t = 7 and 4376 nodes and 13 122 edges). One could argue that it would be more appropriate to compare to an Apollonian network with the same spatial dimension. However, the properties of the Apollonian networks are very similar irrespective of dimension, so we chose to use the two-dimensional example simply because the properties of this case have been most comprehensively worked out. To study the size dependence of the network properties, as in Fig. 3 we have to make a further choice. For the inherent structure networks we follow clusters with an increasing number of atoms, and hence an increasing dimension of configuration space. Again it could be argued that we should be comparing to an Apolonian network of fixed number of generations, but increasing dimension, but the useful feature of examining a network of fixed dimension and increasing t instead is that the variable t behaves in a somewhat similar way to the number of atoms. For example, the number of nodes is an exponential function of t , whereas it only increases polynomially with the dimension of the system.2g As already mentioned, the number of minima is an exponential function of the number of atoms. From Fig. 3, one can see that both types of networks have small-world properties. Firstly, for both networks the average separation between nodes scales no more than logarithmically with system size, as for a random graph. The stronger sub-logarithmic behaviour for the inherent structure networks is because the average degree increases with network size (the random graph result is in fact 1, = log N,/ log(k)) whereas it is approximately constant for the Apollonian networks. The increase in ( k ) is simply because the ratio of the number of transition states to minima on a potential energy landscape is a linear function of the number of atoms.31 Secondly, the clustering coefficient, one measure of the local ordering within a network, has values that are significantly larger than for a random network. The size dependence of this property depends on how it is defined. If it is
380
10
1 -48
-47
-46
-45
-44
.*.-..--.
-
43
-42
41
40
-39
-38
potential energy of minimum
Figure 5. The dependence of the degree of a node on the potential energy of the corresponding minimum for LJ14. The data points are for each individual minimum and the solid line is a binned average.
as the probability that any pair of nodes with a common neighbour are themselves connected (Cl) then it decreases quite rapidly with size. The second definition (CZ) is as the average of the local clustering coefficient, where the latter is defined as the probability that the neighbours of a particular node are themselves connected. The second definition gives more weight to the low-degree nodes that, BS we shall see later, have a higher local clustering coefficient. That Cz tends to a constant value for the Apollonian network, rather than decaying weakly as for the inherent structure networks, reflects the stronger degree dependence of the local clustering coefficient. Both networks also have a power-law tail to their degree distribution, and so are scale-free networks. The exponent is slightly larger for the inherent structure networks (2.78 compared to 2.59). This heterogeneous degree distribution is easier to understand for the Apollonian network, and reflects the fractal nature of the packing^.^' At each stage in the generation of the network, the degrees of the nodes double, i.e. new nodes preferentially connect to those with higher degree, and so the highest degree nodes correspond to those that are 'oldest' and have larger associated disks. For the inherent structure networks, the high-degree nodes correspond to minima with low potential energy (Fig. 5). Our rationale for this correlation between degree and potential energy is that the lower-energy minima have larger basin areas,33 and hence longer basin boundaries with more transition states on them. The scale-free character of these networks must reflect the hierarchical packing of these basins with larger basins surrounded by smaller basins, which in turn are surrounded by smaller basins, and so on, in a manner somewhat similar to the Apollonian packing. Thus, the comparison of the inherent structure and Apollonian networks can provide some
381
km N I
0.01
I
. 0.1
k/47
IW
0.I
‘
k/
lo
1m
Figure 6. The degree dependence of (a) the local clustering coefficient and (b) k,,, the average degree of the neighbours of a node for the inherent structure and Apollonian networks. Both lines represent the average values for a given k.
check of the plausibility of this potential origin of the scale-free behaviour of the inherent structure networks. Figure 6 shows that the two types of networks also behave similarly when we look at more detailed properties of the networks. Both have a local clustering coefficient that decreases strongly with increasing degree. For the Apollonian network, it is actually inversely proportional to the degreezg - a feature that has been previously seen for other deterministic scale-free n e t ~ o r k s ~ and ~ i ~that ~ fhas ~ been interpreted in terms of a hierarchical structure t o the n e t ~ o r k~ whereas ~ > ~ ~ for the inherent structure networks the degree dependence is somewhat reduced at small k. This similar behaviour partly reflects the common spatial character of the networks. The smaller low-degree nodes have a more localized character and so their neighbours are more likely to be connected, whereas the larger high-degree nodes can connect nodes that are spatially distant from each other and so are less likely to be connected. The behaviour of c ( k ) also partly reflects the correlation^^^ evident in Fig. 6(b). Both networks are disassortative, that is nodes are more likely to be connected to nodes with dissimilar degree. By contrast, for an uncorrelated network, knn(k) would be independent of degree. However, it is well known that disassortativity can arise for networks, as here, in which multiple edges and self-connections are not present.3g Indeed, for the inherent structure networks k,,(k) for a random network with the same degree distribution looks almost identical.” An additional source of disassortativity is present in the Apollonian networks, because, except for the initial disks, there are no edges whatsoever between nodes with the same degree; disks created in the same generation all go in separate interstices in the structure and so cannot be connected. Therefore, that k,,(k) for the two types of networks follow each other quite so closely is probably somewhat accidental. The behaviour seen for most of the network properties discussed so far is fairly common for scale-free networks. Therefore, a better test of the applicability of the Apollonian analogy to the energy landscape is to examine the spatial properties of
382 1
0.1
0.01
A /A, 0.001
1
J
0.m1
0.01
0.1
1
klkmax
Figure 7. The degree dependence of the basin areas for the LJ14 energy landscape and disk areas for the Apollonian packing. Both lines represent the average values for a given k.
the two systems directly. For the inherent structure networks, in agreement with the suggestion made earlier, there is a strong correlation between the degree of a node and the hyperarea of the basin of attraction that is similar to the degree dependence of the disk area seen for the Apollonian networks (Fig. 7). This result therefore implies that there is also a strong dependence of the basin area on the energy of a minimum with the low-energy minima having the largest basins. It also provides strong evidence that the scale-free topology of the inherent structure networks reflects the heterogeneous distribution of basin areas. The distribution of disk areas for the Apollonian packing reflects its fractal ~haracter.~’ It is in fact a power-law4’ with an exponent that depends upon the fractal dimension of the packing,41 as illustrated in Fig. 8 For high-dimensional packings this exponent tends to -2.” Preliminary results suggest that there is a similar power-law distribution for the hyperareas of the basins of attraction on an energy landscape, confirming the deep similarity between these two types of system, and suggesting that configuration space is covered by a fractal packing of the basins of attraction. 3. Conclusion
In this chapter we have looked at some of the fundamental organizing principles of complex multi-dimensional energy landscapes. By viewing the landscapes as a network of minima that are linked by transition states, we have found that the topology of this network is scale-free. Unlike most scalefree networks, the origin of this topology must be static. We believe that it is driven by a very heterogeneous size distribution for the basins of attraction associated with the minima, with the large basins having many connections. In this paper, we have explored whether space-filling packings of disks and hyperspheres, such as the Apollonian packings, and their associated contact networks can provide a good model of how the energy
383 t
IeOS
o.ooo1
0.01
0.001
0.I
1
‘/‘mru
Figure 8. The cumulative distribution for disks with radius greater than Apollonian packing.
T
in a two-dimensional
landscape is organized. We have shown that these systems share a deep similarity both in t h e topological properties of the networks a n d the spatial properties of the packings. In fact, our results suggest that t h e energy landscape can be viewed as a fractal packing of basins of attraction. Although this conclusion can provide a n explanation for the scale-free topology of t h e inherent structure network, it itself demands an explanation. Why are t h e basins of attraction organized in this fractal manner? We will explore this in future work.
References 1. D. J. Wales, Energy Landscapes, Cambridge University Press, Cambridge (2003). 2. J. D. Bryngelson, J. N. Onuchic, N. D. Socci and P. G. Wolynes, Proteins 21, 167 (1995). 3. F. H. Stillinger, Science 267,1935 (1995). 4. P. G. Debenedetti and F. H. Stillinger, Nature 410,259 (2001). 5. F. H. Stillinger, Phys. Rev. E 59,48 (1999). 6. F. H. Stillinger and T. A. Weber, Science 225,983 (1984). 7. D. J. Wales, Mol. Phys. 78,151 (1993). 8. F. Calvo, J . P. K. Doye and D. J. Wales, J. Chem. Phys. 115,9627 (2001). 9. M. Goldstein, J . Chem. Phys. 51,3728 (1969). 10. M. E. J. Newman, SIAM Rev. 45,167 (2003). 11. R. Albert and A. L. BarabLi, Rev. Mod. Phys. 74,47 (2002). 12. D. Hughes, M. Paczuski, R. 0. Dendy, P. Helander and K. G. McClements, Phys. Rev. Lett. 90, 131101 (2003). 13. M. Baiesi and M. Paczuski, Phys. Rev. E 69,066106 (2004). 14. R. Albert, H. Jeong and A. L. BarabLi, Nature 401,130 (1999). 15. H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai and A. L. Barabhi, Nature 407,651 (2000). 16. H. Jeong, S. Mason, A. L. BarabLi and Z. N. Oltvai, Nature 411,41 (2001). 17. J. A. Dunne, R. J. Williams and N. D. Martinez, Proc. Natl. Acad. Sci. USA 99, 12917 (2002).
384 18. F. Liljeros, C. R. Edling, L. A. N. Amaral, H. E. Stanley and Y. Aberg, Nature 411, 907 (2001). 19. D. J. Watts and S. H. Strogatz, Nature 393,440 (1998). 20. A. L. BarabLi and R. Albert, Science 286,509 (1999). 21. J. P. K. Doye, Phys. Rev. Lett. 88,238701 (2002). 22. J. P. K. Doye and C. P. Massen, J. Chem. Phys. 122, in press (2005); condmat/0411144. 23. C. P. Massen and J. P. K. Doye, cond-mat/0412469. 24. A. F. Rozenfeld, R. Cohen, D. ben Avraham and S. Havlin, Phys. Rev. Lett. 89, 218701 (2002). 25. C. P. Warren, L. M. Sander and I. M. Sokolov, Phys. Rev. E 66,056105 (2002). 26. D. ben Avraham, A. F. Rozenfeld, R. Cohen and S. Havlin, Physica A 330, 107 (2003). 27. C. Herrmann, M. Barthelemy and P. Provero, Phys. Rev. E 68,026128 (2003). 28. J. S. Andrade, H. J. Herrmann, R. F. S. Andrade and L. R. d a Silva, condmat/0406295. 29. J. P. K. Doye and C. P. Massen, Phys. Rev. E 71,in press (2005); cond-mat/0407779. 30. T. Aste, T. Di Matteo and S. T. Hyde, Physica A 346,20 (2005). 31. J. P. K. Doye and D. J. Wales, J. Chem. Phys. 116,3777 (2002). 32. B. B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman, New York (1983). 33. J. P. K. Doye, D. J. Wales and M. A. Miller, J. Chem. Phys. 109,8143 (1998). 34. S. N. Dorogovtsev, A. V. Goltsev and J. F. F. Mendes, Phys. Rev. E 65, 066122 (2002). 35. E. Ravasz and A. L. BarabLi, Phys. Rev. E 67,026112 (2003). 36. F. Comellas, G. Fertin and A. Raspaud, Phys. Rev. E 69,037104 (2004). 37. A. L. BarabLi, Nature Reviews Genetics 5, 101 (2004). 38. S. N. Soffer and A. Vbquez, cond-mat/0409686. 39. J. Park and M. E. J. Newman, Phys. Rev. E 68,026112 (2003). 40. Z. A. Melzak, Math. Comput. 16,838 (1966). 41. S. S. Manna and H. J. Herrmann, J. Phys. A 24,L481 (1991).
EPIDEMIC MODELING AND COMPLEX REALITIES
MARC BARTHELEMY', ALAIN BAR RAT^, VITTORIA COLIZZA~,ALESSANDRO VESPIGNANI~ School of Informatics and Biocomplexity Institute, Indiana University Bloomington, IN, USA Laboratoire de Physique The'orique (UMR du CNRS 8687) Batiment 210, UniversitC de Paris-Sud 91405 Orsay, France Informatics tools have recently made it possible to achieve an unprecedented static and dynamical picture of our society, providing increasing evidence for the presence of complex features and emerging properties at various levels of description. We present here a brief overview of how epidemic modelling is affected by the complexity characterizing the structure and behavior of real world societies and the new opportunities offered by a coherent inclusion of complex features in the understanding of disease spreading.
1. Introduction
The mathematical modelling of epidemics is a very active field of research that crosses different disciplines. Epidemiologists, computer scientists and social scientists share a common interest in studying spreading phenomena and rely on very similar models for the description of the diffusion of viruses, knowledge and inn* vation. In particular, understanding and predicting an epidemic outbreak requires a detailed knowledge of the contact networks defining the interactions of the p o p ulation at various scale ranging from the individuals interactions to the traveling patterns. Thanks to the development of new computational capabilities, a variety of largescale data on social networks have become available and amenable to scientific analysis. This has lead to the accumulation of ample evidence for the presence of complex and heterogeneous properties of many evolving networks some of them being of great interest for the spreading of epidemics A central result is that some networks are characterized by complex topologies and very heterogeneous structures A striking example of this situation is provided by scale-free networks characterized by large fluctuations in the number of connections (degree) k of each vertex. This feature usually finds its signature in a heavy-tailed degree distribution with power-law behavior of the form P ( k ) k-7, with 2 <_ 7 <_ 3, that implies a non-vanishing probability of finding vertices with very large degrees Similar heterogeneities and heavy tailed distributions are observed also for the intensity of connections and various attributes of the elements of the systems (size, activity etc.). 114,2,3,
53931476.
112337516.
N
19233,4.
385
386 Epidemic modelling has developed an impressive array of methods and approaches aimed at describing various spreading phenomena, as well as incorpe rating many details affecting the spreading of real pathogens ’. The availability of unprecedented computer power has also led to the development of novel simulation tools, relying on agent based modelling that can recreate the dynamics of an entire population at the scale of the single individual on an almost second-bysecond basis (e.g., the study of individual movement in Portland, Oregon 9). While extremely powerful, however, these numerical tools are often not transparent, being extremely difficult, if possible at all, to discriminate the impact of any given modelling assumption or ingredient. In addition, complexity is not the same as the merely complicated elements accounted for in sophisticate epidemic modelling. Complex properties often imply a virtual infinite heterogeneity of the system and large scale fluctuations extending over several orders of magnitude, generally corresponding to the breakdown of standard theoretical frameworks and models. This is for instance the case of epidemic spreading in scale-free networks where the lack of any intrinsic epidemic threshold generates a peculiar scenario with implications in immunization and containment policies. It is thus understood that the theoretical framework for epidemic spreading has to be widened with opportune models and methods dealing with the intrinsic systems’ complexity encountered in many real situations. While the basic models are expected to remain intact, there is a need to investigate in a systematic way the impact of the various network characteristics on the basic features of epidemic spreading. In this article we present an overview of the various instances in which we face complex features in the analysis of systems relevant to the modeling of epidemic phenomena. We also aim to show where complex features emerge and how they might be important in the context of epidemic modeling. We will present recent results concerning the topology and structure of several contact networks ranging from the individual scale to the global transportation flows. Using some specific examples we show how plugging in complex features in epidemic modeling enables one to obtain new interpretative frameworks for the behavior of disease spreading and to provide a quantitative rationalization of general features observed in the global spread of emergent diseases. Finally, we streamline some possible strategies for integrating the various complex features observed in real systems in multi-scale compartmental models of disease spreading. 2. The spreading of epidemics
It is first necessary to distinguish etiology which is the study of the causes or origins of diseases from epidemiology which is the discipline studying the incidence, distribution and control of diseases in a population. Epidemiology thus naturally has to deal with the inherent complexity of individual interaction in a large population. The international travel and commerce have been clearly identified as one the important factors influencing the emergence and spread of infectious diseases
’.
387
Figure 1. Map of the movement of the black death in the 14*hcentury. Colored lines represent the time evolution of the epidemic front propagating throughout Europe, from the start of the infection (Dec. 1347, in black color) to the time when it reached northern Europe (Dec. 1350, in light gray).
In particular, this has been observed by the way epidemic spreads has changed dramatically after the industrial period and the developing of modern transportation systems. A very famous example of pre-industrial outbreak for which it is possible to obtain extensive historical data is provided by the spread of the secalled Black death-a bubonic pleague-which reached a peak during the 14th century. At that time, humans had few traveling means. Long range traveling was sporadic and it is then possible to consider that infected individuals diffused smoothly generating an epidemic front that travels as a continuous wave through geographical regions (see for example lo). Historical studies confirm that the propagation indeed follows such a simple scheme in which the spatio-temporal pattern of the propagation was then dominated by spatial diffusion. In particular, the black death has spread through Europe from south to north (see Fig. 1) and the invasion front moved at an approximate velocity of 200-400 miles/year lo. In this case modeling approaches rely on the generalization of basic epidemic models in which space variables and diffusive terms are introduced. This strategy leads to good approximations that are still very useful for epizootic waves (animal infections) and epidemic spread in limited regions (for a recent model which takes into account rare long-range jumps, see 11).
This picture is turned upside down in our modern societies where humans trav-
388 eling fluxes are extremely relevant and transportation systems-especially the airline systems-allow to travel long distances in a very short time. A striking example of the new emerging scenarios is provided by the recent SARS epidemics 12. In Fig. 2 we report the spatio-temporal pattern of the global SARS spread. As can be seen in this figure, while a certain degree of spatial diffusion is present within countries in south-east Asia, we face a very rapid and patched structure of the main diffusion with distant outbreaks in Canada and in continental Europe. The large scale geographical evolution of several diseases is determined mainly by the transportation infrastructures that regulates the traveling of individuals and goods. In particular, the air-transportation network clearly represents a major channel of epidemic diffusion. This network is highly heterogeneous both in the connectivity pattern and the traffic capacities 13,14 and can naturally give rise to a patchy evolution of epidemics that might have outbreaks in very far apart geographical location. The attempt to understand the large scale geographic insurgence of global epidemics or pandemics should therefore appropriately take into account the role of the airport network in the dynamic evolution. Understanding this type of spatio-temporal pattern, computing probabilities of diseases outbreak and time lags probabilities are the challenges that modern epidemiology has to face now. Modern epidemiology thus faces a new challenge in integrating the complexity of human social contact patterns and flows in the description of the spread of a contagious disease.
Figure 2. Map of the global movement of the SARS. The gray scale corresponds to the time evolution of the epidemic spread, ranging from black color for the start of t h e infection in China in mid-November 2003, to light gray for the last countries which have reported an infected case, as documented by the WHO 12. The map clearly shows that despite the existence of a spatial component in the S A M spread long-range flights are relevant.
389 3. Complex networks and epidemiology
Epidemic spreading is inherently heterogeneous, involving transmission processes that span several time and length scales. From the coarse perspective of population flows occurring through various transportation networks to the very details of population structure (age/gender, households, etc.), it is understood that there is no one-fits-all social network that might, even approximately, candidate as the prototypical substrate for epidemic modeling. The identification and categorization of the various relevant contact networks and of the connections among the different scales present in the process of interest represent therefore a major issue in order to obtain a more comprehensive approach to epidemic modeling. In the last decade, thanks to the widespread use of computer based recording opportunities a variety of large-scale data on social networks have become available and amenable to scientific analysis. This has lead to the accumulation of ample evidence for the presence of complex and heterogeneous properties in social networks, transportation infrastructure, commuting patterns and other intra and inter-city population flows. In this context, networks representation operates at different scales so that different attributes for nodes and links and different observables depending on the representative granularity are therefore considered. (i) The individual scale In general it is rather difficult to get accurate data on the interaction between individuals. Several studies however have focused on the networks of individuals and the role of contact pattern has been acknowledged since long as a relevant factor in determining the properties of epidemic spreading phenomena 15,16. Indeed, the nodes with the largest connectivity, the hubs of the network usually called “superspreaders” 15, have been prompted by epidemiologists as responsible for the proliferation of infected individuals. Moreover, recent studies provided evidence for heavy-tail distributions of the individuals’ contacts as in the case of the web of While these results sexual interactions in different countries and populations must be complemented with concurrency data and timing patterns, the empirical evidence suggests the emergence of features deviating from the regular poissonian paradigms well suited for mean field approaches, i.e. the so called homogeneous mixing approaches. At the same time, depending on the specific infection mechanism, different social networks have to be considered and their topology must be carefully scrutinized. For instance, airborne viruses like influenza require spatial proximity and the contact patterns is determined by the visitation and permanence in public spaces as well as the household structure. 516.
(ii) The urban scale The recent explosive trend in urbanization show that the majority of the world population lives in urban areas l7 and therefore the movement of individuals between city locations is one of the most defining elements of contact networks relevant to
390 disease spreading 18. This fact added to rapid and uncontrolled spread of diseases in highly dense and populated areas emphasizes the need of a good characterization and modeling of humans flows and interaction at this level. A characterization of the human flow at the urban level was recently conducted by the Los Alamos group g. This study conducted on the network of locations in the city of Portland, Oregon, displays some interesting features. The nodes are locations in the city including home, offices, shops and recreational areas. The links represent the flow of individuals going at a certain time from a location to another. Chowell et a1 found that this weighted directed network is scale-free, with a broad distribution of traffic (strength) ’. Strong heterogeneities are thus present at least in two ways at this scale: the number of possible location one can reach from a given one and the number of individuals actually going from a particular location to another one. The network identified by the pattern of commuting people among nearby cities is also very important. For instance a recent study l9 conducted on the network of cities in the Island of Sardinia (Italy) showed that even if the topology of the network is rather homogeneous, the traffic is broadly distributed demonstrating that also at this scale strong fluctuations exist and have to be taken into account. (iii) The global scale: The airport network On a global scale the epidemic spreading has been dramatically changed by the modern travel infrastructures. The various transportation networks (road, rail, air) all contribute with different fluxes and timing to the spread of the epidemic. Among those networks, the air travel network has nowadays the lion’s share in connecting very far cities in a short time scale with a considerable flow of travellers. Its role in the global spreading of epidemics is therefore easily perceived. Very recently, several studies have provided extensive analysis of the complete world-wide airport network and the relative traffic flows 14. The air transportation system can be represented as a a weighted graph comprising N = 3880 vertices denoting airports and E = 18810 weighted edges whose weight wi,j accounts for the passenger flow between the airports i and j. The obtained network is highly heterogeneous both in the connectivity pattern and the traffic capacities l3>l4. The probability distributions that any airport has k connections (degree) to other airports and handles a number T of passengers (traffic) exhibit heavy-tails and very large statistical fluctuations. Analogously, the probability that any connections has a traffic w is skewed and heavy-t ailed. These data can also be enriched by the urban areas population served by each airport that have a clear heavy-tail distribution as known since the seminal results of Zipf 20. Furthermore, these quantities appear to have non-linear association among them. This is clearly shown by the behavior relating the traffic handled by each airport T with the corresponding number of connections k that follows the non-linear form T IcO with p N 1.5 14. Analogously, the city population and the traffic handled by the corresponding airport follows the non-linear relation n N ‘ T
-
391 with (Y 2: 0.5. It is clear that any theoretical understanding of the global spread of emergent diseases cannot avoid the full account of these complex properties. 4. Implications of network structure for epidemic dynamics
Accurate mathematical models of epidemic spreading are the basic conceptual tools for understanding the spread of a disease and the potential impact of effective strategies for epidemic control and containment 16. Recently, a great progress has been obtained in the understanding of disease spread through a wide array of network with complex topological properties. Among those, results include networks with complex features such as heavy-tailed distributions and small-world properties. In particular, it was shown that networks with highly heterogeneous contact patterns characterized by virtually unbounded fluctuations of the degree distribution eventually induce the absence of any epidemic threshold below which the infection cannot initiate a major outbreak 21,22. In other words the epidemic threshold above which the epidemics can spread is zero in the limit of an infinitely large network. If the network is finite this threshold A, is given by the ratio of the first and second moments of the degree distribution 21
where the moments of the connectivity distribution are given by < k" >= k m P ( k ) d k . This new scenario is of practical interest in computer virus diffuIt also sion and the spreading of diseases in heterogeneous populations raises new questions on how to protect the network and find optimal strategies for Analogously, the time evolution the deployment of immunization resources of epidemic outbreaks in complex networks with highly heterogeneous connectivity patterns is heavily affected by the contact properties of the underlying network 26. Also in this case the growth of infected individuals is governed by a time scale r proportional to the ratio between the first and second moment of the network's degree distribution 23315122.
23324125.
This result implies an instantaneous rise of the prevalence in very heterogeneous networks where (k2)+ 00 in the infinite size limit. In particular, this result shows that scale-free networks with 2 5 y 5 3 exhibit, along with the lack of an intrinsic epidemic threshold, a virtually infinite propagation velocity of the infection. Furthermore, the detailed propagation in time of the infection through the different degree classes in the population can be studied (see Fig.3). The result is a striking hierarchical dynamics in which the infection propagates via a cascade that progresses from higher to lower degree classes. First the infection takes control of the large degree vertices in the network. Then it rapidly invades the network via a cascade through progressively smaller degree
392 I
60
t
Figure 3. Average connectivity of newly infected nodes versus time. In a first phase, the infection reaches the hubs and then experiences a “cascade” towards smaller degrees 26.
classes. The dynamical structure of the spreading is therefore characterized by a hierarchical cascade from hubs to intermediate k and finally to small k classes. This infection hierarchy along with the very fast growth rate of epidemic outbreaks, could be of practical importance in the set-up of dynamic control strategies in populations with heterogeneous connectivity patterns. In particular, targeted immunization strategies that evolve with time might be particularly effective in epidemics control. While it is clear that the heterogeneity of the network topology in which the disease spreads may have noticeable effects in the evolution of the epidemic as well as in the effect of immunization strategies, several aspects of the problem have yet to be fully considered. Among others, the inclusion of more refined descriptions in which a compartmentalization both for degree classes and social/typology is at an early stage. Similarly, the large-scale heterogeneity of weight and timing features of the contact patterns has yet to be fully explored. Finally, as we will see in the next section, models for the coarse grained description of disease transmission in transportation networks must be reconsidered in the case of very heterogeneous networks and in the presence of traffic-topology correlations. 5. Modelling the global spread of diseases In the ’80s several models for the global spread of disease where studied by using very partial information on the transportation networks. The present availability of extensive data on these networks and their detailed mathematical characterization motivates the reconsideration of global spread models both in their accuracy and in their basic theoretical properties. The large fluctuations in the topology of the network might affect the behavior of the epidemic spread so that the large scale heterogeneities must be taken into account in a coherent description of global disease
393 spreading. A first step in this direction can be made by generalising models 27*28*29 used to describe how air travel affects the propagation of emerging diseases. For the sake of simplicity we will consider here the basic standard compartmentalization of the susceptible-infected (SI) model in which each individual of the population can only exist in the discrete states such as susceptible (S) and infected (I). The main idea is to split in two parts the contribution to the increase of the number I j of infected individuals in each city j by writing compartmental equations for the evolution of infected individuals that read as
The first term K represents the spread of the disease inside the city j. The second term R represents the travelling and depends on the probability pej that an infected individual departs from city C and arrives at another city j. The travel term can be simply written as the balance between incoming and outgoing individuals as Qj
=
Ce PejIe -
PjeIj
(4)
If we are interested in air travel, the probabilities p j e are given by the ratio of the passengers fluxes wje and the city population Nj. Different choices of the infection dynamics inside the city can be made. In the simplest homogenous mixing approximation, one can consider for the SI model the expression
where X is the disease transmission rate from infected individuals to susceptible ones (Si= Ni - Ii where Ni is the population of the city i). Obviously the model may incorporate additional details, such as networks within networks, the spread inside the city using a contact network and different infection dynamics. The evolution of the disease is obtained by the integration of the system of coupled differential equations describing all the cities considered. The present technical and computational capabilities allow to scale the model to take into account the detailed features of the air transportation networks that consists of almost 4000 cities and 20000 connections, with the relative flow traffic and census data for the population of the corresponding urban areas. In fig. 4 we report the visualization of the US behavioral pattern of the simple SI model for an epidemic starting in Hong Kong. Even for the very simple case of the SI dynamics, it is possible to recover a very interesting and heterogeneous pattern whose relation with the underlying network heterogenities is under study. In particular, it is possible to show that the topology of the airport contact, the distribution of the number of passengers and the population of cities at once reflects in the emerging disease pattern 30. The above framework is flexible enough to allow the inclusion of other transportation networks in the description of the disease spreading, so that the infection dynamics can be considered at very different level of accuracy. It is worth remarking that in some intermediate situation, numerical approaches can be pushed to
394
100%
0%
Figure 4. Four different stages of the time evolution of a global epidemics spread and its effect in North America. The color code represents the level of infection in each state, ranging horn red for a percentage of infected individuals equal to 100%to light yellow for the absence of infection. The complex spatio-temporal pattern is a signature of different complexities at different levels (populations, traffic, etc.).
agent based modelling approaches that recreate entire population and their dynamics at the scale of the single individual on an almost second-by-second basis '!la. In general however, multi-scale models are a more viable solution still transparent enough to provide analytical understanding and yield a rationale for the interplay of network complexity and spreading pattern. Along this line, it is then possible to envision that global spreading models that takes into account the complete networks and its heterogeneities might result in a powerful and more precise forecasting computational met hod.
6. Outlook In this brief review we have tried to report on the various instances in which a proper account of the complex properties of the real world appears to be crucial in epidemiological studies. The age and social structure of the population, the contact network among individuals, the meta-population characteristics such as geography are all factors that acquire a particular relevance in a reliable epidemic model. The presence of emergent phenomena and complex properties in the characterization of these factors represents a major challenge that usually defines new conceptual frameworks and the need of a different modelling perspective. We have shown that progresses have been made in the last years by combining data analysis and modelling techniques. Many areas are however still to be explored.
395 Complex properties appear in timing patterns and weighted networks representation and non-linear association between dynamical properties and contact networks topology prompt t o non-trivial feedbacks mechanisms among the various properties. All these factors represent a serious modelling and conceptual challenge in modern epidemiology. The understanding of the role played by different features of complex networks in the evolution of epidemic spreading has, however, a very high payoff. It yields more transparent approaches to disease evolution in which the various a p proximations and assumptions can be discriminated on a theoretical basis. This will help considerably in building theoretical approaches that connects different level of description and provide meaningful multi-scale approaches to epidemic modelling.
Acknowledgements
M.B. is on leave from absence from CEA-Centre d’Etudes de Bruyhres-le-Chiitel, DBpartement de Physique ThBorique et Appliquke, France. A.B. and A.V. are partially funded by the European Commission - contract 001907 (DELIS). References 1. R. Albert and A.-L. Barabhi, Rev. Mod. Phys. 7 4 , 4 7 (2000). 2. S.N. Dorogovtsev and J.F.F. Mendes, Evolution of networks: R o m biological nets to the Internet and WWW (Oxford University Press, Oxford, 2003). 3. R. Pastor-Satorras and A. Vespignani, Evolution and structure of the Internet: A statistical physics approach (Cambridge University Press, Cambridge, 2003). 4. L.A.N. Amaral, A. Scala, M. Barthklemy, and H.E. Stanley, Proc. Natl. Acad. Sci. USA 97, 11149 (2000). 5. F. Liljeros, C. R. Edling, L.A.N. Amaral, H.E. Stanley, and Y. Aberg, Nature 411, 907 (2001). 6. Schneeberger, A., Mercer, C.H., Gregson S.A., Ferguson N.M., Nyamukapa C.A., Anderson, R.M., Johnson, A.M., Garnett, G.P., Sez Ransm Dis 31, pp. 380-7 (2004). 7. Ferguson, N.M. et al, Nature 425, 681-685 (2003). 8. Cohen, M.L., Nature 406, 762-767 (2000). 9. Chowell, G., Hyman, J.M., Eubank, S., and CastilleGarsow, M.A., Phys. Rev. E 68, 066102 (2003). 10. J.D. Murray, Mathematical Biology, 2nd ed. (Springer, New York, 1993). 11. Keeling, M.J. et al, Science 294, 813-817 (2001). 12. htt p :/ / m w.who.int/csr/sars/en 13. Guimera, R, and Amaral, L.A.N., Eur. Phys. J. B 38, 381-385 (2004). 14. A. Barrat, M. Barthklemy, R. Pastor-Satorras, and A. Vespignani, Proc. Natl. Acad. Sci. USA 101, 3747 (2004). 15. H.W. Hethcote and J.A. Yorke, Lect. Notes Biomath. 56, 1 (1984). 16. R.M. Anderson and R.M. May, Infectious diseases in humans (Oxford University Press, Oxford 1992). 17. Zwingle, E., Megacities. Natl Geogr. Mag. 202, 70-99 (2002). 18. Eubank, S., Guclu, H., Anil Kumar, V.S., Marathe, M.V., Srinivasan, A,, Toroczkai, Z., and Wang, N., Nature 429, 180-184 (2004).
396 19. A. de Montis, M. Barthklemy, A. Chessa, and A. Vespignani, submitted to Env. Plan. J. B (2005). 20. G.K. Zipf, Human Behavior and the Principle of Least Effort (Addison-Wesley, 1949). 21. R. Pastor-Satorras and A. Vespignani, Phys. Rev. Lett. 86, 3200 (2001). 22. A.L. Lloyd and R.M. May, Science 292,1316 (2001). 23. R. Pastor-Satorras and A. Vespignani, Phys. Rev. E 63, 036104 (2002). 24. Z. Dezso and A.-L. Barabiisi, Phys. Rev. E 65, 055103 (2002). 25. R. Cohen, S. Havlin, and D. ben-Avraham Phys. Rev. Lett. 91,247901 (2003) 26. M. Barthdemy, A. Barrat, R. Pastor-Satorras, and A. Vespignani, Phys. Rev. Lett. 92,178701 (2004). 27. Rvachev, L.A. and Longini, I.M., Mathematical Biosciences 75, 3-22 (1985). 28. Flahault, A. and Valleron, A.-J., Math. Pop. Studies 3, 1-11 (1991). 29. Hufnagel, L., Brockmann, D., and Geisel, T., Proc. Natl. Acad. Sci. (USA) 101, 15124-15129 (2004). 30. M. Barthklemy, A. Barrat, V. Colizza, and A. Vespignani, The large scale spreading of infectious diseases in the air-transportation network: a full fledged approach, preprint (2005).
THE IMPORTANCE OF BEING CENTRAL
P. CRUCITTI Scuola Superiore d i Catania, Via S. Paolo 73, 95123 Catania, Italy E-mail: pacrzlcittiOssc.unict.it V. LATORA Dipartimento d i F i s h e Astronomia, Uniuersitd di Catania, and INFN sezione d i Catania, Via S. Sofia 64, 95123 Catania, Italy E-mail: 1atomOct.infn.it Centrality measures are of fundamental importance in network analysis, and have the main purpose of identifying the important components of a network and analyzing the distribution of resources. Here we review some of the most commonly used centrality measures available on the market. We focus, in particular on the information centrality, a measure recently proposed, showing some of the applications that have been investigated so far in social, infrastructure, biological and urban networks.
1. Introduction Centrality is a basic concept in social network analysis The idea of centrality was first applied t o human communication in the late 1940's by Bavelas 3,4, who was interested in the characterization of the communication in small groups of people, and investigated the relation between structural centrality and influence in group processes. Bavelas was the first one to argue that the best performance of certain members of a group, in terms of finding problem solutions, personal satisfaction and perception of leadership, is often related to a strategic location within the network. Or, in few words, good location means power. Since then, various measures of structural centrality have been proposed over the years to quantify the importance of an individual in a social network, and several authors have compared the performance of the different measures, either on real or simulated data, or both Applications have covered different topics in social networks, as problem solving or policy making groups, power in organizations, collaboration networks, status in animal networks, and spread of epidemics in populations. In addition to social systems, the idea of centrality has been applied t o the analysis of urban and territorial cases, especially to transportation/land-use planning and economic geography. The early example is the graph theoretic approach to historical geography proposed by Pitts 5 : a study of correlations between the differential growth of cities in 12-13th century Russia, 'i2.
ll2.
397
398 and their positions on the river trade network. Successively, a rather consistent application of the network approach t o urban design has been developed under the notion of Space Syntax, establishing a significant correlation between the topological accessibility of streets and phenomena as diverse as their popularity (pedestrian and vehicular flows), human way-finding, safety against micro-criminality, microMore recently the issue of centrality economic vitality and social liveability has attracted the attention of physicists 9,10,who have extended its applications from the domain of social systems ( see Fief.ll for an application to scientific colfor sexual contacts networks) to the realm laboration networks, and Refs. of biological (see, for instance, Refs. and technological networks (see Refs 20,21,22,23,24,25 63718.
15i16,1731s719)
1.
The aim of this paper is twofold. Since a comparative analysis of the standard centrality measures has not received the necessary attention in the physics literature, our first purpose is to review some of the most commonly used measures available on the market. This will be done in Section 2. The second goal is to highlight a new measure of centrality that have been recently proposed in Ref. 26, and to show some of its possible applications that have been investigated so far. This will be done in the second part of the paper, namely in Section 3 and Section 4.
2. Traditional Centrality Measures
We represent a social network 27 as non-directed, non-weighted graph G , consisting of a set of N points (vertices or nodes) and a set of K edges (or lines) connecting pairs of points (the ratio K I N is called the density of the graph). Two point connected by an edge are said adjacent. The points of the graph are the individuals, the actors of a social group and the lines represent the social links. The graph can be described by the so-called adjacency matrix, a N x N matrix whose entry aij is 1 if i and j are adjacent, and 0 otherwise. The entries ai, on the diagonal are undefined, and for convenience are set to be equal to 0. We assume that the graph is connected. This means that nodes that are not adjacent may nevertheless be reachable from one to the other. A walk from node i to node j is a sequence of adjacent nodes that begins with i and ends with j. A trail is a walk in which no edge (Le., pair of adjacent nodes) is repeated. A path is a trail in which no node is visited more than once. The length of a walk is defined as the number of edges it contains, and the shortest path between two nodes is a known as a geodesic. The length of a geodesic path between two nodes is known as the geodesic or graph theoretic distance between them. We can represent the graph theoretic distances between all pairs of nodes as a N x N matrix D whose entry dij gives the length of the shortest path from node i to node j. We will next consider the three standard classes of measures usually used to quantify the centrality of a node within a network. Such measures are based on three different concepts: degree, closeness and betweenness ”.
399 2.1. Measures based on degree
The simplest definition of point centrality is based on the idea that important points must be the most active ones, in the sense that they have the largest number of ties to other points in the graph. Thus a centrality index for an actor i , is the degree of i , i.e. the number of points adjacent to i : The degree centrality of i can be defined as 2S,29:
where ki is the degree of point i. Since a given point i can at most be adjacent to N - 1 other points, it is common to introduce the normalized centrality CF = .FIN-1, which is independent of the size of the network and such that 0 5 C? 5 1. The degree centrality focuses on the most visible actors in the network. An actor with a large degree is in direct contact to many other actors and being very visible is immediately recognized by others as a hub, a very active point and major channel of communication. A degree based measure of node centrality can be extended beyond direct connections to those at various distances, widening in this way the relevant neighbourhood. For instance, a node i can be assessed for its centrality in terms of the number of nodes that can be reached from it within a given cutoff distance n '. When n = 1, the measure is identical to c?. The principal problem with such a definition is that, in a large number of graphs, even in those with a very small density value K I N , the majority of nodes are at relativity short path distance (the so-called small-world property ',lo). Thus, the measure loses of significance already for n = 4 or n = 5, because most, if not all, of the points of the graph can be reached at this distance. Another possible extension of the degree centrality, that is based on a different idea, and is quite well known and used, is the eigenvector centrality. It was originally introduced in the social networks context by Bonacich who generalized the influence measure proposed by Katz 32 and is closely related to The some centrality measures used, more recently in web search engines eigenvector centrality cf of a node i is defined to be proportional to the sum of the centralities of its neighbours, so that a vertex can acquire high centrality either by having a high degree or by being connected to others that themselves are highly central. The definition reads 30y31
10133,34.
where X is some constant. In matrix notation this becomes AcE = XcE, so that cE is an eigenvector of the adjacency matrix. Usually the eigenvector corresponding to the leading eigenvalue is taken because its components are all positive real numbers.
400 2.2. Measures based on closeness
This class of measures is based on the lengths of the walks a node is involved in. In this case the idea is that an actor is central if it can quickly interact with the others. The simplest notion of closeness is based on shortest path distances. The closeness centrality of node i , defined by Sabidussi in 1966, is 1728336:
where d i j is the geodesic length from i to j. Usually, it is better t o consider the normalized quantity Cy = (N-l)cF = ( L i ) - l , which takes values in the range [0,1] 28. Such a measure is meaningful for connected graphs only, unless one assumes dij equal t o a finite value, for instance the maximum possible distance N - 1, instead of dij = +m, when there is no path between two nodes i and j. An alternative possibility is to define the so-called eficiency centrality 37138:
which is perfectly defined for non connected graphs. In fact, when there is no path between i and j, the efficiency in the communication between the two nodes l/d,j is equal t o zero and does not contribute to the average in formula (4). Various other generalizations based on the lengths of all possible paths, not only on geodesics, have been also proposed over the years. 2.3. Measures based on betweenness
Interactions between two non-adjacent points might depend on the other actors, especially on those on the paths between the two. Therefore, points in the middle can have a strategic control and influence on the others. The important idea at the base of these centrality measures is that an actor is central if it lies between many of the actors. This concept can be quantified in different ways according t o the kind of flow process that is assumed to be more appropriate to the network considered. The simplest possibility is to assume that the communication travels just along the geodesics. If njk is the number of geodesics linking the two actors j and k , and njk(i) is the number of geodesics linking the two actors j and k that contain point i , the betweenness centrality of actor i, proposed by Freeman in 1977, is defined as 28,39,40:
In formulas, j and k in the double summation at the numerator must be different from i, and the normalization factor ( N - l ) ( N - 2 ) is the maximum possible number of geodesics going through a node. Cp is normalized in the range [0,1] and reaches its maximum when i falls on all geodesics.
401 In most of the cases communication, or any other quantity of interest (for instance infections or fashions in social networks, information packets or emails in computer networks, various goods on trade networks), does not travel through geodesic paths only, and for this reason a more realistic betweenness measure should include nongeodesic as well as geodesic paths (unweighted or weighted inversely by their length). Here we mention only two other measures of betweenness that include contributions from non-geodesic paths: the flow betweenness and the random paths betweenness. The flow betweenness was introduced in 41 and is based on the concept of maximum flow. It is defined by assuming that each edge of the graph is like a pipe and can carry a unitary amount of flow. By considering a generic point j as the source of flow and a generic target point k as the target, it is possible to calculate the maximum possible flow from j to k by means of the min-cut, max-flow theorem 42. In general it is expected that more than a single unit of flow is exchanged from j to k by making simultaneous use of the various possible paths. The flow betweenness centrality of point i is defined from the amount of flow mjk(i) passing through i when the maximum flow m j k is exchanged from j to k , by the formula:
and is based on random The second betweennes was introduced very recently in paths. It is suited for such cases in which a message moves from a source point j to a target point k without any global knowledge of the network, and therefore at each step chooses where to go at random from all the possibilities. The random walk betweennes of point i is equal to the number of times that the message passes through i in its journey, averaged over a large number of trials of the random walk 43344
43.
3. A new idea of centrality
The measure of centrality introduced in Ref. 26 is based on the following simple idea, which is somehow different from all the ideas considered in the previous sections: the importance of an actor is related to the ability of the system to respond to the deactivation of the actor from the network. In fact, the structure and function of a network strongly rely on the existence of paths between couples of nodes. When a node is deactivated, the typical length of such paths will increase and, eventually, some couples of nodes will become disconnected. Consequently, the network will loose in performance and the importance of a node of the network can be measured by considering the drop in the network's performance caused by its deactivation. We need first to define how to measure the performance of the graph. One possibility is to use the efficiency 37138:
402
which is a quantity perfectly suited t o describes cases where the communication takes shortest paths. Of course, various other alternatives are possible, and a better characterization of the performance of a graph can be achieved by taking into account more details about the kind of flow over the network we are interested in. The information centrality of node i is defined as the relative drop in the network efficiency caused by the removal from G of the edges incident in i 26: I AE c.=-=
E [ G ]- E[G:]
' E E[Gl where by GI we indicate a network with N points and K - ki edges obtained by removing from G the edges incident in point i. In practice it is checked for the redundancy of an element by calculating the performance of the perturbed network and comparing it with the original one. A detailed study of the comparison of the information centrality with the standard measures, either on real or simulated social networks, can be found in Ftef.26. In the next section we will show some of the applications of the information centrality that have been investigated so far. We will discuss examples of social, technological and biological systems. 4. Applications of the Information Centrality 4.1. Social Networks
As an example of a social network, in Fig.1, we report the graph representing the terrorist network of the September 2001 terrorist attacks on US. The graph has N = 34 nodes, representing the 19 hijackers and other 15 associates who were reported to have had direct or indirect interactions with the hijackers, and K = 93 links, and has been constructed by Krebs using public released information taken from the major newspapers 45. In the network map in the figure the hijackers are color coded by the flight they were on. The associates of hijackers are represented as dark grey nodes. The gray lines indicate the reported interactions with thicker lines indicating a stronger tie between two nodes. A study of the node information centrality has been performed in Ref. 'l. The results are reported in table 1 where the nodes are ranked according to C'; the degree of each node is also reported. The information centrality assigns the highest score to Mohamed Atta, who was on flight AA-11 crashed into the World Trade Center North, and is also the terrorist with the largest number of direct contacts with other terrorists (k = 16). An extremely interesting result is the presence, among the relevant nodes of the network, of individuals with a small degree, e.g. Mamoun Darkazanli, who has only 4 ties but is the fourth in list. Centrality analysis on terrorist organization networks may have important consequences on criminal activity prevention. In fact, if some knowledge is available on terrorist organization networks, the individuation and targeting of the central nodes of the network can help a lot to disrupt the organization. The application of the information centrality to a different example of a social system, namely the interaction network of a group of 20 monkeys, can be found in
403
Figure 1. Connection network of the hijackers and related terrorists of the September 2001 attacks. (Figure taken from 45). The nodes represent the terrorists and the links represent contacts between terrorists. See text for details.
Ref.26. In that reference the definition of formula 8 is extended in order to measure the centrality of a group of nodes as well as the centrality of single nodes. Table 1. Centrality analysis of the graph shown in figure 1. The five nodes with the highest C' are reported and ranked according to their information centrality score. In the last column we report, as an alternative measure of the importance of a node, the degree centrality cD not normalized of formula 1, i.e. the node degree. rank
nodei
Ci
c,"
1
Mohamed Atta
0.150
2
Salem Alhazmi
0.112
8
3
Hani Hanjour
0.098
10
4 5
Mamoun Darhzanli
0.091
4
Marwan Al-Shehhi
0.091
14
16
404 4.2. Infrastructure Networks
Infrastructure networks, and in particular electric power grids are attracting a great deal of attention because of their importance in the every-day life and their intrinsic criticality. As an example of a communication network, we discuss the results of the centrality analysis of the Internet backbone of Infonet 20014' reported in Ref. 24. In the same reference other examples of communication and transportation networks can be found, with the emphasis on the issues of vulnerability and protection. The network of Infonet has N = 94 nodes and K = 96 cable connections and carries about the 10% of the traffic over US and Europe. It consists of two main parts, the US and the European backbone respectively with Nl = 66 and Nz = 28 nodes, connected by three overseas cables. 25946,
Table 2. Centrality analysis of Infonet 2001 47, as of September 2001. The six nodes and the six links with the highest information centrality are reported and ranked according to their score. In the last column, we report the node degree centrality cD, and the betweenness b,-j of the edge i - j rank
nodei
1 2 3 4 5 6 -
New Jersey NYC Chicago Amsterdam Atlanta Washington
rank -
link i-j
1 2 3 4 5 6
NYC-New Jersey New Jersey-Chicago NYC-Washington Washington-Atlanta New Jersey-San Jose New Jersey-Dallas
C!,
cf
0.573 0.530 0.280 0.241 0.227 0.203
9 9 15 9 14 2 -
c,!- bi-j 0.379 0.229 0.197 0.183 0.179 0.122
2205 1185 1185 1120 984 609
Table 2 shows that New Jersey and NYC are by far the two most important nodes: in fact the damage of either one would disconnect the US from the European backbone, reducing by more than 50% the performance of the network. Such result is in agreement with the significant drop in performance experienced by the Internet in the aftermath of the 11 September terrorist attacks, and probably due to the damages of Internet routers and cables in the south of NYC 48. The comparison of C' with the node degree shows that the damage of the most connected nodes, the hubs ', is not always the worst damage. In fact, the damage of Chicago, the node with the highest degree produces only a drop of 28% in the performance. Infonet is another example in which the central nodes, i.e. the nodes to protect from the
405 attacks, are not the hubs, but less visible and apparently minor nodes. The definition of formula (8) can be easily extended to measure the centrality of a link 24. In Table 2 we report also the six most central links. The link NYC-New Jersey is the one with the highest C'. In fact the removal of such a link will result in a break up of the network into two disconnected parts of about the same size, with a decrease of the 38% in the performance of the network. Notice that the removal of the second link produces only a drop of 23% in the performance. Other important links are those connecting New-Jersey with Chicago, with San Jose and with Dallas, and some links in the east cost as NYC-Washington and Washington-Atlanta. The links in table, ordered according to CI, have also a decreasing betweenness B, another measure of link centrality defined as the number of times the link is in the shortest paths connecting couples of nodes 49. Nevertheless, the correlation between C' is not perfect: for instance the link NYC-Amsterdam, with the second highest betweenness, ranks only 14th according to C'. An application to electric power grids can be found in Ref. 51, where links with the highest information centrality are detected in the Spanish 400 kV, the french 400 kV and the italian 380 kV power transmission grids. For each network it is also suggested how to improve the connectivity. 25150
4.3. Mediators in the immune system
As a biological system we consider the human immune system (IS), a complex system of different cell types whose communication and coordination is crucial for the perfect reaction to a possible danger. The cells of the IS have various ways to communicate among each other: either directly, by establishing bounds between cell surface ligands and receptors, or indirectly, by means of a variety of soluble mediators released and bound by the immune cells. In Ref.lg the focus is on the communication implemented by soluble mediators. The IS is, in fact, represented by a weighted directed graph G , where the N cell types of the IS are the vertices of the graph and the M soluble mediators (cytokines, chemokines) form its K edges: a directed arc from vertex i to vertex j is defined by the existence of at least one mediator secreted by cell i and affecting cell j . The presence of an arc from vertex i to vertex j does not imply that of the arc from j to i, and thus the adjacency matrix is no more symmetric. Cell self-stimulation by soluble mediators (autocriny), which is an important peculiarity of the immune cell network, is also taken into account. The value wij attached to the arc i - j is assumed to be equal to the number of different mediators connecting the cell i to the cell j . This study is different from the ones previously considered for two reasons: 1) the network is weighted and directed; 2) the goal to achieve is to quantify the importance of soluble mediators, not of the links or of the nodes of G . Nevertheless, the same approach used in formula (8), can be easily generalized to define a mediator's information centrality. In fact, the centrality of each mediator can be measured by the relative drop in the network efficiency E caused by the removal of the mediator, where the removal
406
of a mediator means weakening some of the weights wij attached to the arcs (see Fb2f.l' for the details). The results are reported in Table 3, where the mediators are ranked according to their information centrality. The analysis shows that mediators Table 3. Centrality analysis of the human immune cells network 19. The ten m e d k tors with the highest value of information centrality are reported and ranked according to their score. ~
rank 1 2 3 4 5 6 7 8 9 10
~~
mediator i
C,!
TGF-beta MIP-1-alpha/beta TNF-alpha IGlO IG16 IL8/CXCL8 IFN-gamma IFN-alpha/beta/omega IG7 VIP and PACAP
0.0777 0.0745 0.0469 0.0340 0.0314 0.0298 0.0276 0.0261 0.0223 0.0214
involved in innate immunity - the most ancestral branch of the immune system and inflammation have the most central role in the immune network. For instance, the three most important mediators are inflammatory molecules, and are involved in the communication among a large number of cell types, respectively 216, 224 and 120. The unequal role played by the mediators is shown by the plot of the the = M(C')/M (M(C') is the number cumulative centrality distribution P,,(C') of mediators with centrality larger than C') reported in Fig. 2. For large values of C' the distribution shows a power law behavior P,,(C') [C']-(T-'), with and exponent y = 2.8 f0.1. The fat-tail in the mediators centrality distribution indicates that the universal scaling principles discovered in other biological networks seem to be also fundamental ingredients of the human IS architecture. N
159'
4.4. Finding Community Structures and other applications
A property common to social and biological networks is community structure, i.e. the division of network nodes into groups within which the network connections are dense, but between which they are sparser. The ability to find such groups can provide invaluable help in understanding the structure and the functioning of networks. A series of algorithms to finding community structures based on the idea of using centrality measures to define the community boundaries have been recently proposed In particular, in Ref.52, the authors have developed a hierarchical divisive method based on the information centrality and consisting in finding and removing, progressively, the edges with the largest information centrality until the network breaks up into components. Although the algorithm proposed runs to 49144.
407
t
t 1o - ~
I
I
I
I
I I l l
I
I
I
lo-*
,
,
,
,
,
I
lo-'
Centrality C' Figure 2. Distribution of the number of mediators having a network relevance larger than r. The relevance of a mediator is calculated as the relative drop in network efficiency caused by its removal from the network. To reduce noise, logarithm binning was applied. The linear tail in the figure indicates a scale-free power-law behaviour in the relevance of the various mediators of the immune * ) , 7 = 2.8 f 0.1. cell network that can be fitted with a curve Pcum(r)= r ~ ( ~ - with
completion in a time O(N4) and is slower than other methods, it is very effective, especially when the communities are very mixed and hardly detectable by the other methods. Centrality measures have also been applied to urban street networks In particular, in Ref.53, the authors have studied a set of different measures to extract the central paths from city maps. The results show that no single-index spatial analysis can offer an exhaustive picture, while, on the other hand, significant achievements for actions in urban planning are likely to be gained by using a set of different centrality indexes which highlight different properties of paths. In fact, after establishing correlations between centralities of the networks and socio-economic factors (crime rates, retail allocations, pedestrian/vehicular flows and others), one has informed indications on the actions that can be performed in order to increase the desired factors and to hinder the undesired ones. 6*798.
References 1. S. Wasserman and K. Faust, Social Networks Analysis (Cambridge University Press, Cambridge, 1994). 2. J. Scott, Social Network Analysis: A Handbook. Sage Publications, London, 2nd edition (2000). 3. A. Bavelas, Human Organization 7,16 (1948). 4. A. Bavelas, Journal of the Acoustical Society of America 22,271(1950). 5. Pitts, The Professional Geographer, 17 15 (1965). Social Networks 1 285 (1978).
408 6. B. Hillier, J. Hanson, The social logic of space (Cambridge University Press, Cambridge, UK) 1984. 7. B. Jang, C. Claramunt, Env. Plann. B 3 1 , 151 (2004). 8. S. Porta, P. Crucitti and V. Latora, cond-mat/0411241 9. R. Albert and A.-L. BarabLi, Rev. Mod. Phys. 7 4 , 47 (2002). 10. M.E.J. Newman, SZAM Review 4 5 , 167-256 (2003). 11. M. E. J. Newman, Phys. Rev. E 6 4 , 016131 (2001); Phys. Rev. E 6 4 , 016132 (2001); 12. F. Liljeros, C.R. Edling, N. Amaral, H.E. Stanley, and Y . Aberg, Nature 4 1 1 , 907 (2001). 13. R. Pastor-Satorras and A. Vespignani, Phys. Rev. Lett. 8 6 , 3200 (2001) 14. V. Latora, A. Nyamba, and S. Musumeci, Network of Sexual Contacts and Sexually Transmitted Diseases in Burkina Faso submitted to Preventive Medicine. 15. H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, A.L. Barabasi, Nature 4 0 7 , 651 (2000). 16. S.A. Wagner and D.A. Fell, Proc. R. SOC.London B 2 6 8 , 1803 (2001). 17. D.A. Fell and S.A. Wagner, Nature Biotech. 1 8 , 1121 (2000). 18. H. Jeong , S.P. Mason, A.L Barabasi, Z.N. Oltvai, Nature 4 1 1 , 41 (2001). 19. P. Tieri, S. Valensin, V. Latora, G.C. Castellani, M. Marchiori, D. Remondini and C. Franceschi, Preprint q-bio.MN/0409020, in press in Bioinformatics 2005. 20. R. Pastor-Satorras, A. Vespignani Evolution and Structure of the Internet, Cambridge University Press, 2004 21. V. Latora and M. Marchiori, Chaos Solitons and Fractals 2 0 77 (2004). 22. R. Guimera , S. Mossa, A. Turtschi, and L.A.N. Amaral, Preprint cond-mat/0312535 (2003); R. Guimerh and L.A.N. Amaral, Eur. Phys. J. 3 8 381 (2004). 23. P. Crucitti, V. Latora and M. Marchiori, Phys. Rev. E 69, 045104(R) (2004). 24. V. Latora and M. Marchiori, Phys. Rev. E 71, 015103(R) (2005). 25. R. Kinney, P. Crucitti, R. Albert and V. Latora, Preprint cond-mat/0410318 26. V. Latora and M. Marchiori, A measure of centrality based on the network efficiency. Preprint cond-mat/0402050. 27. Although the discussion of this section will be, for historical reasons, in terms of a social networks, all the ideas and measures applies as well for a generic complex network. 28. L. C. Freeman, Social Networks 1 215 (1979). 29. J. Nieminen, Scandinavian Journal of PsychologylS, 322 (1974). 30. P. F. Bonacich, Journal of Mathematical Sociology 2 , 113 (1972). 31. P. F. Bonacich, Am. J. Sociol. 9 2 , 1170 (1987). 32. L. Katz, Psychometrika 1 8 , 39 (1953). 33. S. Brin and L. Page, Computer Networks 30, 107 (1998). 34. J. M. Kleinberg, J. A C M 4 6 , 604 (1999). 35. N. E. Friedkin, Am. J. Sociol. 96, 1487 (1991). 36. G . Sabidussi, Psychometrika31, 581 (1966) 37. V. Latora and M. Marchiori, Phys. Rev. Lett. 8 7 , 198701 (2001). 38. V. Latora and M. Marchiori. Eur. Phys. J. B 3 2 249 (2003). 39. J. M. Anthonisse, The msh in a graph. Amsterdam: University of Amsterdam Mathematical Centre (1971). 40. L.C. Freeman, Sociometry 4 0 , 35 (1977). 41. L.C. Freeman, Stephen P. Borgatti and Douglas R. White. Social Networksl3, 141 (1991). 42. L.R. Ford and D.R. Fulkerson. Flows in Networks. Princeton University Press, Princeton 1962. 43. M.E.J. Newman, Preprint cond-mat/0309045.
409 44. M.E. J. Newman and M. Girvan, 45. V.E. Krebs, Connections 24, 43(2002). See also http://www.orgnet.com/hijackers.html. 46. T. J. Overbay, American Scientist 88, 220, May-June (2000). 47. http://www.infonet.com 48. http://www.cnn.com/2001/TECH/industry/O9/12/telecom.operational.idg/ 49. M. Girvan and M. E. J. Newman, Proc. Natl. Acad. Sci. USA99, 7821 (2002). 50. P. Crucitti, V. Latora, and M. Marchiori, A topological analysis of the Italian electric power grid, Physica A338(2004)92. 51. P. Crucitti, V. Latora and M. Marchiori, Locating critical lines in high-voltage electrical power grids. 52. S.Fortunato, V. Latora and M. Marchiori, Phys. Rev. E70, 056104 (2004). 53. P. Crucitti, V. Latora and S . Porta, Centrality Measures in Urban Networks, to be submitted