D.H. Lyth, A. Riotto/Physics Reports 314 (1999) 1}146
1
PARTICLE PHYSICS MODELS OF INFLATION AND THE COSMOLOGICAL DENSITY PERTURBATION
David H. LYTH , Antonio RIOTTO Department of Physics, Lancaster University, Lancaster LA1 4YB, UK CERN, Theory Division, CH-1211, Geneva 23, Switzerland
AMSTERDAM } LAUSANNE } NEW YORK } OXFORD } SHANNON } TOKYO
Physics Reports 314 (1999) 1}146
Particle physics models of in#ation and the cosmological density perturbation David H. Lyth , Antonio Riotto Department of Physics, Lancaster University, Lancaster LA1 4YB, UK CERN, Theory Division, CH-1211, Geneva 23, Switzerland Received October 1998; editor M.P. Kamionkowski
Contents 1. Introduction 2. Observing the density perturbation (and gravitational waves?) 2.1. The primordial quantities 2.2. The observable quantities 3. The slow-roll paradigm 3.1. The slowly rolling in#aton "eld 3.2. The slow-roll predictions 3.3. Beyond the slow-roll prediction 3.4. The number of e-folds of slow-roll in#ation 3.5. Gravitational waves 3.6. Before observable in#ation 4. Calculating the curvature perturbation generated by in#ation 4.1. The case of a single-component in#aton 4.2. The multi-component case 4.3. The curvature perturbation 4.4. Calculating the spectrum and the spectral index 4.5. When will R become constant? 4.6. Working out the perturbation generated by slow-roll in#ation 4.7. An isocurvature density perturbation?
4 10 10 12 15 16 17 21 23 25 26 27 28 30 32 33 35 36 37
5. Field theory and the potential 5.1. Renormalizable versus non-renormalizable theories 5.2. The Lagrangian 5.3. Internal symmetry 5.4. The true vacuum and the in#ationary vacuum 5.5. Supersymmetry 5.6. Quantum corrections to the potential 5.7. Non-perturbative e!ects 5.8. Flatness requirements on the tree-level in#ation potential 5.9. Satisfying the #atness requirements in a supersymmetric theory 6. Forms for the potential; COBE normalizations and predictions for n 6.1. Single-"eld and hybrid in#ation models 6.2. Monomial and exponential potentials 6.3. The paradigm <"< #2 6.4. The inverted quadratic potential 6.5. Inverted higher-order potentials 6.6. Another form for the potential 6.7. Hybrid in#ation 6.8. Hybrid in#ation with a quadratic potential
On leave of absence from Theoretical Physics Department, University of Oxford, UK. 0370-1573/99/$ } see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 1 2 8 - 8
39 39 40 43 47 48 49 54 55 56 58 59 60 61 61 62 64 65 66
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146 6.9. Masses from soft susy breaking 6.10. Hybrid thermal in#ation 6.11. Inverted hybrid in#ation 6.12. Hybrid in#ation with a cubic or higher potential 6.13. Mutated hybrid in#ation 6.14. Hybrid in#ation from dynamical supersymmetry breaking 6.15. Hybrid in#ation with a loop correction from spontaneous susy breaking 6.16. Hybrid in#ation with a running mass 6.17. The spectral index as a discriminator 7. Supersymmetry 7.1. Introduction 7.2. The motivation for supersymmetry 7.3. The susy algebra and supermultiplets 7.4. The Lagrangian of global supersymmetry 7.5. Spontaneously broken global susy 7.6. Soft susy breaking 7.7. Loop corrections and running 7.8. Supergravity 7.9. Supergravity from string theory 7.10. Gravity-mediated soft susy breaking
67 69 69 70 71 72 74 75 80 82 82 82 83 85 88 91 95 97 100 104
8. F-term in#ation 8.1. Preserving the #at directions of global susy 8.2. The generic F-term contribution to the in#aton potential 8.3. Preserving #at directions in string theory 8.4. Models with the superpotential linear in the in#aton 8.5. A model with gauge-mediated susy breaking 8.6. The running in#aton mass model revisited 8.7. A variant of the NMSSM 9. D-term in#ation 9.1. Keeping the potential #at 9.2. The basic model 9.3. Constructing a workable model from string theory 9.4. D-term in#ation and cosmic strings 9.5. A GUT model of D-term in#ation 10. Conclusion 11. Postscript Acknowledgements References
3 108 108 109 110 114 116 117 119 122 122 123 126 132 134 137 140 140 140
Abstract
This is a review of particle-theory models of in#ation, and of their predictions for the primordial density perturbation that is thought to be the origin of structure in the Universe. It contains mini-reviews of the relevant observational cosmology, of elementary "eld theory and of supersymmetry, that may be of interest in their own right. The spectral index n(k), specifying the scale dependence of the spectrum of the curvature perturbation, will be a powerful discriminator between models, when it is measured by Planck with accuracy *n&0.01. The usual formula for n is derived, as well as its less familiar extension to the case of a multicomponent in#aton; in both cases the key ingredient is the separate evolution of causally disconnected regions of the Universe. Primordial gravitational waves will be an even more powerful discriminator if they are observed, since most models of in#ation predict that they are completely negligible. We treat in detail the new wave of models, which are "rmly rooted in modern particle theory and have supersymmetry as a crucial ingredient. The review is addressed to both astrophysicists and particle physicists, and each section is fairly homogeneous regarding the assumed background knowledge. 1999 Elsevier Science B.V. All rights reserved. PACS: 98.80.Cq Keywords: Cosmology; Particle physics; Gravity
4
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
1. Introduction We do not know the history of the observable Universe before the epoch of nucleosynthesis, but it is widely believed that there was an early era of cosmological in#ation [205,180,197,198]. During this era, the Universe was "lled with a homogeneous scalar "eld , called the in#aton "eld, and essentially nothing else. The potential <( ) dominated the energy density of the Universe, decreasing slowly with time as rolled slowly down the slope of <. The attraction of this paradigm is that it can set the initial conditions for the subsequent hot big bang, which otherwise have to be imposed by hand. One of these is that there be no unwanted relics (particles or topological defects which survive to the present and contradict observation). Another is that the initial density parameter should have the value X"1 to very high accuracy, to ensure that its present value has at least roughly this value. There is also the requirement that the Universe be homogeneous and isotropic to high accuracy. All of these virtues of in#ation were noted when it was "rst proposed by Guth in 1981 [129], and very soon a more dramatic one was also noticed [134,282,130]. Starting with a Universe which is absolutely homogeneous and isotropic at the classical level, the in#ationary expansion of the Universe will &freeze in' the vacuum #uctuation of the in#aton "eld so that it becomes an essentially classical quantity. On each comoving scale, this happens soon after horizon exit. Associated with this vacuum #uctuation is a primordial energy density perturbation, which survives after in#ation and may be the origin of all structure in the Universe. In particular, it may be responsible for the observed cosmic microwave background (cmb) anisotropy and for the large-scale distribution of galaxies and dark matter. In#ation also generates primordial gravitational waves as a vacuum #uctuation, which may contribute to the low multipoles of the cmb anisotropy. When it was "rst proposed in 1982, this remarkable paradigm received comparatively little attention. For one thing observational tests were weak, and for another the in#ationary density perturbation was not the only candidate for the origin of structure. In particular, it seemed as if cosmic strings or other topological defects might do the job instead. This situation changed dramatically in 1992, when COBE measured the cmb anisotropy on large angular scales [280], and another dramatic change is now in progress with the advent of smaller scale measurement. Subject to con"rmation of the latter, it seems that the paradigm of slow roll in#ation is the only one not in con#ict with observation. The in#aton "eld perturbation, except in contrived models, has practically zero mass and negligible interaction. As a result, the primordial density perturbation is gaussian; in other words, its fourier components dk are uncorrelated and have random phases. Its spectrum PR(k), de"ned roughly as the expectation value of "dk" at the epoch of horizon exit, de"nes all of its stochastic properties. The shape of the spectrum is conveniently de"ned by the spectral index n(k), de"ned as n(k)!1,d ln PR/d ln k .
(1)
Guth's paper gave in#ation its name, and for the "rst time spelled out its virtues in setting initial conditions. Earlier authors had contemplated the possibility of in#ation, as reviewed comprehensively in Reference [251]. A comoving scale a/k is said to leave the horizon when k"aH, where a(t) is the scale factor of the Universe and H"a /a is the Hubble parameter. To be precise, PR is the spectrum of a quantity R to be de"ned later, which is a measure of the spatial curvature seen by comoving observers.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
5
Slow-roll in#ation predicts a slowly varying spectrum, corresponding to "n!1" signi"cantly below 1. In some models of in#ation, n(k) is practically constant on cosmological scales, leading to the alternative de"nition PR(k)JkL\ . Taking it to be constant, the spectral index n
(2)
is de"ned by
P (k)JkL . (3) The gravitational wave amplitude is also predicted to be gaussian, again with a primordial spectrum P (k) which is slowly varying. The spectra PR(k) and P (k) provide the contact between theory and observation. The latter is negligible except in a very special class of in#ationary models, and we shall learn a lot if it turns out to be detectable. For the moment, observation gives only the magnitude of PR(k) at the scale "1.91;10\) plus a bound on its scale dependk\&10 Mpc (the COBE normalization P R ence corresponding to n"1.0$0.2. The observational constraint P R &10\ was already known when in#ation was proposed, and was soon seen to rule out an otherwise viable model. Since then, practically all models have been constructed with the constraint in mind, so that its power has not always been recognized; the huge class of models which it rules out have simply never been exhibited. The situation regarding the spectral index is quite di!erent. The present result n(k)"1.0$0.2 is only mildly constraining for in#ationary models, its most notable consequence being to rule out &extended' in#ation in all except very contrived versions. But this situation is going to improve in the forseeable future, and after Planck #ies in about ten years we shall probably know n(k) to an accuracy *n&0.01. As this article demonstrates, such an accurate number will consign to the rubbish bin of history most of the proposed models of in#ation. What do we mean by a model of in#ation? Before addressing the question we should be very clear about one thing. Observation, notably the COBE measurement of the cmb anisotropy, tells us that when our ;niverse leaves the horizon the potential <( ) is far below the Planck scale. To be precise, < is no more than a few times 10 GeV, and it may be many orders of magnitude smaller. Subsequently, there are at most 60 e-folds of in#ation, and only these have a directly observable e!ect. On the other hand, the history of our Universe begins with < presumably at the Planck scale, and to avoid "ne tuning in#ation should also begin then. This &primary in#ation', which may or may not join smoothly to the the last 60 e-folds, cannot be investigated by observation and is of comparatively little interest. It will not be treated in this review. So for us, a &model of in#ation' is a model of in#ation that applies after the observable Universe leaves the horizon. It is a model of &observable', as opposed to &primary', in#ation. So what is meant by a &model of in#ation'? The phrase is actually used by the community in two rather di!erent ways. At the simplest level a &model of in#ation' is taken to mean a form for the
A comoving scale a/k is said to be outside the horizon when it is bigger than H\. Each scale of interest leaves the horizon at some epoch during in#ation and enters it afterwards. The density perturbation on a given scale is essentially generated when it leaves the horizon. The comoving scale corresponding to the whole observable Universe (&our Universe') is entering the horizon at roughly the present epoch.
6
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
potential, as a function of the "elds giving a signi"cant contribution to it. In single-"eld models there is just the in#aton "eld (de"ned as the one which is varying with time) whereas in hybrid in#ation models most of the potential comes from a second "eld t which is "xed until the end of in#ation. In both cases, one ends up by knowing <( ), and the "eld value
at the end of in#ation. This allows one to calculate the spectrum PR(k), and in particular the spectral index n(k). In some cases the prediction for n(k) depends only on the shape of <, as is illustrated in Tables 1 and 2. One can also calculate the spectrum P of gravitational waves, but in most models they are too small to ever be detectable. At a deeper level, one thinks of a &model of in#ation' as something analogous to the Standard Model of particle interactions. One imagines that Nature has chosen some extension of the Standard Model, and that the relevant scalar "elds are part of that model. In this sense a &model of in#ation' is more than merely a speci"cation of the the potential of the relevant "elds. It will provide answers to at least some of the following questions. Have the relevant "elds and interactions already been invoked in some other context, and if so are the parameters required for in#ation compatible with what is already known? Do the relevant "elds have gauge interactions? If so, are we dealing with the Standard Model interactions, GUT interactions, or interactions in a hidden sector? Is the potential the classical one, or are quantum e!ects important? In the latter case, are we dealing with perturbative or non-perturbative e!ects? Of course, it would have been wonderful if in#ation already dropped out of the Standard Model, but sadly that is not the case. Perhaps more signi"cantly, it is not the case either for minimal supersymmetric extensions of the Standard Model. Taken in either sense, in#ation model building has seen a recent renaissance. In this article, we review the present status of the subject, taking seriously present thinking about what is likely to lie beyond the Standard Model. In particular, we take seriously the idea that supersymmetry (susy) is relevant. At the fundamental level, susy is supposed to be local, corresponding to supergravity. When considering particle interactions in the vacuum, in particular predictions for what is seen at colliders and underground detectors, global susy usually provides a good approximation to supergravity. But, as we shall discuss in detail, global susy is not in general a valid approximation during in#ation. This remains true no matter how low the energy scale, and no matter how small the "eld values, a fact ignored over the years by many authors. Being only a symmetry, supersymmetry does not completely de"ne the form of the "eld theory. In fact, in a supergravity theory the number of couplings that need to be speci"ed is in principle in"nite (a non-renormalizable theory). For guidance about the form of "eld theory, one may look to string theory. Taking it to denote the whole class of theories that give "eld theory as an approximation, string theory comes in many versions, but the two most widely studied are weakly coupled heterotic string theory [124,142,103,66,101,67,13,143,166] and Horava}Witten M-theory [199,141,248,214]. In our present state of knowledge, this gives reasonably detailed information in the regime where all "eld values are ;M , but almost nothing about the regime where some "eld . value is <M . Accordingly, &models' of in#ation in the sense of forms for the potential can at . By &minimal' we mean in this context extensions invoking only the supersymmetric partners of fermions and gauge bosons, with a reasonably simple supersymmetric extension of the Higgs sector. The simplest possible extension is called the Minimal Supersymmetric Standard Model (MSSSM). M is the Planck mass, de"ned in units "c"1 by M "(8pG)\"2.4;10 GeV. . .
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
7
present be promoted to particle-physics models only in this regime. Of course, this does not rule out the possibility of promotion at some time in the future. Let us brie#y review the history. In Guth's model of 1981 [129], some "eld that we shall call t is trapped at the origin, in a local minimum of the potential as illustrated in Fig. 1. In#ation ends when it tunnels through the barrier, and descends quickly to the minimum of the potential which represents the vacuum. It was soon noted that this &old in#ation' is not viable because bubbles of the new phase never coalesce. In 1982, Linde [201] and Albrecht and Steinhardt [6] proposed the "rst viable model of in#ation, which has been the archetype for all subsequent models. Some "eld , which we shall call the in#aton, is slowly rolling down a rather #at potential <( ). In the &new in#ation' model proposed in the above references, the potential has a maximum at the origin as in the full line of Fig. 3 (Section 5.3), and in#ation takes place near the maximum. It ends when starts to oscillate around the minimum, which again represents the vacuum. The &new in#ation' model was a model in both senses of the word, specifying both the form of the potential and its possible origin in a grand uni"ed theory (GUT) theory of particle physics. The minimum of the potential, corresponding to the vacuum expectation value (vev) of the in#aton, was originally taken to be at the GUT scale. Later it was raised to the Planck scale (&primordial in#ation) which weakened the connection with the GUT. Most of the models proposed in this "rst phase of particle-theory model building were very complicated, and are not usually mentioned nowadays. They were complicated because they worked under two restrictions, which have since been abondoned. First, the in#aton was required start out in thermal equilibrium (though Linde pointed out at an early stage that this is not mandatory [203]). Secondly, they worked almost exclusively with the paradigm of single-"eld in#ation, as opposed to the hybrid in#ation paradigm that we shall encounter in a moment. While this phase of complicated model building was getting under way, Linde proposed [202] in 1983 that instead the "eld might be rolling towards the origin, with a "eld value much bigger than M . He proposed a monomial potential, say <J or , which was supposed to hold right back . to the Planck epoch when <&M. At that epoch, was supposed to be a chaotically varying . function of position. Working out the "eld dynamics, one "nds that the observable Universe leaves the horizon when &10M , and in#ation ends when &M . Such big "eld values make it . . practically impossible to make a connection with particle physics. After some years, the monomial paradigm became the favoured one, and the search for a connection with particle physics was largely abandoned. The seeds for the present renaissance of model-building were laid around 1990. First, in 1989, La and Steinhardt proposed what they called &extended in#ation' [186]. Its objective was to implement
The only exception is the case of non-minimal in#ation (Section 6.6), where the canonically normalized "eld of interest in the present context is a few times M while the more fundamental "eld is only of order M . The only other cases . . where one has information in the large-"eld regime are those of the bulk moduli and dilaton "elds predicted by string theory. But in that regime, these "elds are either periodic or they do not support in#ation. For this reason, in#ation with such a potential is usually called &chaotic in#ation'. We shall use the phrase &monomial in#ation', because the hypothesis of chaotic initial conditions has no necessary connection with the form of the potential during observable in#ation. A wide variety of other monotonically increasing functions will also in#ate at
8
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Fig. 1. The old in#ation potential. Fig. 2. During hybrid in#ation, the potential <(t) is minimized at t"0.
&old' in#ation by providing a mechanism for making the bubbles coalesce at the end of in#aton. This mechanism was simply to add a slowly rolling in#aton "eld to the original new in#aton model, which makes the Hubble parameter decrease signi"cantly with time. It invoked the extension of gravity known as Brans}Dicke theory, and for this reason it was called extended in#ation. The original version con#icted with present-day tests of General Relativity, but more complicated versions were soon constructed that avoided this problem. This paradigm was practically killed in 1992, by the COBE detection [280] the cmb anisotropy. There was no sign of the bubbles formed at the end of in#ation, yet all except very contrived versions of the paradigm required that there should be. Going back to the historical development, it was known that like many extended gravity theories, extended in#ation can be re-formulated as an Einstein gravity theory. Working from the beginning with Einstein gravity, Linde [206] and Adams and Freese [1] proposed in 1991 a crucial change in the idea behind extended in#ation; until the end of in#ation, tunneling is completely impossible (not just relatively unlikely) because the trapped "eld t has a coupling to the slowly-rolling in#aton "eld . During in#ation, this changes the potential of the trapped "eld so that it becomes like the one shown in Fig. 2. Only at the end of in#ation is the "nal form of Fig. 1 achieved, and only then does tunnelling take place. In this model, the bubbles can coalesce very quickly, and be completely invisible in the microwave background as required by observation. The logical end is perhaps not hard to guess; in 1991, Linde [207] dispensed with the bubbles altogether, by eliminating the dip of the potential at the origin. At the end of in#ation, the "eld t now reverts to its vacuum value without any bubble formation, so that there is a second-order phase transition, instead of the "rst-order one of the original model. This "nal paradigm is known as hybrid in#ation. It has lead to the renaissance of in#ation model building, "rmly rooted in the concepts of modern particle theory, which is the focus of the present review. The actual beginning of the renaissance can be traced to a paper in 1994 [60]. It contained the crucial observation that during hybrid in#ation, the in#aton "eld is typically much less than M . . As a result, contact with particle theory again becomes a realistic possibility. At the same time though, the above paper emphasized that a generic supergravity theory will fail to in#ate no matter
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
9
how small are the "eld values, because the in#aton mass is too big. Much e!ort has since been devoted to "nding ways around this problem. In addition to the small "eld value, hybrid in#ation has another good feature. In single-"eld models the curve <( ) must "rst support in#ation, and then cease to support it so that in#ation ends. There are only a few simple functions that achieve this, if one excludes "eld values much bigger than M . In the hybrid case, the job of ending in#ation is done by the other "eld t, which . greatly increases the range of simple possibilities. We end this introduction with an overview of the present article, and a list of its omissions. The article is addressed to a wide audience, including both cosmologists and particle physicists. To cope with this problem, we have tried to make each section reasonably homogeneous regarding the background knowledge that is taken for granted, while at the same time allowing considerable variation from one section to another. Section 2 focusses on the cosmological quantities, that form a link between a model of in#ation and observation. Section 3 gives the basics of the slow-roll paradigm of in#ation, showing how the cosmological quantities are calculated. Section 4 is a specialized one, explaining how to derive the usual prediction of slow-roll in#ation, and how to generalize it to the case of a multi-component in#aton. Section 5 summarizes some of the basic ideas of modern particle theory, which have been used in in#ation model-building. Those with a background in particle theory will skip through it fairly quickly. Using these ideas, Section 6 reviews &models' of in#ation, taken to mean forms for the potential that have the general form suggested by particle theory. Section 7 summarizes those aspects of supersymmetry which are most relevant for in#ation model- building. It is addressed mainly to those who already have some understanding of that subject. As we explain there, the tree-level potential in a supersymmetric theory is the sum of an &F-term' and a &D-term'. The terms have very di!erent properties and in all models of in#ation so far proposed one or other dominates. Section 8 deals with models of in#ation where the F-term dominates, and Section 9 with those where the D-term dominates. We conclude in Section 10. The above list of topics is formidable, but still not exhaustive. Let us mention the main omissions. While the paradigm of slow-roll in#ation is broadly necessary, in order to account for the near scale independence of the primordial spectrum PR(k), brief interruptions of slow-roll are sometimes contemplated. So are sharp changes in the direction of slow roll, of the kind described in Section 4. In both cases, the e!ect is to generate a sharp feature, in the otherwise smooth primordial spectrum. At the time of writing there is no "rm observational evidence for such a feature, and we mention only brie#y the models that would predict one. We shall not discuss the pre-big-bang idea, that a bounce at the Planck scale can do the job of in#ation. In contrast with in#ation, this paradigm provides no natural explanation of the near scale-independence of the spectrum of the primordial curvature perturbation, encoded by the result nK1. In slow-roll in#ation this result is an automatic consequence of the near time-independence of the Hubble parameter, but no analogous quantity appears in the pre-big-bang paradigm. Globally supersymmetric models using complicated particle physics, in particular a Grand Uni"ed Theory (GUT) are not mentioned much. Like some simpler models that we do mention, these models usually lack any speci"c mechanism for controlling the supergravity corrections. Except for a brief mention of monomial potentials, models invoking "eld values much bigger than M are not mentioned. (All know models involving non-Einstein gravity are of this type.) .
10
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
We do not consider the rather special in#ationary potentials that can give an open Universe (negative spatial curvature) through bubble formation [119,46,209,212], since little attention has so far been paid to these in the context of particle theory. We are basically focussing on the usual case, that X has been driven to 1 long before our Universe leaves the horizon during in#ation, making its present value (including the contribution of any cosmological constant) also 1. However, most of what we do continues to apply if that is not the case, which arguably might happen for any form of the in#ationary potential. We assume that the primordial density perturbation generated by the vacuum #uctuation of the in#aton is solely responsible for large scale structure, except possibly for a gravitational wave signal in the cmb anisotropy. This means that we ignore anything coming from topological defects, as well as the isocurvature density perturbation that could in principle be generated by the vacuum #uctuation of a non-in#aton "eld like the axion. Subject to con"rmation from further observations, it looks as though such things cannot be entirely responsible for large-scale structure, so indeed the simplest thing is to assume that they are entirely absent. Finally, we are considering only models of in-ation, not of the subsequent cosmology. In particular, we are not considering the reheating process by which the scalar "eld is converted into hot radiation. We are not considering the preheating process that might exchange energy between scalar "elds before reheating [174}178,11,181,155}158]. And we are de"nitely not considering baryogenesis, dark matter, or unwanted relics such as moduli. All of these phenomena are likely to involve "elds, and interactions, that play no role during in#ation. We generally set "c"1, and we de"ne the Planck mass by M "(8pG)\"2.4;10 GeV. . 2. Observing the density perturbation (and gravitational waves?) The vacuum #uctuation of the in#aton "eld generates a primordial energy density perturbation, and the vacuum #uctuation in the transverse traceless part of the metric generates gravitational waves. In this section we explain brie#y how the primordial density perturbation, and the gravitational waves, are related to what is actually observed. 2.1. The primordial quantities In the unperturbed Universe, the separation of comoving observers is proportional to the scale factor of the Universe a(t), and we normalize it to 1 at the present epoch. The Hubble parameter is H"aR /a, and its present value H "100h km s\ Mpc\ with h probably in the range 0.5}0.7. The
With any potential, one can assume that X is "ne-tuned to be small at the Planck scale [224], or else that the Universe is created at a "nely tuned point in "eld space [135,210,72,42]. As usual, one can consider eliminating such "ne-tuning by the anthropic principle. Both in the perturbed and unperturbed Universe, a comoving observer is de"ned as one moving with the #ow of energy. Such observers measure zero momentum density at their own positions.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
11
corresponding Hubble distance is cH\"3000h\ Mpc, which is roughly the size of the observ able Universe. Instead of the physical Cartesian coordinates r it is more convenient to use coordinates x such that r"a(t)x. Then the coordinate position of a comoving observer is time-independent, in the unperturbed Universe. The Fourier expansion of a perturbation g(x, t) is made inside a large comoving box, whose coordinate size ¸ should be a few orders of magnitude bigger than that of the observable Universe. (On bigger scales it would not be justi"ed to assume a homogeneous, isotropic universe.) The Fourier expansion is g(x, t)" gk(t)e k x .
(4)
k
For mathematical purposes it is convenient to consider the limit of an in"nite box,
g(x, t)" dk g(k, t)e k x ,
(5)
where (¸/2p)gkP(2p)\g(k). A useful wave of specifying the physical wave number k/a is to give its present value k. During in#ation, aH increases with time, and a comoving scale a/k is said to leave the horizon when aH/k"1. After in#ation aH decreases, and the comoving scale is said to enter the horizon when aH/k"1. For cosmologically interesting scales, horizon entry occurs long after nucleosynthesis. We shall occasionally refer to the long era between horizon exit and horizon entry as the primordial/ era. As we shall see, the evolution of the perturbations during the primordial era is simple, because causal processes cannot operate. Leaving aside gravitational waves, there is only one independent primordial perturbation, because everything is generated from the vacuum #uctuation of the in#aton "eld. (We are considering the usual case of a single-component in#aton "eld.) Instead of the in#aton "eld perturbation, it is actually more convenient to consider a quantity R(k), de"ned by R(k)"(a/k)R(k) ,
(6)
where R is the spatial curvature scalar seen by comoving observers. Unlike the in#aton "eld perturbation, it is time independent during the primeval era, and it continues to be well-de"ned after the in#aton "eld disappears.
In the following we shorten &observable Universe' to &Universe'. The unknown regions outside it are referred to as the &universe' with capitalization. The quantity we are calling R was de"ned "rst by Bardeen [22], who called it . It was called R by Kodama and K K Sasaki [167], and we drop the subscript following [224,196,291]. Later it was called f by Mukhanov et al. [240], which is the other commonly used notation at present. It is a factor k\ times the quantity dK of Refs. [215,216]. On the scales far outside the horizon where it is constant (the only regime where it is of interest) it coincides with the quantity m/3 of Ref. [23] and the quantity m of Ref. [272]. On these scales, it is also equal to !CU, where U is the commonly used &gauge invariant potential' [22], and C is a factor of order unity which is constant both during radiation domination and during matter domination. During the latter epoch, C"5/3.
12
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
A gravitational wave corresponds to a spatial metric perturbation h which is traceless, GH dGHh "0, and transverse, R h "0. This means that each Fourier component is of the form GH G GH h "h e>#h e" . (7) GH > GH " GH In a coordinate system where k points along the z-axis, the nonzero components of the polarization tensors are de"ned by e> "!e>"1 and e" "e" "1. The two independent quantities h are VV WW VW WV >" time independent well outside the horizon. In#ation generates Gaussian #uctuations. This means that for each perturbation g(x), at "xed t, the Fourier components are uncorrelated except for the expectation values 1gH(k)g(k)2"d(k!k)(2p/k)P (k) . (8) E The quantity P (k) is called the spectrum of g(x), and it determines all of its stochastic properties. E The primordial perturbations consist of the three independent quantities R, h and h , and from > " rotational invariance the last two have the same spectrum, (9) P >"P ",P /2 . F F We therefore have two independent spectra PR and P , determined in a slow-roll model of in#ation by the formulas described in the next section. We shall see that they have at most mild scale dependence, and this is consistent with observation. The spectral index n(k) of the curvature perturbation (Eq. (1)) is a crucial point of comparison between theory and observation, and the same will be true of n (k) if the gravitational waves are detectable. 2.2. The observable quantities From these primordial quantities, one can calculate the observable quantities, provided that one knows enough about the nature and evolution of the unperturbed Universe after relevant scales enter the horizon. Since cosmological scales enter the horizon well after nucleosynthesis, one indeed has the necessary information, up to uncertainties in the Hubble parameter, the nature and amount of dark matter, the epoch of reionization and the magnitude of the cosmological constant. The observed quantities can be taken to be the matter density contrast d,do/o (observed through the distribution and motion of the galaxies), and the cmb anisotropy. The latter consists of the temperature anisotropy *¹/¹, which is already being observed, and two Stokes parameters describing the polarization which will be observed by the MAP [230] and Planck [257] satellites. It is convenient to make multipole expansions so that one is dealing with the temperature anisotropy al , and the polarization anisotropies El and Bl . K K K To be more precise, the gravitational wave amplitude is certainly gaussian, and so is the curvature perturbation if the in#aton "eld #uctuation d is Gaussian. The latter is true if d is a practically free "eld, which is the case in practically all models of in#ation. The Gaussianity is inherited by all of the perturbations as long as they remain small. In principle, the reionization epoch can be calculated in terms of the other parameters, through the abundance of early rare objects, but present estimates are fairly crude. The polarization multipoles are de"ned with respect to spin-weighted spherical harmonics, to ensure the correct transformation of the Stokes parameters under rotation about the line of sight.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
13
Except for the density perturbation on scales where gravitational collapse has taken place, the observable quantities are related to the primordial ones through linear, rotationally invariant, transfer functions. For the density perturbation, d(k, t)"T(k, t)R(k) .
(10)
It can be observed both at the present, and (by looking out to large distances) at earlier times. The corresponding spectrum is P (k, t)"T(k, t)PR(k) . B For the cmb anisotropy, ignoring the gravitational waves, one has
(11)
4p T (k, l)Rl (k)k dk , al " K (2p) H K
(12)
4p El " T (k, l)Rl (k)k dk , K (2p) # K
(13)
Bl "0 . K Here, the multipoles of R are related to its Fourier components by
Rl (k)"kil R(k, kK )>l (kK ) dXk . K K
(14)
(15)
which is equivalent to the usual spherical expansion. They are uncorrelated except for the expectation values 1gH (k)2"(2p/k)P (k)d(k!k)dll d . (16) l (k)gl K YKY E Y KKY As a result, the multipoles of the cmb anisotropy are uncorrelated, except for the expectation values 1aH 2"C(l)dll d , l al K YKY Y KKY 1aH 2"C (l)dll d , l El K YKY Y KKY 1EH 2"C (l)dll d . l El K YKY # Y KKY where dk C(l)"4p , TH(k, l)PR(k) k dk C (l)"4p , TH(k, l)¹ (k, l)PR(k) # k dk C (l)"4p . T(k, l)PR(k) # # k
(17) (18) (19)
(20) (21) (22)
14
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
The gravitational waves give contributions to the C's which have a similar form, now with a non-zero C de"ned analogously to C . We shall not give their precise form, but note for future # reference that they fall o! rapidly above l:100. The reason is that larger l correspond to scales smaller than the horizon at photon decoupling; on such scales the amplitude of the gravitational waves has been reduced from its primordial value by the redshift. We should comment on the meaning of the &expectation value', denoted by 122. At the fundamental level, it denotes the quantum expectation value, in the state that corresponds to the vacuum during in#ation. This state does not correspond to a de"nite perturbation g(x) (because it does not correspond to de"nite g(k)), so it is a superposition of possible universes. As usual, this SchroK dinger's cat paradox does not prevent us from comparing with observation. We simply make the hypothesis that our Universe is a typical one, of the superposition de"ned by the quantum state. Except for the low multipoles of the cmb anisotropy, this makes observational quantities sharply de"ned, since they involve a sum over the practically continuous variables k and l. For the low multipoles the expected di!erence between the observed "al " and 1"al "2 (called cosmic variance) K K needs to be taken into account, but the hypothesis that we live in a typical universe is still a very powerful one. For the density perturbation, the comparison of the above prediction with observation has been a major industry for many years. Since 1992 the same has been true of the cmb temperature anisotropy. Perhaps surprisingly, the result of all this e!ort is easy to summarize. Observation is consistent with the in#ationary prediction that the curvature perturbation is gaussian, with a smooth spectrum. The spectrum is accurately measured by COBE at the scale kK7.5H (more or less the center of the range explored by COBE). Assuming that gravitational waves are negligible, it is [47] d ,(2/5)P R "1.91;10\ . &
(23)
with an estimated 9% uncertainty at the 1-p level. In writing this expression, we introduced the quantity d which is normally used by observers. & Assuming that the spectral index is roughly constant over cosmological scales, observation constrains it to something like the range [198] n"1.0$0.2 .
(24)
Gravitational waves have not so far been seen in the cmb anisotropy (or anywhere else). Observation is consistent with the hypothesis that they account for a signi"cant fraction (less than 50% or so) of the mean-square cmb multipoles at l:100. In quantifying their e!ect, it is useful to consider the quantity r de"ned in the next section. Up to a numerical factor it is P /PR, and the factor is chosen so that in an analytic approximation due to Starobinsky [284], r"C (l)/CR(l)
(25)
There is no cross term involving Bl because it would be odd under the parity transformation. (The vacuum state is K parity invariant, and so is the Thompson scattering process responsible for the polarization.)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
15
for l in the central COBE range. (Here CR is the contribution of the curvature perturbation given by Eq. (20), and C the contribution of gravitational waves.) We are saying that present observations require r:1 or so. According to an accurate calculation [47], the relative contribution of gravitational waves to the COBE anisotropy is actually 0.75r, reducing the deduced value of d by a factor K(1#0.75r)\ compared with Eq. (23). & What about the future? The magni"cent COBE normalization will perhaps never to be improved, but this hardly matters since at present an understanding of even its order of magnitude is a major theoretical challenge. Much more interesting is the situation with the spectral index. The Planck satellite will probably measure n(k) with an accuracy of order *n&0.01, which as already mentioned will be a powerful discriminator between models of in#ation. The same satellite will also either tighten the limit on gravitational waves to r:0.1, or detect them. This last "gure is unlikely to be improved by more than an order of magnitude in the forseeable future. The Planck satellite probes a range * ln kK6, and will measure the scale-dependence dn/d ln k if it is bigger than a few times 10\. We have emphasized the cmb anisotropy because of the promised high accuracy, but it will never be the whole story. It can directly probe only the scales 10 Mpc:k\:10 Mpc, where the upper limit is the size of the observable Universe, and the lower limit is the thickness of the last-scattering &surface'. At present it probes only the upper half of this range, 100 Mpc:k\:10 Mpc. Galaxy surveys probe the range 1 Mpc:k\:100 Mpc, providing a useful overlap in the future. The range 1 Mpc:k\:10 Mpc is usually taken to be the range of &cosmological' scales. If a signal of early reionization is seen in the cmb anisotropy, it will provide an estimate of the spectrum on a signi"cantly smaller scale, k\&10\ Mpc. Alternatively, the absence of a signal will provide a rough upper limit on this scale. On smaller scales still, information on the spectrum of the primordial density perturbation is sparse, and consists entirely of upper limits. The most useful limit, from the viewpoint of constraining models of in#ation, is the one on the smallest relevant scale which is the one leaving the horizon just before the end of in#ation. It has been considered in Refs. [49,266,123], and for a scale-independent spectral index corresponds to n:1.3.
3. The slow-roll paradigm In#ation is de"ned as an era of repulsive gravity, a( '0, which is equivalent to 3P(!o where o is the energy density and P is the pressure. As noted earlier, we are concerned only with the era of &observable in#ation', which begins when the observable Universe leaves the horizon, since memory of any earlier epochs has been wiped out.
A common alternative is to de"ne r by setting l"2 in Starobinsky's calculation. This increases r by a factor 1.118 compared with the above de"nition. A very limited constraint is provided on much bigger scales through the Grishchuk}Zeldovich [127,111] e!ect, which we shall not discuss.
16
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
During in#ation the density parameter X is driven towards 1. Subsequently it moves away from 1, and its present value is equal to its value at the the beginning of observable in#ation. We are taking that value to be close to 1, which means that X is close to 1 during observable in#ation. This gives the energy density o in terms of the Hubble parameter, 3MH"o . (26) . During observable in#ation, the energy density and pressure are supposed to be dominated by scalar "elds. Of the "elds that contribute signi"cantly to the potential, the in#aton "eld is by de"nition the only one with signi"cant time dependence, leading to (27) o" Q #< , P" Q !< . (28) (We make the usual assumption that has only one component, deferring the general case to Section 4.) The evolution of is given by
$ #3H Q "!< ,
(29)
where an overdot denotes d/dt and a prime denotes d/d . This is equivalent to the continuity equation oR "!3H(o#P), which with Eq. (26), is equivalent to HQ "! Q /M . .
(30)
3.1. The slowly rolling inyaton xeld While cosmological scales are leaving the horizon, the slow-roll paradigm of in#ation [180,205,197,198] is practically mandatory in order to account for the near scale invariance of spectrum of the primordial curvature perturbation. The in#aton "eld is supposed to be on a region of the potential which satis"es the #atness conditions e;1 ,
(31)
"g";1 ,
(32)
where (33) e,M(</<) , . g,M</< . (34) . Also, it is supposed that the exact evolution equation Eq. (29) can be replaced by the slow-roll approximation
Q "!</3H .
(35)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
17
The #atness conditions and the slow-roll approximation are the basic equations, needed to derive the standard prediction for the density perturbation and the spectral index. For potentials satisfying the #atness conditions, the slow-roll approximation is typically valid for a wide range of initial conditions (values of and Q at an early time). The "rst #atness condition e;1 ensures that o is close to < and is slowly varying. As a result H is slowly varying, which implies that one can write aJe&R at least over a Hubble time or so. The second #atness condition "g";1 is actually a consequence of the "rst #atness condition plus the slow-roll approximation 3H Q "!<. Indeed, di!erentiating the latter one "nds
$ /H Q "e!g ,
(36)
and from Eq. (29) the slow-roll approximation is equivalent to " $ ";H" Q ". A crucial role is played by the number of Hubble times N( ) of in#ation, still remaining when
has a given value. From some time t to a "xed later time t , the number of Hubble times is R (37) N(t), H(t) dt . R The small change satis"es
dN,!H dt("!d ln a) .
(38)
During slow-roll in#ation, H < dN "! " ("$((2eM )\) . . M d
<
Q . The number of e-folds of slow-roll in#ation, remaining at a given epoch, is
N( )"
< M\ d , . < ( (
(39)
(40)
where marks the end of slow-roll in#ation. 3.2. The slow-roll predictions In this subsection and the next, as well as in Section 4, we discuss predictions for PR, n and dn/d ln k. More material can be found in Refs. [196,197,291,198]. Two basic assumptions are made. One is that the in#aton "eld perturbation d has negligible interaction with other "elds. This is equivalent to the validity during in#ation of linear cosmological perturbation theory, in other words to the procedure of keeping only terms that are linear in the perturbations [198].
In what follows, we say that a function of time satisfying "d ln f/d ln a";1 is &slowly varying'. For a function of wave number k, &slowly varying' will mean the same thing with a replaced by k.
18
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
The other essential assumption is that well before horizon exit, when the particle concept makes sense, the relevant Fourier modes of d have zero occupation number. This vacuum assumption is more or less mandatory, since too many particles would give signi"cant pressure and spoil in#ation [196]. As a result of these assumptions, the primordial curvature perturbation is gaussian, with stochastic properties that are completely de"ned by its spectrum PR(k). In this subsection, we make the usual assumption that the slow-roll paradigm is valid. 3.2.1. The spectrum The perturbation d is best de"ned on spatially #at hypersurfaces. Then, in the slow-roll limit HQ P0, one can ignore the e!ect of the metric perturbation [197,198], and d satis"es (d )
#3H(d ) #[<#(k/a)]d "0 .
(41)
The #atness condition Eq. (32) ensures that the mass-squared 2< is negligible until at least a few Hubble times after horizon exit. This means that d can be treated as a massless free "eld. A few Hubble times after horizon exit, its vacuum #uctuation can be regarded as a classical quantity, and its spectrum is then P "(H/2p) . (
(42)
The corresponding curvature perturbation is given by R"(!H/ Q )d (valid in linear perturbation theory independently of slow-roll). Using Eqs. (35) and (31), this is equivalent to < < 1 1 PR(k),d (k)" " . & 75pM < 150pM e . .
(43)
In this expression, the potential and its derivative are evaluated at the epoch of horizon exit for the scale k, which is de"ned by k"aH. This prediction, for the spectrum of R(k) a few Hubble times after horizon exit, is of no use as it stands. But one can show that R(k) is time-independent between that epoch and the approach of horizon entry long after in#ation ends. As we saw in Section 2, this allows one to calculate observable quantities. Comparing Eq. (43) with the value Eq. (23) deduced from the COBE observation of the cmb anisotropy gives M\</<"5.3;10\ . .
(44)
Eq. (43) becomes valid only a few Hubble times after horizon exit, but its right-hand side is slowly varying and we might as well evaluate it actually at horizon exit. The di!erence this makes is of the same order as the error in Eq. (43). This relation ignores any gravitational wave contribution, but there is no point in including their e!ect in the present context. The reason is that the prediction for d that is being used has an error of at least the same order. If necessary one & could include [291,200] the e!ect of the gravitational waves using the more accurate formula Eq. (78).
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
19
This relation provides a useful constraint on the parameters of the potential. It can be written in the equivalent form </e"0.027M "6.7;10 GeV . .
(45)
Since e is much less than 1, the in#ationary energy scale < is at least a couple of orders of magnitude below the Planck scale [215]. The scale leaving the horizon at a given epoch is directly related to the number N( ) of e-folds of slow-roll in#ation, that occur after the epoch of horizon exit. Indeed, since H is slowly varying we have d ln k"d(ln(aH))Kd ln a"H dt. From the de"nition Eq. (38) this gives d ln k"!dN( ) ,
(46)
and therefore ln(k /k)"N( ) ,
(47)
where k is the scale leaving the horizon at the end of slow-roll in#ation. As we shall see, this relation is very useful when working out the prediction for a given form of the potential. This is a good place to insert a historical footnote, about the origin of the slow-roll prediction for PR. As we noted already, it comes in two parts. One is the formula Eq. (43) for PR a few Hubble times after horizon exit, and the other is the statement that R (hence PR) is time-independent while k is well outside the horizon. Both parts were, in essence, given at about the same time in Refs. [134,282,130,23]. (Related work [237] had been done earlier.) To be precise, these authors gave results which become more or less equivalent after the spectrum has been de"ned, though that last step was not explicitly made and except for the last work only a particular potential is discussed. Soon afterwards the results were given again, this time with an explicitly de"ned spectrum [215]. Strictly speaking none of these "ve derivations is completely satisfactory. The "rst three make simplifying assumptions. Regarding the constancy of R, all except the third assume something equivalent to it without adequate proof. We discuss the constancy of R in Section 4. Regarding Eq. (43), none of these early derivations properly considers the e!ect of the in#aton "eld perturbation on the metric, but as we noted already that turns out to be negligible. 3.2.2. The spectral index We have an expression for PR(k) in terms of < and <, and we want to calculate the spectral index de"ned by Eq. (1). From Eqs. (39) and (46), d
< "!M , .< d ln k
(48)
The last three works give results equivalent to the one we quote, and the "rst gives a result which is approximately the same.
20
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
where, as always, k"aH. We shall need the following expressions: de/d ln k"2eg!4e ,
(49)
dg/d ln k"!2eg#m ,
(50)
dm/d ln k"!2em#gm#p ,
(51)
where m,M .
<(d
(52)
p,M .
<(d
(53)
Following for instance [28], we have introduced respectively the square and the cube of a quantity, even though the quantity itself never appears in an equation. As we shall see, this is a convenient device. Also, in the case <J N, with pO1 or 2, one has "g"&"m"&"p". Using Eqs. (49) and (43) one "nds [196,64,273] n!1"!6e#2g ,
(54)
and using Eqs. (49) and (50) [184], dn "!16eg#24e#2m . d ln k
(55)
Practically all models proposed so far (Section 6) have <J N or <J N ln , and in most cases one also has ;M . Then e&( /M )"g" is negligible, and one can write . . n!1"2g , (56) dn/d ln k"2m .
(57)
More generally, one can argue that e is small irrespectively of the form of the potential, provided that ;M . To see this, take the cosmological range of scales to span four decades, corresponding . to * ln kK9. This corresponds to 9 e-folds of in#ation. In slow-roll in#ation e has negligible variation over one e-fold and in typical models it has only small variation over the 9 e-folds. Taking that to be the case, and assuming that ;M , one learns from Eq. (39) that e;; . (1/9)"6;10\. 3.2.3. Error estimates for the slow-roll predictions In deriving the prediction for PR we used the #atness conditions e;1 and "g";1, as well as the slow-roll approximation whose fractional error is e!g (Eq. (36)). As a result one expects PR to pick up fractional errors of order e and g, *PR/PR"O(e,g) .
(58)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
21
Using Eqs. (49)}(51) one therefore expects n!1"2g!6e#O(m) ,
(59)
dn/d ln k"!16eg#24e#2m#O(p) .
(60)
In the "rst expression we ignored errors that are quadratic in e and g, because barring cancellations the corresponding fractional errors are small by virtue of the #atness conditions e;1 and "g";1. In the second expression we ignored errors that are cubic in e, g and m. Barring cancellations, the accuracy of the prediction for n!1 requires "m";max(e,"g") ,
(61)
and the accuracy of the prediction for its derivative requires in addition "p";max(e, e"g","m") .
(62)
3.3. Beyond the slow-roll prediction The slow-roll predictions given in the last subsection are very convenient, because they involve only < and its low derivatives evaluated at the epoch of horizon exit. The use of slow-roll is not however mandatory; on the contrary, one can obtain [236,276] predictions using essentially no assumptions beyond linear perturbation theory. In linear perturbation theory, the quantity u"ad satis"es the following exact equation:
Ru 1dz # k! u"0 . Rq zdq
(63)
Here, q is conformal time de"ned by dq"dt/a, and z,a Q /H
(64)
dz 3 1 1 1 de 1 dd "2aH 1#e # d# d# e d# , & 2 dq 2 2 & 2H dt 2H dt
(65)
where e , Q /H"!HQ /H , & d, $ /H Q ,
(66) (67)
and an overdot denotes d/dt. It is convenient to set q"0 at the end of slow-roll in#ation. In the extreme slow-roll limit HQ "0, this corresponds to q"!1/(aH) .
(68)
One assumes that in#ation is near enough slow-roll that k"q"<1 a few Hubble times before horizon exit, and k"q";1 a few Hubble times after. Then, there is a solution u"w of Eq. (63) which satis"es w"(2k)\e\ IO ,
(69)
22
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
a few Hubble times before horizon exit. A few Hubble times after horizon exit this solution has the behaviour w/zPconstant .
(70)
One can show that the spectrum of R is then given by PR(k)"(k/2pz)"w(k)" ,
(71)
Given an in#ationary trajectory de"ned by a(q) and Q (q), this method gives a practically unique, and accurate, result in all reasonable cases. The trajectory in turn follows from the potential practically independently of the initial conditions, if slow-roll becomes very accurate at some early epoch. We noted earlier that in the regime where the slow-roll predictions for PR, n!1 and dn/d ln k are approximately valid, the four #atness conditions Eqs. (31), (32), (61) and (62) are also valid. In that case, the &exact' solution yields an improved version of the slow-roll predictions for PR and n!1 [291]. Let us see how this goes. Eqs. (36), (49) and (50) and the #atness conditions give the approximation (72) dz/dq"2aH(1#e #d) , & with e and d slowly varying on the Hubble timescale. This leads to the approximation [291] & Q", (73) P R (k)"[1!(2C#1)e !Cd]H/2p"
& where C"!2#ln 2#bK!0.73, with b the Euler}Mascheroni constant. As always, the right-hand side is evaluated at k"aH. We want an expression involving < and its derivatives. Substituting Eq. (35) into Eq. (27) gives 3MH/<"1#e , . and substituting Eq. (36) into Eq. (29) gives
(74)
!3H Q /<"1!e#g . (75) These are improvements in the slow-roll formulas, valid to linear order in e and g. Squaring the last equation gives e /e"1!e#g , & and Eq. (36) is
(76)
d"e!g .
(77)
Inserting these four expressions into Eq. (73) gives
1 < 2 (k)" d " P & 5 R 5(3pM "<" .
1 1 1! 2C# e# C! g#O(m) . 6 3
(78)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
23
The fractional error in this improved expression for PR is expected to be of order O(m), plus terms quadratic in e and g that we did not display. The m term will be present, because it contributes to the variation per Hubble time of g (Eq. (50)) which is being ignored. Using k"aH with Eqs. (30), (74) and (75) gives the improved formula d /d ln k"(2e(1#e#g) .
(79)
This leads to (n!1)"!3e#g!(#12C)e#(8C!1)eg#g!(C!)m#O(p) .
(80)
The fractional error of order p comes from di!erentiating the error of order m in Eq. (78) (dm/d ln k is given by Eq. (51)). Contrary to what is stated in [183], the actual coe$cient of p cannot be evaluated without going back to the exact equation. There will also be error terms cubic in e, g and m, that we do not display. The improved solution becomes exact in the case of power-law in#ation (aJ N) when e "!d & is constant, and in the case of <"< $m in the limit P0 when eP0 and d becomes constant. In some models, the improvement is big enough to measure with ,xed values of the parameters in the potential. But in the cases that have been examined to date, this change can be practically cancelled by varying the parameters. As a result, the improvement is probably going to be useful only if gravitational waves are detected (Section 3.5). 3.4. The number of e-folds of slow-roll in-ation A model of in#ation will give us an in#ationary potential <( ), and a prescription for the value
of the "eld at the end of slow-roll in#ation. This is not enough to work out the prediction for PR(k), because we need to know the value of when a given cosmological scale k leaves the horizon. Using Eq. (40), we can do this if we know the number N( ) of e-folds of slow-roll in#ation taking place after that epoch. The model will give d ln k/d , (through Eq. (48)) so we need this information for just one cosmological scale. For de"niteness, let us consider the scale k\"H\"3000h\ Mpc, which is the biggest cosmological scale of interest. As this is more or less the scale probed by COBE, we denote it by a subscript COBE. The number of e-folds of in#ation after this scale leaves the horizon is N "ln(a /a ). !- # !- #
(81)
The absolute limit of direct observation is 2H\, the distance to the particle horizon in a #at, matter-dominated Universe. Since the prediction is made for a randomly placed observer in a much bigger patch, bigger scales in principle contribute to it, but sensitivity rapidly decreases outside our horizon. Only if the spectrum increases sharply on very large scales [127,111] might there be a signi"cant e!ect. This Grishchuk}Zeldovich e!ect is not present in any model of in#ation that has been proposed so far.
24
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Since this scale is the one entering the horizon now, a H "a H where the subscript !- # !- # 0 indicates the present epoch. This leads to
a H H !ln . (82) a H H !- # The second term will be given by the model of slow-roll in#ation and is usually :1; for simplicity let us ignore it. The "rst term depends on the evolution of the scale factor between the end of slow-roll in#ation and the present. Assume "rst that slow-roll in#ation gives way promptly to matter domination (aJt), which is followed by a radiation dominated era (aJt) lasting until the present matter dominated era begins. Then one has [197,198] N "ln !- #
(83) N "62!ln(10 GeV/<)!ln(</o) !- # (o is the &reheat' temperature, when radiation domination begins). With <&10 GeV and instant reheating this gives N K62, the biggest possible value. In fact, o should probably be !- # no bigger than 10 GeV to avoid too many gravitinos [275], and using that value gives N "58, perhaps the biggest reasonable value. With <"10 GeV, the lowest scale usually !- # considered, one "nds N "48 with instant reheating, and N "39 if reheating is delayed to !- # !- # just before nucleosynthesis. The smallest cosmological scale that will be directly probed in the forseeable future is perhaps six orders of magnitude lower than H\, which corresponds to replacing N by N ! !- # !- # 6 ln 10"N !14. !- # The estimates for N are valid only if there is no additional in#ation, after slow-roll in#ation !- # ends. In fact, there are least two possibilities for additional in#ation. One is that slow-roll gives way smoothly to a signi"cant amount of fast-roll in#ation. This does not happen in most models, but it does happen in the rather attractive model described in Sections 6.9 and 8.6. Its e!ect is to reduce N by some amount N , which is highly model-dependent. The other possibility is that there !- # is a separate, late era of thermal in#ation, as described in footnote 63 of Section 6.10. The minimal assumption of one bout of thermal in#ation will reduce N by N &10. !- # We want slow-roll in#ation to generate structure on all cosmological scales. Taking the smallest one to correspond to N !15, and remembering that without thermal in#ation N is in the !- # !- # range 40}60, we learn that the amount of additional in#ation must certainly satisfy N #N (25 to 45 . (84) In many models of in#ation, n(k) is strongly dependent on N( ) at the epoch of horizon exit (see for instance Tables 1 and 2). Then a more stringent limit upper limit may come from the requirement that "1!n"(0.2. From now on, we shall usually denote N simply by N. The more generally quantity N( ), !- # referring to an arbitrary "eld value, will always have its argument displayed. The quantity N de"ned in this way is not identical with the number of e-folds of fast-roll in#ation, since H is not constant during such in#ation. But the latter provides a rough approximation to N if slow-roll is only marginally violated, as in Section 6.9.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
25
3.5. Gravitational waves In#ation also generates gravitational waves, with two independent components h . Pertur>" has the same action as bing the Einstein action, one "nds that each of quantities (M /(2)h >" . a massless scalar "eld. It follows that h are independent Gaussian perturbations, whose >" spectrum on scales far outside the horizon has the time-independent value [283,271] P (k)"(2/M)(H/2p) . (85) . As usual, the right-hand side is evaluated at the epoch of horizon exit k"aH. According to the analytic approximation mentioned earlier [284], the relative contribution C (l)/CR(l), of gravi tational waves to the low multipoles, is equal to r,12.4e .
(86)
We are using r de,ned by this equation as a convenient measure of the relative importance of the gravitational waves. Using the slow-roll conditions, the spectral index is n
"!2e . (87) This is the fourth quantity we calculated from the three quantities <, e and g, so it will provide a consistency check if gravitational waves are ever detected. We noted earlier that the primordial gravitational waves will not be detectable by Planck unless r90.1, and are unlikely to be detected in the forseeable future unless r90.01. Most models of in#ation give a much smaller value [219]. To see why, note "rst that the waves are signi"cant only up to l&100, corresponding to the "rst 4 or so e-folds of in#ation after our Universe leaves the horizon. From Eq. (39), this means that the "eld variation is at least of order the Planck scale, * K4(2eM "0.5M (r/0.1) . . .
(88)
Afterwards, we have say &50 e-folds more in#ation, which will increase the total * . In models where e increases with time this gives (89) * 9M (r/0.1) . . Then detectable gravitational waves require * 92 to 6M , placing the in#ation model out of . theoretical control. In models where e decreases with time, the extra change in need not be signi"cant, making it possible to generate detectable gravitational waves in models with * 90.2 to 0.5M . Of the models proposed so far in the framework of particle theory, only tree-level hybrid . in#ation is of the latter type (<"< #m with the "rst term dominating, or the same thing with a higher power of .) But in most versions of hybrid in#ation the "eld is small, the only exception so far being Ref. [222]. Another viewpoint is to look at the COBE normalization Eq. (45). It can be written <"(2.0;10 GeV)(r/0.1) ,
(90)
so detectable waves require <91;10 GeV. Such a big value is the exception rather than the rule for existing models.
26
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
We conclude that a detectable gravitational wave signal is unlikely. If such a signal is present, Eqs. (43), (54), (86) and (87) and more accurate versions of them will allow one to deduce <( ) and its low derivatives. This is the &reconstruction' programme [200]. Note that it will estimate <( ) only on the limited portion of the trajectory corresponding to the ten or so e-folds occurring while cosmological scales leave the horizon. 3.6. Before observable inyation The only era of in#ation that is directly relevant for observation is the one beginning when the observable Universe leaves the horizon. This era of &observable' in#ation will undoubtedly be preceded by more in#ation, but all memory of earlier in#ation is lost apart from the starting values of the "elds at the beginning of observable in#ation. Nevertheless, one ought to try to understand the earlier era if only to check that the assumed starting values are not ridiculous. A complete history of the Universe will presumably start when the energy density is at the Planck scale. (Recall that < is at least two orders of magnitude lower during observable in#ation.) The usual hypothesis is that the scalar "elds at that epoch take on chaotically varying values as one moves around the universe, in#ation occurring in patches where conditions are suitable [202,205]. The observable Universe is located at one of these patches, and from now on we consider only it. One would indeed like to start the descent from the Planck scale with an era of in#ation, for at least two reasons. One, which applies only to the case of positive spatial curvature, is to avoid having the Universe collapse in a few Planck times (or "ne-tune the initial density parameter X). The other, which applies in any case, is to have an event horizon so that the homogeneous patch within which we are supposed to live is not eaten up by its inhomogeneous surroundings. However, there is no reason to suppose that this initial era of in#ation is of the slow-roll variety. The motivation for slow-roll comes from the observed fact that d is almost scale-independent, which & applies only during the relatively brief era when cosmological scales are leaving the horizon. In the context of supergravity, where achieving slow-roll in#ation requires rather delicate conditions, it might be quite attractive to suppose that non-slow-roll in#ation takes the Universe down from the Planck scale with slow-roll setting in only much later. A well known potential that can give non-slow-roll in#ation is <Jexp((2/p /M ), which gives aJtN and corresponds to non-slow. roll in#ation in the regime where p is bigger than 1 but not much bigger. Well before observable in#ation, it is possible to have an era of &eternal in#ation' during which the motion of the in#aton "eld is dominated by the quantum #uctuation. The condition for this to occur is that the predicted spectrum PR be formally bigger than 1 [286].
We discount, for the moment, the fascinating possibility that additional space dimensions open up well below the Planck scale. We also do not consider the idea that a complete (open or closed) in#ating universe is created by a quantum process, with energy density already far below the Planck scale [135,210,72,42]. A more modest proposal [119] is that our Universe is located within a bubble, which nucleated at a low-energy scale [119], but the universe within which that bubble originated is still supposed to have begun at the Planck scale. Eternal in#ation taking place at large "eld values is discussed in detail in Refs. [204,211]. The corresponding phenomenon for in#ation near a maximum was noted earlier by a number of authors.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
27
With all this in mind, let us ask what might precede observable in#ation, with a view to seeing what initial conditions for the latter might be reasonable. Going back in time, one might "nd a smooth in#ationary trajectory going all the way back to an era when < is at the Planck scale (or at any rate much bigger than its value during observable in#ation). In that case the in#aton "eld will probably be decreasing during in#ation. Another natural possibility is for the in#aton to "nd itself near a maximum of the potential before observable in#ation starts. Then there may be eternal in#ation followed by slow-roll in#ation. If the maximum is a "xed point of the symmetries it is quite natural for the "eld to have been driven there by its interaction with other "elds. Otherwise it could arrive there by accident, though this is perhaps only reasonable if the distance from the maximum to the minimum is 9M (see, for instance, Ref. [165] for an example). In this latter case, the fact . that eternal in#ation occurs near the maximum may help to enhance the probability of in#ation starting there [203]. If the maximum is a "xed point, the in#aton "eld might be placed there through a coupling with another "eld, with that "eld initially in#ating [149]. Alternatively, it may be placed at the origin through thermal corrections to the potential [201,6], but this mechanism is di$cult to implement. In summary, two kinds of initial conditions seem reasonable. One is to have the in#aton moving towards the origin, the idea being that the "eld value is initially at least of order M . The other is to . have the in#aton moving away from a maximum of the potential, preferably located at the origin. We emphasize that these are just speculations; to make a de"nite statement, one needs a de"nite model going back to the Planck scale.
4. Calculating the curvature perturbation generated by in6ation This section is somewhat specialized, and may be omitted by the general reader. It concerns the calculation of the spectrum PR of the primordial curvature perturbation R. We "rst consider the standard case of a single-component in#aton; essentially all of the models considered in the text are of this kind. Then we explain the concept of a multi-component in#aton, and see how to extend the calculation to that case. In both cases we use an approach that has only recently been developed [274,277,198], though its starting point can already be seen in the "rst calculations [134,282,130]. This starting point consists of the following assumption. During any era of the early Universe, the evolution of the relevant quantities along each comoving worldline is practically the same as in an unperturbed Universe, after smoothing on a comoving scale that is well outside the horizon (Hubble distance H\(t)). The assumed condition seems very reasonable. There needs to be some smoothing scale that makes the perturbations negligible or it would not make sense to talk about a Robertson}Walker
The potential will be something like Eq. (147), with t the "eld corresponding to observable in#ation. One initially has hybrid in#ation, but in contrast with the usual case the destabilized "eld takes so long to roll down that it becomes the single in#aton "eld of observable in#ation. &Smoothing' on a scale R means that one replaces (say) the energy density o(x) by dx=("x!x")o(x) with =(y)K1 for y:R and =K0 for y9R. A simple choice is to take ="1 for y(R and ="0 for y'R (top-hat smoothing).
28
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Universe. The horizon scale will be big enough, unless there is dramatic new physics on a much bigger scale, and the absence of an observed Grishchuk}Zel'dovich e!ect [127] or tilted Universe e!ect [70] more or less assures us that there is no such scale. We shall see how a comparison of the evolution of di!erent comoving regions provides a simple and powerful technique for calculating the density perturbation. As we discuss later, this approach is quite di!erent from the usual one of writing down, and then solving, a closed set of equations for the perturbations in the relevant degrees of freedom (for instance the components of the in#aton "eld during in#ation). Roughly speaking, the present approach replaces the sequence &perturb then solve' by the far simpler sequence &solve then perturb', though it is actually more general than the other approach. For the case of a single-component in#aton it gives a very simple, and completely general, proof of the constancy of R on scales well outside the horizon. For the multi-component case it allows one to follow the evolution of R, knowing only the evolution of the unperturbed universe corresponding to a given value of the initial in#aton "eld. So far it has been applied to three multi-component models [274,110,112]. 4.1. The case of a single-component in-aton We begin with a derivation of the usual result for the single-component case. The assumption about the evolution along each comoving worldline is invoked only at the very end, when it is used to establish the constancy of R which up till now has only been demonstrated for special cases. Otherwise the proof is the standard one [197,198], but it provides a useful starting point for the multi-component case. A few Hubble times after horizon exit during in#ation, when R(k,t) can "rst be regarded as a classical quantity, its spectrum can be calculated using the relation [167,197,198] R(x)"H*q(x) ,
(91)
where *q is the separation of the comoving hypersurface (with curvature R) from a spatially #at one coinciding with it on average. The relation is generally true, but we apply it at an epoch a few Hubble times after horizon exit during in#ation. On a comoving hypersurface the in#aton "eld is uniform, because the momentum density Q
vanishes. It follows that *q(x)"!d (x)/ Q ,
(92)
where d is de"ned on the #at hypersurface. Note that the comoving hypersurfaces become singular (in"nitely distorted) in the slow-roll limit Q P0, so that to "rst order in slow-roll any non-singular choice of hypersurface could actually be used to de"ne d . The spectrum of d is calculated by assuming that well before horizon exit (when the particle concept makes sense) d is a practically massless free "eld in the vacuum state. Using the #atness and slow-roll conditions one "nds, a few Hubble times after horizon exit, the famous result [197,198] P "(H/2p), which leads to the usual formula Eq. (43) for the spectrum. (
In [197] there is an incorrect minus sign on the right-hand side.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
29
However, this result refers to R a few Hubble times after horizon exit, and we need to check that R remains constant until the radiation dominated era where we need it. To calculate the rate of change of R we proceed as follows [221,197,198]. In addition to the energy density o(x, t) and the pressure P(x, t), we consider a locally de"ned Hubble parameter H(x, t)"D uI where uI is the four-velocity of comoving worldlines and D is I I the covariant derivative. (The quantity 3H is often denoted by h in the literature.) The Universe is sliced into comoving hypersurfaces, and each quantity is split into an average (&background') plus a perturbation, o(x, t)"o(t)#do(x,t)
(93)
and so on. (We use the same symbol for the local and the background quantity since there is no confusion in practice.) As usual, x is the Cartesian position-vector of a comoving worldline and t is the time. To "rst order, perturbations &live' in unperturbed space}time, since the inclusion of the perturbation in the space}time metric when describing the evolution of a perturbation would be a second-order e!ect. We ignore the anisotropic stress of the early Universe, since it is unlikely to a!ect the constancy of R [198]. The locally de"ned quantities satisfy [133,215,221,197,198] H(x, t)"M\o(x, t)/3# R . .
(94)
The Laplacian acts on comoving hypersurfaces. This is the Friedmann equation except that K(x, t),!(2/3)a R need not be constant. The evolution along each worldline is do(x, t)/dq"!3H(x, t)(o(x, t)#P(x, t)) ,
(95)
dH(x, t)/dq"!H(x, t)!M\(o(x, t)#3P(x, t))! dP/o#P . .
(96)
Except for the last term these are the same as in an unperturbed universe. If that term vanishes R is constant, but otherwise one "nds RQ "!HdP/(o#P) .
(97)
In this equation we have in mind that o and P are the unperturbed quantities, depending only on t, though as we are working to "rst order in the perturbations it would make no di!erence if they were the locally de"ned quantities. According to Eq. (97), R will be constant if dP is negligible. We now show that this is so, by "rst demonstrating that do is negligible, and then using the new viewpoint to see that P will be a practically unique function of o making dP also negligible.
This includes the case that the perturbation being evolved is itself a perturbation in the metric, such as the gravitational wave amplitude or the spatial curvature perturbation R. If the in#aton "eld oscillates after in#ation, R becomes singular when Q passes through zero. But then one can work with d to show that R is still constant away from the singularity.
30
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
From now on, we work with Fourier modes, represented by the same symbol, and replace by !(k/a). Extracting the perturbations from Eq. (94) gives 2
dH do 2 k " ! R. o 3 aH H
(98)
This allows one to calculate the evolution of do from Eq. (95), but we have to remember that the proper-time separation of the hypersurfaces is position-dependent. Writing q(x,t)"t#dq(x,t) we have [167,221,223] d(qR )"!dP/(o#P) .
(99)
Writing do/o,(k/aH)Z one "nds [221] ( f Z)"f (1#w) R .
(100)
Here a prime denotes d/d(ln a) and f /f,(5#3w)/2 where w,P/o. With w and R constant, and dropping a decaying mode, this gives 2#2w R. Z" 5#3w
(101)
More generally, integrating Eq. (100) will give "Z"&"R" for any reasonable variation of w and R. Even for a bizarre variation there is no scale dependence in either w (obviously) or in R (because Eq. (107) gives it in terms of dP, and we will see that if dP is signi"cant it is scale-independent). In all cases, do/o becomes negligible on scales su$ciently far outside the horizon. The discussion so far applies to each Fourier mode separately, on the assumption that the corresponding perturbation is small. To make the "nal step, of showing that dP is also negligible, we need to consider the full quantities o(x, t) and so on. But we still want to consider only scales that are well outside the horizon, so we suppose that all quantities are smoothed on a comoving scale somewhat shorter than the one of interest. The smoothing removes Fourier modes on scales shorter than the smoothing scale, but has practically no e!ect on the scale of interest. Having done this, we invoke the assumption that the evolution of the Universe along each worldline is practically the same as in an unperturbed universe. In the context of slow-roll in#ation, this means that the evolution is determined by the in#aton "eld at the &initial' epoch a few Hubble times after horizon exit. To high accuracy, o and P are well-de"ned functions of the initial in#aton "eld and if it has only one component this means that they are well-de"ned functions of each other. Therefore dP will be very small on comoving hypersurfaces because do is. Finally, we note for future reference that dH is also negligible because of Eq. (98). 4.2. The multi-component case So far we have assumed that the slow-rolling in#aton "eld is essentially unique. What does &essentially' mean in this context? A strictly unique in#aton trajectory would be one lying in If k/a is the smoothing scale, the assumption that the evolution is the same as in an unperturbed universe with the same initial in#aton "eld has in general errors of order (k/aH). In the single-component case, where dP is also of this order, we cannot use the assumption to actually calculate it, but neither is it of any interest.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
31
a steep-sided valley in "eld space. This is not very likely in a realistic model. Rather there will be a whole family of possible in#aton trajectories, lying in the space of two or more real "elds ,
,2. Usually, though, the di!erent trajectories are completely equivalent, so that we still have an &essentially' unique in#aton "eld. For instance, in many cases the in#aton "eld is the modulus of a complex "eld, with < independent of the phase. Each choice of the phase gives a di!erent but equivalent in#aton trajectory in the space of the complex "eld. Also, there may be a "eld(s) a, unrelated to the in#aton "eld, which has practically zero mass. Di!erent choices of a lead to di!erent in#aton trajectories in "eld space, but in the usual case that a has no cosmological e!ect these trajectories will again be equivalent. In both of these cases, one can modify things so that the trajectories are inequivalent. In the case of the complex "eld, it might be that < is a function of both the real and imaginary parts, call them
and , with < satisfying the #atness conditions Eqs. (31) and (32) as a function of each "eld separately. Then there will in general be a family of curved in#aton trajectories, corresponding to the lines of steepest descent, which are inequivalent. In this case, it is useful to think of the in#aton as a two-component object ( , ). More generally, there might be a family of curved in#aton trajectories in the space of several "elds, so that there is a multi-component in#aton. In the case of an unrelated massless "eld a, that "eld might survive and be stable, to become dark matter after it starts to oscillate about its minimum. The in#aton trajectories are now inequivalent, but the inequivalence shows up only when the oscillation starts. The vacuum #uctuation of a during in#ation then turns into an isocurvature density perturbation. Extensions of the standard model typically contain a "eld which can have just these properties, namely the axion. Postponing until later the discussion of this case, we continue discussion of the multicomponent case. Multi-component in#aton models generally have just two components, and are called double in#ation models because the trajectory can lie "rst in the direction of one "eld, and then in the direction of the other. They were "rst proposed in the context of non-Einstein gravity [285,171,238,120,270,239,121,68,69,288,109,110]. By rede"ning the "elds and the space}time metric one can recover Einstein gravity, with "elds that are not small on the Planck scale and in general non-canonical kinetic terms and a non-polynomial potential. Then models with canonical kinetic terms were proposed [272,258}262,24,172,138,140,105,274], with potentials such as <"j N #j O . These potentials too in#ate in the large-"eld regime where theory provides no guidance about the form of the potential. However there seems to be no bar to having a multi-component model with ;M , and one may yet emerge in a well-motivated . particle theory setting. In that case a hybrid model might emerge, though the models proposed so far are all of the non-hybrid type (i.e., the multi-component in#aton is entirely responsible for the potential). In this brief survey we have focussed on the era when cosmological scales leave the horizon. In the hybrid in#ation model of Refs. [166,112], the &other' "eld is responsible for the last several e-folds of in#ation, so one is really dealing with a two-component in#aton (in a non-hybrid model). The scales corresponding to the last few e-folds are many orders of magnitude shorter than the cosmological scales, but it turns out that the perturbation on them is big so that black holes can be produced. This phenomenon was investigated in Refs. [266,112]. The second reference also investigated the possible production of topological defects, when the "rst "eld is destabilized.
32
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
4.3. The curvature perturbation It is assumed that while cosmological scales are leaving the horizon all components of the in#aton have the slow-roll behaviour 3H Q "!< . (102) ? ? (The subscript, a denotes the derivative with respect to .) Di!erentiating this and comparing it ? with the exact expression $ #3H Q #< "0 gives consistency provided that ? ? ? M(< /<);1 , (103) . ? M"< /<";1 . (104) . ?@ (The second condition could actually be replaced by a weaker one but let us retain it for simplicity.) One expects slow-roll to hold if these #atness conditions are satis"ed. Slow-roll plus the "rst #atness condition imply that H (and therefore o) is slowly varying, giving quasi-exponential in#ation. The second #atness condition ensures that Q is slowly varying. ? It is not necessary to assume that all of the "elds continue to slow-roll after cosmological scales leave the horizon. For instance, one or more of the "elds might start to oscillate, while the others continue to support quasi-exponential in#ation, which ends only when slow-roll fails for all of them. Alternatively, the oscillation of some "eld might brie#y interrupt in#ation, which resumes when its amplitude becomes small enough. (Of course, these things might happen while cosmological scales leave the horizon too, but that case will not be considered.) Expression Eq. (91) for R still holds in the multi-component case. Also, one still has *q"!d / Q if d denotes the component of the vector d parallel to the trajectory. (The ? momentum density seen by an observer orthogonal to an arbitrary hypersurface is Q
.) A few ? ? Hubble times after horizon exit the spectrum of every in#aton "eld component, in particular the parallel one, is still (H/2p). If R had no subsequent variation this would lead to the usual prediction, but we are considering the case where the variation is signi"cant. It is given in terms of dP by Eq. (97), and when dP is signi"cant it can be calculated from the assumption that the evolution along each worldline is the same as for an unperturbed universe with the same initial in#aton "eld. This will give dP"P d , (105) ? ? where d is evaluated at the initial epoch and the function P( , ,2,t) represents the evolution ? of P in an unperturbed universe. Choosing the basis so that one of the components is the parallel one, and remembering that all components have spectrum (H/2p), one can calculate the "nal spectrum of R. The only input is the evolution of P in the unperturbed universe corresponding to a generic initial in#aton "eld (close to the classical initial "eld). In this discussion we started with Eq. (91) for the initial R, and then invoked Eq. (97) to evolve it. The equations can actually be combined to give R"dN ,
(106)
where N"H dq is the number of Hubble times between the initial #at hypersurface and the "nal comoving one on which R is evaluated. This remarkable expression was given in Ref. [285] and proved in Refs. [274,277]. The approach we are using is close to the one in the last reference.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
33
The proof that Eqs. (91) and (97) lead to R"dN is very simple. First combine them to give
R(x, t)"H *q (x)!
R
H(t)dP/(o#P) ,
R
(107)
where t is a few Hubble times after horizon exit. Then use Eq. (99) to give
R(x, t)"H *q (x)#
R
H(t)dqR (x, t) dt .
R
(108)
As we remarked at the end of Section 4.1, dH is negligible. As a result, this can be written
R(x, t)"H *q (x)#d
R
R
H(x, t)qR (x, t) dt .
(109)
Finally, rede"ne q(x, t) so that it vanishes on the initial -at hypersurface, which gives the desired relation R"dN. In Ref. [277] this relation is derived using an arbitrary smooth interpolation of hypersurfaces between the initial and "nal one, rather than by making the sudden jump to a comoving one. Then H is replaced by the corresponding quantity HI for worldlines orthogonal to the interpolation (incidentally making dHI non-negligible). One then "nds R"dNI . One also "nds that the righthand side is independent of the choice of the interpolation, as it must be for consistency. If the interpolating hypersurfaces are chosen to be comoving except very near the initial one, NI KN which gives the desired formula R"dN. 4.4. Calculating the spectrum and the spectral index Now we derive explicit formulas for the spectrum and the spectral index, following [277]. Since the evolution of H along a comoving worldline will be the same as for a homogeneous universe with the same initial in#aton "eld, N is a function only of this "eld and we have R"N d . ? ?
(110)
(Repeated indices are summed over and the subscript and, a denotes di!erentiation with respect to
.) The perturbations d are Gaussian random "elds generated by the vacuum #uctuation, and ? ? have a common spectrum (H/2p). The spectrum d ,(4/25)PR is therefore & d "(75pM)N N . & . ? ?
(111)
The last step is not spelled out in Ref. [277]. The statement that NI is independent of the interpolation is true only on scales well outside the horizon, and its physical interpretation is unclear though it drops out very simply in the explicit calculation.
34
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
In the single-component case, N"M\< and we recover the usual expression. In the . multi-component case we can always choose the basis "elds so that while cosmological scales are leaving the horizon one of them points along the in#aton trajectory, and then its contribution gives the standard result with the orthogonal directions giving an additional contribution. Since the spectrum of gravitational waves is independent of the number components (being equal to a numerical constant times <) the relative contribution r of gravitational waves to the cmb is always smaller in the multi-component case. The contribution from the orthogonal directions depends on the whole in#ationary potential after the relevant scale leaves the horizon, and maybe even on the evolution of the energy density after in#ation as well. This is in contrast to the contribution from the parallel direction which depends only on < and < evaluated when the relevant scale leaves the horizon. The contribution from the orthogonal directions will be at most of order the one from the parallel direction provided that all N are at most of order M\<. We shall see later that this is a reasonable expectation at ? . least if R stops varying after the end of slow-roll in#ation. To calculate the spectral index we need the analogue of Eqs. (48) and (39). Using the chain rule and dN"!Hdt one "nds M R d "! . < , ? < R
d ln k ? N < "M\< . ? ? . Di!erentiating the second expression gives < N #N < "M\< . ? ?@ ? ?@ . @ Using these results one "nds
(112) (113)
(114)
2 MN N < M< < #2 . ? @ ?@ . (115) n!1"! . ? ?! MN N
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
35
These formulas give the spectrum and spectral index of the density perturbation, if one knows the evolution of the homogeneous universe corresponding both to the classical in#aton trajectory and to nearby trajectories. An important di!erence in principle from the single-component case, is that the classical trajectory is not uniquely speci"ed by the potential, but rather has to be given as a separate piece of information. However, if there are only two components the classical trajectory can be determined from the COBE normalization of the spectrum, and then there is still a prediction for the spectral index. This treatment can be generalized straightforwardly [277] to the case of non-canonical kinetic terms described by Eq. (135). However, in the regime where all "elds are ;M one expects the . &curvature', associated with the &metric' H in Eq. (135) to be negligible, and then one can recover ?@ the canonical normalization H "d by rede"ning the "elds. ?@ ?@ 4.5. When will R become constant? We need to evaluate N up to the epoch where R"dN has no further time dependence. When will that be? As long as all "elds are slow-rolling, R is constant if and only if the in#aton trajectory is straight. If it turns through a small angle h, and the trajectories have not converged appreciably since horizon exit, the fractional change in R is in fact 2h. Since slow-roll requires that the change in the vector Q during one Hubble time is negligible, the total angle turned is ;N. Hence the relative ? contribution of the orthogonal directions cannot be orders of magnitude bigger than the one from the parallel direction, if it is generated during slow-roll in#ation. (In two dimensions the angle turned cannot exceed 2p of course, but there could be say a corkscrew motion in more dimensions.) Later slow-roll may fail for one or more of the "elds, with or without interrupting in#ation, and things become more complicated, but in general there is no reason why R should stop varying before the end of in#ation. Now let us ask what happens after the end of in#ation (or to be more precise, after signi"cant particle production has spoiled the above analysis, which may happen a little before the end). The simplest case is if the relevant trajectories have practically converged to a single trajectory (q), as ? in Ref. [112]. Then R will not vary any more (even after in#ation is over) as soon as the trajectory has been reached. Indeed, setting q"0 at the end of in#ation, this unique trajectory corresponds to a post-in#ationary universe depending only on q. The #uctuation in the initial "eld values causes a #uctuation *q in the arrival time at the end of in#ation, leading to a time-independent R"dN"H *q. What if the trajectory is not unique at the end of in#ation? After the completion of the transition from in#ation to a universe of radiation and matter, Eq. (97) tells us that R will be constant
Thinking in two dimensions and taking the trajectory to be an arc of a circle, a displacement d towards the centre decreases the length of the trajectory by an amount hd , to be compared with the decrease d for the same displacement along the trajectory. (The rms displacements will indeed be the same if the trajectories have not converged.) The speed along the new trajectory is faster in inverse proportion to the length since it is proportional to < and < is "xed at the initial and "nal points on the trajectory. Thus the perpendicular displacement increases N by 2h times the e!ect of a parallel displacement, for h;1.
36
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
provided that dP and do are related in a de"nite way. This is the case during matter domination (dP"0) or radiation domination (dP"do). Immediately following in#ation there might be a quite complicated situation, with &preheating' [174}178,11,181,155}158] or else the quantum #uctuation of the &other' "eld in hybrid models [60] converting most of the in#ationary potential energy into marginally relativistic particles in much less than a Hubble time. But after at most a few Hubble times one expects to arrive at a matterdominated era so that R is constant. Subsequent events will not cause R to vary provided that they occur at de"nite values of the energy density, since again P will have a de"nite relation with o. This is indeed the case for the usually considered events, such as the decay of matter into radiation and thermal phase transitions (including thermal in#ation). The conclusion is that it is reasonable to suppose that R achieves a constant value at most a few Hubble times after in#ation. On the other hand, one cannot exclude the possibility that one of the orthogonal components of the in#aton provides a signi"cant additional degree of freedom, allowing R to have additional variation before we "nally arrive at the radiation-dominated era preceding the present matter-dominated era. A commonly considered example is described in Section 4.8. 4.6. Working out the perturbation generated by slow-roll in-ation If R stops varying by the end of in#ation, the "nal hypersurface can be located just before the end (not necessarily at the very end because that might not correspond to a hypersurface of constant energy density). Then, knowing the potential and the hypersurface in "eld space that corresponds to the end of in#ation, one can work out N( , ,2) using the equations of motion for the "elds, and the expression 1d d
? ?. 3MH"o"<# . 2 dq dq
(117)
To perform such a calculation it is not necessary that all of the "elds continue to slow-roll after cosmological scales leave the horizon. In particular, the oscillation of some "eld might brie#y interrupt in#ation, which resumes when its amplitude becomes small enough. If that happens it may be necessary to take into account &preheating' during the interruption. In general all this is quite complicated, but there is one case that may be extremely simple, at least in a limited regime of parameter space. This is the case <"< ( )#< ( )#2
(118)
with each < proportional to a power of . For a single-component in#aton this gives in#ation ? ? ending at KM , with cosmological scales leaving the horizon at < . If the potentials . < are identical we recover that case. If they are di!erent, slow-roll may fail in sequence for the ?
If the &matter' consists of a nearly homogeneous oscillating scalar "eld one actually has dP"do ("d(d /dq)), since on comoving hypersurfaces and therefore < is constant. But this still makes dP negligible.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
37
di!erent components, but in some regime of parameter space the result for N (at least) might be the same as if it failed simultaneously for all components. If that is the case one can derive simple formulas [259,274], provided that cosmological scales leave the horizon at < for all ? ? components. One has < < H dt"!M\ d "!M\ ? d . . . < < ? ? ? It follows that (? N"M\ . ?
< ? d . < ? ?
(120)
(121)
(? Since each integral is dominated by the endpoint , we have N "M\< /< and ? ? . ? ? < < ? . d " & 75pM < .? ? The spectral index is given by Eq. (115), which simpli"es slightly because < "d <. ?@ ?@ ? The simplest case is <"m #m . Then n is given by the following formula: 1 (1#r)(1#kr) #1 , 1!n" (1#kr) N
(119)
(122)
where r" / and k"m/m. If k"1 this reduces to the single-component formula 1!n"2/N. Otherwise, it can be much bigger, but note that our assumptions will be valid if at all in a restricted region of the r}k plane. 4.7. An isocurvature density perturbation? Following the astrophysics usage, we classify a density perturbation as adiabatic or isocurvature with reference to its properties at some epoch during the radiation-dominated era preceding the present matter-dominated era, while it is still far outside the horizon. For an adiabatic density perturbation, the density of each particle species is a unique function of the total energy density. For an isocurvature density perturbation the total density perturbation vanishes, but those of the individual particle species do not. The most general density perturbation is the sum of an adiabatic and an isocurvature perturbation, with R specifying the adiabatic density perturbation only. For an isocurvature perturbation to exist the universe has to possess more than the single degree of freedom provided by the total energy density. If the in#aton trajectory is unique, or has become so by the end of in#ation, there is only the single degree of freedom corresponding to the #uctuation back and forth along the trajectory and there can be no isocurvature perturbation. Otherwise one of the orthogonal "elds can provide the necessary degree of freedom. The simplest way for this to happen is for the orthogonal "eld to survive, and acquire a potential so that it starts to oscillate and
38
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
becomes matter. The start of the oscillation will be determined by the total energy density, but its amplitude will depend on the initial "eld value so there will be an isocurvature perturbation. It will be compensated, for given energy density, by the perturbations in the other species of matter and radiation which will continue to satisfy the adiabatic condition do /o "do /o . K K P P The classic example of this is the axion "eld [205,180,218], which is simple because the #uctuation in the direction of the axion "eld causes no adiabatic density perturbation, at least in the models proposed so far. The more general case, where one of the components of the in#aton may cause both an adiabatic and an isocurvature perturbation has been looked at in for instance Ref. [261], though not in the context of speci"c particle physics. If an isocurvature perturbation in the non-baryonic dark matter density exists, it must not con#ict with observation and this imposes strong constraints on, for instance, models of the axion [218,207]. An isocurvature perturbation in the density of a species of matter may be de"ned by the &entropy perturbation' [167,223,197,198] 3do do P, (123) S" K! 4o o P K where o is the non-baryonic dark matter density. Equivalently, S"dy/y, where y"o /o. Since K K P we are dealing with scales far outside the horizon, o and o evolve as they would in an unperturbed K P universe which means that y is constant and so is S. Provided that the "eld #uctuation is small S will be proportional to it, and so will be a Gaussian random "eld with a nearly #at spectrum [218,197,198]. For an isocurvature perturbation, R vanishes during the radiation-dominated era preceding the present matter-dominated era. But on the very large scales entering the horizon well after matter domination, S generates a nonzero R during matter domination, namely R"S. A simple way of seeing this, which has not been noted before, is through the relation Eq. (97). Since do"0, one has S"!(o\#o\)do . Then, using dP"do /3, o /o Ja and H dt"da/a one "nds the quoted P P P K K P result by integrating Eq. (97). As discussed for instance in Refs. [197,198], the large-scale cmb anisotropy coming from an isocurvature perturbation is *¹/¹"!(# )S, where S is evaluated on the last-scattering surface. The second term is the Sachs}Wolfe e!ect coming from the curvature perturbation we just calculated, and the "rst term is the anisotropy do /o just after last scattering (on a comoving P P hypersurface). By contrast the anisotropy from an adiabatic perturbation comes only from the Sachs}Wolfe e!ect, so for a given large-scale density perturbation the isocurvature perturbation gives an anisotropy six times bigger. As a result an isocurvature perturbation with a #at spectrum cannot be the dominant contribution to the cmb, though one could contemplate a small contribution [295].
If the potential of the &orthogonal' "eld already exists during in#ation the in#aton trajectory will have a tiny component in its direction, so that it is not strictly orthogonal to the in#aton trajectory. This makes no practical di!erence. In the axion case the potential is usually supposed to be generated by QCD e!ects long after in#ation.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
39
5. Field theory and the potential All models of in#ation assume the validity of "eld theory, and in particular the existence of a potential < which is a function of the scalar "elds. In this section we discuss, in an elementary way, the form of the scalar "eld potential that one might expect on the basis of particle theory. 5.1. Renormalizable versus non-renormalizable theories A given "eld theory, like the Standard Model or a supersymmetric extension of it, is nowadays regarded as an e!ective theory. Such a theory is valid when the (biggest) relevant energy scale is less than some &ultraviolet cuto!', which we shall denote by K . In the context of collider physics, the 34 relevant energy scale is usually the collision energy. In the context of in#ation, it is usually the value of the in#aton "eld. It will be helpful to keep these two cases in mind. In the most optimistic case, K will be the Planck scale M . For "eld theory in three space 34 . dimensions, K presumably cannot be higher than M , since at that scale the theory will be 34 . invalidated by e!ects like the quantum #uctuation of the space}time metric. But it might be lower. In weakly coupled string theory it is suggested (Section 7.9.3) that K is the string scale 34 M Kg M , where g & 1 to 0.1 is the gauge coupling at the string scale. . At high scales, n compacti"ed space dimensions may become relevant. In that case, the biggest possible value of K is presumably the Planck scale M for gravity with these extra dimensions. 34 >L It is typically lower than M , as the following argument shows. If R is the size of the compacti"ed . dimensions (assumed to be all equal) the Newtonian gravitational force 1/(M r) is valid only for . r
L where M is the Planck scale for gravity with the >L >L n extra dimensions. Matching these expressions at the scale r&R one learns that (M /M )>L&(M R)\L. The right-hand side is less than 1, or it would not make sense to talk >L . . about the extra dimensions. At least if one is dealing with a "eld theory in which the "elds are con"ned to the three-space dimensions, M may be a useful estimate of the appropriate renormalization scale. This is what >L happens in Horava}Witten M-theory [141,199] (there are two sets of "elds, each con"ned to a di!erent three-dimensional space). There is one extra dimension (plus much smaller ones that we do not consider), with (M /M )&0.1 and therefore (M R)\&10\. Another proposal (Refs. . . [14,12] and earlier ones cited there) invokes n"2, M &1 TeV and therefore R&1 mm. If the "elds of the in#ation model are con"ned to three space dimensions, extra dimensions per se should make no di!erence provided that their size is much less than the Hubble distance during in#ation. As one easily veri"es, this is automatic for n52, given the condition <(M that >L certainly needs to be imposed. It will also be the case in Horava}Witten M-theory, since the COBE bound, Eq. (45), requires H:10\M . . There is also the proposal that the cuto! is inversely related to the size ¸ of the region that is to be described. For instance, [59] suggests that one needs K :(M /¸; with a box size a few times 34 . bigger than the Hubble distance (used when calculating the vacuum #uctuation of the in#aton) this would give K &< [59]. 34 All of this refers to the cuto! for a "eld theory including all of the "elds in Nature. In the context of terrestrial and astrophysics, one often considers an an e!ective theory, obtained by integrating
40
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
out "elds with mass g9K . In the simpler context of in#ation model-building though, it is 34 usually supposed that K is associated with the breakdown of "eld theory itself. 34 In general, we shall assume that K &M , while recognizing occasionally the important 34 . possibility of a lower value. A "eld theory is speci"ed by the Lagrangian (density) L, such that the action is S"dx L. It has dimension [energy], and is a function of the "elds and their derivatives with respect to space and time. In a given theory the Lagrangian will contain parameters, that de"ne the masses of the particles and their interactions. In a renormalizable theory, the number of parameters is "nite, even after quantum e!ects are included; the Standard Model is such a theory. Nowadays, a renormalizable theory is regarded as an approximation to a non-renormalizable one. The non-renormalizable theory is supposed to be a complete description of nature, on energy scales :M . . The non-renormalizable theory contains an in"nite number of parameters, which may be thought of as summarizing the unknown Planck-scale physics, and it can be replaced by the renormalizable theory in any situation where M can be regarded as in"nite. . We are focussing on supersymmetric theories, which can be either renormalizable or nonrenormalizable. Supergravity, which is presumed to be the version of supersymmetry chosen by nature, is non-renormalizable. A simpler version, called global supersymmetry, can be renormalizable. Following the usual practice, we shall take the term &global supersymmetry' to denote a version that is renormalizable, with the possible exception of terms appearing in the the superpotential; see Section 7.8. In the usual situations, including most models of in#ation, global supersymmetry is supposed to be valid. Global supersymmetry may be broken either explicitly or softly (see below) and both possibilities are considered for in#ation models. An important consideration for in#ation model-building is the fact that soft susy breaking coming from the underlying supergravity theory (gravity-mediated susy breaking) has to be weaker than would be expected for a generic theory. Several proposals have been made for achieving this, and at present there is no consensus about which one is correct. 5.2. The Lagrangian The "elds can be classi"ed according to the spin of the corresponding particles; in the Standard Model one has spin 0 (Higgs), spin 1/2 (quarks and leptons) and spin 1 (gauge bosons). Fields with these spins are ubiquitous in extensions of the Standard Model. There is also the graviton with spin 2, and according to supergravity the gravitino with spin 3/2. At the particle physics level, a model of in#ation consists of the relevant part of the Lagrangian.
For the present purpose, integrating "elds out (of the action) can be taken to mean that the scalar "eld potential is minimized with respect to them, at "xed values of the "elds which are not integrated out. This gives a well-de"ned "eld theory, if the motion of the integrated-out "elds about this minimum is negligible. That will always be the case if their masses are much bigger than those of the "elds that are not integrated out. It is also the case when the coupling between the two sets of "elds is of only gravitational strength, even if the integrated-out "elds are not particularly heavy; an example is provided by the dilaton and bulk moduli of string theory, which are usually integrated out when considering the other (&matter') "elds.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
41
The spin-0 "elds are called scalar "elds, and they are what we need for in#ation. Happily, there are lots of scalar "elds in supersymmetric extensions of the Standard Model. This is because every spin 1/2 "eld is accompanied by either a spin 0 or a spin 1 "eld, with the "rst case ubiquitous. During in#ation, only scalar "elds exist in the Universe. At the classical level, their evolution is determined by that part of the Lagrangian L containing only the scalar "elds. If only a single, real, scalar "eld is relevant, the Lagrangian in #at space}time is of the form (124) L" R RI !<( ) . I In this expression, <( ) is the potential. The other term is called the kinetic term, and in it R denotes the space}time derivative R/RxI. Up to a "eld rede"nition, this is the only LorentzI invariant expression containing "rst derivatives but no higher. The resulting equation of motion is
$ ! #<( )"0 ,
(125)
where the prime denotes d/d . For a spatially homogeneous "eld this becomes
$ #<( )"0 .
(126)
This is the same as for a particle moving in one dimension, with position (t) and potential <( ). The assumption of #at space}time corresponds to Special Relativity and negligible gravity. In the expanding Universe we need General Relativity, describing curved space}time. Its e!ect on the "eld equation is to introduce an extra term !3H Q on the left-hand side, so that we get Eq. (29). This is analogous to a friction term for particle motion. The extra term is signi"cant only in the context of cosmology. With a suitable choice of the origin, a non-interacting (free) "eld has the potential <"m /2 where m is the mass of the corresponding particle. The "eld equations has a time-independent, spatially homogeneous, solution "0, which represents the vacuum. Plane waves, corresponding to oscillations around the vacuum state, correspond after quantization to non-interacting particles of the species , which have mass m. Self-interactions correspond to higher-order terms in <( ). In a renormalizable theory, only cubic and quartic terms are allowed. The cubic term is usually forbidden by a symmetry, and dropping it the potential is (127) <"m #j . It is assumed that j:1, because otherwise the interaction would become so strong that would not correspond to a physical particle (the non-perturbative regime). On the other hand, values of j very many orders of magnitude less than 1 are not usually envisaged since they would represent "ne-tuning. The full potential will have an in"nite number of terms, and including the cubic one for generality one can write (128) <( )"< #m #j M #j # j M\B B#2 . . B . B The non-renormalizable (d'4) couplings j are generically of order 1, though they may be B suppressed in a supersymmetric theory as we shall discuss.
42
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
All this extends to the case of several scalar "elds , t, etc. With two "elds, the simplest Lagrangian density is L"R RI #R t RIt!<( ,t) . I I The "eld equations are
(129)
$ ! #R<( ,t)/R "0 ,
(130)
t$ ! t#R<( ,t)/Rt"0 .
(131)
The extension to further "elds is similar. The potential as a function of all the "elds will be a power series. With the origin in "eld space chosen to be the vacuum, as we are assuming at the moment, the power series for each "eld will have the form Eq. (128) (no linear term) provided that the other "elds are "xed at the origin. It is often appropriate to combine two real "elds and into a single complex "eld, de"ned by convention as
"(1/(2)( #i ) . The kinetic term corresponding to Eq. (129) is
(132)
L "R H RI . (133) I The use of a complex "eld is particularly appropriate if the potential depends only on " ". Then Eq. (127) is replaced by (134) <( )"< #m" "#j" " . Complex "elds are, in any case, part of the language of supersymmetry. With two or more real "elds, it is no longer true that the most general Lorentz-invariant Lagrangian density L can be reduced to the above form, Eq. (129), by a "eld rede"nition. For several real "elds , the most general kinetic term involving derivatives is L L " H R RI , (135) KL I K L KL where H is an arbitrary function of the "elds. In a supersymmetric theory, all "elds are complex KL and the most general kinetic term has the more restricted form (136) L " K HR RI H , L KL I K KL where K H,RK/R R H and K is called the KaK hler potential. KL K L We recover the canonical expression Eq. (129) only with the canonical choice if K H"d . With KL KL more than one "eld it is not in general possible to recover this form by a "eld rede"nition. If it is impossible, the space of the "elds is said to be curved. One expects that the curvature scale will be of order M , allowing one to choose K H"d to high accuracy in the regime " ";M . KL L . . KL
This is the case if the origin "0 is chosen to so that the vacuum values of the "elds are ;M . As we shall see L . soon, a di!erent choice is more natural for certain "elds predicted by string theory.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
43
A non-canonical kinetic term modi"es the "eld equations, so that the slope of the potential no longer has its usual signi"cance. Canonical normalization is assumed when describing slow-roll in#ation. 5.3. Internal symmetry 5.3.1. Continuous and discrete symmetries In addition to Lorentz invariance, the action will usually be invariant under a group of transformations acting exclusively on the "elds, with no e!ect on the space}time indices. This is called an internal symmetry. Consider "rst the case of a single real "eld , with < a function of as for example in Eq. (127). Then there is invariance under the Z group P! . Invariance under a group like this, which has only discrete elements, is called a discrete symmetry. Now consider the case of a single complex "eld, with < depending on " " as for example in Eq. (134). Then there is invariance under the ;(1) group
Pe Q , s arbitrary ,
(137)
with s an arbitrary real number. This is the case for Eq. (134). Alternatively, there might be invariance under the Z group ,
Pe Q , s"2pn/N , (138) with s an arbitrary integer. In the limit where the integer N goes to in"nity, the ;(1) group is recovered. Invariance under a continuous group like ;(1) is said to be a continuous symmetry. A given symmetry group acts on some of the "elds, but not on others. The action of a given Z or , ;(1) on the full set of "elds may be given by (139)
Pe OLQ , L L which de"nes the charge q of each "eld under the given symmetry. L In these expressions, the origin in "eld space has been taken to be the "xed point of the symmetry group. The gradient of the potential vanishes at the "xed point, which therefore represents a maximum, minimum or saddle point of the potential. In a supersymmetric theory, it is usual to take all scalar "elds to be complex. (Each of them is the partner of a spin-half "eld that has two components, corresponding to the two possible spin values.) If such a theory emerges from string theory, there are two kinds of "eld. The most numerous, usually called matter "elds, transform under groups built out of ;(1)'s (continuous symmetries) and Z 's (discrete symmetries). As in Eqs. (137) and (138) there is a unique "xed point , in "eld space, which is generally chosen as the origin. For a given ;(1) (say) the transformation can be brought into the form Eq. (137) with a suitable choice of the directions in "eld space that de"ne the , but in general this cannot be done for all of them simultaneously. We then have L a non-Abelian group (one whose elements do not commute) such as SU(N). In addition to the matter "elds, there are special "elds namely the dilaton s, and certain "elds called bulk moduli. In the example we shall discuss in Sections 7.9 and 8.3 there are three of the latter, t with I"1 to 3. The dilaton and bulk moduli are charged under discrete symmetry groups '
44
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
that are not built out of Eq. (137), and the most convenient choice of origin for these "elds is not the "xed point of these symmetry groups. A ;(1) symmetry is said to be global, if s in Eq. (137) is independent of space}time position. This is mandatory if no gauge "eld transforms under the ;(1), because then the space}time derivatives in the kinetic term inevitably spoil the symmetry. If gauge "elds have a suitable transformation under the ;(1), we can allow s to depend on position, because the change in the kinetic term is cancelled by a change in the part of the action involving the gauge "elds. The symmetry is then said to be a local symmetry, or a gauge symmetry. An example is the electromagnetic gauge "eld (electromagnetic potential) A . This generalizes to non-Abelian groups. There is an electromagI netic-like interaction associated with each gauge symmetry. The Standard Model is invariant under the gauge symmetry group SU(3) SU(2) SU(1) , the factors corresponding respectively ! * 7 to the colour, left-handed electroweak and hypercharge interactions. Generalizing from Eq. (139), the "elds not a!ected by a given symmetry group are said to be uncharged under the group, or to be singlets. It is usually supposed that every "eld is charged under some symmetry, though the opposite possibility of a &universal singlet' is sometimes considered [250]. 5.3.2. Spontaneously broken symmetry and vevs Any minimum of the potential represents a possible vacuum state, with the scalar "elds having the time-independent value corresponding to the minimum. (Such values are indeed solutions of the "eld equation Eq. (125)). In the examples encountered so far there is a unique minimum, but matters can be more complicated. As a simple example, consider Eq. (127) with the sign of the mass term reversed, (140) <( )"< !m #j . It has the same Z symmetry as the original potential, corresponding to invariance under
P! . But as shown in Fig. 3, the minimum at the origin is replaced by minima at
"$(m/(j). Taking, say, the positive sign, one can de"ne a new "eld I " !(m/(j). Then, if the constant < is chosen appropriately, one has near the minimum (141) <"mJ I #A I #B I , where mJ "(2m and we are not interested in the precise values of A and B. The minima represent possible vacuum expectation values (vevs) of the "eld. Each of them represents a possible vacuum of the theory, around which are small oscillations corresponding (after quantization) to particles. The oscillations correspond to an almost-free "eld if the cubic and quadratic terms in Eq. (141) are small. (It turns out that the criterion for this is j:1, which as in the previous case one assumes to be valid.) On the other hand, the original Z symmetry will not be evident in this almost-free "eld theory, and one says that it has been spontaneously broken.
The latter terminology originated with the case of non-Abelian groups, where each "eld charged under the group is necessarily part of a multiplet of charged "elds.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
45
Fig. 3. The full line illustrates schematically the potential Eq. (140). The dashed line shows the same potential with the sign of m reversed (symmetry restoration). This Figure is taken from Ref. [198].
The vev of a "eld is denoted by angle brackets, so that in the above case one has 1 2"$m/(j. Now consider Eq. (134) with the sign of the mass term reversed, <( )"< !m" "#j" " .
(142)
The vacuum now consists of the circle " ""1" "2,2m/(j. About any point in the vacuum, there is a &radial' mode of oscillation corresponding to the one we already considered, plus an &angular' mode with zero frequency. For a global symmetry, the particle corresponding to the angular mode is called the Goldstone boson of the symmetry, while the particle corresponding to the radial mode has no particular name. As we discuss in a moment, continuous global symmetries are usually broken, so that their Goldstone bosons acquire mass and become pseudo-Goldstone bosons. Examples are the pion (corresponding to the chiral symmetry of QCD) and the axion (corresponding to the hypothetical Peccei}Quinn symmetry that is proposed to ensure the CP invariance of QCD). For a gauge symmetry, the particle corresponding to the radial mode is called a Higgs particle, while the would-be Goldstone boson loses its identity to become one of the degrees of freedom of the gauge boson. This case, generalized to the SU(2) group, occurs in the electroweak sector of the Standard Model, and supersymmetric generalizations of it. More Higgs "elds occur in GUT models. The "eld which spontaneously breaks the symmetry, that we have denoted by , need not be one of the elementary "elds appearing in the Lagrangian. Instead it can be a product of these "elds, called a condensate. The "elds can be spin-half, so if all symmetry-breaking scalars were condensates one would have no need of elementary scalars. The pion "eld is a condensate, and in some models so is the axion. In this case there need be no particle corresponding to the radial mode. Higgs "elds are usually taken to be elementary, because this is the simplest possibility. The desire to have elementary scalar "elds is one of the most important motivations for supersymmetry. The above discussion applies to matter "elds, but a similar one applies to any internal symmetry and in particular to the dilaton and bulk moduli. The general criterion for spontaneous symmetry
46
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
breaking is that the vacuum (the minimum of the potential) does not correspond to a "xed point of the symmetry; as a result of this there is more than one copy of the vacuum, di!erent copies being related by the symmetry. 5.3.3. Explicitly broken global symmetries Global (but not gauge) symmetries can be explicitly broken. This means that the action is not precisely invariant under the symmetry group. Consider "rst the Z symmetry acting on a real "eld, P! . It is broken if one adds to the potential, Eq. (127) or Eq. (140) an odd term. Now consider a global ;(1) acting on a complex "eld, according to Eq. (137). It is broken if one adds to the the potential, Eq. (134) or Eq. (142), a term that depends on the phase of . For instance, there might be a contribution of the form
*<"j M\B B .
B# HB . 2
(143)
Instead of being generated from the tree-level potential in this way, *<(h) can come from a non-perturbative e!ect (to be precise, an instanton). With explicit breaking, a Goldstone boson acquires mass, to become a pseudo-Goldstone boson. This case occurs in QCD, where the pion is a pseudo-Goldstone boson. The axion (if it exists) is also a pseudo-Goldstone boson. If we write "1" "2e F, the canonically normalized pseudoGoldstone boson "eld is t," "h. Its potential <(h) has period 2p/N where N is some integer. For N52, the original ;(1) symmetry, Eq. (137), has been broken down to the residual symmetry, Z , , Eq. (138). For N"1, there is no residual symmetry. In the above example *<(h)"j M(1" "2/M )B cos(dh) . B . . De"ning the zero of t to be at a minimum of <, one "nds
(144)
1" "2 B\ . (145) m"dj M R B . M . Provided that m is much less than m , the radial part of which has the latter mass can remain R ( practically at the vev while t oscillates. For much bigger values this becomes impossible, and we have completely lost the original symmetry. It is usually supposed that all continuous global symmetries are approximate. One reason is that this seems to be the case for "eld theories derived from string theory [56,20,76]. In contrast, they typically contain many discrete symmetries [189]. 5.3.4. The restoration of a spontaneously broken internal symmetry In the early Universe, the scalar "elds will be displaced from their vacuum expectation values (vevs). In particular, a "eld with a non-zero vev, corresponding to a spontaneously broken symmetry, may have zero value in the early Universe. Then the symmetry is restored at early times. This may happen during in#ation, and also during the subsequent hot big bang.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
47
A simple example which can illustrate both cases is the following potential, involving real "elds
and t: (146) <"< !mt#jt#m #jt R (147) "j(M!t)#m #jt . Comparing the two ways of writing the potential, one sees that the parameters are related by m"jM , (148) R (149) < "jM"Mm . R The minimum of <, corresponding to the vacuum, is at "0 and t"M. The latter "eld has a non-zero vev, spontaneously breaking the discrete symmetry tP!t. But now suppose that, in the early Universe, has a non-zero value, bigger than a critical value "m/j. Then the R minimum with respect to t lies at t"0, and the symmetry is restored. With the relabelling tP , this is illustrated by the dashed line in Fig. 3. In an appropriate region of parameter space, the "elds can be in thermal equilibrium at temperature ¹9 , making typically of order ¹. Then the symmetry tP!t is restored for ¹9 , and spontaneously breaks as ¹ falls below that value. Alternatively, might be the in#aton. Then the symmetry tP!t is restored until falls below , after which it spontaneously breaks. If < dominates, this signals the end of in#ation and we have hybrid in#ation [207]. Even if it does not, the change might correspond to a feature in the spectrum PR, or topological defects [169,75,170,228,173,272,217,139,242]. (This is an alternative to the familiar Kibble mechanism [159] of defect formation, which applies if the symmetry is restored by thermal e!ects.) 5.4. The true vacuum and the inyationary vacuum The di!erent vacua, that occur when a symmetry is spontaneously broken, are physically equivalent, and are simply referred to as the vacuum. During in#ation, the spatially averaged in#aton "eld is not at a minimum of the potential, and it varies slowly with time. The spatially averaged non-in#aton "elds mostly adjust themselves to be at the instantaneous minimum of the potential with at the current value, which may or may not be the same as the vacuum value. Others may have extremely #at potentials, giving negligible motion for the spatial average. These spatially averaged "elds provide a classical background, around which are the quantum #uctuations described by quantum "eld theory. The classical background may be taken to be For simplicity, we are supposing that the vacuum so-de"ned is unique. In general, the potential might have another minimum (or set of minima related by a spontaneously broken symmetry) in which < has a di!erent value; or there might be three or more minima with di!erent values of <. In these cases, it is not clear whether the vacuum corresponding to our Universe (the one with <"0) must be the global minimum (the one with the lowest value of <). If it is not the global minimum, the lifetime for tunneling to the latter should presumably be much bigger than the age of the Universe. Examples of multiple vacua are shown in Figs. 5 and 6. The averaging is to be done over the comoving box within which quantum "eld theory is formulated. It should be large compared with the comoving scale presently equal to the size of the observable Universe, but it is neither necessary nor desirable to make it exponentially bigger.
48
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
constant, because the variation of < is slow on the Hubble timescale. It de"nes an e!ective vacuum for a quantum "eld theory, which we shall call the in#ationary vacuum. To emphasize the distinction, we shall often call the actual vacuum, corresponding to the minimum of the potential, the true vacuum. In some applications, such as when calculating the vacuum #uctuation of the in#aton "eld, it is necessary to formulate quantum "eld theory in the setting of curved space}time (the expanding Universe). The main di!erence though, between the in#ationary vacuum and the true one, is the value of the vacuum energy density <. In the true vacuum it is practically zero ("<":10\ eV, corresponding to the bound on the cosmological constant). During in#ation it is big. From this perspective two separate searches are in progress, for quantum ,eld theories beyond the Standard Model. There is the search for the "eld theory that applies in the true vacuum, and the search for the "eld theory that applies during in#ation. In some proposals these theories are very di!erent, whereas in others they are almost the same. Roughly speaking, the former proposals predict that the in#ationary energy scale is <<10 GeV, and the latter predict that it is in the range 10 GeV:<:10 GeV. 5.5. Supersymmetry Practically all viable extensions of the Standard Model invoke supersymmetry. The main reason is that they invoke fundamental scalar "elds, which look natural only in the context of supersymmetry. Indeed, supersymmetry eliminates the quadratic divergences in the mass m of fundamental light scalar "elds, dm&K , K being the scale beyond which the low-energy theory no longer 34 34 applies. In about ten years, the Large Hadron Collider (LHC) at CERN will either discover supersymmetry, if it has not been discovered before then, or practically kill it. In the latter eventuality the task of understanding whatever is observed at the LHC will take precedence over such relatively trivial matters as in#ation model-building, so let us suppose optimistically that supersymmetry is valid. We shall consider supersymmetry in detail in Section 7, but let us note a few important points. Supersymmetry is an extension of Lorentz invariance, and therefore not an internal symmetry. It relates bosons and fermions. In the &N"1' version generally adopted, there are &chiral' supermultiplets each containing a complex scalar "eld (spin 0) plus a chiral fermion (spin 1/2) "eld, and &gauge' supermultiplets each containing a gauge "eld (spin 1) and a gaugino (spin 1/2) "eld. Each Standard Model particle has an undiscovered superpartner; there are squarks and sleptons with spin 0, Higgsinos with spin 1/2 and gauginos with spin 1/2. (It turns out that at least two Higgs "elds are required.) Supersymmetry is expected to be local as opposed to global, and local supersymmetry is called supergravity because it automatically incorporates gravity. In N"1 supergravity, the graviton (spin 2) is accompanied by the gravitino (spin 3/2). In the true vacuum, global supersymmetry provides a good approximation to supergravity for most purposes, but that is not generally the case during in#ation. To decide between di!erent possible forms of "eld theory, and in particular supergravity, one may look to a hopefully more fundamental theory like weakly coupled string theory or Horava}Witten M-theory.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
49
Unbroken supersymmetry would require that each particle has the same mass as its partner. This is not observed, so supergravity is spontaneously broken in the true vacuum. (A local symmetry cannot be explicitly broken.) The scale of this breaking is conveniently characterized by a scale M , 1 related to the gravitino mass m by (150) M"(3M m . . 1 To have a viable phenomenology, the spontaneous breaking is supposed to occur in a &hidden sector' of the theory, communicating only weakly with the &visible' sector containing particles with Standard Model interactions. In the visible sector, one has for most purposes global supersymmetry with explicit breaking of a special kind, called &soft supersymmetry breaking'. Soft susy breaking must give the squarks and sleptons masses mJ &100 GeV to 1 TeV .
(151)
(Gauginos may also have such masses, or they may be lighter.) This typical &soft' mass mJ is an important parameter for model building. It cannot be much above 1 TeV or susy would not do its job of allowing us to understand the existence of the Standard Model Higgs "eld. Nor can it be much less than 100 GeV, or the squarks and sleptons would have been observed. The relation between M and mJ is model-dependent. In a class of theories known as gravity1 mediated one has (152) M K(mJ M &10 to 10 GeV . 1 . (For de"niteness we usually take 10 GeV in what follows.) Then m &mJ . In another class, called gauge-mediated, M can be anywhere between 10 and 10 GeV, corresponding to 1 keV 1 :m :1 TeV. All this refers to the true vacuum. During in#ation, susy is also necessarily broken. In most models the mechanism of susy breaking during in#ation has nothing to do with the mechanism of susy breaking in the true vacuum (and is much simpler). In an interesting class of models, the mechanism is supposed to be the same. As a rough guide, in#ation models with <<M fall into 1 the "rst class, while models with a lower < fall into the second. 5.6. Quantum corrections to the potential So far we speci"ed the part of the Lagrangian involving only the scalar "elds. When quantum e!ects are included, this is not enough to describe these "elds; we need the rest of the Lagrangian, that describing higher-spin "elds that can couple to scalar "elds. During in#ation, when the scalar "elds are almost independent of position, these e!ects can be summarized by giving an e!ective potential < and (if necessary) an e!ective kinetic function K H, which are to be used in the "eld KL equation Eq. (125) or its non-canonical equivalent. Note that we are using the same symbols for the e!ective objects and the ones that appear in the Lagrangian. Quantum e!ects are determined by the couplings of the "elds (as well as their masses). Gauge couplings (couplings to gauge "elds) are characterized by a dimensionless constant g, or equivalently by a"g/4p. (For electromagnetism, g is the electron charge and a evaluated at low energy is the "ne structure constant a "1/137.) Couplings not involving gauge "elds, called Yukawa couplings, can again be characterized by dimensionless constants. Complex scalar "elds with no
50
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
gauge couplings are called gauge singlets, and both their radial and angular components (pseudoGoldstone bosons) are favourite candidates for the in#aton "eld. Quantum e!ects are of two kinds; the perturbative e!ects represented by Feynman graphs, and the non-perturbative e!ects represented by things like instanton contributions to the path integral. This separation is meaningful only if the relevant couplings are small, in particular if gauge couplings satisfy a:1. At large couplings the theory is completely non-perturbative. Gauge couplings are not supposed to be extremely small, and one should take g&1 for crude order of magnitude estimates (making a one or two orders of magnitude below 1). For renormalizable Yukawa couplings, values a few orders of magnitude below unity are generally regarded as reasonable, at least for the renormalizable couplings in an e!ective "eld theory. 5.6.1. Gauge coupling unixcation and the Planck scale With quantum e!ects included, the masses and couplings to be used in the Lagrangian depend on the relevant energy scale Q. The dependence on Q (called &running') can be calculated through the renormalization group equations (RGEs), and is logarithmic. In the context of collider physics, Q can be taken to be the collision energy, if there are no bigger relevant scales (particle masses). In the context of in#ation, Q can be taken to be the value of the in#aton "eld if, again, there are no bigger relevant scales (particle masses, or values of other relevant "elds). For the Standard Model there are three gauge couplings, a where i"3,2,1, corresponding G respectively to the strong interaction, the left-handed electroweak interaction and electroweak hypercharge. (The electromagnetic gauge coupling is given by a\"a\#a\.) In the one-loop approximation, ignoring the Higgs "eld, their running is given by da /d ln(Q)"(b /4p)a . (153) G G G The coe$cients b depend on the number of particles with mass ;Q. Including all particles in the G minimal supersymmetric Standard Model particle gives b "11, b "1 and b "!3. Using the values of a measured by collider experiments at a scale QK100 MeV, one "nds that G all three couplings become equal at a scale [10,118,93,188,234] Q"M , where %32 M K2;10 GeV . (154) %32 The uni"ed value is a
K1/25 . (155) %32 One explanation of this remarkable experimental result may be that there is a Grand Uni"ed Theory (GUT), involving a higher symmetry with a single gauge coupling, which is unbroken above the scale M . Another might be that "eld theory becomes invalid above the uni"cation %32 scale, to be replaced by something like weakly coupled string theory or Horava}Witten M-theory,
To be precise, a "a "a "a , the factor 5/3 arising because the historical de"nition of a is not very sensible. %32 In passing we note that the uni"cation fails by many standard deviations in the absence of supersymmetry, which may be construed as evidence for supersymmetry and anyhow highlights the remarkable accuracy of the experiments leading to this result.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
51
which is the source of uni"cation. At the time of writing there is no consensus about which explanation is correct [71]. 5.6.2. The one-loop correction The perturbative part of the e!ective potential is given by a sum of terms, corresponding to the number of loops in the Feynman graphs. The no-loop term is called the tree-level term (because the Feynman graphs look like trees) and it has the power-series form Eq. (128). In any given situation, one can usually choose the renormalization scale Q so that the loop corrections are small. Then, the 1-loop correction typically dominates, and only it has so far been considered in connection with in#ation model-building. We now discuss the form of the 1-loop correction, initially making the choice Q"M . In . a supersymmetric theory, in the usual case that is much bigger than the masses of the particles in the loop, two cases arise. If the relevant part of the Lagrangian is supersymmetric, corresponding to spontaneous susy breaking, the loop correction is typically of the approximate form d
(157)
<"< #m( ) #2 , where
(158)
m( )"m#ck ln( /M ) , (159) . and the dots represent non-renormalizable terms. Let us consider a typical case, where the parameter k is of order m. Because the loop suppression factor c is ;1, < will have a maximum or minimum at &e\AM . The minimum occurs if the . mass-squared is positive at the Planck scale. This case is illustrated in Fig. 4. The maximum occurs if the mass-squared is negative at the Planck scale. In that case there is a minimum at "0, and another at some some value 9exp(!1/c)M which is determined by the non-renormalizable
. terms of the tree-level potential. Typically, is the global minimum. If "0 is our vacuum,
< vanishes there as shown in Fig. 5. In that case, the lifetime for tunneling to the global minimum had better be much longer than the age of the Universe. If "0 is not our vacuum, one can have either of the situations shown in Figs. 6 and 7. We shall see in Section 8.6 how they permit one to construct models of in#ation.
52
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Fig. 4. A non-perturbative loop correction generates a minimum in the potential. The minimum corresponds to
&exp(!1/c)M , where c;1 is a loop suppression factor. .
Fig. 5. A non-perturbative loop correction generates a maximum in the potential, at a value &exp(!1/c)M . hierarchically smaller than the Planck scale. Non-renormalizable terms generated a minimum, at a bigger which may or may not be of order M . There is another minimum at "0, which typically corresponds to a bigger value of . In the . true vacuum, < vanishes. As shown in the graph, the true vacuum may be at "0.
A notable feature of these expressions is that they generate a scale many orders or magnitude less than M without "ne-tuning. This occurs because the loop suppression factor, say a couple of . orders of magnitude below 1, is exponentiated. This phenomenon is known as dimensional transmutation, and optimistically one may suppose that with its help all mass scales can be generated more or less directly from the Planck scale. For an accurate calculation of the potential <( ) we should abandon the choice Q"M for the . renormalization scale. With a general scale Q, the potential becomes <(Q, )"< #m(Q) #c(Q)k(Q)ln( /Q) .
(160)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
53
Fig. 6. Alternatively, the true vacuum may be at the minimum with non-zero .
Fig. 7. A third possibility is that neither of the minima correspond to the true vacuum. Rather, it lies in another "eld direction, &out of the paper'.
At a given value of , the one-loop correction vanishes if we set QK . The two-loop and higher corrections are then hopefully small, and we obtain Eq. (158) with m( ),m(QK ) now given by the RGE's instead of by Eq. (159). The RGE for m is dm(Q)/d ln Q"c(Q)k(Q) .
(161)
Those of c and k will also be "rst-order di!erential equations, and m(Q) is determined by solving the equations simultaneously as in Section 8.6. If c and k have negligible running we recover Eq. (159).
54
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Being a physical quantity, <(Q, ) should actually be independent of Q, so that R
(164)
With a view to generating supersymmetry breaking, it is usually supposed that the behaviour exhibited by QCD occurs also for some other gauge interaction. The particles with this interaction should not possess the Standard Model interactions, and correspond to the hidden sector mentioned earlier. One can again have spin-1/2 condensates 1jjM 2&K, where j can be either a chiral fermion "eld as in QCD, or a gaugino "eld. The condensation scale K of the hidden sector may be far bigger than K . /!" 5.7.2. A non-perturbative contribution to the potential Above the condensation scale, the e!ect on the potential is to introduce a term like K>N/ N. In A a #at direction, it can be stabilized by a non-renormalizable tree-level term >K/MK, to generate . a large vev given by 1 2&(K /M )>N>N>KM &e\AM . A . . . (To obtain the "nal equality, we used the generalization of Eq. (163).)
(165)
For Q:100 GeV, the value of b changes as massive particles cease to be e!ective, but it remains negative.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
55
5.8. Flatness requirements on the tree-level in-ation potential So far our discussion of the potential has been quite general. Now we want to specialize to the case where is the in#aton "eld. We shall formulate conditions on the tree-level potential that ought to be satis"ed in any model of in#ation, and ask how they can be satis"ed in a supersymmetric theory. During in#ation, the tree-level potential with all other "elds "xed will be of the form Eq. (128). The mass-squared (and for that matter the coe$cients of higher-order terms) can have either sign; equivalently we can make the convention that all coe$cients are positive and there is a plus or minus sign in front of them. We adopt the latter convention so that <"< $m 2. In Eq. (128) the origin has been chosen as a point where < vanishes, and before proceeding we want to comment on this choice. As mentioned earlier (Section 5.3.1) string-derived "eld theories contain matter "elds on the one hand, and the dilation and bulk moduli on the other. In the space of the matter "elds, the origin is usually chosen to be the (unique) "xed point of the internal symmetries. The derivatives of < vanish there. In most models of in#ation, the in#aton is supposed to be the radial part of a matter "eld, with this choice of origin. Then < vanishes at the origin, provided that any other matter "elds coupling to the in#aton vanish during in#ation. If there are non-zero matter "elds coupling to the in#aton, or if the in#aton is a pseudoGoldstone boson (corresponding essentially to the real part of a matter "eld with a displaced origin), we simply de"ne the origin as a point where < vanishes. Finally, we come to the case that the in#aton is the real or imaginary part of a bulk modulus or the dilation, with some choice of the origin. For these "elds the usual choice of origin is not at all useful, so we again choose the origin as a point where < vanishes. In this case we expect to be of order M during in#ation, whereas if is a matter "eld it is usually much smaller. . Assuming canonical normalization of the "elds, in#ation requires that the potential satis"es the #atness conditions e;1 and "g";1, where e,M(</<) and g,M</< (Eqs. (31) and (32)). . . As mentioned in Section 5.2, canonical normalization is not expected to hold if 9M , but should . be a good approximation if is signi"cantly below M . In what follows, we assume at least . approximate canonical normalization, and :M . . We want to see how the two #atness conditions constrain the tree-level potential, Eq. (128). As we have seen, quantum corrections have to be added to the tree-level expression. They may give a signi"cant or even dominant contribution to the slope of the potential. But it is reasonable to assume that this contribution does not accurately cancel the tree-level contribution, over the whole relevant range of values (the values corresponding to horizon exit for cosmological scales). By the same token, one can assume that there is no accurate cancellation between di!erent terms of the tree-level potential.
A di!erent case, where in#ation occurs at &M and the kinetic function becoming singular at slightly higher . values, is discussed brie#y in Section 6.6. In this context, we are regarding the use of a running in#aton mass (Section 6.16) as still a tree-level e!ect. Note also that in mutated hybrid in#ation (Section 6.13) there is an additional contribution to the in#aton potential, coming from the implicit dependence in <( )"<( ,t( )). In that case our discussion can be taken to apply to <( ,0).
56
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
The assumptions that :M and that there are no cancellations lead to considerable simpli"ca. tion. The #atness conditions require
(169)
(170)
Putting it into Eq. (168) gives
d(d!1)j :2;10\ B
2;10\M B\ . . <
(171)
5.9. Satisfying the yatness requirements in a supersymmetric theory These constraints are quite strong in the context of received ideas about particle theory. Consider "rst the constraint, Eq. (166), on the in#aton mass. In a globally supersymmetric theory (or a non-supersymmetric theory) the constraint poses no particular problem since the mass can be set to an arbitrarily small value. Unfortunately, the corrections to global susy coming from a generic supergravity theory are not small during in#ation; rather, they give m9< /M for every . scalar "eld and in particular for the in#aton [60,289]. Therefore, to construct a model of This fact was "rst recognized in Refs. [252,61,82], but the last two did not consider the case of the in#aton. The "rst, working actually in the context of minimal supergravity, took the view that a su$ciently small mass will occur through an accidental cancellation.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
57
in#ation in the context of supergravity, one must either invoke an accidental cancellation, or a non-generic supergravity theory. We shall have much more to say about the problem of keeping the in#ation mass small in the context of supergravity. For the constraints on j and j , we need to consider separately the case that the in#aton is the B radial part of a matter "eld (the usual case), and the case that it is a bulk modulus or the dilation (more precisely, the real or imaginary part of one of these with respect to some origin).
5.9.1. The inyaton a matter xeld Consider "rst the constraint, Eq. (170), on j. If is a generic "eld, one does not expect j to be so small. But in a globally supersymmetric theory, the potential is typically independent of some of the "elds, when the others are held at the origin. Such "elds are called at directions' (in "eld space). This makes j"0 in the globally supersymmetric theory. When we go the full supergravity theory, we generically "nd in a #at direction that j is of order < /M. Then, the #atness condition, . Eq. (167), is satis"ed provided that ;M . . Now we consider Eq. (168), omitting the cubic term d"3 since it is usually forbidden by a symmetry. Even in a #at direction, the non-renormalizable couplings j are generically of order 1, B at least for d not too large. In that case, Eq. (171) becomes an upper bound on < . For d"5 it gives <(3;10 GeV, and for d"6 it gives <:3;10 GeV. For dPR it becomes <:2.6;10 GeV which is anyhow more or less demanded by the COBE normalization. (Not a coincidence, as one sees by examining the argument that led to Eq. (171).) For low d these bounds are violated in many models of in#ation. In these cases, at least some of the j must be below the generic value j &1. But provided that is well below M , this will be needed B B . only for the "rst few coe$cients, and it is enough to make these of order < /M. As we shall see in . Section 8.2, that can be achieved in a supersymmetric theory by imposing a discrete symmetry. In some models, notably D-term in#ation, is of order M rather than much less. In that case . [182] all of the j (as well as j) need to be signi"cantly less than < /M. This can be achieved by B . imposing an exact ;(1) (or higher) symmetry, but in the usual case that the in#aton is a gauge singlet the symmetry would have to be global and as we noted in Section 5.3.4 global continuous symmetries do not seem to be present in "eld theories derived from string theory [56,20,76]. Accordingly, models of in#ation with &M are at present quite speculative. One possibility . [182] is that the coe$cients j actually fall o! rapidly at large d, so that only the "rst few need be B suppressed.
The direction is #at only when other "elds coupling to it vanish; for instance if is a #at direction, and there is a term
t in the potential, a non-zero value of t will lift the #atness. At this point, we should make it clear that the #at direction can be a linear combination of the "elds that one would naturally choose; for instance in the above example the natural "elds (with say de"nite charges under a ;(1) symmetry) might be ( $t). For large d, j might be suppressed by a large d-dependent factors. For instance, if supergravity were obtained by B integrating out heavy "elds, from some renormalizable "eld theory valid on scales bigger than M , then one might expect . "j "&1/d! [182]. Such is not the case, but we are reminded that d-dependent factors might be present when supergravity B is matched to say a string theory. As our estimates of the j apply only if d is not too large, and are anyhow very rough, B the factor d(d!1) in Eq. (168) cannot be taken seriously, and we set it equal to 1 in what follows.
58
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
According to these estimates, the power series expansion, Eq. (128) ceases to be reliable for
<M . In this regime one has in general no idea what form the potential will take. . 5.9.2. The inyaton a bulk modulus or the dilaton If the in#aton is the real or imaginary part of a bulk modulus (with some choice of origin) its potential during in#ation will be of the form <"A#Bf (x) .
(172)
Here, x" /M and f (x) and its derivatives are generically of order 1 in magnitude. Also, is . typically of order M during in#ation. . The constant term A can be negligible, or can dominate <. If it is negligible, it is clear that the #atness conditions "g";1 and e;1 are marginally violated. (In terms of the coe$cients we have m&< /M and j&j &< /M.) If the constant term dominates, the #atness conditions are . B . satis"ed. The potentials of the real and imaginary parts of the dilation are very model dependent, but they are often supposed also to be of the above form, with A negligible.
6. Forms for the potential; COBE normalizations and predictions for n At the lowest level, a &model of in#ation' is simply a speci"cation of the form of the potential relevant during in#ation; this will be <( ) for a single-"eld model, or <( ,t ,t ,2) for a hybrid in#ation model. In this section we give a survey of &models' in this sense, that have been proposed in the literature. The particle theory background will be mentioned only brie#y, pending the full discussion of Sections 8 and 9. The potential of a given model will contain one, two or more parameters. Discounting particle theory, these are constrained only by observation. The most fundamental constraint is the COBE normalization, Eq. (44). The corresponding upper bound was known (to order of magnitude) long before the cmb anisotropy was actually observed, and was therefore available when in#ation was "rst proposed. It ruled out the "rst viable models of in#ation [201,6] (or to be precise, required that the dimensionless coupling is tiny, Eq. (185)) and has been imposed as a constraint on all models of in#ation since then. The COBE normalization typically determines the magnitude of the potential, as opposed to its shape.
The other, irrelevant, "elds are supposed to give a negligible contribution to <. Most of them will have masses during in#ation that are big enough to anchor them at the vacuum values. (The criterion for this is M</<<1 where . the prime is the derivative with respect to the relevant "eld, or equivalently mass
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
59
The other important constraint is provided by the spectral index n, given by Eq. (54) or more usually Eq. (56). The spectral index can often be calculated just from the shape of the potential, and is a powerful discriminator between models. One can also calculate the scale dependence of n and the relative contribution r of gravitational waves. The latter is too small ever to observe in most models, but the former may well provide additional discrimination in the future. Without going into detail, we shall try to give some indication of the extent to which each form for the potential is attractive in the context of current ideas about particle theory. In particular, we shall indicate whether there is a mechanism for keeping the in#aton mass small in the context of supergravity (Section 5.9), or whether an accidental cancellation is invoked. 6.1. Single-,eld and hybrid in-ation models As we already pointed out, there are two broad classes of &model'. In single-,eld models, the slow-rolling in#aton "eld gives the dominant contribution to the potential, and in#ation ends when
starts to oscillate about its vacuum value. In hybrid models, the dominant contribution to the potential < comes from some "eld t which is not slow-rolling, but is "xed by its interaction with . There are two, very di!erent, kinds of single-"eld model. In what are usually called chaotic in-ation models, is moving towards the origin, and its magnitude during observable in#ation is several times M . In what are usually called new in-ation models, is moving away from the origin, . and during observable in#ation its magnitude is at most of order M . In hybrid models, may be . moving in either direction, but its magnitude is again at most of order M . . Both in single-"eld and hybrid in#ation, one will have a potential <( ) during in#ation, which depends on one or more parameters. One will also know the value at the end of slow-roll in#ation. Given this information, the recipe for obtaining the predictions is simple. E Calculate the number of e-folds N( ) to the end of slow-roll in#ation using Eq. (40). In many cases, this integral is insensitive to in which case the predictions are independent of that quantity. E The value of N( ) when the observable Universe leaves the horizon, denoted simply by N with no argument, depends on the history of the Universe after slow-roll in#ation ends. We saw in
In mutated hybrid in#ation, t is a function of the in#aton "eld. Then the potential during in#ation is <( ,t( )). At this point, we should note that the de"nition of a &"eld' is in principle not unique. However, we are supposing that the "elds can be taken to be canonically normalized, so that the "eld-space &metric' K H is Euclidean. Then, apart from the KL choice of origin, the choice of "elds corresponds to a choice of orthogonal directions in "eld space. In the context of particle physics there is usually a naturally preferred choice (up to gauge transformations) making the de"nition of the "elds essentially unique in that context. On the other hand, the &in#aton "eld' may be a linear combination of the particle physics "elds. We shall generally avoid the terms &chaotic' and &new', since they are also used to indicate initial conditions long before observable in#ation starts (respectively chaotically varying "elds, and "elds in thermal equilibrium). In single-"eld models it always corresponds to the failure of one of the #atness conditions (Eqs. (31) and (32)). In hybrid in#ation, this may be the case, or alternatively it may correspond to arriving at the critical value at which the non-in#aton "eld is destabilized.
60
E E E
E
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Section 3.4, that a reasonable estimate is N&50, unless there is signi"cant in#ation after slow-roll in#ation ends. Using this, or a lower, estimate of N, calculate the corresponding slow-roll parameters g and e. Use e to impose the COBE normalization Eq. (44) on the model. See if there are signi"cant gravitational waves. As discussed in Section 3.5, this requires e90.01 which is hardly ever satis"ed. We shall mention gravitational waves only in the rare models where they are signi"cant. Check that e;"g". If it is, the full expression n!1"2g!6e may be replaced by n!1"2g. As discussed after Eq. (56), this is usually the case, and we shall mention the full expression for g only for those rare models where it is needed. Using one expression or the other, calculate n. As shown in Tables 1 and 2 (Section 6.17), it often depends only on the shape of the potential. Check to see if n has signi"cant variation on cosmological scales, corresponding to N!10:N( ):N.
6.2. Monomial and exponential potentials Now we begin our survey of models, starting with single-"eld models and going on to hybrid models. We start with the simplest potential of all. It is (173) <"m . Almost as simple are <"j , and <"jM\N N with p/253. These monomial potentials were . proposed as the simplest realizations of chaotic initial conditions (Section 3.6) at the Planck scale [202]. In#ation ends at KpM , after which starts to oscillate about its vev "0. When . cosmological scales leave the horizon "(2NpM . Since the in#aton "eld is then of order 10M , . . there is no particle physics motivation for a monomial potential. The model gives n!1"!(2#p)/(2N) (using the full expression n"1#2g!6e), and gravitational waves are big enough to be eventually observable with r"2.5p/N"5(1!n)!2.5/N. The COBE normalization Eq. (44) corresponds to m"1.8;10 GeV for the quadratic case. For p"4,6,8 it gives respectively j"2;10\, j"8;10\, j"6;10\ and so on. The COBE normalization gives <&10 GeV. The same prediction is obtained for a more complicated potential, provided that it is proportional to N during cosmological in#ation, and in particular
could have a nonzero vev ;M [190,193,174]. . In#ation at <M which ends at &M is the prediction of a wide variety of monotonically . . increasing potentials [136,137], but they are seldom considered because there is too much freedom and no guidance from particle theory. The limit of a high power is an exponential potential, of the form <"exp((2/q ). This gives e"g/2"1/q which lead to n!1"!2/q and r"10/q. This is the case of &extended in#ation', where the basic interaction involves non-Einstein gravity but the exponential potential occurs after transforming to Einstein gravity [179,186]. However, simple versions of this proposal are ruled out by observation, because the end of in#ation corresponds to a "rst order phase transition, and in order for the bubbles not to spoil the cmb isotropy one requires n:0.75. With the e!ect of gravitational waves included, this strongly contradicts observation [196,122].
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
61
6.3. The paradigm V"< #2 The models we have just considered are the only ones that have well in excess of M . In all of . the other models that we shall describe it is assumed that :M during observable in#ation. As . a result of this condition, the potential is always of the form <"< #2, with the constant < dominating and the other terms satisfying the #atness conditions of Section 5.8. To avoid repetition we shall take all this for granted in what follows. 6.4. The inverted quadratic potential Another simple potential leading to in#ation is [34,203,104,254,2,185,165,161,148,21,162] (174) <"< !m #2 , with the constant < dominating. We shall call this the &inverted' quadratic potential, to distinguish it from the same potential with the plus sign which comes from the simplest version of hybrid in#ation. The dots indicate the e!ect of higher powers, that are supposed to come in after cosmological scales leave the horizon. This potential gives 1!n"2g"2Mm/< . If m and < are regarded as free parameters, the . region of parameter space permitting slow-roll in#ation corresponds to 1!n;1. Thus n is indistinguishable from 1 except on the edge of parameter space. However, there are two reasons why the edge might be regarded as favoured. One is the fact that in supergravity g generically receives contributions of order 1. Since slow-roll in#ation requires "g";1, either g is somewhat reduced from its natural value by accident, or it is suppressed because the theory has a non-generic form. One might argue that g should be as big as possible in models that rely on an accident, corresponding to n signi"cantly di!erent from 1. The other reason for expecting n to be signi"cantly below 1, which is speci"c to this potential, has to do with the position of the minimum, . If the inverted quadratic form for the potential holds
until < ceases to dominate, one expects
&</m"(2/(1!n))M . (175)
. (This is also an estimate of in that case.) To have any hope of understanding the potential within the context of particle theory, should not be more than a few times M , which requires
. n to be well below 1. The second reason for expecting n to be signi"cantly below 1 does not hold if the potential steepens drastically soon after cosmological scales leave the horizon, as in the model at the end of Section 6.5, or if in#ation ends through a hybrid mechanism as in Section 6.11. In some of these models the "rst reason does not hold either, and n is in fact indistinguishable from 1. The COBE normalization, Eq. (44), for the inverted quadratic potential is 2 < "5.3;10\ . (176) 1!n M
. The "eld is evaluated when COBE scales leave the horizon, N e-folds before the end of slow-roll in#ation at some epoch . It is given by " e\V where x,N"1!n"/2(5 . (177)
62
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
(The bound comes from N(50 and "1!n"(0.2. At the moment we are dealing with the case n(1 but we shall use the variable x also for the case n'1.) This gives < 1!n
"5.3;10\ e\V . (178) M 2 M . . If the inverted quadratic form holds until < ceases to dominate, 9M , and . <91;10 GeV. If it fails earlier, as in the two cases mentioned, < can be much lower. Since the "eld variation is bigger than M , this type of model is unattractive in the context of . particle theory. Let us consider the proposals that have been made. Modular in-ation: If is the (real or imaginary part of the) dilaton or a bulk modulus of string theory, and other "elds are not signi"cantly displaced from their vacuum values, its potential will be given by Eq. (172) with A negligible, <"Bf ( /M ), with f (x) and its derivatives roughly of . order 1 in the regime "x":1. In that regime one expects the #atness parameters g,M</< and . e,M(</<)/2 to be both roughly of order 1, and they might both be signi"cantly below 1 near . some value of so that slow-roll in#ation can occur there. One favours the case that this value would be a maximum of the potential so that &eternal' in#ation would set the initial condition. Then the potential will be of the inverted quadratic form. So far, investigations using speci"c models [34,2,229,107] have actually concluded that viable in#ation does not occur. Radial part of a matter ,eld: Alternatively, one could take to be the radial part of a matter "eld, but this is problematic in the context of string theory for the reasons discussed in Section 5.9.1. Angular part of a matter ,eld: Instead of taking to be radial part of a matter "eld, one might take it to be a pseudo-Goldstone boson, corresponding to the angular part of a matter "eld whose radial part is "xed. This was "rst proposed in Ref. [104], and dubbed &natural' in#ation. It has subsequently been considered by several authors [165,254,2,161,112]. One might think that this proposal avoids the problem mentioned in Section 5.9.1 but this turns out not to be the case. The potential of the pseudo-Goldstone boson, coming say from instant on e!ects, is typically of the form <( )"< cos( /M) . (179) where M/(2 is the magnitude of the corresponding complex "eld. Near the top of the potential, in#ation takes place and to su$cient accuracy we have an inverted quadratic potential with m"2< /M, and 1!n"4(M /M), and to have viable in#ation we need M to be signi"cantly . bigger than M . From Eq. (144), non-renormalizable terms will then give a &correction' *<<( ) . unless they are suppressed to all orders. The di$culty of understanding such a suppression is precisely the problem stated in Section 5.9.1. 6.5. Inverted higher-order potentials If the quadratic term is heavily suppressed or absent, one will have
(180)
The case of A dominating would correspond to hybrid in#ation, which we are not considering at the moment. Ref. [32] claims to have been successful, but an analytic calculation of that model reviewed in Ref. [227] "nds that it is not viable.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
63
with p53. For this potential one expects that the integral Eq. (40) for N is dominated by the limit
leading to [162]
N\"[p(p!2)kNM]\ .
(181)
(182)
and nK1!2
p!1 1 . p!2 N
It is easy to see that the integral is indeed dominated by the limit, if higher terms in the potential Eq. (180) become signi"cant only when < ceases to dominate at N&k\. Then, in the regime where < dominates, g"[(p(p!1)M/ ]k N, and if this expression becomes of order 1 in . that regime in#ation presumably ends soon after. Otherwise in#ation ends when < ceases to dominate. At the end of in#ation one therefore has Mk N\&1 if ;M , otherwise one has . . k N &1. (We are supposing for simplicity that p is not enormous, and dropping it in these rough estimates.) The integral Eq. (40) is dominated by the limit provided that NMk N\<1 . (183) . This is always satis"ed in the "rst case, and is satis"ed in the second case provided that
;(NM which we shall assume. If higher order terms come in more quickly than we have . supposed, or if in#ation ends through a hybrid in#ation mechanism then will be smaller than these estimates, and one will have to see whether the criterion Eq. (183) is satis"ed. If it is satis"ed, the COBE normalization, Eq. (44), is [162] 5.3;10\"(pkMN)N\[N(p!2)]N\N\<M\ . (184) . . For p"4, this becomes a bound on the dimensionless coupling j de"ned by <"< !j #2, which is independent of < ; j"3;10\(50/N) . (185) Such a tiny number can hardly be a fundamental parameter, but it can be generated if j is a function of some heavy "elds which are integrated out as in the example of Section 8.5. A practically equivalent form for the potential is (186) <"< #j log( /Q) . The logarithm comes from the loop correction ignoring 's supersymmetric partner. This was the "rst viable model of in#ation [201,6] (see also [278]). The constraint j&10\ presumably rules out the model if j is a fundamental parameter though there is a dissenting view [188]. In any case, the model does not survive with supersymmetry, since the fermionic partner of then gives an equal and opposite loop contribution (Section 7.7.1). A dynamical mechanism for suppressing the mass-squared term has been proposed [3]. The potential is <"< (1#b t!c #2) ,
(187)
64
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
where t is another "eld. Then, with b and c of order 1 in Planck units, and initial values t&M . and K0, one can check that the quadratic term is driven to a negligible value before cosmological in#ation begins. For this proposal to work, the mass m has to have a negligible e!ect, which R requires m;< /M. As with the in#aton mass, this is violated in a generic supergravity theory. R . In Ref. [3] t is supposed to be a pseudo-Goldstone boson, but as we noted earlier this is not an attractive mechanism for keeping the mass small in the context of string theory. The above proposal gives a vev of order M . Some particle-physics motivation for a vev ;M . . is given in Refs. [161,162], though not in the context of supergravity. One could contemplate models in which more than one power of is signi"cant while cosmological scales leave the horizon, but this requires a delicate balance of coe$cients. Models of this kind were also discussed a long time ago [92,251], again with a vev of order M , but their . motivation was in the context of setting the initial value of through thermal equilibrium and has disappeared with the realization that this &new in#ation' mechanism is not needed. A more recent proposal is described in Section 8.5. It gives <( )K< !m !j , This gives !<"m #j #2
(188)
(189)
and the two terms are equal at " ,m/(j. It is supposed that the "rst term dominates while H cosmological scales are leaving the horizon, but that the second term dominates before the end of in#ation. For an estimate of Eq. (40), one can keep only the "rst term of Eq. (189) when the integration variable is less than , and only the second term when it is bigger. In the latter case, H one can also take the integral to be dominated by its lower limit . This gives H (190)
/ Kexp(!x), H where x is de"ned by Eq. (177). The COBE normalization, Eq. (176), then gives j"3;10\(50/N)(2x)exp(1!2x) .
(191)
Using the constraint (x(5 , this becomes 4;10\(j(3;10\ .
(192)
(193)
In this model, the tiny value of j occurs because it is of the form F/M, where F is a function of . "elds that have been integrated out. 6.6. Another form for the potential Another potential that has been proposed is
(194)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
65
with q of order 1. This form is supposed to apply in the regime where < dominates, which is
9M . In#ation ends at &M , and when cosmological scales leave the horizon one has . .
"(1/q) ln(qN)M , .
(195)
n!1"!2g"!2/N .
(196)
This potential is mimicked by <"< (1!k \N) with pPR (Table 1). Gravitational waves are negligible. The COBE normalization, Eq. (44), is now <K7;10 GeV. This potential occurs in what one might call non-minimal in#ation [289]. Here, the original potential is not particularly #at, but the kinetic term given by Eq. (136) becomes singular at a "eld value of order M leading to a #at potential after converting to a canonically normalized in#aton . "eld. Suppose, for example, that K is given by Eqs. (349) and (350), and purely for convenience suppose that t#tH"M (it is expected to be of this order). Suppose also that all other "elds vanish . except for some "eld , and set M "1. Then K"!3ln(1!" "), and assuming that < is . independent of the phase of it is easy to show [289] that the potential is given by Eq. (194) with q"(2 and the canonically normalized "eld 1 2d
"tanh \(2" "! ( < (2
(197)
Another derivation [287,48] modi"es Einstein gravity by adding a large R term to the usual R term, but with a huge coe$cient, and a third [25] uses a variable Planck mass. In both cases, after transforming back to Einstein gravity one obtains the above form with q"(2/3. These proposals too invoke large "eld values, making it di$cult to see how < can be su$ciently small (and how the kinetic terms can be almost canonical, as is assumed). 6.7. Hybrid inyation We now turn to hybrid in#ation models. In these models, the slowly rolling in#aton "eld is not the one responsible for most of the energy density. That role is played by another "eld t, which is held in place by its interaction with the in#aton "eld until the latter falls below a critical value . When that happens t is destabilized and in#ation ends. This paradigm has proved very fruitful, since its introduction by Linde [207] in 1991. Early treatments of it are Ref. [197] (1993), [208,60,235,88,113] (1994), [289,290,192,269] (1995), and [266,89,112,227,35,132] (1996); by now it is the standard paradigm of in#ation. In a related class of models the in#aton "eld is rolling away from the origin, and in#ation ends when it rises above some critical value . This paradigm, now known as inverted hybrid in#ation, is less useful as we shall discuss in Section 6.11. It was introduced by Ovrut and Steinhardt [253] in 1984, but has received little attention. Note that the essential feature of hybrid in#ation is the dominance of the potential, by the "eld that is held "xed. Potentials of the form proposed by Linde had been considered earlier by several authors, starting with Kofman and Linde [169]. But they presumed the parameters to be such that the other "eld gives only a small contribution to the potential. As we noted at the end of
66
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Section 5.3.4, such models are interesting because they might produce topological defects, or a feature in the spectrum, but they are not hybrid in#ation models. 6.8. Hybrid inyation with a quadratic potential We begin with the case that the potential during in#ation has the simplest possible tree-level form, <"< #m . The "rst term is supposed to dominate, and in#ation occurs provided that the condition
(198)
m;< /M (199) . is at least marginally satis"ed (this is the condition g;1). We shall assume unless otherwise stated that ;M , so that e;g and . n!1"2g"2Mm/< . (200) . By itself, the above potential has no mechanism for ending in#ation, since the #atness parameters e and g become smaller as decreases. In#ation is supposed to end through a hybrid in#ation mechanism as described in a moment, when falls below some critical value . When the observable Universe leaves the horizon
/ "eV , (201) where x is given by Eq. (177). At least with the two prescriptions for discussed below, Eq. (201) is consistent with the assumption ;M . . We emphasize at this point that the loop correction, ignored when one considers this potential, often dominates in reality. Several examples will be given later. Proceeding with the assumption of a tree-level potential, the COBE normalization Eq. (44) is 2 < "5.3;10\ n!1 M
.
(202)
< n!1
"5.3;10\ eV . M 2 M . .
(203)
or
To work out , we need to include the non-in#aton "eld t that is responsible for < . The full potential for the original model [207] is Eq. (147) that we already considered. <"< !mt#jt#m #jt R "j(M!t)#m #jt .
The earlier model of [94,95] seems also to be of this kind.
(204) (205)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
67
Comparing the two ways of writing the potential, one sees that the parameters are related by m"jM , R < "jM"Mm . R This gives
"m/j"jM/j . R It is useful to de"ne
(206) (207)
(208)
g ,mM/< . (209) R R . To have in#ation end promptly when falls below , as is assumed in this model, one needs g signi"cantly bigger than 1. In terms of g , the COBE normalization becomes R R j"2.8;10\ eE,gg . (210) R A di!erent prescription [266] is to replace the renormalizable coupling jt by a non renormalizable coupling t /K . 34 The COBE normalization is now
(211)
< "2.8;10\ eE,g(g . (212) R M K . 34 If one takes K signi"cantly below M , the #atness conditions on the potential discussed in 34 . Section 5.8 may become more stringent. With given by either of these prescriptions, Eq. (203) implies [60] a limit n:1.3 (assuming that < dominates the potential and that M:M ). In that case, the present observational limit . "n!1"(0.2 is more or less predicted. Di!erent prescriptions will be considered in Sections 6.13, 8.3.4, 8.7 and 9.2. In the last case, is of order M when the observable Universe leaves the horizon. . 6.9. Masses from soft susy breaking When hybrid in#ation is implemented in a supersymmetric theory, the slope of the potential is often dominated by a loop correction. But there are cases where a tree-level slope m can dominate and we mention one of them now. The crucial feature of the model [266] is that the parameters g and g are both very roughly of R order 1. (This is what one might expect if the masses m and m both vanish in the limit of global R
This is obvious if t remains homogeneous, but the same result can actually be established [60] even without that assumption. The scale K is presumably supposed to come from integrating out some sector of the full theory. The non34 renormalizable terms relevant for may or may not have the same e!ective scale K . 34
68
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
supersymmetry, and come only from supergravity corrections.) The vev of t is therefore roughly M&M . It is presumed that this is achieved by replacing the "rst term of Eq. (205) by a more . complicated function of t, rather than by making j tiny as would be required by Eq. (207). t might be a matter "eld with non-renormalizable coupling suppressed to high order as a result of a discrete symmetry. In any case, t"0 is presumably a "xed point of the relevant symmetries. A less crucial feature is the assumption that < is very roughly of order 10 GeV. This is motivated by an assumption that there is a gravity-mediated mechanism of susy breaking in the true vacuum, which operates also during in#ation with essentially the same strength. As we have seen, the observational constraint "n!1"(0.2 actually requires "g"(0.1. The reduction of g below its natural value of order 1 is supposed to come from an accidental cancellation in this model. To minimize the cancellation required, one prefers n to be signi"cantly above 1. With the choice g &1 some number N of e-folds of in#ation occur after reaches . As R R discussed in Section 3.4, one has to require that N is less than the total number of e-folds after R cosmological scales leave the horizon, since the #uctuation while t is rolling does not generate the #at spectrum required in this regime. In fact, it gives a spike in the spectrum [266,112], and one must require that it does not lead to excessive black hole formation. Typically this reduces the already signi"cant upper limit on n, that follows from the same requirement in the absence of a spike [49,266,123]. Assuming that t remains almost homogeneous after falls below , one can calculate the number of e-folds of in#ation that occur while t rolls down to its vacuum value t"M. The result is [294]
4g M 1 1# 1# R ln . N " R 2g t 3 R
(213)
Here t &H is the initial value of t, given by its quantum #uctuation. Since < is supposed to be of order 10 GeV, in this model, t &10\M , leading to [294] . N &(37/2g )(1#(1#4g /3) . R R R
(214)
Requiring N (10 leads to g '8, and requiring N (30 leads to g '1.7. R R R R In this model, the COBE normalization requires j in Eq. (210), or K /M in Eq. (212), to be 34 . a few orders of magnitude below unity. These small couplings are consistent with the assumption that loop corrections are negligible. On the other hand, the in#aton could still have large couplings to other "elds, which could give a large loop correction. If that happens, one arrives at the running in#aton mass model of Section 6.16.
Of the six examples displayed in the "gures of Ref. [266] only one actually has n signi"cantly bigger than 1, and therefore it should be regarded as a favoured parameter choice. In the case g ;1 one has slow-roll in#ation, and the homogeneity can be checked by calculating the vacuum R #uctuation. It seems reasonable that is will hold to su$cient accuracy also if g &1. R
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
69
6.10. Hybrid thermal inyation Related to the scheme we just described, is a radical proposal [4], which would have a distinctive observational signature. Its basic ingredients are fairly natural, though the particular combination required may be di$cult to arrange. The idea is to have a hot big bang during the era immediately preceding observable in#ation, with all relevant "elds in thermal equilibrium as was proposed in the early models of in#ation. (This primordial hot big bang is presumably preceded by more in#ation as described in Section 3.6.) Let us begin with the simplest version of the proposal. Including the "nite temperature ¹, the potential during in#ation is something like (215) <( ,t)"< #¹#¹t!mt#¹ !*<( ) . R As in the previous case, it is supposed that very roughly m&< /M, corresponding to a true R . vacuum value t very roughly of order M (but maybe some orders of magnitude less). The last . term, which will determine the motion of the the in#aton "eld , is not speci"ed in detail. The temperature falls roughly like 1/a, and an epoch of what one might call &hybrid thermal in#ation' begins when the potential is dominated by < at ¹&<, and ends when t is destabilized at ¹&m . This lasts for N &10 e-folds. After a further N e-folds, given by R R Eq. (214), t arrives at its true vacuum value and in#ation ends. Meanwhile, rolls slowly, and is supposed to be the dominant source of the primordial curvature perturbation. (This last feature would need checking case by case, as the other "eld t may be signi"cant } see Section 4.) To avoid unacceptable relics of the thermal era, at least a few e-folds of in#ation have to occur before the observable Universe leaves the horizon [197,70], which will probably use up all of the e-folds of thermal in#ation. In that case, we just have a hybrid in#ation model with the unspeci"ed potential <"< !*<( ). (Di!erent from the usual case though, in that the other "eld t is already destabilized when the observable Universe leaves the horizon.) However, there could well be several of the other "elds t , taking di!erent numbers of e-folds to reach their vacuum values. As L each one does so, a feature in the spectrum could be generated, because the in#aton mass coming from supergravity may change. As a more complicated variant of the scheme, one may suppose that the destabilization of one "eld a!ects the stability of another. 6.11. Inverted hybrid inyation One can also construct hybrid in#ation models where is rolling away from the origin, under the in#uence of the inverted quadratic potential Eq. (174). A simple potential <( ,t) which achieves this is [227] <"< !m #mt!j t#2 . R (
(216)
The phenomenon of ordinary thermal in#ation was noted in Refs. [34,191], and discussed in detail in Refs. [225,226,292,26]. Ordinary thermal in#ation is identical with the phenomenon we are describing now, except that the "eld is not present. Ordinary thermal in#ation is supposed to happen long after ordinary in#ation is over, with the susy breaking scale the same as in the vacuum. This makes m a typical soft mass of order 100 GeV, and assuming t;M it R . makes <;10 GeV.
70
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
The dots represent terms which give < a minimum where it vanishes, but which play no role during in#ation. At "xed there is a minimum at t"0 provided that (217)
( "m /(j . R A better-motivated potential leading to inverted hybrid in#ation will be described in Section 8.7. A more complicated one appears in [253], but the in#aton trajectory turns out to be unstable [251]. Inverted hybrid in#ation is characterised by the appearance of a negative coupling ! LtK, in contrast with the usual positive coupling LtK. Such a negative coupling, for "elds in thermal equilibrium, corresponds to high temperature symmetry restoration [128]. In the context of supersymmetry it is more di$cult to arrange than the positive coupling. In any case, one has to ensure that the potential remains bounded from below in its presence. 6.12. Hybrid inyation with a cubic or higher potential Instead of the quadratic potential, Eq. (198), one might consider a potential <"< (1#c N) , with p53 (and c'0). This case is similar to the one that we discuss in some detail in Section 6.14. One has
(218)
g"cMp(p!1) N\ , (219) . and in#ation is possible [208] only in the regime g;1. It is not clear how the in#aton is supposed to get into this regime. The number of e-folds to the end of in#ation is
p!1 p!2
1 1 ! . g( ) g( ) For < , N( ) approaches a constant p!1 1 . N ,
p!2 g( ) The spectral index is given by N( )K
(220)
(221)
(222)
n!1 p!1 1 " . 2 p!2 N !N
The quartic case has been considered in some detail [269], including the regime <M that we . are ignoring.
We are assuming that
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
71
One may also consider the case where (say) quadratic, cubic and quartic terms are all important during observable in#ation [113], but that will clearly involve considerable "ne-tuning. 6.13. Mutated hybrid inyation In both ordinary and inverted hybrid in#ation, the other "eld t is precisely "xed during in#ation. If it varies, an e!ective potential <( ) can be generated even if the original potential contains no piece that depends only on . This mechanism was "rst proposed in Ref. [290], where it was called mutated hybrid in#ation. The potential considered was (223) <"< (1!t/M)#j t#2 . The dots represent one or more additional terms, which give < a minimum at which it vanishes but play no role during in#ation. All of the other terms are signi"cant, with < dominating. For suitable choices of the parameters in#ation takes place with t held at the instantaneous minimum, leading to a potential <"< (1!< /jM ) . This gives n!1"!3/2N ,
(224)
(225)
and the COBE normalization, Eq. (44), is 5.2;10\"(2N)(j<(M/M . (226) . A di!erent version of hybrid in#ation [191] was called &smooth' hybrid in#ation emphasizing that any topological defects associated with t will never be produced. In this version, the potential is <"< !At#Bt #2. It leads to <"< (1!k \). Retaining the original name, the most general mutated hybrid in#ation model with only two signi"cant terms is [227] <"< !(p/p)M\NtN#(j/q)M\O\PtO P#2 . (227) . . In a suitable regime of parameter space, t adjusts itself to minimize < at "xed , and t; so that the slight curvature of the in#aton trajectory does not a!ect the "eld dynamics. Then, provided that < dominates the energy density, the e!ective potential during in#ation is <"< (1!k \?) , (228) where q!p pOO\Nj\NO\N k"M>? '0 , (229) . pq < pr a" . (230) q!p
For q'p, the exponent a is positive as in the examples already mentioned, but for p'q it is negative with a(!1. In both cases it can be non-integral, though integer values are the most
72
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
common for low choices of the integers p and q. This potential is supposed to hold until < ceases to dominate at
&k? , (231) after which slow-roll in#ation ends. The situation in the regime !2(a(!1 is similar to the one that we discussed already for the case a"!2; the prediction for n covers a continuous range below 1 because it depends on the parameters, but to have a model with ;M the potential has to be steepened after cosmological . scales leave the horizon. The COBE normalization in this case is [227] M?\< [M?\ \?!"a"(2!"a")M? kN]\?\\? . 5.3;10\" . . . "a"k
(232)
In the cases a(!2 and a'!1, the situation is similar to the the one that we encountered in Section 6.5 (except for the special cases aK!2 and aK!1, which we do not consider). In the case a(!2, the integral Eq. (40) is dominated by the limit provided that ;(NM , which . we assume. In the case a'!1 one has ( , and assuming ;M while cosmological scales . leave the horizon again means that Eq. (40) is dominated by the limit . In all of these cases, the COBE normalization, Eq. (184), and the prediction, Eq. (182), are valid, with p replaced by !a. Of the various possibilities regarding a, some are preferred over others in the context of supersymmetry. One would prefer [227] q and r to be even if a'0 (corresponding to q'p) and p to be even if a(0. Applying this criterion with p"1 or 2 and q and r as low as possible leads [227] to the original mutated hybrid model, along with the cases a"!2 and a"!4 that we discussed earlier in the context of inverted hybrid and single-"eld models. A di!erent example of a mutated hybrid in#ation potential is given in Ref. [112], where t is a pseudo-Golstone boson with the potential Eq. (179). Mutated hybrid in-ation with explicit dependence: So far we have assumed that the original potential has no piece that depends only on . If there is such a piece it has to be added to the in#ationary potential Eq. (228). If it dominates while cosmological scales leave the horizon, the only e!ect that the t variation has on the in#ationary prediction is to determine " through Eq. (231). 6.14. Hybrid inyation from dynamical supersymmetry breaking In Section 5.7.2, we noted that non-perturbative e!ects, such as those associated with dynamical supersymmetry breaking, could give a potential proportional to 1/ N where p is some integer, KN> #2 , <( )"< #
N
(233)
where the dots represent terms that are negligible during in#ation. This potential has been proposed [163,164] as a model of in#ation. It is convenient to de"ne a dimensionless quantity a,KN>M\N<\, so that . M N . #2 . (234) <"< 1#a
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
73
This gives g"ap(p#1)(M / )N> . (235) . The potential satis"es the #atness conditions in the regime g;1. In#ation is supposed to end when reaches a critical value , through some unspeci"ed hybrid in#ation mechanism. The number of e-folds to the end of in#ation is
p#1 p#2
1 1 ! , e;g , (236) g( ) g( ) For ; N( ) approaches a constant p#1 1
N> 1 . (237) " a\ N , p#2 g( ) p(p#2) M . This is quite an unusual feature. Most models of in#ation have no intrinsic upper limit on the total amount of expansion that takes place during the in#ationary phase, although only the last 50 or 60 e-folds are of direct observational signi"cance. Here the total amount of in#ation is bounded from above, although that upper bound can in principle be very large. The COBE normalization, Eq. (44), is N( )K
(p#2) 1 < N N>N> N 1! , (238) d K & 3 M
2p N . where N:50 corresponds to the epoch when COBE scales leave the horizon. The spectral index is given by
p#1 2 . (239) p#2 N !N The spectrum turns out to be blue (n'1), but for N <50 the spectrum approaches scale invariance (n"1). If one takes the case of p"2 and &<, the COBE constraint Eq. (44) is met for <K10 GeV and KK10 GeV. In this class of models, n is indistinguishable from 1 in most of parameter space. A value of n signi"cantly above 1 is however possible for for properly tuned values of the parameters. Taking N"50 and p"2, a spectral index of n'1.1 requires N given by Eq. (237) to be less than 65. In the context of supergravity, it is more comfortable to be in this regime since an accidental cancellation is being invoked to avoid the generic contributions of order 1 to the quantity 2g"n!1. Such a small amount of in#ation could have observationally important consequences. Also, unlike standard hybrid in#ation models, dynamical supersymmetric in#ation allows a measurable deviation from a power-law spectrum of #uctuations, with a variation in the scalar spectral index "dn/d(ln k)" that may be as large as 0.05 [164]. n!1K
We are assuming that
74
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
It is important to note that this upper limit on the total amount of in#ation can potentially lead to di$culties with initial conditions: how does the "eld end up in the correct region of the potential with a small enough rate of change to initiate slow-roll? While this sort of problem with initial conditions is in fact common to many models of in#ation, it is mitigated to a certain degree by the existence of classical solutions which admit a formally in,nite amount of in#ation. No such solution exists in this case. It is reasonable to expect that the "eld will initially be at small values, ;1 2, since the term \N in the potential will generically appear only at scales smaller than K, with a phase transition connecting the high energy and low energy behaviours. However, in the absence of a detailed model for this phase transition, the question of initial conditions remain quite obscure. 6.15. Hybrid inyation with a loop correction from spontaneous susy breaking The models considered so far work at tree level. This is valid only if the couplings of the in#aton to other "elds are strongly suppressed. In particular, the in#aton presumably has to be a gauge singlet (no coupling to gauge "elds) since gauge couplings are not supposed to be suppressed. In the absence of supersymmetry, the couplings should indeed be suppressed. The reason is that the loop correction is then *<J ln( /Q) which would spoil in#ation as in Eq. (186). But with supersymmetry, there is no reason to suppose that the in#aton couplings are suppressed. As we saw in Sections 5.6.2 and 7.7, the 1-loop correction in a supersymmetric theory typically has one of two forms, *<Jln( /Q) or *<J ln( /Q). We discuss the "rst form in this subsection, and the second form in the next one. This form typically arises if susy is broken spontaneously. Assuming that tree-level terms are negligible during in#ation, the potential is of the form <"< (1#(Cg/8p) ln( /Q)) .
(240)
In this expression, C may be taken to be the number of possible 1-loop diagrams, in other words the number of "elds which have signi"cant coupling to the in#aton. The other factor g is a typical coupling of these "elds (times a numerical factor of order 1). It may be a gauge coupling (D-term in#ation, Section 9) or a Yukawa coupling (Section 8.4). In the former case C might be of order 100, which as we shall see would be bad news. In both cases, this potential occurs as part of a hybrid in#ation model. Depending on the parameters, in#ation ends when either slow-roll fails (g&1) or the critical value is reached, whichever is earlier. However, the precise value of is irrelevant because the integral, Eq. (40), is dominated by the limit . It gives
K
NCg M . 4p
(241)
If slow-roll fails at a value ' in#ation will continue until the amplitude of the oscillation becomes of order
. The number of e-folds of this type of in#ation is *N& ln( / ), which is typically negligible.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
75
"11
N C g M 501001.0 .
(242)
"0.2
N g C M . 20 0.1 .
(243)
This makes comparable with the Planck scale, and maybe bigger. As we discussed in Section 5.9 one needs :M and preferably ;M , in order to keep the theory under control and in . . particular to justify the assumption of canonical normalization for the "elds. Let us proceed on the assumption that is not too big. Assuming that the loop dominates the slope, and using Eq. (40), the #atness parameters are g"!1/2N ,
(244)
e"C(g/8p)"g" .
(245)
The COBE normalization, Eq. (44), is <"6.0(50/N)Cg;10 GeV .
(246)
The spectral index is given by 1!n"(1/N)(1#3Cg/16p) .
(247)
Taking the bracket to be close to 1, and N to be in the range 25}50, one obtains the distinctive prediction n"0.96 to 0.98. With g"1 and C"100, 1!n is increased by a factor K2, but it is clear that anyhow n is close to 1. This prediction will eventually be tested. 6.16. Hybrid inyation with a running mass Now we turn to the case that the loop correction is of the form ln( /Q), which typically arises when susy is softly broken. Models of in#ation invoking such a correction have been proposed by Stewart [293,294]. As we noted in Section 5.6.2, this type of loop correction is equivalent to replacing the in#aton mass by a slowly varying (running) mass m( ). At "M , the running mass is supposed to have . the magnitude "m"&< /M, which is the minimum one in a generic supergravity theory. The . in#aton is supposed to have couplings (gauge, or maybe Yukawa) that are not too small, and for the most part we assume that m( ) passes through zero before it stops running. Because the couplings are small compared with unity, < then vanishes at some relatively nearby point, which we denote by . H 6.16.1. General formulas It is useful to write <( )"< (1!M\k( ) ) , .
(248)
The running associated with a given loop will stop when falls below the mass of the particle in the loop.
76
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
where k( ),!Mm( )/< . (249) . We are supposing that < dominates, since this is necessary for in#ation in the regime :M . where the "eld theory is under control. Then
1dk < "! k# , (250) .< 2 dt 3dk 1dk < # , (251) g,M "! k# .< 2 dt 2 dt where t, ln( /M ). . We assume that while observable scales are leaving the horizon one can make a linear expansion in ln , M
kKk #c ln( / ) , H H where "c";1 is related to the couplings involved. This gives
(252)
M </< "c ln( / ) . H g,M</< "c[ln( / )!1] . . H Note that k "!c, and that k"0 at ln( / )"! while <"0 at ln( / )"1. H H H The number N( ) of e-folds to the end of slow-roll in#ation is given by
(253)
N( )"M\ .
(
(<) d .
(254)
(255)
(
Using the linear approximation near , this gives H 1 c
N( )"! ln ln H c p
(256)
or (p/c)e\A,"ln( / ) . (257) H Knowing the functional form of m( ), and the value of , the constant p can be evaluated by taking the limit P in the full expression Eq. (255). We shall see that in most cases one expects H "c":"p":1 . (258)
This is equivalent to writing k"c ln( /Q) as in Tables 1 and 2, the free parameter Q then replacing the free parameter . In turn, this is equivalent to using a loop correction, with the renormalization scale Q "xed at the point H where m vanishes.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
77
The spectral index n"1#2g is given in terms of c and p by (n!1)"pe\A,!c . The COBE normalization is
(259)
</M"5.3;10\M "<"/< , . . In our case it is convenient to de"ne a constant q by
(260)
ln(M / ),q/"c" . (261) . H Assuming that "m" has the typical value < /M at the Planck scale, the linear approximation . Eq. (252) applied at that scale would give qK1. Will the linear approximation apply at that scale? If all relevant masses at the Planck scale are of order < /M, one expects on dimensional grounds . that the linear approximation will be valid in the regime "c ln( / H)";1. Then the approximation will be just beginning to fail at the Planck scale. At least in this case, one expects q to be very roughly of order 1. Using the de"nition of q, Eqs. (253) and (257) give
p < "e\OAexp ! e\A,!- # "p"e\A,!- #;5.3;10\ . (262) c M . In these models, the spectral index may be strongly scale-dependent. In fact, using d ln k"!dN one "nds
n!1 dn "2cpe\A,"2c #c . 2 d ln k
(263)
For it to be eventually observable we need "dn/d ln k"910\, and this condition is satis"ed in a large part of the parameter space. Let us discuss the regime of validity of Eqs. (259) and (263), using Eqs. (59) and (60). The quantities appearing in these expressions are m"cpe\A, , p "!m/c"!mc ln( / ) . 4 H (We relabelled the quantity p in Eq. (53) as p .) 4 Eq. (259) will be a good approximation if "m";"g" .
(264) (265)
(266)
In contrast to the other models we have discussed (where <J N), this condition is not guaranteed. But in this model, m is slowly varying. As a result Eq. (51) (with e negligible) implies that the condition will hold except within a few e-folds of a point where g changes sign. The error of order m just represents a small change in the e!ective value of p, which can be cancelled by a small change in the underlying parameters (couplings and masses). The improved slow-roll approximation, Eq. (80), shows that the error actually corresponds to changing p by an amount 1.06c. In the present state of theory the precise amount is not of interest.
78
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
It would become so only if the underlying parameters were predicted by something like string theory. When cosmological leave the horizon, "p ";"m", so the slow-roll formula for dn/d ln k will also 4 be valid. 6.16.2. The four models Four types of in#ation models are possible, corresponding to whether is a maximum or H a minimum, and whether during in#ation is smaller or bigger than . H In the case that is a maximum, one expects the potential to have the form shown in Figs. 6 H and 7. There is a minimum at "0, and the non-renormalizable terms will ensure that there is a minimum also at some value ' . The latter will generally be lower than the one at the
H origin, and we assume that this is the case. This lowest minimum represents the true vacuum if < vanishes there as in Fig. 6. If instead < is positive as in Fig. 7, the vacuum lies in some other "eld direction, &out of the paper'. In this case, it is supposed that arrives near the maximum by tunneling from the minimum that lies on the opposite side. In the case that is a minimum, the potential will be like the one in Fig. 4. The unique H minimum represented by "0 is the vacuum if < vanishes there (the case shown in Fig. 4). If H instead < is positive at the minimum, the vacuum lies in some other "eld direction. Model (i); a maximum with ( . This model [294,62] corresponds to m(M )(0, c'0 H H . and p'0, with decreasing during in#ation. The spectral index increases as the scale k\ decreases, and can be either bigger or less than 1. For in#ation to end, the form Eq. (248) of <( ) must be modi"ed when falls below some critical value , presumably through a hybrid in#ation mechanism. On the other hand, if the in#aton mass continues to run until mK< /M, slow-roll in#ation will end then. Let us suppose "rst that . this is the case, and de"ne by m( )"< /M . (267) . This is equivalent to de"ning g( )"1, up to corrections of order c which presumably should not be included in a one-loop calculation. The end of slow-roll in#ation corresponds to " , and the linear approximation Eq. (252) gives the rough estimate "ln( / )"&1/c, making p&1. H Now consider the case where in#ation ends at some value , with "m( )"(< /M. If the mass . is still running at that point, the linear estimate Eq. (256) gives p&c ln( / )(1. Values p;c H can be achieved only with very close to which would represent "ne-tuning. Therefore we H expect in this case c:p:1. If the mass stops running before is reached, at some point , then m has a constant value m "m( ) in the regime ( ( . In this regime, some number *N of e-folds of slow-roll in#ation occur. We are assuming that cosmological scales leave the horizon while the mass is still running, which requires *N(N !10 (268) !- # (38#ln(</10 GeV) . (269) Retaining the estimate of the previous paragraph for the e-folds of in#ation before the mass stops running, the constant p to be used in Eq. (257) will be in the range c:p:eA , .
(270)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
79
After imposing observational constraints [62,63], one "nds that eA , is no more than one or two orders of magnitude above unity. Model (ii); a maximum with ' . Like the previous model, this one corresponds to H H m(M )(0 and c'0, but now p(0 and increases during in#ation. The spectral index is less . than 1, and decreases as the scale decreases. In contrast to the previous case, in#ation can end without any need for a hybrid in#ation mechanism, or a change in the form of the potential, Eq. (248), if the minimum at ' is the true H vacuum. If the form Eq. (248) holds until reaches the value de"ned by g( )"!1, slow roll in#ation will end there. To leading order in c this corresponds to m( )"!< /M . (271) . Setting " , and using the crude linear approximation one "nds &eA &M , and H . p&!1. On the other hand, slow-roll in#ation might end at some earlier point . In the true-vacuum case illustrated in Fig. 6, this may happen through a steepening in the form of <( ). Otherwise it may happen through an inverted hybrid in#ation mechanism. In both cases, we expect c:"p":1. In contrast with the previous model, this one also makes sense if m stops running (as
decreases) before it changes sign; in other words, if it stops running at with m( )(0, but very small. In this case the maximum of the potential is at the origin and g is small and constant up to "0. The above treatment remains valid if m has started to run before cosmological scales leave the horizon (remember that in this model, increases during in#ation). Otherwise, one has a di!erent model that we shall not consider. Model (iii); a minimum with ( . This corresponds to m(M )'0, c(0 and p(0, and H H .
increases during in#ation. The spectral index can be either above or below 1, and it increases as the scale decreases. Now "m" decreases during in#ation, and slow-roll in#ation ends only when the potential Eq. (248) ceases to hold at some value " . In a single-"eld model, corresponding to < vanishing at the minimum, this can occur through a steepening of the form of the tree-level potential, as higher powers of become important. Alternatively, if < is positive at the minimum it can occur through a hybrid in#ation mechanism (inverted hybrid in#ation). To estimate p in this case, suppose "rst that (as decreases) the mass continues to run until m"!< /M, and denote the point where this happens by . Slow roll in#ation can then only . occur in the regime 9 . It follows that
9 , (272) and the linear approximation &e\A then gives "p":1. As before "p"9"c" is required to H avoid the "ne-tuning ln( / );1. H This estimate of assumes that quartic and higher terms in the tree-level potential are negligible at . Assuming that only one such term is signi"cant, one easily checks that the estimate is roughly correct, unless the dimension of the term is not extremely large. We do not consider that case, or the case where more than one term is signi"cant. Stewart [293] took the view that models (iii) and (iv) require a "ne-tuning of over the whole range of parameter space. As with all views on "ne-tuning, this is a matter of taste.
80
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
If the mass stops running at some point , with "m( )";1, in#ation can begin at arbitrarily small "eld values. If cosmological scales start to leave the horizon only after the mass has started to run, Eq. (272) still applies and the estimate for p is unchanged. We do not consider the opposite case. Model (iv); a minimum with ' . Like the previous case this one corresponds to H H m(M )'0 and c(0, but now p'0 and decreases during in#ation. The spectral index is bigger . than 1, and it decreases as the scale decreases. Everything is the same as in the previous case, except that a hybrid in#ation mechanism will de"nitely be needed to end in#ation, since higher-order terms in can hardly become more important as decreases. We again expect "c":p:1, with the lower limit needed to avoid the "ne-tuning ln( / );1. As a result we expect "c":p:1. A H Like Model (iii), this one can still make sense if the mass stops running before is reached. The H above treatment applies if cosmological scales leave the horizon while the mass is still running. We do not consider the opposite case. 6.16.3. Observational constraints In this model, the spectral index can change very signi"cantly on cosmological scales. The usual constraint "n!1"(0.2 may therefore not apply, but as a crude procedure [63] one can impose this constraint at both N and N !10. In all four models one "nds a viable range of parameter !- # !- # space. 6.17. The spectral index as a discriminator The point of contact with observation is the spectral index n(k). The Planck satellite will measure it with an accuracy *n&0.01 over a range * ln kK6, and will measure dn/d ln k if it exceeds a few times 10\. Let us summarise the predictions of the various models, and see how well the Planck measurement will discriminate between them. In most models of in#ation, the potential is of the form <( )"< #2, with the constant "rst term dominating and :M . With certain quali"cations stated in the text (notably a requirement .
;M that needs to be imposed in certain cases) the spectrum of the gravitational waves is too . small ever to observe. With similar quali"cations, the spectral index for various models is shown in Tables 1 and 2, along with its scale dependence dn/d ln k. The simplest cases are <"< $m , which give a scale-independent spectral index that may or may not be close to 1. Next in simplicity come the cases <"< (1!c N). Here p can be an integer 53, corresponding to self-coupling of the in#aton at tree-level, or it can be in the ranges 2(p(R or !R(p(1 (not necessarily an integer) corresponding to mutated hybrid in#ation. Related to these, as far as the prediction is concerned, are the cases <"< (1!e\O() (Section 6.6) which corresponds to pP!R and <"< (1#c ln ( /Q)) (Section 6.15) which corresponds to pP0. In all these cases the predictions are
p!1 1 1 (n!1)"! , p!2 N 2
(273)
1 dn p!1 1 "! . 2 d ln k p!2 N
(274)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
81
Table 1 Predictions for the spectral index n and its variation dn/d ln k, are displayed for some potentials of the form < (1!c N) that are discussed in the text. The variation will be detectable by Planck if "dn/d ln k"92.0;10\. The case pP0 corresponds to the potential < (1!c ln ), and the case pP!R corresponds to < (1!e\O() p
!10dn/d ln k
1!n
pP0 p"!2 pP$R p"4 p"3
N"50
N"20
N"50
N"20
0.02 0.03 0.04 0.06 0.08
0.05 0.075 0.10 0.15 0.20
(0.4) (0.6) (0.8) (1.2) (1.6)
2.6 3.8 5.0 5.4 10.0
Table 2 Predictions for the spectral index n(k). Wavenumber k related to number of e-folds N by d ln k"!dN. Constants c, q and Q are positive while p and p can have either sign. In the "rst three cases, there is a theoretical constraint "c";1. In the second case, one expects "p"9"c" Comments
<( )/<
1 (n!1) 2
1 dn 2 d ln k
Mass term
1 1$ c 2 M .
$c
0
Softly broken susy
1
ln 1$ c 2 M Q .
$c#pe!A,
Gcpe!A,
1 ! 2N
1 1 ! 2 N
Spont. broken susy
1#c ln
Q
1!c N
Various models
1!e\O(
1 ! N
1 ! N
p integer 4!1 (dyn. s. b.) or 53 (self-coupling)
1#c N
p!1 1 p!2 N !N
!
!
p!1 1 p!2 N
p'2 or !R(p(1 (self-coupling or hybrid)
!
p!1 1 p!2 N
p!2 p!1
n!1 2
The second expression can be written
p!2 1 dn "! p!1 2d ln k
n!1 . 2
(275)
82
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Excluding the cases pK1 and pK2, the factor (p!1)/(p!2) is of order 1. As a result, (n!1) is far enough below zero to be eventually observable. The scale dependence will probably be too small to measure if N is around 50, but should be observable if N is signi"cantly smaller. Next consider the case <"< (1#c N) with p an integer 53 (tree-level self-coupling) or 4!1 (dynamical symmetry breaking). In these cases there is a maximum possible number of e-folds of in#ation, whose value is unknown. If it is not too big, n!1 may be far enough above zero to eventually detect. The scale dependence is given by Eq. (275), and will be observable if "n!1" is more than a few times 0.01. Note that in these models, it is (more than usually) unclear how the in#aton is supposed to arrive at the in#aton part of the potential. Finally, we come to the case of a running in#aton mass (Section 6.16). This gives a distinctive prediction for the shape of n, and in contrast with the other models the predicted magnitude of dn/d ln k can be of order (n!1).
7. Supersymmetry 7.1. Introduction In the last section we looked at some &models' of in#ation, taken to mean forms for the in#ationary potential that look reasonable from the viewpoint of particle theory. Now we go deeper, taking on board present ideas about what might lie beyond the Standard Model. The eventual goal is to see whether deeper considerations favour one form of the potential over another. We begin by reviewing supersymmetry, which is the almost universally accepted framework for constructing extensions of the Standard Model. Supersymmetry can be formulated either as a global or a local symmetry. In the latter case it includes gravity, and is therefore called supergravity. Supergravity is presumably the version chosen by Nature. 7.2. The motivation for supersymmetry It is widely accepted that the standard model of gauge interactions describing the laws of physics at the weak scale is extraordinarily successful. The agreement between theory and experimental data is very good. Yet, we believe that the present structure is incomplete. Only to mention a few drawbacks, the theory has too many parameters, it does not describe the fermion masses and why the number of generations is three. It contains fundamental scalars, something di$cult to reconcile with our current understanding of non-supersymmetric "eld theory. Finally, it does not incorporate gravity. It is tempting to speculate that a new (but yet undiscovered) symmetry, supersymmetry [247,131,154,19], may provide answers to these fundamental questions. Supersymmetry is the only framework in which we seem to be able to understand light fundamental scalars. It addresses the question of parameters: "rst, uni"cation of gauge couplings works much better with than without supersymmetry; second, it is easier to attack questions such as fermion masses in supersymmetric theories, in part simply due to the presence of fundamental scalars. Supersymmetry seems to be intimately connected with gravity. So there are a number of arguments that suggest that nature
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
83
might be supersymmetric, and that supersymmetry might manifest itself at energies of order the weak interaction scale. Is supersymmetry expected to play a fundamental role at the early stages of the evolution of the Universe and, more speci"cally, during in#ation? The answer is almost certainly yes. For one thing, the mere fact that we are invoking scalar "elds (the in#aton, and at least one other in the case of hybrid in#ation) means that supersymmetry is involved. More concretely, the potential needs to be very #at in the direction of the in#aton, and supersymmetry can help here too. We noted earlier that supersymmetric theories typically possess many #at directions, in which the dangerous quartic term of the potential vanishes. It helps in a more general sense too. While the necessity of introducing very small parameters to ensure the extreme #atness of the in#aton potential seems very unnatural and "ne-tuned in most non-supersymmetric theories, this technical naturalness may be achieved in supersymmetric models. Indeed, the nonrenormalization theorem guarantees that a fundamental object in supersymmetric theories, the superpotential, is not renormalized to all orders of perturbation theory [126]. In other words, the nonrenormalization theorems in unbroken, renormalizable global supersymmetry guarantee that we can "ne-tune any parameter at the tree level and this "ne-tuning will not be destabilized by radiative corrections at any order in perturbation theory. Therefore, in#ation in the context of supersymmetric theories seems, at least technically speaking, more natural than in the context of non-supersymmetric theories. 7.3. The susy algebra and supermultiplets We begin with some basics that apply to both global susy and supergravity. In the low-energy regime, phenomenology requires the type of supersymmetry known as N"1 (one generator). This is usually assumed to be the case also in the higher-energy regime relevant during in#ation (though see [108]). In this section, we present some features of N"1 supersymmetric theories, that are likely to be relevant for in#ation. The reader interested in more details is referred to the excellent introductions by Nilles [247], Bailin and Love [19] and Wess and Bagger [154]. Except where stated, we use the conventions of Wess and Bagger except that some of their symbols are replaced by more modern ones (for instance, the superpotential is denoted by = instead of P). The basic supersymmetry algebra is given by (276) +Q ,Q Q ,"2pI Q P , ?@ I ? @ where Q and Q Q are the supersymmetric generators (bars stand for conjugate), a and b run from ? @ 1 to 2 and denote the two-component Weyl spinors (quantities with dotted indices transform under the (0, ) representation of the Lorentz group, while those with undotted indices transform under the (, 0) conjugate representation). pI is a matrix four vector, pI"(!1, r) and P is the generator I of spacetime displacements (four-momentum). The chiral and vector super"elds are two irreducible representations of the supersymmetry algebra containing "elds of spin less than or equal to one. Chiral "elds contain a Weyl spinor and a complex scalar; vector "elds contain a Weyl spinor and a (massless) vector. In superspace a chiral super"eld may be expanded in terms of the Grassmann variable h [154]
(x,h)" (x)#(2ht(x)#hF(x) .
(277)
84
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Here x denotes a point in space}time, (x) is the complex scalar, t the fermion, and F is an auxiliary "eld. As in this expression, we shall generally use the same symbol to represent a super"eld and its scalar component. Under a supersymmetry transformation with anticommuting parameter f, the component "elds transform as d "(2ft ,
(278)
dt"(2fF#(2ipIfM R , (279) I dF"!(2iR tpIfM . (280) I Here and in the following, for any generic two-component Weyl spinor j, jM indicates the complex conjugate of j. For a gauge theory one has to introduce vector super"elds and the physical content is most transparent in the Wess}Zumino gauge. In this gauge and for the simplest case of an abelian group ;(1), the vector super"eld may be written as (281) <"!hpIhM A #ihhM jM !ihM hj#hhM D . I Here A is the gauge "eld, j is the gaugino, and D is an auxiliary "eld. The analog of the I ? gauge-invariant "eld strength is a chiral "eld: (282) = "!ij #h D! (pIpN Jh) F #hpIQ R jM Q @ , ? IJ ?@ I ? ? ? where F "R A !R A , and where pN I"(!1,!r). Regarding the supersymmetry transformaIJ I J J I tions, let us just note that dj"ifD#fpIpN JF . (283) IJ Global supersymmetry is de"ned as invariance under these transformations with m independent of space}time position, and local supersymmetry (supergravity) as invariance with m depending on space}time position. In the latter case one has to introduce another supermultiplet containing the graviton and gravitino. Global supersymmetry need not be renormalizable (Section 7.8). But the usual convention is that &global supersymmetry' refers to a theory which is renormalizable, except possibly for the superpotential = de"ned below. For the most part we follow that convention. As we discuss in Section 7.8, global supersymmetry may be regarded as a limit of supergravity, in which roughly speaking gravity is made negligible by taking M to in"nity. For most purposes it is . a good approximation if the vevs of all relevant scalar "elds and auxiliary "elds are much less than M . (Relevant here means that they have not been integrated out (Section 5.1).) There are however . two notable exceptions. In the true vacuum, global susy (whether renormalizable or not) would predict a large positive value for <, instead of the practically zero value observed in our Universe. According to supergravity, a negative contribution of unknown magnitude should be subtracted from the global
One can also consider the fully non-renormalizable version of global susy, which includes a non-trivial KaK hler potential and/or a non-trivial gauge kinetic function. At this point, let us make it clear that we are talking about the KaK hler potential, and the gauge kinetic function, of the fundamental Lagrangian, giving the tree-level potential.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
85
susy value. It is assumed that this value makes < practically zero in the true vacuum, though one does not understand the origin of this exact cancellation. (This is called the cosmological constant problem.) During in#ation, the naive limit M PR makes no sense [294], because as we saw in Section 3, . M plays an essential role. The approximation of global supersymmetry can be justi"ed only in . special circumstances, by methods more subtle than simply taking M to in"nity. As we shall see, . this is a problem for in#ation model-building, because a generic supergravity theory does not give a potential that is su$ciently #at for in#ation. By contrast a generic globally supersymmetric theory works perfectly well.
7.4. The Lagrangian of global supersymmetry We focus "rst on global susy, with the usual restriction that it be renormalizable except for possible non-renormalizable terms in the superpotential. To write down the action for a set of chiral super"elds, , transforming in some represG entation of a gauge group G, one introduces, for each gauge generator, a vector super"eld, . De"ning the matrix <"¹?< , where ¹? are the Hermitian generators of the gauge group ? G in the representation de"ned by the scalar "elds and excluding the possible Fayet}Iliopoulos term to be discussed later, the most general renormalizable Lagrangian, written in superspace, is then
1 dh =# dh =( )#h.c. , L" dh Re4 # L L 4k ? L L
(284)
where in the adjoint representation Tr(¹?¹@)"kd?@ and =( (x, h)) is a fundamental object known L as superpotential. The corresponding function of the scalar components (x), denoted by the same L name and symbol, is a holomorphic function of the . For simplicity, we shall pretend that there is L a single gauge ;(1) interaction, with coupling constant g. This is adequate since such an interaction is the only one that we consider in detail. (To be precise, we consider a ;(1) with a Fayet}Iliopoulos term.) In the case of several ;(1)'s, there are no cross-terms in the potential from the D-terms, i.e. < is simply expressed as (< ) . " L "L To write this down in terms of component "elds, we need the covariant derivative D "R !igA . I I I
(285)
In terms of the component "elds, the Lagrangian takes the form 1 1 g L" (D HDI #iD tM pN It #"F ")! F !ijpIR jM # D# D q H
I L L I L L L I L L L 4 IJ 2 2 L L
g 1 R= R= tM jM ! t t # F ! i L L L K L 2R R
R
L K L L (2 LK L
#c.c .
(286)
86
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
At the end of the second line, q are the ;(1)-charges of the "elds . The equations of motion for L L the auxiliary "elds F and D are the constraints L F "!(R=/R )H , (287) L L g (288) D"! q " " . L L 2 L Eq. (286) contains the gauge-invariant kinetic terms for the various "elds, which specify their gauge interactions. It also contains, after having made use of Eqs. (287) and (288), the scalar "eld potential, <"< #< , $ "
(289)
< , "F " , (290) $ L L (291) < ,D . " This separation of the potential into an F term and a D term is crucial for in#ation model building, especially when it is generalized to the case of supergravity. The potential speci"es the masses of the scalar "elds, and their interactions with each other. The "rst term in the second line speci"es the interactions of gaugino and scalar "elds, while the second speci"es the masses of the chiral fermions and their interactions with the scalars. All of these non-gauge interactions are called Yukawa couplings. To have a renormalizable theory, = is at most cubic in the "elds, corresponding to a potential which is at most quartic. However, one commonly allows = to be of higher order, producing the kind of potentials that were mentioned in Section 5.9. From the above expressions, in particular Eq. (290), one sees that the overall phase of = is not physically signi"cant. An internal symmetry can either leave = invariant, or alter its phase. The latter case corresponds to what is called an R-symmetry. Because = is holomorphic, the internal symmetries restrict its form much more than is the case for the actual potential <. In particular, terms in = of the form m or m , which would generate a mass term m" " in the potential, are usually forbidden. As a result, scalar particles usually acquire masses only from the vevs of scalar "elds (i.e., from the spontaneous breaking of an internal symmetry) and from supersymmetry breaking. The same applies to the spin-half partners of scalar "elds, with the former contribution the same in both cases. In the case of a ;(1) gauge symmetry, one can add to the above Lagrangian what is called a Fayet}Iliopoulos term [100],
!2m dh < .
An exception is the k term of the MSSM, kH H , which gives mass to the Higgs "elds. 3 "
(292)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
87
This corresponds to adding a contribution !m to the D "eld, so that Eq. (288) becomes g D"! q " "!m . L L 2 L
(293)
The D term of the potential therefore becomes
1 g q " "#m . < " L L " 2 2 L
(294)
From now on, we shall use a more common notation, where m and the charges are rede"ned so that
1 < " g q " "#m . L L " 2 L
(295)
This is equivalent to
D"!g q " "#m . L L L
(296)
A Fayet}Iliopoulos term may be present in the underlying theory from the very beginning, or appears in the e!ective theory after some heavy degrees of freedom have been integrated out. It looks particularly intriguing that an anomalous ;(1) symmetry is usually present in weakly coupled string theories [125]. (Anomalous in this context means that q O0.) In this case L [84,17,85] m"(g /192p)Tr QM . .
(297)
Here Tr Q" q , which is typically [103,166] of order 100. One expects the string-scale gauge L coupling g (Section 7.9.3) to be of order 1}10\, making mK10\ to 10\M . . In the context of the strongly coupled E E heterotic string [141], anomalous ;(1) symmetries may appear and have a nonperturbative origin, related to the presence, after compacti"cation, of "ve-branes in the "ve-dimensional bulk of the theory. There is, at the moment, no general agreement on the relative size of the induced Fayet}Iliopoulos terms on each boundary compared to the value of the universal one induced in the weakly coupled case [231,41].
It is allowed by a gauge symmetry, unless the ;(1) is embedded in some non-Abelian group. m"0 can be enforced by charge conjugation symmetry which #ips all ;(1) charges. Such symmetry is possible in nonchiral theories.
88
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
7.5. Spontaneously broken global susy 7.5.1. The F and D terms Global supersymmetry breaking may be either spontaneous or explicit. Let us begin with the "rst case. For spontaneous breaking, the Lagrangian is supersymmetric as given in the last subsection. But the generators Q fail to annihilate the vacuum. Instead, they produce ? a spin-half "eld, which may be either a chiral "eld t or a gauge "eld j . The condition for ? ? spontaneous susy breaking is therefore to have a non-zero vacuum expectation value for +Q ,t , or ? @ +Q ,j ,. ? @ The former quantity is de"ned by Eq. (279), and the latter by Eq. (283). The quantities R and I F contain derivatives of "elds, and are supposed to vanish in the vacuum. It follows that susy is IJ spontaneously broken if, and only if, at least one of the auxiliary "elds F or D has a non-vanishing L vev. In the true vacuum, one de"nes the scale M of global supersymmetry breaking by 1 1 (298) M" "F "# D , 1 L 2 L or equivalently M"< . (299) 1 (In the simplest case D vanishes and there is just one F .) L When we go to supergravity, part of < is still generated by the supersymmetry-breaking terms, but there is also a contribution !3"="/M. This allows < to vanish in the true vacuum as is . (practically) demanded by observation. During in#ation, < is positive so the negative term is smaller than the susy-breaking terms. In most models of in#ation it is negligible. In any case, < is at least as big as the susy breaking term, so the search for a model of in#ation is also a search for a susy-breaking mechanism in the early Universe. Spontaneous symmetry breaking can be either tree level (already present in the Lagrangian) or dynamical (generated only by quantum e!ects like condensation). The spontaneous breaking in general breaks the equality between the scalar and spin- masses, in each chiral supermultiplet. But at tree level the breaking satis"es a simple relation, which can easily be derived from the Lagrangian Eq. (286). Ignoring mass mixing for simplicity, one "nds in the case of symmetry breaking by an F-term, (m #m !2m )"0 . (300) L L L L Here n labels the chiral supermultiplets, m is the fermion mass while m and m are the scalar L L L masses. In the case of symmetry breaking by a D term, coming from a ;(1), the right-hand side of
More generally, if the mass-squared matrix is non-diagonal the left-hand side of Eq. (300) is the supertrace de"ned in Eq. (300).
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
89
Eq. (300) becomes DTr Q. But in order to cancel gauge anomalies, it is often desirable that Tr Q"0 which recovers Eq. (300). 7.5.2. Tree-level spontaneous susy breaking with an F term Models of tree-level spontaneous susy-breaking where only F terms have vevs are called O'Raifearteagh models. We consider them now, postponing until Section 7.6.1 the case of D-term susy breaking. The simplest O'Raifearteagh model involves a single "eld X, ="mX#2 ,
(301)
where the dots represent terms independent of X. The potential is given by <"m#2, and F "m; thus supersymmetry is broken for non-vanishing m. Some models of in#ation invoke such 6 a linear superpotential. We shall encounter more complicated O'Raifearteagh models for in#ation later. At this point let us give the following example, which is probably of only pedagogical interest. It involves three singlet "elds, X, and >, with superpotential: ="j X( !k)#j > . With this superpotential, the equations
(302)
F "R=/RX"j ( !k)"0, F "R=/R>"j "0 (303) 6 7 are incompatible. Note that at this level not all of the "elds are fully determined, since the equation R=/R "0
(304)
can be satis"ed provided j X#j >"0 . (305) This vacuum degeneracy is accidental and is lifted by quantum corrections. Since either 1F 2 or 6 1F 2 are non-vanishing, supersymmetry is broken at the tree level. 7 7.5.3. Dynamically generated superpotentials It has been known for a long time that global, renormalizable supersymmetry may be dynamically broken in four dimensions [5,245]. There already exist excellent reviews of this subject and the reader is referred to [77,279,144,297,245,117] for more details. Several mechanisms have been proposed, but only two have so far been invoked for in#ation model-building. These are a dynamically generated superpotential, and a quantum moduli space, which we look at now starting with the former. In some cases, the dynamically generated superpotential occurs in a theory characterized by many classically #at directions. Typically, the potentials generated along these #at directions fall down to zero at large values of the "elds. These potentials, however, must be stabilized by some mechanism and so far no compelling model has been proposed. Alternatively, models are known in which supersymmetry is broken without #at directions [5] and no need of complicated stabilization mechanisms. In some directions, non-perturbative e!ects
90
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
might raise the potential at small "eld values, while tree-level terms raise it at large values. If some F-term is non-zero in the ground state, supersymmetry is spontaneously broken. To provide an explicit example, let us consider the model discussed in [86] in which the tree-level terms are non-renormalizable. The gauge group is SU(6);(1);(1) and the chiral super"elds
are A(15, 1, 0), FM !(6 , !2, $1), S(1, 3, 0) and S!(1, 3,$2). ;(1) is irrelevant for supersymmetry
breaking but may play the role of messenger hypercharge. The gauge symmetries forbid a cubic superpotential in the model. At the level of dimension "ve operators, the unique term allowed ="(1/M)AFM >FM \S, where M may be identi"ed with M . Along the SU(6) and ;(1) D-#at . directions the gauge symmetry is broken down to Sp(4). Gluino condensation at the scale K leads to a nonperturbative superpotential whose form follows uniquely from symmetry considerations: = "K/O, where O"FM >FM \AGHe AIJAKLAMN. Turning on the nonrenormalizable super G H IJKLMN potential lifts the #at directions and the value of the potential at the minimum turns out to be < &K/M and F-terms are of order of KM\ signalling the breaking of supersymmetry. A generic prediction of dynamical supersymmetry breaking models is the appearance of a superpotential =KK>O/ O, leading to a potential <( )"(KN>)/(" N"), where the index p and the scale K depend upon the underlying gauge group. 7.5.4. Quantum moduli spaces Recent developments have also shown that many supersymmetric theories may have other types of non-perturbative dynamics which lead to degenerate quantum moduli spaces of vacuum instead of dynamically generated superpotentials [77,279,144,297,145,147]. The quantum deformation of a classical moduli space constraint may lead to supersymmetry breaking. This happens because the patterns of breakings of global and gauge symmetries on a quantum moduli space may di!er from those on the classical moduli space and the quantum deformed constraint associated with the moduli space is inconsistent with a stationary superpotential. Indeed, moduli generally transform under global symmetries and there is a point on the classical moduli space at which all the "elds have zero vev and global symmetries are unbroken. However, at the quantum level points which are part of the classical moduli space may be removed. If tree-level interactions have vanishing potential, and auxiliary "elds, only at points on the classical moduli space which are not part of the quantum deformed moduli space, supersymmetry gets broken. We consider the following simple example. The gauge theory considered is an SU(2) gauge theory with matter consisting of four doublet chiral super"elds Q , QM (, where I, J"1, 2 are #avour ' indices. The theory also contains a singlet super"eld S and the superpotential reads ="gS(Q QM #Q QM ) , (306) where g is a Yukawa coupling constant. At the classical level, in the absence of this superpotential (g"0), the space of vacua (D-#at directions) is parameterized by a set of complex "elds consisting of S plus the following 6 SU(2) invariants (mesons and baryons) M("Q QM (, B"e'(Q Q , BM "e QM 'QM ( . ' ' ' ( '( The invariants are however subject to the constraint det M!BM B"0
(307)
(308)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
91
so that in the end the space of vacua at g"0 has complex dimension 6. In the presence of the superpotential, the classical moduli space has two branches: (a) SO0, with M("B"BM "0. On ' this branch the quarks get a mass &gS from the superpotential and the gauge symmetry is unbroken. (b) S"0, with non-zero mesons and baryons satisfying two constraints. One is Eq. (308) while the other is F "Tr M"0. Here the gauge group is broken. 1 This moduli space is however reduced by quantum e!ects. In particular, a non-zero vacuum energy is generated along the SO0 branch. This is established by considering the e!ective theory far away along SO0. Here the quark "elds get masses of order S and decouple. The e!ective theory consists of the (free) singlet S plus a pure SU(2) gauge sector. The e!ective scale K of the * low-energy SU(2) along this trajectory is given to all orders by the one-loop matching of the gauge couplings at the quarks' mass gS and reads K"gSK , (309) * where K is the scale of the original theory with massless quarks. In the pure SU(2) gauge theory gauginos condense and an e!ective superpotential &K is generated * = "gSK . (310) Thus F "gK (311) 1 and supersymmetry is broken, with a vacuum energy density F which is independent of S. As we 1 mention later, it has been suggested [74] that "S" is the in#aton. 7.6. Soft susy breaking In the e!ective theory, which describes the interactions of the Standard Model particles and their superpartners at energies :1 TeV, supersymmetry is taken to be broken explicitly. In order to preserve the theoretical motivation for supersymmetry (the absence of quadratic divergences and the naturalness of the theory) only certain &soft' susy-breaking terms are allowed. These are E Masses (and mass-mixing terms) for scalars, whose typical value will be denoted by mJ . E Masses for gauginos, whose typical value will be denoted by m J . E E Cubic terms in the scalar "eld potential, of the form (A #c.c.). The typical value of the GHI G H I couplings A will be denoted by A. GHI There are no soft chiral fermion masses, nor any soft quartic terms. Both of these have their unbroken susy values; in particular, the quartic term vanishes in a #at direction of unbroken susy. For susy to do its job one requires that the mass scales m J , mJ and A are all :1TeV. E The squark and slepton masses come almost entirely from the soft susy breaking (except for the stop), and to have escaped detection they have to be 9100 GeV. So at least mJ should be in the range roughly 100 GeV to 1 TeV. The e!ective theory, with explicit soft susy breaking, describes only the &visible' sector of the theory that consists of the "elds possessing the Standard Model gauge interactions. In the full theory, spontaneous susy breaking is supposed to take place, but in a &hidden' sector, consisting of
92
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
"elds which do not possess the Standard Model gauge interactions. When the hidden sector is integrated out (footnote 38) one obtains the e!ective theory in the visible sector. The spontaneous breaking is usually of the F-term type. Models are classi"ed as &gravitymediated' if the interaction between the two sectors is only of gravitational strength, or as &gauge-mediated' if it is stronger (usually involving a gauge interaction). In the gauge-mediated case, the entire theory including the mechanism of spontaneous susy breaking is supposed to be describable in terms of global susy. In the gravity-mediated case, the mechanism of spontaneous susy breaking is usually supposed to involve supergravity in an essential way, since that theory is anyhow needed to describe the interaction between the two sectors. (One is however free to suppose that in this case too, the mechanism of spontaneous susy breaking is describable in terms of global supersymmetry [265].) 7.6.1. Soft susy breaking from a D term Before dealing with the gauge-mediated case, we look at a proposal [36,264,232,233, 246,98,33,38,15,152] that invokes a D term. The D term comes from a ;(1) with a Fayet}Iliopoulos term, which is usually considered to have a stringy origin as described in Section 7.4. As we shall see, such a term has also been widely used for building models of in#ation, but for now we are concerned with the true vacuum. The hidden sector consists of two "elds . The part of the superpotential depending only on ! them is ="m . Ignoring the rest of the superpotential for the moment, the potential is > \
g <"m(" "#" ")# q "QI "#" "!" "#m . G G > \ > \ 2 G
(312)
The scalar "elds of the visible sector are denoted by QI , and we shall see in a moment that they have G masses of order m. Accordingly, we take mK(1!10) TeV, without enquiring into the origin of m. Let us consider the part of < setting QI "0. It is easy to see that its minimum breaks supersymmetry as well as the anomalous ;(1) gauge symmetry with [264] 1 2"(m!m/g), \ 1F >2"m(m!m/g), (
1 2"0 , > 1D2"m .
(313) (314)
If we parameterize m"eM , we have 1 2KeM and 1F >2KemM . In weakly coupled . . \ . ( string theory, e is given by Eq. (297) and is of order 10\ to 10\. Integrating out (footnote 38)
generates soft susy breaking mass terms of order m for the scalar "elds charged under ;(1) ! mJ I G"q 1D2"q m . G G /
(315)
The charges q are required to be positive to avoid color/charge breaking. Invariance under the G anomalous ;(1) will require, therefore, that terms in the superpotential involving visible-sector "elds with nonzero charges are multiplied by appropriate powers of /M [232,233]. \ . The term 1F >2 will give a gravity-mediated contribution which is smaller by a factor e. (
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
93
If m is large enough and if the "rst two generations of squarks are (equally) charged under the ;(1), the harmful #avour-changing neutral currents (FCNCs) are suppressed and trilinear soft breaking mass terms are also suppressed by powers of e so that large supersymmetric CP-violating phases pose no problem [264,232,233]. If the Fayet}Iliopoulos has a stringy origin, it is directly proportional to g , see Eq. (297). As such, it depends on the vacuum expectation value of (the real part of) the dilaton "eld s, g "M /(Re s) (see Section 7.9.3 for more details). . It has been recently argued [16] that, within some particular mechanisms for stabilizing the dilaton in string theories, the supersymmetry breaking contribution to the soft masses of sfermions coming from the dilaton F-term always dominates over the D-term supersymmetry breaking contribution from the anomalous ;(1). However, other mechanisms for stabilizing the dilaton may not have this e!ect. For instance, if the dilaton is stabilized by the contributions to the superpotential, the dilaton F-term vanishes and the soft supersymmetry breaking mass terms only comes from the D-term. Finally, we would like to point out that the class of model with D-term supersymmetry breaking may have some problems on the cosmological side, as far as the dark matter abundance is concerned [114]. 7.6.2. Gauge-mediated susy breaking Global susy models involving only the F term are called gauge-mediated models [78}81,9,244,73,18], because communication between the hidden and visible sectors is usually through a gauge interaction. A review of these models is given in Ref. [117]. The minimal gauge mediated supersymmetry breaking models are de"ned by three sectors: (i) a hidden sector (often called a secluded sector in this context) that breaks supersymmetry; (ii) a messenger sector that serves to communicate the SUSY breaking to the standard model and (iii) the standard model sector. The minimal messenger sector consists of a single 5#5M of SU(5) (to preserve gauge coupling constant uni"cation), i.e. color triplets, q and qN , and weak doublets l and lM with their interactions determined by the following superpotential: ="j XqN q#j XlM l . (316) When the "eld X acquires a vacuum expectation value for both its scalar and auxiliary components, 1X2 and 1F 2, respectively, the "elds q$qH acquire masses j1X2$j 1F 2, and 6 6 similarly for the "elds l$lH. This supersymmetry breaking in the messenger sector gives gaugino masses at one loop and scalar masses at two loops (with messengers and gauge bosons in the loops). At the scale 1X2, the gaugino masses are given approximately by a (1X2) K, j"1,2,3 , M (1X2)"k H H H 4p
(317)
where K,1F 2/1X2, k "5/3, k "k "1 and a are the three standard model gauge couplings 6 G in Eq. (153). The scalar masses are given approximately by
a (1X2) K , mJ (1X2)"2 C k H H H 4p H
(318)
94
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
where C "4/3 for color triplets, C "3/4 for weak doublets (and equal to zero otherwise) and C "> with >"Q!¹ . To have squarks and gaugino masses of order 100 GeV, we need K,1F 2/1X2&10 GeV . (319) 6 Because the scalar masses are functions of only the gauge quantum numbers, the #avourchanging-neutral-current processes are naturally suppressed in agreement with experimental bounds. The reason for this suppression is that the gauge interactions induce #avour-symmetric supersymmetry-breaking terms in the visible sector at the scale 1X2 and, because this scale is usually much smaller than the Planck scale, only a slight asymmetry is introduced by renormalization group extrapolation to low energies. This is in contrast to the supergravity scenarios where one generically needs to invoke additional #avor symmetries to achieve the same goal. Notice that there is no need to have (1F 2&1X2. The only requirement is Eq. (319), and the 6 hierarchy K;(1F 2;1X2 6
(320)
is certainly allowed [265,267]. In fact, (1F 2 can take any value between 10, and 10 to 6 10 GeV. This corresponds to 10:1X2:10 to 10 GeV. If the upper bound is saturated, the gravity-mediated susy breaking that is always present (Section 7.10) becomes of the same order as the gauge-mediated susy breaking; if it were exceeded, gravity-mediated susy breaking would make the soft susy breaking parameters too big (<1 TeV). The upper bound is also required from considerations about nucleosynthesis [115]. To obtain the hierarchy (1F 2;1X2, one can suppose that nonrenormalizable operators are 6 involved, as in Section 5.9, or [265] that X has a soft susy breaking mass which runs, as in Section 5.6.2. In the latter case, the mass may come from supergravity corrections. Alternatively, mJ may 6 receive contribution from one-loop Yukawa interactions. To illustrate this idea, we can consider the following toy model: ="j AWM W#B(WM W#j U>U\#j B) , (321) where A and B are singlets, U! have charge $1 under a messenger ;(1) and WM and W are charged under some gauge group G. We assume that some susy breaking occurs in a hidden sector dynamically and is transmitted directly to U! via the messenger ;(1) resulting in a negative mass squared m for these two states. Minimizing the potential, one can show that there is a #at direction represented by X,j A#B whose VEV is undetermined at the tree-level and that supersymmetry is broken with F "(m/j )1/(2!j /3j ). mJ gets a one-loop contribution proportional to jm 6 6 through the Yukawa interaction ="j BU>U\. Arguably, the cosmological constant problem is worse in the case of gauge-mediated susy breaking, than in the gravity-mediated case. To achieve the (practically) vanishing potential that is required by observation, the global supersymmetry result <" "= " must be cancelled by a term L !3"="/M in the full supergravity theory. But if = is dominated by the sector of the theory . responsible for gauge-mediated susy breaking, one will typically have "= "&"="/" " with L L " ";M . The conclusion is that "=" must come from some other sector of the theory, or else be L .
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
95
identi"ed with the constant = in the expansion of = (Eq. (331)) which might perhaps come from a string theory. In contrast, with gravity-mediated susy breaking the sector of the theory responsible for susy breaking usually gives "=" of the right order, because the relevant are usually of L order M . . 7.7. Loop corrections and running This is a good place to discuss the loop corrections in more detail. Perhaps the most convincing reason for believing supersymmetry is its solution to the hierarchy problem [296]. In a theory where the largest interesting energy scale is the Planck mass or uni"cation scale, light fundamental scalars (like a single Higgs doublet) get quadratically divergent contributions to their masses via one-loop diagrams where other heavy scalar or gauge "elds are running in the loop. The scalar mass is given by m"(m) #cK , where (m) is the tree-level ( ( 34 ( mass term, K is the ultraviolet cuto! scale of the theory to be identi"ed with some extremely large 34 scale and c is a loop suppression factor. The Higgs mass can only be small if there is a delicate "ne-tuning between classical and quantum e!ects. The only known symmetry which can suppress the quadratically divergent corrections is supersymmetry. Indeed, the way supersymmetry works is to cancel the leading K contribution by adding extra degrees of freedom into the game. The 34 cancellation works because the number of degrees of freedom is basically doubled in a supersymmetric theory: each spin 0 or 1 "eld is accompanied by its fermionic partner. This amounts to adding an extra contribution to m which is equal in magnitude, but opposite in sign to the original ( one. The cancellation is exact in the limit of exact supersymmetry. 7.7.1. One-loop corrections Let us address this issue more formally and imagine one is interested in the computation of the one-loop e!ective potential < ( ) of a given scalar "eld of the supersymmetric theory. In the U dimensional reduction with modi"ed minimal subtraction (DR) scheme of renormalization, it reads
1 M( ) 3 Q Str M# Str M( ) ln ! < ( )" U 64p Q 2 32p
,
(322)
where M( ) is the "eld-dependent mass-squared matrix for the particles contributing to the loop correction. These particles will in general have spins j"0,1/2 or 1, and the supertrace is de"ned as Str A" (!1)H(1#2j)Tr A . (323) H H Here, A denotes either M or the square bracket, and A is the ordinary trace for particles of spin j. H The scale Q is the renormalization scale, at which all the parameters (masses, gauge and Yukawa couplings, etc.) entering the tree-level and the one-loop potential Eq. (322) must be evaluated. In Eq. (322) we have explicitly written the quadratic divergent piece proportional to Str M. In non-supersymmetric theories this term is "eld dependent and is the source of the divergent corrections to the squared mass m. On the contrary, in supersymmetric (and anomaly free) ( theories, this term is independent of the "elds and proportional to the soft breaking masses of the
96
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
"elds contributing to the e!ective potential. It therefore contributes only to the cosmological constant, and we drop it giving
1 M( ) 3 < ( )" Str M( ) ln ! U 64p Q 2
.
(324)
With unbroken supersymmetry, the loop correction vanishes, and the tree-level scalar potential of the "eld is not renormalized at all (in particular, there is no one-loop contribution to the squared mass m). Notice that, in the case of global supersymmetric theory, this property is true at ( any order of perturbation theory as a result of the nonrenormalization theorem. If supersymmetry is broken, the supertrace as well as the one-loop potential usually no longer vanish. As an example, we consider a simple situation that can give Eq. (156) and Eq. (157). The loop correction comes from a single complex "eld t (with masses m and m for the real and imaginary parts) and its fermionic partner (with mass m ). The interaction is supposed to be j "t". When D
(taken to be real) is much bigger than the masses the total loop correction is
1 1
m# j !2 m# j ln . (325) G D 2 2 Q G The coe$cient of vanishes by virtue of the supersymmetry. Two cases commonly arise for the other terms. The "rst case occurs when there is soft susy breaking in the relevant sector, with zero (or negligible) fermion masses. Then the quadratic term dominates and one has 1 *
*
ln . (327) *
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
97
The one-loop correction, for a given particle in the loop, was displayed in Eq. (324). If is much larger than any relevant mass scales, the typical contribution to M will be of order (the only relevant scale). As a result, the loop correction will vanish for some choice Q& . The potential is then given just by the tree-level contribution, <( )K< ( ,Q"c ) , where the coe$cient c&1 depends upon the details of the theory.
(329)
7.8. Supergravity So far we have considered global supersymmetry, taken to be renormalizable except possibly for terms in the superpotential. In the usual context of collider physics, particle detectors and astrophysics, this is adequate for most purposes. But during in#ation one needs to consider supergravity, which contains within it the most general non-renormalizable version of global susy. A non-renormalizable "eld theory is an e!ective one, valid below some ultra-violet cuto! K . 34 With all of the "elds and interactions in Nature included, K is generally identi"ed with 34 M (Section 5.1), and we shall do this in the end. But for clarity of exposition we initially leave . K unspeci"ed. 34 7.8.1. Specifying a supergravity theory In Section 7.3 we de"ned the chiral and gauge supermultiplets, and their supersymmetry transformations. These formulas remain valid in supergravity, but the Lagrangian is di!erent. In addition to the superpotential = one now needs two more functions. These are the KaK hler potential K, and the gauge kinetic function f. Both = and f are holomorphic function of the complex scalar "elds, but the real function K is not holomorphic; it is regarded as a function of the "elds and their complex conjugates. Only the combination G,M\K#ln"="/M (330) . . is physically signi"cant. So we have invariance under the KaK hler transformation M\KPM\K!X!XM , =Pe6= where X is any holomorphic function of the "elds. . . We shall adopt the following conventions [154]. The scalar components L and auxiliary components FL of chiral supermultiplets are labelled by a superscript. A subscript n denotes R/R L, and a subscript n denotes R/R LH. (Note that K H"G H.) Occasionally one lowers components, LK H LK
,K H KH and F ,K HFKH; the inverse matrix of K H, which raises components, is denoted L LK LK L LK by KKHL. A summation over repeated indices is implied. We "rst consider the expansion of =,K and f about a suitable origin in "eld space. It may be chosen to be the position of the vacuum or, in the case of matter "elds, to be the "xed point of the symmetries. Presumably, there are also functions specifying terms involving second and higher spacetime derivatives. It is reasonable to suppose that such terms are negligible compared with the kinetic term unless the spacetime derivatives are of order 1 in Planck units. But then "eld theory will break down anyway.
98
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
¹he superpotential: We already considered the superpotential, in the context of global susy. Since it is holomorphic in the "elds, it is of the form ="= #K= ( L)#m= ( L)#= ( L)#K\B = ( L) . (331) 34 B B Each quantity = is the sum of dimension d terms; in other words, it is a sum of terms, each of B which is a product of d "elds times a coe$cient. For the non-renormalizable terms (d54), the coe$cient is expected to be of order 1, unless it is forbidden by internal symmetries. As we noted in Section 7.4, = is strongly constrained by internal symmetries, because it is holomorphic. For a given "eld, if one starts with an expression in which the "eld only occurs at low order, one can forbid additional terms up to a "nite order by imposing a discrete Z symmetry, and one can forbid additional terms up to all orders by imposing a continuous , symmetry. However, in the case of a gauge singlet the continuous symmetry would have to be global, and as we noted in Section 5.3.4 global continuous symmetries do not seem to exist in string theory. Therefore, in the case of a gauge singlet, it may be unreasonable to forbid additional terms to all orders. As we shall see, this is a problem for models of in#ation where the in#aton "eld has a value of order M . . ¹he Ka( hler potential: The KaK hler potential determines the kinetic terms of the scalar "elds, according to the formula (332) L "(R LH)K H (RI K) . LK I It is a function of the "elds and their complex conjugates, and can be chosen to have the expansion
L KH#K\B K ( L, LH) , (333) LK 34 B B where K H is evaluated at the origin. For simplicity we have assumed that any constant or linear LK term has been absorbed into the superpotential by a KaK hler transformation, which is always possible. One can choose the scalar "elds to be canonically normalized at the origin, corresponding to K H"d H. LK LK As in the previous expression, each K is a sum with each term in the sum a product of d "elds, B times a coe$cient which is expected to be of order 1 unless it is forbidden by a symmetry. As K is not holomorphic, symmetries do not constrain it very strongly. It can, for instance, be an arbitrary function of the " L", and the coe$cient of a monomial built out of such terms will generically be of order 1. As we shall see, this is a problem for in#ation model-building. ¹he gauge kinetic function: The gauge kinetic function determines the kinetic terms of the gauge and gaugino "elds. One can choose them to be canonically normalized when the scalar "elds are at the origin, which corresponds to K"K
H
f"1#K\B f ( L) . (334) 34 B B As is the case with =, symmetries powerfully constrain the form of f because it is homomorphic. We need to consider f because it appears in the scalar "eld potential.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
99
7.8.2. The scalar potential and spontaneously broken supergravity Supergravity can be broken only spontaneously, not explicitly like global susy. The transformation equations Eqs. (279) and (283) hold in supergravity, so it remains true that the condition for spontaneous breaking is a non-vanishing vev for one or more of the auxiliary "elds FL and D. In contrast with the case of global susy, the vevs of FL and D can receive contributions from fermion condensates as well as from scalar "elds. A favoured possibility for susy breaking (in the vacuum) is gaugino condensation, but as discussed later one can in that case add an e!ective non-perturbative contribution to = instead of including the condensate explicitly. Assuming that this has been done, the auxiliary "elds are given by D"!g(q K L#m) , L L FL"!e)KLKH(= #M\=K )H . K . K The tree-level potential is given by <"< #F!3M\e)+."=" , " . where < ,(Re f)\g(q K L#m) " L L
(335) (336)
(337)
(338)
and F,FLK HFKH"F KKHLF (339) LK L K (340) "e)+.(= #M\=K )HKKHL(= #M\=K ) . K . K L . L In the second line, we de"ned F ,K HFKH, and KKHL is the inverse of the matrix K . L LK LKH As in global supersymmetry, < is proportional to D, while F is equal to "F " if we choose " L K H"d . The last term in Eq. (337) allows the true vacuum energy to vanish, as is (practically) LK LK demanded by observation. It is usual to de"ne < "F!3e)+.M\"=" $ . "e)+.[(= #M\=K )KKHL(= #M\=K )H!3M\"="] . L . L K . K .
(341) (342)
Then <"< #< , (343) " $ and one calls < the F term even though is does not come only from the auxiliary "elds F . $ L Taking M to in"nity with K "xed gives . 34 (344) < "= KKHL(= )H . K $ L
The KaK hler invariance of the "rst expression is guaranteed by the gauge invariance. Indeed, one can replace K by L G , because the gauge invariance requires q = L"0. L L L L
100
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
We then have non-renormalizable global supersymmetry. Renormalizable global supersymmetry is obtained by taking K to in"nity as well. 34 The other possible limit is K PR with M "xed. This is minimal supergravity, characterised 34 . by canonical kinetic terms. It has no motivation from string theory. In the usually considered case that K is identi"ed with M , one simply says that (renormaliz34 . able) global supersymmetry is obtained from supergravity in the limit M PR. From now on, we . make this identi"cation except where stated. The scale of susy breaking in the true vacuum is denoted by M and de"ned by 1 M"F#< . (345) 1 " An equivalent de"nition is <"M!M\e)+."=" . 1 . Since < (practically) vanishes in the true vacuum, this is equivalent to
(346)
M"M\e)+."=" . 1 . One can show that the gravitino mass is given by
(347)
M"3m M . 1 .
(348)
7.9. Supergravity from string theory One hopes that the Lagrangian describing "eld theory, will eventually be derivable from some more fundamental theory. Candidates under consideration at present include weakly coupled (heterotic) string theory [124] and Horava}Witten M-theory [199,141]. In this section we look at the form of supergravity predicted by weakly coupled string theory. Then we brie#y mention the case of M-theory, which has not so far been invoked for in#ation model-building. A crucial role is played by special "elds, namely the dilaton and the bulk moduli. The dilaton, usually denoted by s, speci"es the gauge coupling at the string scale, and the bulk moduli specifying the radii of the compacti"ed dimensions. (Weakly coupled strings live in nine space dimensions, so six of them have to be compacti"ed.) We consider the cases where there is just one bulk modulus t, and where there are three bulk moduli t'. For simplicity, we ignore the Green}Schwarz term needed to cancel the modular anomaly induced by "eld theory loop corrections, and initially we ignore the dilaton as well. In this section, we set M "1 unless otherwise stated. . 7.9.1. A single modulus t The simplest case corresponds to compacti"cation on a six-torus [168]. It should be regarded as a toy model, since it permits only one generation in the Standard Model. In units of the string scale (slightly below M , see Section 7.9.3) the radius of the six-torus is (2x)\ where . x,t#tH! " L" . L
(349)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
101
Here t is a bulk modulus, and L are a subset of the matter "elds, called the untwisted sector. (The other matter "elds are said to belong to the twisted sector.) The KaK hler potential derived from string theory is K"!3 ln x .
(350)
If we ignore the twisted sector, and assume that = is independent of t, Eq. (342) takes the remarkably simple form 3 (351) < " "= " . L $ x L It is assumed that the vacuum of the globally supersymmetric theory (minimum of its potential) is at = "<"0, corresponding to unbroken global supersymmetry. Then, the vacuum of the L supergravity theory is also at <"0, as is required by observation, but supersymmetry is now in general spontaneously broken. At the tree level under consideration here, the scale of supersymmetry breaking given by Eq. (347) is undetermined. (< in the vacuum is independent of x and therefore e) is undetermined.) This corresponds to what is called a no-scale supergravity theory [187]. Although supersymmetry is broken, the scalar masses given by this tree-level expression do not feel the e!ect of the breaking as is clear from the fact that the potential < has the same form as in global susy. The no-scale model is a consequence of the assumptions about =. In general one expects that = will depend on t, and the twisted sector may be important. We shall look at these issues in the next subsection, in the context of the more realistic model that has three bulk moduli. For future reference, we note that if the D term in Eq. (338) involves only the untwisted sector it is of the form 1 (352) < " (Re f )\g x\ q " L"#m , L " 2 L
7.9.2. Three moduli t' Compacti"cation on the six-torus is not phenomenologically viable, because it allows only one generation in the Standard Model. To obtain the three generations that are observed, one can use [142,103,66,101,67,13,143,166] orbifold compacti"cations with three tori. There are now three moduli t' (I"1 to 3). This theory possesses invariance under the modular transformations. Acting on the moduli, these transformations are generated by t'P1/t' and t'Pt'$i. A matter "eld ? transforms like g\O?'(t'), where g is the Dedekind function and ' q? are the weights of the "eld. Modular symmetry has a "xed point (up to modular transformations) ' at which the matter "elds vanish and t'"e p. At this point, the derivative of < with respect to every "eld vanishes. The matter "elds are divided into "elds ', that belong to the untwisted sector, ( having modular weight q("d(, and "elds belonging to the twisted sector, that have weights q'0 ' ' ' (typically less than 1). In units of the string scale, the radius of the Ith torus is (2x )\, where ' x "t'#t'H! " '" . '
(353)
102
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
We expect "t'"&1 with all matter "elds ;1, both in the true vacuum and during in#ation. The superpotential has a power series expansion in the matter "elds, of the form (354) =" j ( ?)L?K g(t') ?L?KO?'\ , K ' K ? where n? are positive integers or zero. The t' dependence of each coe$cient is dictated by modular K invariance, which requires that = transforms like g\(t') (up to a modular-invariant holomor' phic function, which we do not consider because it would have singularities). Using this expression one sees that
R=/Rt',= "2m(t') q? ?= != . ' ? ' ? The KaK hler potential is
(355)
(356) K"! ln x # x\O' " "#2 . ' ' ' ' The "rst term comes directly from string theory, and it gives the part of K that is independent of the twisted "elds. The second term comes from an expansion of the S-matrix as a power series in matter "elds. The additional terms are restricted by modular invariance, but they could in general include terms like
" '" x\O' " " . (357) ' t'#t'H ' Such terms would generically have coe$cients of order 1, and as we shall see they could spoil the #atness of the in#ationary potential. They can be eliminated if we assume that K depends on the moduli and untwisted "elds only through the combinations x , as advocated in [107]. ' If the twisted "elds and the = are negligible, the potential Eq. (342) becomes
(358) < "e) x "= # 'H= "#"x = !=" !3"=" . ' ' ' ' ' $ ' In this expression, = ,R=/Rt'. ' If = is a sum of cubic terms, each containing just one "eld from each untwisted sector, then = does not depend on the moduli and we have simply x "= " ' ' ' < " . (359) $ x x x This expression is similar to Eq. (351), that we wrote down earlier. It has all the properties that we described then, and is also called a no-scale model. For future reference, we note that with Eq. (356) the D term in Eq. (338) becomes
1 < " (Re f )\g q x\O?' " ?"#m . " 2 ? ' ? ' Here, a runs over both twisted and untwisted "elds.
(360)
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
103
7.9.3. The dilaton At the the string scale, the gauge coupling is related to the dilaton "eld s by g "M /(Re s) . (361) . This expression takes the real part of the gauge kinetic function to be 1. Equivalently, g can be absorbed into f. Then at the string scale f (s)"s/M . .
(362)
Ignoring Green}Schwarz terms, the contribution of the dilaton to the KaK hler potential is *K"!ln(s#sH)2
(363)
This gives an extra contribution to the potential "FQ" *<" (s#sH)
(364)
"e)"(s#sH)= !=" . Q
(365)
(Of course it also contributes an overall factor (s#sH)\ from the e) in front of everything in Eq. (342).) In the true vacuum, Eq. (361) requires s&1 to 10M , and during in#ation the order of . magnitude of s is presumably not very di!erent, so as to be within the domain of attraction of the true vacuum. The contribution of s to the superpotential is non-perturbative, and very model-dependent. It is often supposed to be something like =(s)"Me\Q@+. . .
(366)
Since e)J1/Re s, these expressions make Re s run away to in"nity at least with a single term in Eq. (366). There is no consensus about what stabilizes the dilaton either in the true vacuum or [45,21,160] during in#ation. The simplest possibility is to invoke an additional (non-perturbative) contribution to the KaK hler potential. All this assumes that the dilaton is part of a chiral supermultiplet, like the other scalar "elds. An alternative description [37] puts the real part of the dilaton in a linear supermultiplet. The situation then is qualitatively similar to the one that we have described, but di!erent in important details.
The string scale is the one below which, in weakly coupled string theory, "eld theory will become a valid approximation. At this scale, the gauge couplings in the true vacuum are supposed to have a common value g , presumably of order 1. The scale and coupling are related by M Kg M . The value g K0.5 would correspond to the . value a ,g /4pK1/25, which with naive running of the couplings is suggested by observation at a scale of order 10 GeV.
104
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
7.9.4. Horava}=itten M-theory In Horava}Witten M-theory [199,141], K receives an extra contribution [214,195]. For the untwisted "elds, this is " '" 1 1 a (t'#t'H) 8 . (367) *K" ' t'#t'H 3s#sH ' '
The parameters a are expected to be roughly of order 1. The gauge coupling in the visible sector (at ' the string scale) becomes f"s# a t' . (368) ' ' The &string' scale at which this expression is valid will be lower than in weakly coupled string theory. 7.10. Gravity-mediated soft susy breaking This is a good place to give a brief account of gravity-mediated soft susy breaking. 7.10.1. General features The basic features are the same as for gauge-mediated susy breaking (Section 7.6.2). The softly broken global susy, that describes the visible sector, is supposed to be only an e!ective theory. In the full theory, supersymmetry is spontaneously broken. The spontaneous breaking takes place in a hidden sector, whose "elds do not possess the Standard Model gauge interactions. The spontaneous breaking mechanism is supposed to involve an F-term. In contrast with the gauge-mediated case, the mechanism of spontaneous susy breaking in the hidden sector is usually supposed to involve supergravity in an essential way. The de"ning di!erence, though, is that the mechanism of transmission of susy breaking to the visible sector comes only from interactions of gravitational strength. In other words, each interaction term is multiplied by a power of M\. Some interaction terms of this type will be present as non. renormalizable terms in the expansions Eqs. (331) and (333); for instance, no symmetry can prevent the appearance of a term in K like K"2jM\" ""y"2 , (369) . where belongs to the hidden sector and y belongs to the visible sector, and the coupling j of such a term will generically be of order 1. Additional interaction terms will arise in the potential because of the form of the supergravity expression Eq. (342). Given the values of the auxiliary "elds that spontaneously break susy, and those of the "elds themselves, one can calculate the soft susy masses-squared m (or more generally the soft massL matrix) and the A l parameters that de"ne the soft trilinear terms. One "nds generically LK m&A l&M/M("3m ), where M is the susy breaking scale de"ned by Eqs. (345) and (346) L LK 1 . 1 or Eq. (347), and m is the gravitino mass de"ned by Eq. (348). One can see this by making rough estimates, as in the similar analysis of Section 8.2.1. A classic explicit calculation, with some speci"c assumptions, is given in Section 7.10.3.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
105
The gaugino masses are given by Rf FL m " , (370) R L 2 Re f L where f is the gauge kinetic function for the visible sector. The simplest example of gravity-mediated susy breaking was given in an unpublished paper by Polonyi [263]. The superpotential is split into the sum of two functions ="=( )#=(y?) ,
(371)
where y? denotes the visible sector "elds and denotes a hidden sector "eld which is a gauge singlet. Its superpotential is taken to be =( )"M( #b) . 1 If gravitational e!ects are ignored, =( ) leads to a #at potential independent of ,
(372)
<"M , (373) 1 susy is broken, but the vev of is undermined. Once gravity is turned on, the presence of the negative terms produces a minimum of the potential at 1 2&M , (374) . F "R=/R &M . (375) ( 1 The constant b"(2!(3)M is chosen to make the cosmological constant vanishing in the true . vacuum, <(1 2)"0. 7.10.2. Gravity-mediated susy breaking from string theory The nonvanishing auxiliary "elds of the hidden sector are usually taken to be those of the dilaton and/or the bulk moduli. Also, the bulk moduli t' and their auxiliary "elds FR' are usually set equal to common values, t and FR. Finally, the weakly coupled string theory expression Eq. (356) is assumed. Then the scalar masses are [153,44] m"m [(3#q cos h)C!2] , (376) L L where q " qL and tan h"(K H/K H)"FQ/FR". The constant C is given by C!1"< / L ' ' QQ RR (3Mm ), and it is equal to 1 in the true vacuum case that we are dealing with at the moment. As . usual m "e)+."="/M. . At a deeper level, the vevs of the auxiliary "elds are usually supposed to mimic some dynamical e!ect, often originating in string theory with extra space dimensions. A favoured mechanism is gaugino condensation, which is supposed to generate a superpotential =(s) looking something like Eq. (366). (With several hidden sectors there is a sum of such terms.) The value of b has to be such that =&K&(10 GeV) . (377) This gives the right soft susy breaking scale, M&K/M &(10 GeV). With this mechanism 1 . FQ vanishes, since once s is stabilized the perfect square, Eq. (365), is driven to zero. In weakly coupled string theory, Eqs. (362) and (370) then make the gaugino masses vanish at the string scale.
106
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
This is probably forbidden by observation, but it is avoided in Horava}Witten M-theory where Eq. (362) is replaced by Eq. (368). A particular version of gravity-mediated susy breaking is the no-scale theory, corresponding to Eq. (359). In this case, the masses of untwisted "elds vanish at tree level, though running them from the string scale can still give masses of order 100 GeV at the electroweak scale. In the context of weakly coupled string theory, no-scale gravity corresponds to the assumption that the superpotential in the relevant sector of the theory is independent of the bulk moduli. Because of the modular invariance encapsulated in Eq. (354), this may be di$cult to arrange in the true vacuum under consideration at present, since = is necessarily nonzero. (During in#ation the no-scale form is easier to achieve as we discuss later, provided that = is negligible.) In Horava}Witten M-theory, no-scale gravity will presumably be a valid approximation only if some of the a in Eq. (367) are signi"cantly below 1. ' 7.10.3. Formalism for gravity-mediated supersymmetry breaking This subsection is more technical, and can be skipped by the general reader. It gives a formalism for calculating the soft scalar terms explicitly, with some assumptions, and an example of how gaugino condensation can generate an e!ective contribution to =. For the formalism, we follow the original notation [281,116], in which the complex conjugate of a "eld is labelled by a subscript. The visible sector "elds are y? (collectively y) and the hidden sector "elds are G (collectively ). It is supposed that
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
107
After taking the limit M PR [116], we obtain for the visible sector a renormalizable global . susy theory, with explicit soft breaking terms. The scalar potential is of the form
RgL RgL #m y?S@yR# mR y?R@ # (A !3)gL # (B !2)k g #h.c. ? @ ?Ry@ L L K K K Ry? L L #D-terms . (381)
<"
The "rst term is the unbroken susy result, the second term is the soft mass matrix for the complex "elds, and the term in square brackets contains soft trilinear terms, as well as bilinear terms that complete the speci"cation of the mass matrix of the real "elds. We have imposed the constraint <"0 appropriate for the vacuum, and the gravitino mass is the modulus of m ,1e)=2 . The soft parameters are determined by the following formulas:
RK@RKA RK@ ?! ? o S@"d@# oRG ? ? RmR RmG RmRRmG H H H where
,
(382)
RK@ R@"d@!M oRG ? ? ? . RmG
,
RK \ R o, (ln =#K) . H RmGRmR RmH H Here gL is the superpotential for the light "elds de"ned by
(383)
(384)
gL (y)" gL (y)# k g(y) , L K K L K
(385)
with gL (y)"1e)2c (1m2)g(y) , L L L
k "m K
1!o
R c (m,mR) GRmR K G
.
(386)
Also,
R A " oRG [K#ln c (m)] L L RmG
,
c (m,mR) R R R K !o !oRGo . (387) G H RmG RmR RmGRmR (1!Mo (R/RmR)c (m,mR) G G K H H Identifying the ultra-violet cuto! K in Eqs. (331) and (333) with M , one will have generically 34 . "S@"&"R@"&"A "&"B "&1, making the soft susy breaking mass matrix-squared of order m and ? ? L L the trilinear terms of order m . Next we see how gaugino condensation can give an e!ective superpotential. We consider an extension of susy-QCD based on the gauge group SU(N ) in the hidden sector with N 4N #avors A D A B " K
2# oRG
108
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
of `quarksa QG in the fundamental representation and `antiquarksa QI M in the antifundamental G representation of SU(N ) [5]. The gauge kinetic function may be chosen to be f"ks, where s is the A dilaton super"eld and k is the Kac-Moody level of the hidden gauge group. Because of the gauge structure, the gauge group SU(N ) enters the strong-coupling regime at the A scale (388) K "M e\IQ@ , . where b "(3N !N )/(16p) is the one-loop beta function for the hidden sector gauge group. A D Below the scale K the appropriate degrees of freedom for N (N are the mesons MGM "QGQI M . D A G G The e!ective superpotential is "xed uniquely by the global symmetries as follows [5]: ="(N !N )1jj2 , A D where the gaugino condensation scale is
1jj2"
K,A\,D ,A\,D . Det M
(389)
(390)
8. F-term in6ation 8.1. Preserving the yat directions of global susy Let us recall the discussion of Section 5.9. We saw there that in any model of in#ation, the quartic term of the potential <( ) should be small. One can ensure this by choosing the in#aton to be a #at direction of global supersymmetry, but one still has to ensure that the the mass term and non-renormalizable terms are su$ciently small. At least for the mass term, this does not happen in a generic supergravity theory. The following strategies have been proposed to get around this problem. 1. The potential is dominated by the F term, but the in#aton mass is suppressed because K and = have special forms. 2. The potential is dominated by the F term, whose form is generic. However, the in#aton mass is suppressed because of an accidental cancellation between di!erent terms. 3. The potential is dominated by the F term, whose form at the Planck scale is generic. However, the in#aton mass is suppressed in the regime where in#ation takes place, because it runs strongly with scale. 4. The potential is dominated by a Fayet-Iliopoulos D term. 5. The potential is dominated by the F term, whose form is generic. However, the kinetic term of the in#aton "eld becomes singular near the region where in#ation takes place, so that after going to a canonically-normalized the potential becomes #at even though it was not originally. We mentioned the last possibility in Section 6.6 and it will not be considered further. We consider in this section the three F-term possibilities, and then go on to the D-term.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
109
8.2. The generic F-term contribution to the in-aton potential In this section we show that in a generic model of F-term in#ation, the #atness parameter g,M</< of the would-be in#aton potential is at least of order 1, in contrast with the . requirement "g":0.1. We are continuing the discussion of Section 5.9, and supposing that the in#aton is the radial part of a matter "eld. The full potential is given by Eq. (342), and as it contains more than one complex "eld we cannot adopt the assumption of Section 5.9 of exact canonical normalization; this would correspond to the condition K H"d H which will be impossible to arrange for all "eld values. However, we assume LK LK this condition at the origin for the in#aton "eld (n"m"i), in order to calculate the in#aton mass-squared. We also assume that it provides at least a rough approximation for all of the "elds, and that in addition "K ":M and e)+.&1. These assumptions are valid in the string theory L . examples that are usually considered. By analogy with Eq. (346), we de"ne the scale M of susy breaking during in#ation by (391)
(392)
8.2.1. The inyaton mass We are mainly concerned with the contribution of the quadratic term in Eq. (128) to g, which is g"mM/<. Purely for simplicity, we suppose that < depends only on " " so that m"< H . G GG evaluated at the origin. To get o! the ground, we "rst assume that all "eld values are ;M , with K H"d at the LK . LK origin. Then we "nd from Eq. (342), assuming that the in#aton is a #at direction corresponding to = "0, LG (393) m"M\< !M\"= "# KKHHL=H= . GG L K . . G LK The right-hand side is evaluated with all "elds at the origin. The contribution of the "rst term to g is precisely 1. For the other terms, take "rst the case < &M . Then the (negative) contribution of the second term to g is at most of order 1. For the third term, we use Eq. (333), and set K "M . 34 . Then KLN KM will be of order M\, and the contribution of the third term to g is also of order 1 (with GG . either sign). Generically, there is no reason to expect an accurate cancellation of the contribution #1 coming from the "rst term. The case that one or more "elds have values of order M is more model dependent, but the . generic contributions to g are still at least of order < /M. In particular, one gets a contribution to . m analogous to the third one of Eq. (393), KKHHLFHF , that is generically of this order. LK GG L K If we abandon the assumption < KM , the estimate becomes bigger, m&M /M&(< /M)(M /< ) . (394) . .
110
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
8.2.2. The quartic coupling and non-renormalizable terms The expansion Eq. (331) of = will generically give coe$cients j&j &1 in Eq. (128). According B to Section 5.8, j(10\. To achieve this, the in#aton is chosen to be a #at direction, so that the relevant renormalizable terms of Eq. (331) vanish. Repeating the above discussion one then "nds j&< /M. . At least the "rst few j should also be suppressed, by eliminating the relevant non-renormalizB able terms in Eq. (331). These coe$cients are then also of order &< /M. . As before, these estimates assume < KM and more generally we have j&j &M /M&(< /M)(M /< ) . (395) B . . 8.3. Preserving yat directions in string theory 8.3.1. A recipe for preserving yat directions A strategy for keeping the F-term #at was given by Stewart [289] (see also [60]). The basic idea is to ensure that the potential has almost the same form as in global susy. This is done by imposing some simple conditions on = and the "elds, and choosing a rather special form for K. The required form occurs in weakly coupled string theory, though apparently not in Horava}Witten M-theory. The "elds are divided into three classes, which we shall label , t and s. During in#ation, it is required that the following relations are satis"ed to su$cient accuracy: ="= "= "s"0 , ( R = O0 . (396) Q The in#aton is going to be one of the "elds, which means that the others are constant during in#ation; as a result the requirement s"0 can always be imposed by a choice of origin, though it may not be a natural one. With these assumptions, the potential, Eq. (342), becomes, during in#ation, < "e) = KKHL(= )H , K $ L LK where the sum goes only over the s "elds. The required form for K is
(397)
(398) K"!ln f ( , H)! sHC (t,tH)s #KI (t,tH)#O(s,sH) , L LK K LK where f and KI are arbitrary functions, and C is a matrix which might be the unit matrix. Then the potential during in#ation is <( )"e)I = (C\) (= )H . (399) L LK K L We see that the dependence of < on the "elds comes only from the = . For such "elds, #at L directions of global susy are preserved, provided that they are not spoiled by "elds that are displaced from the origin. We can have viable in#ation by choosing the in#aton to be one of these
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
111
#at directions. Note, though, that the t "elds have to be stabilized in the presence of the KI and C factors. LK One can quantify [289] the required accuracy of the assumptions by looking at Eq. (342). A slight violation of the conditions on = typically gives "g"&M\"=/= ", "= /= " or . Q ( Q "= /= ". A small contribution dK( ,s) to K gives, assuming s"0, R Q e&M\dK , .
(400)
"g!2e"&dK ,
(401)
where the prime denotes a typical partial derivative of dK. 8.3.2. Preserving the yatness in weakly coupled string theory In weakly coupled string theory, ignoring Green}Schwarz terms, K given by Eqs. (356) and (363) is of the required form if the and s "elds constitute a single untwisted sector (with the modulus a "eld), and the twisted "elds vanish to su$cient accuracy. Accordingly we can require the following conditions, to su$cient accuracy during in#ation [60,107]. 1. All derivatives of = with respect to matter "elds vanish, except for the one corresponding to a single untwisted "eld, say = . (One could allow more untwisted "elds from the I"3 sector ! without changing anything, and of course the choice I"3 is arbitrary.) 2. ="= "= " "0. (The easiest way of ensuring = "0 is to suppose that every term in Q ' ! ' the expansion Eq. (354) of = vanishes.) 3. The twisted "elds vanish. From Eqs. (400) and (401) it is actually enough to have the twisted "elds "xed at values ;M . . Also, condition 2 is accurate enough if "="/M , "= " and "= " are all ;"= ". These conditions are . Q ' ! straightforward to achieve if one ignores the dilaton, which is reasonable for models with <<10 GeV, provided that the dilaton contribution =(s) is the same during in#ation as it is in the true vacuum. The present scheme may not work for models with <:10 GeV. With these conditions in place, Eq. (342) gives <""= "/x x . (402) ! Flat directions in the untwisted I"3 sector are preserved, if their #atness is not spoiled by coupling to "elds with non-zero values, and one of them can be the in#aton. It could also be t , or a combination. Note that the analogous procedure in the case of a single modulus would not work, because of the factor 3 in front of Eq. (350). The above strategy preserves the #atness of the globally supersymmetric potential at all values of the in#aton "eld. This is possible because the in#aton is supposed to belong to an untwisted sector, and string theory gives the part of the KaK hler potential depending only on the sector for all "eld values. If the in#aton "eld is small it may be enough to keep the in#aton mass small, and this can be achieved provided that one knows the relevant part of the KaK hler potential up to quartic terms. For the twisted "elds, Eq. (356) gives the required information if we assume that K depends on the untwisted "elds and bulk moduli only through the x . In general, Eq. (356) gives the usual result '
112
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
m9< /M, but an exception has been noted [53,54]; if Eq. (376) applies, with FQ"0 and . m ,e)"=" negligible, then m vanishes provided that the in#aton "eld has weight q "3. L All this is in weakly coupled string theory. In Horava}Witten M-theory, K receives an extra contribution Eq. (367). If the a are of order 1 this contribution will presumably give us back the ' generic result m9< /M. . 8.3.3. Case of a linear superpotential Returning to Eq. (402), we have to ensure the stability of t and t . This is achieved [60] if = comes from a term K , with K independent of the matter "elds. Then, modular invariance ! ! requires KJg\(t )g\(t ), and <J["g(t )g(t )"x x ]\ . (403) To discuss the stability of the moduli, we can set the matter "elds equal to zero so that x "t #tM . ' ' ' As shown in [60], < is stabilized at t "t "e p up to modular transformations. The masses squared of the canonically normalized t and t turn out to be precisely
and acquire vevs when the D term is driven to (practically) zero. From Eq. (352), one sees that the vevs " " and " " will be proportional to respectively x and x , making < given by Eq. (402) independent of these quantities.
The other "xed point in the fundamental domain, namely t "1, is a saddle point of potential (403); see e.g. [102]. ' (To be more precise, t "1 is a "xed point if in addition to modular invariance there is symmetry under Im t P!Im t , ' ' ' which is the case in the present model.) This was considered in [60,289], but the factors K in the D term in Eq. (338) were not considered whereas they are in L fact crucial. The ratio " "/" " can be "xed, for instance, by gauging a non-anomalous ;(1) symmetry under which and
have opposite and equal charges.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
113
Flat directions are now preserved in all of the untwisted sectors, provided that they are not spoiled by the displacement of "elds from the origin. Any of them is a candidate for the in#aton, and so are each of the moduli t . ' The same thing actually works [107] in the toy model with a single modulus t; taking all three "elds to belong to the (single) untwisted sector one eliminates the x dependence appearing in Eq. (351). Other authors using Eq. (351) for in#ation suppose that x is "xed, either by an ad hoc functional form for K(x) [205,251,241,31], or by a loop correction [106]. The "rst option seems unsatisfactory, and in the second option the status of the loop correction during in#ation is not clear. We have not yet considered the stability of the dilaton, either in the D-term model or in the one with a linear superpotential. This has been investigated [107] with the (real part of the) dilaton in a linear multiplet, using a model [37] which stabilizes the dilaton in the true vacuum. The dilaton is stabilized in the model with a linear superpotential, but not in the D-term model in the simple form given above. However, the vevs induced by the D term can then induce additional vevs through the F term. It was shown [107] that this can stabilize the dilaton, while preserving the #atness in one or more of the untwisted directions. (By &stability' we mean existence of a minimum in the potential, with all "elds except the dilaton "xed. Starting from a wide variety of initial conditions, the dilaton will typically settle down to the minimum [27].) To have a complete model, one also needs to end in#ation, and because of the form we are imposing on = this will probably require hybrid as opposed to single-"eld in#ation. No complete example has yet been given for the particular superstring-derived theory that we are considering, but one can presumably be constructed along the lines of the following model [289]. The model works with a superpotential that has the general properties listed at the beginning of the last subsection. The KaK hler potential is assumed to be of the form Eq. (398), with for simplicity C"e)I "1 so that the potential is the same as in global susy, but its detailed form is not considered. Also, K " H is used when calculating the D term. The model contains one s "eld and L L three t "elds. Working with units M "1, the superpotential is . ="j t t #j tL s (404) with n52. The D-term is taken to be < "g(m!t!t#t#ns) , " and it is assumed that
(405)
j mL\;g . (406) It is assumed that during in#ation t#t(m. Then s and t will be driven to zero, and so will the derivatives of = with respect to , t and t . The potential then becomes (407) <"j t#jtL#g(m!t!t) .
We use the same symbol for the square and the modulus-squared of a "eld since it is obvious from the context which is intended.
114
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
During in#ation it is assumed that
'(g/j)(m!t) . (408) Then t is driven to zero, along with the derivative of = with respect to t . From Eq. (406) t has a constant value given approximately by m!tKnjmL\/g;m . (409) Restoring M , we conclude that during in#ation, there is an exactly #at potential with magnitude . given by m L\ <K(j (m , (410) M . in the regime
(411)
' ,(j /j )(n(m/M)L\(m . . A This scheme is similar to the scheme of D-term in#ation that we consider later, but di!ers from it in two crucial respects. One is that the loop correction is much smaller, because the D-term is much smaller. As a result, there is no need for the in#aton "eld to have the dangerous value &M . The . other is that the COBE normalization <:10\M can be achieved without supposing that . (m is so small. 8.3.5. Simple global susy models of inyation In these examples we took seriously the requirement of modular invariance. We end by considering a couple of models that ignore this requirement, while using a superpotential of the required form Eq. (396). It would not be di$cult to generalize them so that modular invariance is satis"ed, though the stability of the moduli and dilaton may require care. The mutated hybrid in#ation model Eq. (223) is generated by [227] (412) ="Ks (1!t /M)#(j t s . The s "elds are driven to zero, giving Eq. (223) with < "K. The COBE normalization, Eq. (226), with M&M , corresponds to K&10}10 Gev. It was suggested [290] that K could be . identi"ed with gaugino condensation scale, though it is not clear how that might be achieved. To obtain inverted hybrid in#ation one can use [227] ="K(1!j t/K)s .
(413)
This drives s to zero, and after adding suitable mass terms one obtains Eq. (216). The mass terms can come from the supergravity corrections. (A more complicated superpotential was given much earlier [253], but the in#aton trajectory turned out to be unstable [251].) 8.4. Models with the superpotential linear in the inyaton We next turn to models where the superpotential during in#ation is linear in the in#aton "eld [60,88,89,194,213,74], or linear except for small corrections [185,148,149]. The "rst case gives
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
115
hybrid in#ation with a potential whose slope is dominated by a loop correction. The second case gives single-"eld in#ation with an inverted quadratic, or higher-order, potential. This paradigm has been widely regarded as a way of keeping supergravity corrections under control. Unfortunately, the analysis leading to that viewpoint is likely to be incorrect, since it assumes that all of the "elds in Nature have values ;M during in#ation. To see what is going on, . "rst suppose that this assumption is correct. Then the in#aton mass-squared is given by Eq. (393), and one can see [60] that indeed the "rst two terms cancel if the superpotential is linear in the in#aton "eld. So to achieve a su$ciently small mass one need only tune down the coe$cient of the relevant quartic term in K, below its natural value &M\. Arguably, this is preferable to . arranging an accurate cancellation. But now suppose that there are "elds , with values M . One L . sees from Eq. (342) that with the minimal form K" " ", each such "eld contributes L g"M\" "K1. There is no reason to suppose that the non-minimal form presumably holding . L in reality will give a much smaller contribution. So one is back with a cancellation and the paradigm has no special virtue. Speci"c examples of "elds with values of order M are the dilaton . and bulk moduli that emerge from string theory. We focus on the hybrid in#ation model, where the superpotential during in#ation is exactly linear in the in#aton "eld. The "eld whose radial part will be the in#aton is a gauge singlet, denoted by S. The original version of the model [60] used the superpotential ="S(it t!k) ,
(414)
where i is a dimensionless coupling. This form does not allow t to be charged under any symmetry, but one can change it to [88] ="S(itM t!k) .
(415)
Here, t and tM are oppositely charged under all symmetries so that their product is invariant. The absence of additional terms involving S is enforced to all orders if S is charged under a global ;(1) R-symmetry, and up to a "nite order if it is charged under a discrete (Z ) symmetry. As we noted , before, only the latter seems to be allowed in the context of string theory. Instead of putting in the scale k by hand, one may derive it [74] from dynamical supersymmetry breaking by a quantum moduli space (Section 7.5.4). The canonically normalized in#aton "eld is ,(2"S", and the global susy potential is <"i ("t"#"tM ")#"itM t!k" . (416) This has the same general form as original tree-level hybrid in#ation potential, Eq. (205), with zero in#aton mass. The interaction with then gives t and tM identical 2;2 mass matrices for their real and imaginary parts, and after diagonalizing one "nds masses m "i $ki . (417) ! The critical value is therefore given by "2k/i. For ' , "t"""tM ""0 and we have slow-roll A A in#ation. The potential is exactly #at at tree level, but the loop correction gives a signi"cant slope [88]. Indeed, using Eq. (327) and remembering that there are two chiral multiplets, one "nds the
116
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
potential we wrote down in Eq. (240), with K"k and Cg"i. We already worked out the prediction of this potential for n, and its COBE normalization. Some authors [255,213] have considered the possibility that quadratic and quartic tree-level terms are signi"cant, with the former assumed to come only from the quartic term of the KaK hler potential and the latter assumed to come only from the factor e)+.. According to the analysis of Section 8.2, neither of these assumptions is very reasonable. 8.5. A model with gauge-mediated susy breaking Now we consider a global susy model [87,267] in which = does not have the form Eq. (396). Our discussion somewhat extends the original one. In this model, the supergravity corrections are presumed to be small because of an accident. As we shall see, a very severe cancellation is required. The model assumes that there is gauge-mediated susy breaking in the true vacuum, which also operates during in#ation. It uses a particle physics model [83] which replaces the k parameter of the MSSM by a term j M\LSL>. The "eld I .
,(2Re S becomes the in#aton. In this model, gravitations do not pose a cosmological problem, while the moduli problem is ameliorated. The superpotential is supposed to be bXS>N SK> ="! # #j M\LSL>H H #2 . (418) I . 3 " MN MK . . This structure can be enforced by discrete symmetries. The dots represent the contributions to = that do not involve S. They generate among other things the vev F , which we take to be real 6 and positive, and close to the vacuum value discussed in Section 7.6.2. The third term generates the k term, but plays no role during in#ation. The case p"m"2 is considered. Adding a negative mass-squared term that is supposed to come from supergravity, the potential along the real component of S (denoted by the same symbol) is
(419)
with j"8bM\F . (420) . 6 The constant term < is given by < "F !3M\e)+."=" , (421) 6 . and as is usual in gauge-mediated models the origin of the last term is not unspeci"ed. One can determine the vacuum value of S by minimizing this potential, and using the vacuum value XKF /K with K&10 Gev. Assuming X;S, one "nds S&bMF and F :MK or 6 . 6 . (F :10 Gev. By setting <"0 in the vacuum, one "nds that in this case < &bF. In the 6 opposite case X<S, one "nds S&MK/F and (F 910 Gev. Then, . 6 6 < &(KM/bX)F . (422) . 6
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
117
One can check that X is negligible during in#ation (assuming that like F it has almost the same 6 value as in the true vacuum). The potential then becomes the one that we analyzed in Section 6.5. We found that the COBE normalization requires (bF &10 Gev, marginally consistent with 6 the upper limit for gauge-mediated susy breaking if b is close to 1. This corresponds to X&10 Gev. Using Eq. (422) one "nds < &10\F . 6
(423)
The generic supergravity contributions to m are of order F /M&10< /M. In contrast with 6 . . the usual situation, the generic contributions have to be suppressed to at one part in 10, even if n is signi"cantly di!erent from 1 (n!1"mM/< K0.1). . 8.6. The running inyaton mass model revisited Now we look in more detail at the theory behind the running mass model of Section 6.16. 8.6.1. The basic scenario The fundamental assumption of the model is that the sector of the theory occupied by the in#aton is hidden from the sector where supersymmetry is spontaneously broken, and communicates with it only through interactions of gravitational strength. We shall call the former the in-aton sector, and the latter the in-ationary SSB sector. The in#aton sector is supposed to be described by a renormalizable global susy theory, with soft susy breaking terms. In other words, there is supposed to be gravity-mediated supersymmetry breaking, in the in#aton sector during in#ation. It is not necessary, for the viability of the model, to assume anything about the in#ationary SSB sector. But the simplest thing is to identify the in#ationary SSB sector with the one that generates susy breaking in the true vacuum, which we call the vacuum SSB sector. Also, one might suppose that the susy-breaking scales are the same, M &M . In that case, we shall have M &10 Gev 1 if there is gravity-mediated susy breaking in the visible sector, and 10:M :10 Gev if there is gauge-mediated susy breaking in the visible sector. (Presumably, the in#aton sector is di!erent from the visible sector, though they might conceivably be identical if there is gravity-mediated susy breaking in the visible sector.) Even if the in#ationary and vacuum SSB sectors are identical, it is not inevitable that the susy-breaking scales are the same. Take for instance the case of gaugino condensation, where that scale is determined by =(s). Even if =(s) has the same functional form in the two cases, it will not have the same value because s will be di!erent. But =(s) might be a di!erent function during in#ation. For instance, gaugino condensation might occur only after in#ation. If it does occur, the value of b might be di!erent, because during in#ation some of the "elds which contribute to the running of the gauge coupling and are light in the true vacuum, become heavy and no longer contribute to the renormalization group equation of the gauge coupling [160].
118
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
At the Planck scale, the in#aton mass-squared m( ) (along with other soft susy breaking parameters in the in#aton sector) is supposed to have its generic magnitude given by Eq. (394) with M K< , "m(M )"&< /M . (424) . . The mass-squared is supposed to run strongly with scale, so that it becomes small which allows in#ation to occur. 8.6.2. Directions for model building Although a complete model is far from being written down, one can see some basic features that will be needed. The complete potential might look roughly like Eq. (205). If the mass m is also generated by soft R susy breaking, then as we noted in Section 6.9 t would have a vev of order M ; it might be a matter . "eld with non-renormalizable terms suppressed to high order by a discrete symmetry. On the other hand, m might be bigger and come from some other mechanism, in which case t could be a more R ordinary "eld. The quartic coupling of Eq. (205) could come from a term (jS t in the superpotential, with S some "eld that vanishes during in#ation. The alternative coupling in Eq. (211), plus an identical term with Pt that we did not consider for simplicity, could come [266] from a term t/K in 34 the superpotential. One will have to avoid the strong cancellation between the terms of Eq. (391), that is present in the true vacuum. In the case of gauge-mediated susy breaking in the true vacuum, this might require an understanding of the origin of the sector that generates the magnitude of = in the true vacuum, which is so far something of a puzzle. In the opposite case, the situation maybe under better control, since one could use an explicit model (such as the one of Ref. [37]) which already speci"es all of the relevant quantities in the true vacuum. 8.6.3. Running with a gauge coupling Following [294,62], we calculate the running in#aton mass, on the assumption that the in#aton is charged under a gauge group and that its Yukawa couplings have a negligible e!ect. The RGEs have the same form as the well-known ones that describe the running of the squark masses with only QCD included, da/dt"(b/2p)a ,
(425)
(d/dt)(mJ /a)"0 ,
(426)
dm/dt"!(2c/p)amJ . (427) ( Here a is related to the gauge coupling by a"g/4p, mJ is the gaugino mass, and t,ln( /M )(0. . The numbers b and c depend on the group; c is the Casimir quadratic invariant of the in#aton To be more correct, the relevant coe$cients in the expansion (333) are supposed to be of order 1 at the Planck scale, in Planck units. They run, which is equivalent to running the susy-breaking parameters even though the latter may really be de"ned only below a lower scale where supersymmetry breaks.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
119
representation under the gauge group, for instance c"(N!1)/2N for any fundamental representation of SU(N), and b"!3N#N for a supersymmetric SU(N) with N pairs of fermions in the fundamental/antifundamental representation. The renormalization group equations can easily be solved. The result is
1 2c . (428) m( )"m# mJ 1! [1!(ba /2p)ln( /M )] b . Here m is the in#aton mass, mJ is the gaugino mass, and a is the gauge coupling, all evaluated at the Planck scale. We want the magnitude of m to decrease as one goes down from the Planck scale. This requires m(0, corresponding to models (i) or (ii) of Section 6.16. We evaluate c, p and q to leading order in a, which is presumably all that is justi"ed in a one-loop calculation. It is convenient to use the following de"nitions: k,!mM/< , . 2cmJ M ., A,! b < aJ ,!ba/2p ,
(429) (430) (431)
y,[1#aJ ln( /M )]\ , . y ,(1#k/A . HH Applying the linear approximation one "nds [63]
(432) (433)
c"2y A aJ , HH q"2A y (y !1) . HH HH If m continues to run until the end of slow-roll in#ation, p is given by
(434) (435)
4A y "y !y " HH HH , (436) y #y HH where y "(y $A\), with the plus sign for model (i) and the minus sign for model (ii). HH Using this result, one can calculate the COBE normalization, and the spectral index n(N). There are four cases to consider, corresponding to asymptotic freedom or not, and models (i) or (ii). Except for the case of model (ii) and no asymptotic freedom, there is [63] a region of parameter space that is allowed by the observational constraints described in Section 6.16, and includes the theoretically favoured values a &10\ to 10\, "k "&"A "&1 and 10 Gev:<:10 Gev. ln p"2y (y\!y\)#ln HH HH
8.7. A variant of the NMSSM The model we just considered supposes that there is soft susy breaking in the in#aton sector, and that the relevant soft susy-breaking parameters all have their natural values at the Planck scale. In particular, the in#aton mass is supposed to satisfy "m"&< /M there. We now consider a model . [29}31] which also assumes soft susy breaking in the in#aton sector, with all relevant parameters
120
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
except the in-aton mass at natural values (and actually negligible running). But the in#aton mass is supposed to vanish at the Planck scale, presumably because it occupies a special subsector in which no-scale supergravity holds. This last feature may be di$cult to arrange, since the model requires an accurate cancellation in Eq. (391) and therefore a nonzero value for = (see the remarks in Section 7.10.2). (The speci"c proposal [31] invokes the weakly coupled string theory expressions of Section 7.9, but it requires a "eld with vanishing modular weight whereas one expects nonzero weights.) In this model, the in#aton sector is actually (part of) the visible sector, and it is assumed that gravity-mediated susy-breaking holds with M KM . 1 The model [29}31] works with a variant of the next-to-minimal Standard Model [99,249,65]. The relevant part of the superpotential is ="jNH H !k N , (437) 3 " where H and H are the usual Higgs "elds and N and are two standard model gauge singlet "elds. 3 " The actual next-to-minimal Standard Model is recovered if the last term of Eq. (437) becomes !kN, which leads to a Z symmetry and possible cosmological problems with domain walls. In the variant, the Z becomes a global ;(1), which is in fact the Peccei-Quinn symmetry commonly invoked to ensure the CP invariance of the strong interaction. This symmetry is spontaneously broken in the true vacuum because and N acquire vevs. The axion is the pseudo-goldstone boson of this symmetry, and axion physics requires 1 2&1N2&10 to 10 GeV or so. (Higher values are allowed in some models, but not this one.) The latter value is adopted to make the in#ation model work. The axion is practically massless, and by a choice of the axion "eld one can make real. It is going to be the in#aton, and during in#ation H H is negligible. Writing (2N"N #iN , and 3 " including a soft susy-breaking trilinear term 2A k N#c.c. (with A taken to be real) as well as I I soft susy-breaking mass terms, the potential is 1 1 <"< #k"N"# m( )N# m , G G 2 ( 2 G where
(438)
m( )"m!2kA #4k , (439) I m( )"m#2kA #4k . (440) I The parameters m , A and m are supposed to be generated by a gravity-mediated mechanism, G I ( which is the same as in the true vacuum, and it is supposed that the susy breaking scale is also the same. This is supposed to give generic values m &A &1 TeV for all of these parameters except m . G I ( The latter is supposed to vanish at the Planck scale, being generated by radiative corrections as described in a moment. The constant term < comes from some other sector of the theory, and it is supposed to dominate the potential. We take this real to be canonically normalized, which means that the original complex is (2 times the canonically normalized object.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
121
The true vacuum corresponds to 1 2"A /4k , (441) I (442) 1N 2"(A /2(2k)(1!4m/A , I I 1N 2"0 . (443) We ignored the tiny e!ect of m in working out the non-zero vevs. It is assumed that 4m is ( somewhat below A, so that I A &k1N 2&k1 2&1 TeV . (444) I To have the vevs at the axion scale, say 10 GeV, we require k&10\. Also, j should have a similar value, since j1N 2 will be the k parameter of the MSSM. The tiny couplings k and j are supposed to be products of several terms like (t/M ) where t is . the vev of a "eld that is integrated out. The structure of such terms may be enforced through discrete symmetries derived from string theory. The same terms can ensure Peccei}Quinn symmetry to su$cient accuracy, without actually invoking a global symmetry. In the example given [29], is charged under a Z as well as the Z already encountered, which forbids terms B up to d"15 in the superpotential. One must in any case forbid them up to dK8, to satisfy the constraint Eq. (168). During in#ation, the "elds N are trapped at the origin, and G (445) <"< #m . ( The "eld N is destabilized if lies between the values (446)
!"(A /4k)(1$(1!4m /A)&A /k . I I , I If m is positive the model gives ordinary hybrid in#ation ending at >, but if it is negative it ( gives inverted hybrid in#ation ending at \. We shall see that the radiative corrections actually give the latter case. The height of the potential is <&A /(k&10 GeV. The COBE normaliz I ation, Eq. (178) or Eq. (203), is therefore A "2.5;10\"n!1"e!VM , (447) I . where x,"n!1"N. This requires n to be completely indistinguishable from 1, "n!1"&10\. The corresponding in#aton mass, given by "n!1""2m< /M is ( . m &100(A /M ) M &eV . (448) ( I . . The loop correction generating m comes from the N , and their fermionic partner which has ( G mass-squared 4k . In the regime
122
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
Strictly speaking the derivation of this result holds only for Q
& !&A /k, and in this regime we should take Q&A /k to minimize the loop correction. I I Thus we are somewhat below the regime where the result is valid, but it hopefully gives a rough approximation. If so we are dealing with an inverted hybrid in#ation model. Somewhat remarkably, the magnitude m &kA agrees with the COBE normalization m & eV, within the uncer( I ( tainties of A and k. I Finally, we note that because the running of the in#aton mass is weak, its use is optional; instead of using it, we could set m "0, and generate the slope of the in#aton potential from the loop ( correction, Eq. (449), with Q"M . . 9. D-term in6ation D-term in#ation can preserve the #at directions of global susy (and in particular keep the in#aton potential #at) provided that one of the contributions to < contains a Fayet-Iliopoulos " term as in Eq. (294), and that all "elds charged under the Fayet-Iliopoulos ;(1) are driven to negligible values so that <"(g/2)m. This was "rst pointed out by Stewart [289], who exhibited a hybrid in#ation model which uses the F term to drive all of the charged "elds to zero. He considered only the tree-level potential without any de"nite proposal for its slope. Signi"cant progress came when BineH truy and Dvali [35] and Halyo [132] pointed out, in the context of a somewhat simpler tree-level potential, that the loop correction gives a well-de"ned slope. This lead to an explosion of interest in D-term in#ation [150,91,52,222,220,268,182,96,40]. 9.1. Keeping the potential yat We initially make the usual assumption that the "elds charged under the ;(1) vanish exactly. Then (452) < "(Re f )\gm . " In this tree-level potential, the only dependence of < on the "elds comes from the gauge kinetic " function f. It has non-renormalizable terms, and so does = that appears in the supposedly negligible F-term. If " ";M , only low-dimensional terms are dangerous, and they can be . eliminated using a suitable discrete symmetry [220,182]. Unfortunately, we shall "nd that " " is of The argument of the log in the loop correction is actually 2k /Q, so one might argue that the appropriate scale is Q&k &A which is much lower. But the e!ect of including k here is the same order of magnitude as the e!ect of I including the two-loop correction, and is presumably negligible. This is because making k small also makes the running slower. A slightly di!erent version of the model, which actually was the main focus of his paper, gives the F-term in#ation model mentioned in Section 8.3.4. A single-"eld model of in#ation with a Fayet}Iliopoulos D-term, and the in#aton charged under the relevant ;(1), had been considered earlier [50,51]. It gives the inverted quadratic potential considered in Section 6.4, and is viable only under the unlikely assumption that the in#aton charge is ;1. In any case it does not preserve the #at directions of global susy. We are here taking g to be a constant, and f"1 at the origin as in Eq. (334). Later we adopt the convention that (Re f )\ is absorbed into g, making the latter a function of the "elds.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
123
order M , and may be bigger. This makes the non-renormalizable terms di$cult to control [182], . as well as those of K. (As well as directly a!ecting the potential, the latter can give a non-trivial kinetic term, which alters the potential after going to a canonically normalized "eld). We proceed on the assumption that non-renormalizable terms of = and f turn out to be negligible; in particular we assume f"1. With = being under control, and the "elds charged under the ;(1) exactly zero, the KaK hler potential K can have no e!ect, and the coe$cients j and j can be much smaller than < /M. This B . will be crucial, in view of the fact that the in#aton "eld is of order M . . In some versions of D term in#ation the charged "elds are not driven to zero. If their contribution to < is a signi"cant fraction of the total, the terms K in Eq. (338) (and similar ones " L from D terms involving other gauge groups under which they are charged) will generically spoil in#ation [40]. The conclusion seems to be that the charged "elds should be driven to su$ciently small values, even if they do not vanish. Finally, let us mention that, if the Fayet-Iliopoulos term is to come from string theory, see Eq. (297), the corresponding D-term scales like g J(Re s)\. The problem here is that, assuming that the D-term dominates over any other F-term, the potential during in#ation appears to prefer Re sPR and therefore < P0. This is the D-term in#ation equivalent of the dilaton runaway " problem that appears in string theories in the true vacuum. However, it has been argued [160] that the physics of gaugino condensation in 10-dimensional E E superstring theories is likely to be modi"ed during the in#ationary phase in such a way as to enhance the gaugino condensation scale. This may allow the dilaton to be stabilized by the F term [160], though one has to check that the latter does not generate dangerous supergravity corrections to the in#aton potential. 9.2. The basic model At least one of the charged "elds should have negative charge q , so that the D term is driven to L zero in the vacuum (or at least to a value much smaller than (g/2)m, as in Section 7.6.1). One has to give such negatively charged "elds couplings which drive them to small values during D-term in#ation. The proposal of Refs. [289,35,132] is that every negatively charged "eld has a partner with all charges opposite. It can then couple to the in#aton in the F term, and acquire a positive mass-squared during in#ation. Suppose for simplicity that there is just one pair, . The in#aton is supposed to be the radial ! part of an uncharged "eld S ( "(2"S"). There is a term in the superpotential ="jS . (453) > \ Since are going to be driven to zero, it will be enough to use the global susy expression for the ! D-term, giving <"j (" "#" ")#j" "#g(" "!" "#m) . (454) \ > > \ > \ The global minimum is supersymmetry conserving, but the gauge group ;(1) is spontaneously broken 1 2"1 2"0 , > 1 2"m . \
(455) (456)
124
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
However, if we minimize the potential, for "xed values of , with respect to other "elds, we "nd that for bigger than (457)
,(g/j)(2m the minimum is at " "0. Thus, for ' and " "0 the tree-level potential has > \ > \ a vanishing curvature in the direction and large positive curvature in the remaining two directions (458) m "j $gm . ! For ' , the tree level value of the potential has the constant value <"gm. This is a hybrid in#ation model. At tree level, the potential <( ) is perfectly #at, and its
dependence comes from the loop correction. Supersymmetry is spontaneously broken, and inserting Eq. (458) into Eq. (327) gives
g j g ln , <"< , m 1# \ 16p 2k 2
(459)
where k is the renormalization scale. We can generalize the model by including more than one pair of "elds , with charges q and L! L superpotential couplings j . Then the one-loop potential becomes L g j g ln . (460) <"< , m 1#C \ 16p 2k 2
where C" q . (461) L L (In the log we took all j to have a common value j but this is not essential since j does not a!ect L the slope of the potential.) Since the ;(1) generated by string theory is anomalous, corresponding to q O0, there have to L be some unpaired charges. If they are positive, they will be driven to zero and be irrelevant, and that is assumed in the paradigm under consideration. However, in weakly coupled string theory one actually expects unpaired negative charges which might ruin this paradigm [96]. That case is discussed in Section 9.3. In#ation with this potential was discussed in Section 6.15. As noted there, slow-roll in#ation will end when is reached, or when it gives way to fast roll, whichever is sooner. In the latter case, fast roll begins when g&1, at (462)
"((Cg/8p)M . . This is about the same as given by Eq. (457), so it depends on the parameters which happens "rst. If fast roll begins "rst in#ation will continue for an e-fold or so, ending when the oscillation amplitude falls below . According to Eq. (243), is comparable with the Planck scale, and may be bigger. If we increase the slope of <, by assuming that a tree-level contribution dominates the loop correction [222], this will increase (see Eq. (40)). The only hope of reducing it would be a cancellation between
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
125
the loop and a tree-level contribution, which seems unlikely over a range of . As mentioned earlier, the large value of means that non-renormalizable terms in the potential and the kinetic function are not under good control, but we proceed on the assumption that they turn out to be harmless. The COBE normalization is (m"8.5;10 GeV(50C/N)
(463)
The scale impose by COBE is clearly lower than the prediction Eq. (297) of weakly coupled string theory, which is a second worry for the model. Indeed, Eq. (297) requires
192p 50C 5.9;10 g " :10\ . Tr Q N 2.4;10
(464)
Such a value is unreasonable, since the dilaton during in#ation would presumably have to be far away from the true vacuum value, placing it outside the domain of attraction of that value. How can we get around this problem? The most obvious possibility is that weakly coupled string theory is replaced by something else, such as Horava}Witten M-theory, which might give a lower value for m. At the time of writing, it is not clear whether this is an open possibility or not [231,41]. Another possibility is to make m lower by decoupling its origin from string theories. But to avoid putting it in by hand, one should generate it in some low-energy e!ective theory after some degrees of freedom have been integrated out. But to do this, one has presumably to break supersymmetry by some F-terms present in the sector which the heavy "elds belong to and to generate the D-term by loop corrections. As a result, it turns out that 1D2;1F2, unless some "ne-tuning is called for, and large supergravity corrections to g appear again. Let us give an example. Consider the following superpotential where a ;(1) symmetry has been imposed [267]: ="jX(UM U !m)#M UM U #M UM U . (465) For jm;M,M, the vacuum of this model is such that 1 2"1 M 2"0 (i"1,2), where M and G G G
are the scalar components of the super"elds UM and U , respectively. Supersymmetry is broken G G G and F "!jm. This means that in the potential a term like <"(F M #h.c.) will appear. It 6 6 is easy to show that, integrating out the and M scalar "elds, induces a nonvanishing FayetG G Iliopoulos D-term
M F , 6 ln (466) mK 16p(M!M) M which is, however, smaller than F and in#ation, if any, is presumably dominated by the F-term. 6 Staying with the high value of m, one might consider increasing the COBE normalization by supposing that the slope of the potential is bigger than the loop contribution. For instance, it might come from a term m , generated by the F term [222]. In general, one has 100 < g "1.9 . (467) Tr Q M .
126
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
But even with the maximum allowed value <&10\M , g is still unreasonably low. This is . a second problem for D-term in#ation, though unlike the large-"eld problem it depends on details of the underlying string theory. 9.3. Constructing a workable model from string theory The presence of the Fayet-Iliopoulos D-term Eq. (297) in weakly coupled string theory leads to the breaking of supersymmetry at the one-loop order at very high scale, the string scale. This option is generically not welcome from the phenomenological point of view because it induces too large soft susy breaking masses via gravity e!ects, mJ &m/M . The standard solution to this puzzle is . to give a nonvanishing vev to some of the scalar "elds which are present in the string model and are negatively charged under the anomalous ;(1). In such a way, the Fayet-Iliopoulos D-term is cancelled and supersymmetry is preserved. In the context of string theory, this procedure is called `vacuum shiftinga since it amounts to moving to a point where the string ground state is stable. While maintaining the D-and F-#atness of the e!ective "eld theory, such vacuum shifting may have important consequences for the phenomenology of the string theory. Indeed, the vacuum shifting not only breaks the ;(1), but may also break some other gauge symmetries under which the "elds which acquire a vev are charged. This is because the anomalous ;(1) is usually accompanied by a plethora of nonanomalous ;(1)'s. In the true vacuum, the vacuum shifting can generate e!ective superpotential mass terms for vector-like states that would otherwise remain massless or may even be responsible for the soft mass terms of squarks and sleptons at the TeV scale. In string theories the protection of supersymmetry against the e!ects of the anomalous ;(1) is extremely e$cient. If we now apply a sort of `minimal principlea [87,91] requiring that a successful scenario of D-term in#ation should arise from `realistica string models leading to the SU(3) SU(2) ;(1) gauge structure at low energies, the cancellation of the Fayet-Iliopoulos ! * 7 D-term by the vacuum shifting mechanism may represent (and usually does) a serious problem. Indeed, one has to make sure that during in#ation the Fayet-Iliopoulos D-term is not cancelled by one of the many scalar "elds which are negatively charged under the anomalous ;(1) and are not coupled to the in#aton. This usually leads to the conclusion that a successful D-term in#ationary scenario in string theory require many in#atons to render the vacuum shifting mechanism inoperative and it is clear that only a systematic analysis of #at directions in any speci"c model may answer these and similar questions. This requires the identi"cation of possible in#atons and D- and F-#at directions for a large class of perturbative string vacua. This classi"cation is a prerequisite to address systematically the issue of in#ation in string theories as well as the phenomenological issues at low energy [57,146]. As an illustrative example of the possible complications one has to face in building up a successful model of D-term in#ation in the framework of 4D string models [96], one may consider
Notice, however, that if the vacuum shifting in the true vacuum is not complete, because of the presence of some nonvanishing F terms, it may give rise to interesting phenomenological implications. This is what happens for the model described in Section 7.6.1. A vector-like set of "elds is one having zero total charge.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
127
the massless spectrum of a compacti"cation on a Calabi}Yau manifold with Hodge numbers h , h , etc. The four-dimensional gauge group is SO(26);;(1). There are then h left-handed chiral supermultiplets transforming as (26, ()(1, !2() and h supermultiplets transforming as (26, !()(1, 2(). In this case the ;(1) is anomalous because h and h are not equal. Indeed, suppose that h !h '0. In such a case, out of the total h #h chiral supermultip lets, there are only 2 h left-handed chiral supermultiplets which may form h vector-like pairs under the ;(1) and give a vanishing contribution to Tr Q. The remaining h #h !2 h " h !h "elds will give a non-vanishing contribution to the Fayet-Iliopoulos D-term. Taking into account the multiplicity of the "elds, the one-loop D-term Eq. (297) is therefore given by g M 24 (h !h ) . m" . 2 192p (3
(468)
We suppose the model has a gauge singlet "eld S which will play the role of the in#aton. Further we assume that there is a discrete R-symmetry that ensures S-#atness. These assumptions are quite ad hoc and in a realistic model we would have to demonstrate the existence of such a "eld, but we use this simple example to illustrate another problem that must be overcome if one is to obtain a realistic string model of D-term in#ation. With this "eld one may try to construct an in#ationary potential. Gauge symmetries and the fact that h !h '0 impose that one can generate masses only for the h vectorlike combinations of the SO(26) singlet and non-singlet "elds via the couplings in the superpotential of the form ="jS[(26, (1/3) ) (26,!(1/3)#(1, !2(1/3) ) (1, 2(1/3)] .
(469)
Therefore only 2 h "elds get a mass j1S2 and become very massive during in#ation. This means that they decouple from the theory and do not contribute anymore to the Fayet-Iliopoulos term Eq. (468). On the other hand, the remaining (h !h ) "elds transforming as (26, () (1,!2() remain light because they cannot couple to the in#aton and give a contribution to Eq. (468), which remains, therefore, unchanged. The (h !h ) SO(26) singlet "elds with ;(1) charge !2(1/3, let us denote them by , are now available to cancel the anomalous D-term G because Q " "(0, as is expected if supersymmetry is not to be broken by the Fayet-Iliopoulos G G G D-term. However, this prevents one from implementing D-term in#ation because the scalar potential dependence on the "elds arises only through the anomalous D-term. The vacuum G expectation values of the "elds will rapidly #ow to cancel the D-term preventing in#ation from G occurring. This example illustrates the problem in implementing D-term in#ation in a string theory. It arises because the minimum of the potential does not generically break supersymmetry through the anomalous D-term and so there must be light "elds (here the ) with the appropriate ;(1) charge G
In this section the charges are represented by uppercase letters.
128
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
to cancel it. To implement D-term in#ation these "elds must acquire a mass for large values of S but this was not possible in this example because the were protected by chirality from acquiring mass G by coupling to the S "eld. Thus, we conclude that it is crucial to consider all "elds with non-trivial ;(1) quantum numbers when discussing the possible in#ationary potential in the framework of string theories. We will consider now further examples to capture other possible aspects of D-term in#ation in string theories [96]. For illustrative purposes, we will use the speci"c string models, discussed in [55,97] whose space of #at directions was recently analysed in [57]. The emphasis will be on exploring the di!erent possibilities that may be realized rather than proposing a working model of in#ation. In so doing we will often restrict the analysis to some subset of the "elds present in the model and ignore the rest. In view of what we concluded above, this is not consistent, but the examples that follow should only be considered as toy models attempting to capture some of the stringy characteristics one should expect when trying to construct a fully realistic model of D-term in#ation in string inspired scenarios. The presence of several (non-anomalous) additional ;(1) factors is a generic property of string models. For the discussion of D-term in#ation, the relevant objects are thus no longer single elementary "elds but rather multiple-"eld directions in "eld space along which the D-term potential of the non-anomalous ;(1)'s vanishes [58]. These directions would be truly #at if an anomalous ;(1) (or some F-terms) were not present. To study whether a given direction remains #at in the presence of the anomalous ;(1) , the important quantity is the anomalous charge Q along the direction. If the sign of this charge is opposite to that of the Fayet-Iliopoulos term, VEVs along the #at direction will adjust themselves to cancel the Fayet-Iliopoulos D-term and give a zero potential. If the charge has the same sign of the Fayet-Iliopoulos D-term, the potential along that direction rises steeply with increasing values of the "eld. The interesting case corresponds to zero anomalous charge, in which case the potential along the given direction is #at and equals, at tree level, g m/2. In that case, the direction can be the in#aton. The condition Q "0, ensuring tree-level #atness of the in#aton potential, is not by itself su$cient. We must also require that the direction is stable for large values of the in#aton, that is, all non-in#aton masses deep in the in#aton direction must be positive (or zero). However the Fayet-Iliopoulos D-term in the scalar potential will give a negative contribution to the massessquared of those "elds which have a negative anomalous charge: dm"g Qm . ( G
(470)
To ensure that masses are positive in the end one can use F-term contributions (to balance the negative FI-induced masses) coming from superpotential terms of the generic form d="jIU U , > \
(471)
where I stands for some product of "elds that enter the in#aton direction while U do not. Fields ! of type U and U which couple to the in#aton direction in the superpotential terms get a large > \ F-term mass, j1I2. Consider the simplest example, a toy model with two chiral "elds S and S of opposite ;(1) charges, so that the direction "S"""S """S " can play the role of the in#aton. Assume that deep in
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
129
this direction (S<(m) the masses of all other "elds are positive (or zero) and thus no other VEVs are triggered. Then we can minimize the D-term scalar potential
1 < " g Q("S "!"S ")# Q" "#m " 2 G G G 1 (472) # g Q? ("S "!"S ")# Q?" " , ? G G 2 ? G [where a"1,2,n counts the additional D-term contributions of the non-anomalous ;(1)'s] for S and S only. If m"0, "S """S " is #at and necessarily stable, as <"0. For m'0 however, the #at direction is slightly displaced and lies at
dS,"S "!"S ""!(g /G )Qm , (473) where G "g QQ# gQ?Q?. This displacement is the result of the destabilization e!ect GH G H ? ? G H of m referred to above and occurs when the "elds in the in#aton direction carry anomalous charge: as the in#aton direction must have zero anomalous charge, the "elds forming it have anomalous charges of opposite signs and one of them will get a negative mass of the form Eq. (470). Taking into account this displacement, the value of the potential along the in#aton direction is, at tree level 1 1 1 g m g(Q? ), g m 4 g m . (474) < " ? 2 G 2 2 ? As noted in Section 9.1, m should be very close to m in order not to spoil in#ation. For a viable in#ationary model we should ensure that the one-loop potential is appropriate to give a slow roll along the in#aton direction. Thus, we must consider the one-loop corrections proportional to the Yukawa couplings introduced in the terms of Eq. (471). The "eld-dependent masses for the scalar components of the chiral "elds U along the in#aton direction are ! m "j1I2#g Q (QdS#m)# gQ? Q? dS"j1I2#G dS#g Q m ! ! ? ! ! ! ? ,j1I2#g a m , (475) ! while the fermionic partners have masses-squared equal to j1I2. For large values of the "eld 1I2, the one-loop potential takes the form
j1I2 j1I2 !1 #g (a #a )mlog . 32pd< "2g (a #a )j1I2m log > \ > \ Q Q
(476)
In writing this potential we are assuming for simplicity that kinetic mixing of di!erent ;(1)'s is absent. For this to be a consistent assumption the vanishing of Tr(Q Q ) and Tr(Q Q ) is a necessary condition. ? ? @
130
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
In this more complicated model the scalar direction transverse to the in#aton gains a very large mass deep in the in#aton direction. In addition, the gauge boson corresponding to the broken ;(1) symmetry and one neutralino also become massive. These "elds arrange themselves in a massive vector supermultiplet, degenerate even if mO0, and their contribution to the one-loop potential along the in#aton direction cancel exactly. The potential of Eq. (476) can be also rewritten as a RG-improved tree-level potential with gauge couplings evaluated at the scale j1I2. The term quadratic in j1I2 would spoil the slow-roll condition necessary for a successful in#ation, but it drops out because g (a #a )"(G #G )dS#g (Q #Q )m"!G dS!g Q mJG dS > \ > \ > \ 'Y 'Y #g Qm"0 , (477) where we have made use of the ;(1) invariance of IU U to write the third expression which > \ vanishes by Eq. (473). The results just described for the simplest in#aton direction containing more than one "eld are generalizable to more complicated in#atons. One could have in#atons containing more than two elementary "elds while still having only a one-dimensional #at direction. Another possibility is that the #at direction has more than one free VEV (multidimensional in#atons). It is straightforward to verify that the results obtained above for two mirror "elds are generic provided the in#aton does not contain some subdirection capable of compensating the Fayet-Iliopoulos D-term. As the next step in complexity one can examine the case in which, besides the in#aton VEVs "S " and "S ", some other "eld u is forced to take a VEV (this can be triggered by m in the anomalous G D-term of the potential or by dS in any D-term). In general, the new VEV can induce further VEVs too. For simplicity, we assume that this chain of destabilizations ends with 1u 2. By minimizing the G D-term potential, all VEVs are determined to be dS""S "!"S ""!(g /det G)(GQ!G Q)m , (478) GG G G 1u2"!(g /det G)(!G Q#G Q)m , (479) G G G with det G"G G!G . The tree-level potential along this direction is GG G 1 1 m ggQ? Q@(Q? Q@!Q@ Q?)4 g m . (480) < " g ? @ G G G 2 det G 2 ?@ In this background, the masses of the scalar components of U appearing in the superpotential ! Eq. (471) are m "j1I2#g Q 1D 2# gQ? 1D 2"j1I2#g a m , ! ! ? ! ? ! ? and again, one "nds a #a "0. > \
(481)
In doing so, a careful treatment of the possibility of kinetic mixing of di!erent ;(1)'s is required. The details of our analysis are modi"ed in the presence of such mixing but the generic results are not changed.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
131
Table 3 List of non-Abelian singlet "elds with their charges under the ;(1) gauge groups. The charges of these "elds under ;(1) are zero and not listed Field
Q
Q
Q
Q
Q
Q
S S S S S S S S S
!1 !1 !1 !1 !1 !1 0 0 0
1 1 1 !1 !1 !1 2 1 1
0 0 0 0 0 0 0 !3 3
0 1 !1 0 1 !1 0 0 0
!2 1 1 !2 1 1 0 0 0
2 2 2 2 2 2 0 0 0
To illustrate the above discussion, consider the following example of a string model [97] that satis"es the conditions required for D-term in#ation, at least when we restrict the analysis to a subset of the "elds. The ;(1) charges of these "elds are listed in the Table 3 (we follow the notation of Ref. [57] with charges rescaled). For every listed "eld S , a `mirrora "eld S exists with G G opposite charges. At trilinear order the superpotential is ="S (S S #S S #S #S S )#S (S S #S S #S S #S S ) .
(482)
The role of the in#aton direction can be played by 1S S 2, formed by "elds with zero anomalous charge. However, for this to be viable there should be no higher-order terms in the superpotential involving just the in#aton directions "elds (or terms involving just a single nonin#aton direction "eld) for these will spoil the F-#atness of the in#aton direction. Given that slow-roll is expected to end at values of the in#aton "eld not much smaller than M , see Eq. (243), . only very high dimensional terms will be acceptable in the superpotential. 1S S 2 must be invariant under continuous gauge symmetries and so the only symmetry capable of ensuring such F-#atness is a discrete R-symmetry. Unfortunately, we do not know whether the models considered have such a discrete R-symmetry and thus they may allow the dangerous terms. Henceforth, we will ignore this problem and assume the dangerous terms are absent. The rest of the "elds acquire large positive masses deep in the in#aton direction due to the Yukawa couplings in Eq. (482), guaranteeing the stability of the in#aton direction S"S "S . One-loop corrections to the in#aton potential proportional to S are absent and only the &m log S dependence remains, providing the slow-roll condition. However, the end of in#ation poses a problem for the present example: no set of VEVs for the selected "elds can give zero potential. As is well known, a #at direction (<"0) is always associated with an holomorphic, gauge invariant monomial built of the chiral "elds. To compensate the FI-term and give <"0, this monomial should have negative anomalous charge. However, in the considered subset Q "Q /2 and all holomorphic, gauge invariant monomials must have then Q "0. To circumvent this
132
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
(S )"(Q ;Q )" problem we enlarge the "eld subset by adding an extra "eld, S with Q ? (!4;0,1,0,0,!2). It is easy to see that, for example, the #at direction 11,5,6,10,132 can cancel the FI-term and give <"0. Other #at directions exist, but clearly all of them involve S . However, the superpotential Eq. (482) does not provide a large mass for S when we are deep in the #at direction. Unless higher-order terms in Eq. (482) provide a positive mass for S , the FI-term induces a destabilization of the in#aton direction and S is forced to take a VEV: 1S2"(!g /G )Qm , (483) where we use the de"nition G "g QQ# gQ?Q?. This is not a problem in itself because the GH G H ? ? G H rest of the "elds are forced to have zero VEVs and so the potential cannot relax to zero. The presence of additional ;(1) factors prevents the vacuum shift that was problematic for the example of Section 4. The value of the potential in the presence of a VEV for S is (484) <" g m , with g(Q? ) m " ? ? m . (485) G The masses of the rest of the "elds are also a!ected and read: m"j1I2#(g /G )(QG !QG )m , (486) ( G G G G where j are some of the Yukawa couplings in Eq. (482). G In general, when all the "elds in the model are included, the presence of the Fayet}Iliopoulos D-term will induce VEVs for the "elds with negative anomalous charge which are not forced to have zero VEV by F-term contributions. These non-zero VEVs will in turn induce, through other D-terms, non-zero VEVs for other "elds, even if they have positive anomalous charge. Finding all the VEVs requires the minimization of a complicated multi"eld potential that includes both F and D contributions. In many cases the "eld VEVs adjust themselves to give <"0 and no D-term in#ation is possible. In other cases, however, especially in the presence of additional ;(1) factors, there is a limited number of "elds that must necessarily take a VEV to cancel the Fayet-Iliopoulos D-term. If the in#aton direction provides a large F-term mass for them, cancellation of the FI-term is prevented. Even if many other "elds are forced to take VEVs, no con"guration exists giving <"0 and D-term in#ation can take place in principle. To determine if that is the case, one should minimize the e!ective potential for large values of the in#aton "eld and determine all the additional vevs triggered by the FI-term. These VEVs, of order m will a!ect the details of the potential along the in#aton direction, both at tree level (o!ering the possibility of reducing the e!ective value of m) and at one-loop, via their in#uence on the "eld-dependent masses of other "elds. 9.4. D-term in-ation and cosmic strings Let us go back now to the basic model discussed in Section 9.2. The point we would like to comment on is the following [222]: when the "eld rolls down to its present day value \ 1 2"(m to terminate in#ation, cosmic strings may be formed since the anomalous gauge group \
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
133
;(1) is broken to unity [150]. As it is known, stable cosmic strings arise when the manifold M of degenerate vacua has a non-trivial "rst homotopy group, P (M)O1. The fact that at the end of hybrid in#ationary models the formation of cosmic strings may occur was already noticed in Ref. [151] in the context of global supersymmetric theories and in Ref. [213] in the context of supergravity theories. It has been recently shown [39] that (at least some of) the strings formed at the breaking of the anomalous ;(1) are local, in the sense that their energy per unit length can be localized in a "nite region surrounding the string's core, even though this energy is formally logarithmically in"nite. This happens because the axion "eld con"guration may be made to wind around the strings so that any divergence must come from the region near the core instead of asymptotically. Moreover, as we have seen in the previous section in realistic four-dimensional string models, there are extra local ;(1) symmetries that can be also spontaneously broken by the D-term. This happens necessarily if there are no singlet "elds charged under the anomalous ;(1) only. In such a case, there may arise local cosmic strings associated with extra ;(1) factors. In D-term in#ation the string per unit length is given by k"2pm. Cosmic strings forming at the end of D-term in#ation are very heavy and temperature anisotropies may arise both from the in#ationary dynamics and from the presence of cosmic strings. From recent numerical simulations on the cosmic microwave background anisotropies induced by cosmic strings [7,8,256] it is possible to infer than this mixed-perturbation scenario [213] leads to the COBE normalized value (m"4.7;10 GeV [150], which is of course smaller than the value obtained in the absence of cosmic strings. Moreover, cosmic strings contribute to the angular spectrum an amount of order of 75% in the simplest version of D-term in#ation [150], which might render the angular spectrum, when both cosmic strings and in#ation contributions are summed up, too smooth to be in agreement with present day observations [7,8]. Thus, even though cosmic strings produced at the end of D-term in#ation may play a fundamental role in the production of the baryon asymmetry [43], all the previous considerations and, above all, the fact that the value of (m is further reduced with respect to the case in which cosmic strings are not present, would appear to exacerbate the problem of reconciling the value of (m suggested by COBE with the value inspired by weakly coupled string theory when cosmic strings are present. One has to remember, however, the condition to produce cosmic strings is P (M)O1 and therefore consider the structure of the whole potential, i.e. all the F-terms and all the D-terms. When this is done, it turns out that, depending on the speci"c models, some or all of the (global and local) cosmic strings may disappear. In general, there can be models with anomalous ;(1) that have just global cosmic strings, just local cosmic strings, both global and local strings or, more important, no cosmic strings at all [50,51]. The latter case is certainly the most preferable case since the presence of cosmic strings renders the problem of reconciling the COBE normalized low value of m with the one suggested by string theory even worse. In the case in which the Fayet-Iliopoulos D-term is present in the theory from the very beginning because of anomaly-free ;(1) symmetry and not due to some underlying string theory, the value (m&10 GeV is very natural and is not in con#ict with the presence of cosmic strings. The only shortcoming seems to be a too smooth angular spectrum because cosmic strings may provide most contribution to the angular spectrum. If this problem is taken seriously and one wants to avoid the presence of cosmic strings, a natural solution to it is to assume that the ;(1) gauge group is broken before the onset of in#ation so that no cosmic strings will be produced when rolls down to its \
134
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
ground state. This may be easily achieved by introducing a pair of vector-like (under ;(1)) "elds W and WM and two gauge singlets X and p with a superpotential of the form ="X(iWM W!M)#bpWM U #jSU U , (487) > > \ where M is some high-energy scale, presumably the grand uni"ed scale. It is easy to show that the scalar components of the two-vector super"elds acquire vacuum expectation values 1t2"1tM 2"M, and 1X2"1p2"0) which leave supersymmetry unbroken and D-term in#ation una!ected. In this example, cosmic strings are produced prior to the onset of in#ation and subsequently diluted. 9.5. A GUT model of D-term in-ation A D-term in#ationary scenario may be constructed within the framework of concrete supersymmetric grand uni"ed theories (GUTs) where realistic fermion masses are predicted and the doublet-triplet splitting problem is naturally solved by the pseudo-Goldstone boson mechanism in SU(6) [91]. The presence of the D-term is essential in order to generate vacuum expectation values and therefore simplify the structure of the superpotential. As a by-product, the model has a built-in in#ationary trajectory in the "eld space along which all F-terms are vanishing and only the associated ;(1) D-term is nonzero. In this case, the COBE-normalized scale (m&10 GeV appears more natural to accept since the the GUT scale is of the same order of magnitude, even though it must be put in by hand along with two similar mass scales M and M. This model gives a four-component in#aton (Section 4), instead of the usual one-component in#aton. Its predictions depend on the initial conditions as well as on the potential, but for a signi"cant range of initial conditions they will be the same as for the other D-term in#ation models. A problem is that the "eld values while cosmological scales leave the horizon are of order M , making it questionable if the "eld theory is really under control. . The model is based on the SU(6) supersymmetric GUT with one adjoint Higgs R and a number of fundamental Higgses H , HM , H , HM Y. Each of these fundamentals transforms as a doublet of a certain custodial SU(2) symmetry that is required to solve the hierarchy problem. The index A A"1,2 is the SU(2) -index. The H , HM carry unit charges opposite to m and are the ones that A compensate ;(1) D-term in the present Universe. The superpotential reads ="cTr R#(aR#aX#M)H HM Y#(aR#aX#M)H HM . (488) Minimizing both the D- and the F-terms we get the following supersymmetric vacuum which leaves the Standard Model S;(3) S;(2) ;(1) as unbroken gauge symmetry: ! *
m , H "HM Y"0 , H "HM G"d d G G 2 aM!aM aM!aM R" diag(1, 1, 1,!1,!1,!1), X"! . aa!aa aa!aa
(489)
Here i,k"1,2,2,6 are SU(6) indexes. The role of the R vacuum expectation value is crucial since it leaves the unbroken SU(3) SU(3) ;(1) symmetry, consequently it can cancel masses of all ! * 7
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
135
upper three or lower three components of the fundamentals. The fundamental vevs are SU(5) symmetric, so that the intersection gives the unbroken standard model symmetry group. In this vacuum the electroweak Higgs doublets from H , HM , H , HM are massless. This is an e!ect of custodial SU(2) symmetry. Indeed, since H and HM break one of the SU(3) subgroups to A SU(2) , their electroweak doublet components become eaten up Goldstone multiplets and cannot * get masses from the superpotential due to the Goldstone theorem. This forces the vevs of R and X to exactly cancel their mass terms and those of H , HM , H , HM due to the custodial symmetry. This solves the doublet-triplet splitting problem in a natural way [90]. Quarks and leptons of each generation are placed in a minimal anomaly free set of S;(6) group: 15-plet plus two 6 -plets per family. We assume that 6 form a doublet under SU(2) so that A A"1,2 is identi"ed as SU(2) index. The fermion masses are then generated through the couplings A (SU(6) and family indices are suppressed) HM ) 15 ) 6 #e (H ) H /M )15 ) 15, where M has to be K K understood as the mass of order (m of integrated-out heavy states. When the large vevs of H and HM are inserted, the additional, vectorlike under SU(5)-subgroup, states: 5-s from 15-s and 5 -s from 6 , become heavy and decouple. Low energy couplings are just the usual SU(5)-invariant Yukawa interactions of the light doublets from H and HM with the usual quarks and leptons. The relevant branch for in#ation in the "eld space is represented by the SU(6) D- and F-#at trajectory parameterized by the invariant Tr R. This corresponds to an arbitrary expectation value along the component R"diag(1, 1, 1,!1, !1,!1)S/(6 .
(490)
The key point here is that above component has no self-interaction (i.e. Tr R"0) and appears in the superpotential linearly. At the generic point of this moduli space the gauge SU(6) symmetry is broken to SU(3)SU(3);(1). All gauge-non singlet Higgs "elds are getting masses O(S) and therefore, for large values of S, S<(m, they decouple. Part of them gets eaten up by the massive gauge super"elds. These are the components of R transforming as (3,3 ) and (3 ,3) under the unbroken subgroup. All other Higgs "elds get large masses from the superpotential. The massless degrees of freedom along the branch are therefore: two singlets S and X, the massless SU(3)SU(3);(1) super-Yang}Mills multiplet and the massless matter super"elds. By integrating out the heavy super"elds, we can write down an e!ective low energy superpotential by simply using holomorphy and symmetry arguments. This superpotential, as well as all gauge SU(6) D-terms, is vanishing. Were not for the ;(1)-gauge symmetry, the branch parameterized by S, would simply correspond to a SUSY-preserving #at vacuum direction remaining #at to all orders in perturbation theory. The D-term, however, lifts this #at direction, taking an asymptotically constant value for arbitrarily large S at the tree-level. This is because all Higgs "elds with charges opposite to m gain large masses and decouple, and m can not be compensated any more (notice that heavy "elds decouple in pairs with opposite charges and therefore Tr Q over the remaining low energy "elds is not changed). As a result, the branch of interest is represented by two massless degrees of freedom X and S whose vevs set the mass scale for the heavy particles, and a constant tree level vacuum energy density < "g1D2"gm which is responsible for in#ation. This result can be easily rederived by explicit solution of the equations of motion along the in#ationary branch. For doing this, we can explicitly minimize all D- and F-terms subject to large
136
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
values of S and X. The relevant part of the potential is (491) <""F Y"#"F M Y"#gD , & & since the remaining F- and D-terms are automatically vanishing as long as all other gauge-nonsinglet Higgses are zero. We would need to include them only if the minima of the potential Eq. (491) (subject to S,X<m) were incompatible with such an assumption. However for the branch of our interest this turns out to be not the case. As with the simpler models that we considered earlier, the negatively charged "elds that might drive < can acquire positive masses-squared from the F term. These "elds come purely from the " H,HM ,H,HM super"elds. These are the fragments (1,3),(1,3 ) and (3,1),(3 ,1) of the H,HM with massessquared
$gm >
(492)
and
$gm , \ where
(493)
,$aS/(6#aX#M , ! and the analogous fragments of the H,HM with masses-squared
(494)
$gm >
(495)
and
$gm , \ where
(496)
(497)
,$aS/(6#aX#M . ! For each of these four cases there are eight pairs of charged "elds. When and are both bigger than gm, there is in#ation. Including the loop correction the ! ! potential is
3g g ln(" "" "" "" ") . < " m 1# > \ > \ 16p 2
(498)
(To obtain this expression, we added the four contributions given by Eq. (460), with C"8 for each of them.) This potential is a function of four real "elds, namely the real and imaginary parts of S and X. As discussed in Section 4, there will in general be a family of non-equivalent in#ationary trajectories. We are dealing with a four-component in#aton, and the predictions depend in general on the initial conditions. However, for a signi"cant range of initial conditions, the in#aton
All other states either have vanishing charge (these are X,R and the gauge "elds), or have no in#aton dependent mass but positive charge (these are matter "elds).
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
137
trajectory after the observable Universe leaves the horizon will be roughly a straight line pointing towards the origin, in the space of the "elds. If is the canonically-normalized "eld along the trajectory, the in#aton potential is then given by Eq. (460) with C"96 (except for an insigni"cant change in < coming from the constant ratio of and the argument of the log). From the estimate Eq. (243), one sees that in this case, when the observable Universe leaves the horizon, is at least of order M and maybe of order 10M . One needs the former case to have any . . chance of keeping the "eld theory under control. Notice that in the usual hybrid in#ationary scenarios in#ation is terminated by the rolling down of a Higgs "eld coupled to the in#aton and consequent phase transition with symmetry breaking. Whenever the vacuum manifold has a non-trivial homotopy, the topological defects will form much in the same way as in the conventional thermal phase transition. Thus, the straightforward generalization of the hybrid scenario in the GUT context would result in the post-in#ationary formation of the unwanted magnetic monopoles. In the model proposed in [91] this disaster never happens, since the in#aton "eld is the GUT Higgs itself. The GUT symmetry is broken both during and after in#ation and the monopoles (even if present at the early stages) get inevitably in#ated away. The unbroken symmetry group along the in#ationary branch is G "SU(3)SU(3);(1) SU(2);(1) which gets broken to G "SU(3)SU(2);(1);(1) modulo the electroweak phase transition (extra ;(1)-factor is global). Since p (G /G )"0 no monopoles are formed. The model described above demonstrates that D-term in#ation may satisfy a sort of `minimal principlea [87] which requires that any successful in#ationary scenario should naturally arise from models which are entirely motivated by particle physics considerations and should not involve (usually complicated and ad hoc) sectors on top of the existing structures.
10. Conclusion In the face of increasingly accurate observations of the cosmic microwave background anisotropy and of the galaxy distribution, slow-roll in#ation seems to provide the only known origin for structure in the Universe. In this review we have seen how to build models of in#ation, and test them against observation. What is the point of such an exercise? To address this question, one needs to understand what is meant by a model of in#ation. One can think of a model as something analogous to a building. It has an outer shell, which is visible to the casual observer, but hopefully also something inside. The shell is a speci"cation of the form in#ationary potential. In a single-"eld model the potential depends only on the in#aton "eld, while in a hybrid model it depends on one or more additional "elds. Observation, notably through the spectral index of the density perturbation, can discriminate sharply between di!erent shells. Most, and perhaps all, of the present zoo of shells will be rejected by observations in the 10}15 years, culminating with the Planck satellite that will give an
If the gauge ;(1) is a stringy anomalous ;(1), it will be broken by the dilaton even if all other charged "elds vanish. In this case the unbroken symmetry has to be understood as a global one. Other ways of solving the monopole problem exist in previous papers [290,192,226,213].
138
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
essentially complete measurement of the cmb anisotropy. One can imagine that eventually just one basic form for the shell is singled out by the community, which by virtue of its intrinsic beauty and its accurate description of the observations is likely to be the one chosen by Nature. Then, in a sense, there will be a consensus about the origin of all structure in the Universe. One will have arrived at the rather boring conclusion, that it probably comes from a certain scalar "eld potential! Things are very di!erent when we come to consider the interior of the shell. Here, one recognizes that the in#ationary potential is part of an the extension of the Standard Model, that is supposed to describe the fundamental interactions at the level of "eld theory. The "eld theory description is, hopefully, an approximation to some more fundamental theory like weakly coupled string theory or Horava}Witten M-theory. Although di!erent interiors generally have di!erent shells, that is not inevitable as we have seen in more than one example. At this point, in#ation model-building becomes part of the enterprise that has occupied the particle physics community for more than two decades. That is, to "nd the extension of the Standard Model that has been chosen by Nature. Because there is so little guidance from observation, this enterprise has been driven by theoretical considerations to an extent that is unprecedented in the history of science. In particular, the rich structure of supersymmetry is almost always assumed because it seems to be the only way of avoiding a certain type of extreme "ne-tuning. In the forseeable future we shall "nd out whether supersymmetry and other theoretical structures have been chosen by Nature, and therefore whether pure thought has successfully pulled so far ahead of observation. Whether positive or negative, this resolution will surely be a permanent landmark in the history of the human intellect. Assuming that current ideas are basically correct, one still has to ask to what extent it will ever be possible to discriminate between di!erent fundamental theories. Observation by itself provides, so far, only a few numbers relevant to this purpose, together with some upper and lower limits. Among them are the parameters of the Standard Model and, if one accepts the increasingly strong evidence, one or two numbers relating to the neutrino masses. There is also strong evidence for non-baryonic dark matter, which probably has to be in the form of one or more as-yet undiscovered particle species. And "nally, coming to the concern of this review, there is the magnitude of the spectrum of the primeval density perturbation, measured on the scales explored by COBE. Among the quantities with crucial upper or lower limits one might mention on the particle physics side the Higgs masses, neutrino masses and mixing angles, the proton lifetime and the electric dipole moment of the neutron. As we have seen, one should add to these the limit on the departure from scale invariance represented by the result "1!n"(0.2, and the upper limit of order 50% on the relative contribution of gravitational waves to the spectrum of the cmb anisotropy. These lists are incomplete but they serve to explain the role of in#ation. It will add to the precious collection of numbers and limits, that guide us in a search for what lies beyond the Standard Model. Possibly there will even be a non-trivial function, n(k), that requires explanation.
This is the usual viewpoint but one can vary it. Maybe there is only one mathematically consistent theory that gives anything resembling physical reality, in which case we have in principle little need of observation. Maybe the usual assumption that there are many possible theories is correct, but many or all of them have been realized by Nature in di!erent parts of the universe, that may or may not be connected with the homogeneous Universe around us. These variations make no di!erence for the present purpose.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
139
Analogously with the situation concerning the outer shell of a model of in#ation, the hope is that the community will eventually be able to agree that some model of the fundamental interactions is likely to be the one that Nature has chosen, by virtue of its intrinsic beauty and accurate agreement with the few numbers provided by observation. Because the numbers are few, this would hardly be possible at the level of a "eld theory, but it might be possible at the level of something like string theory where there are essentially no free parameters and everything is dictated by group theoretic and topological considerations. With this perspective, let us look at some of the models of in#ation that are presently under consideration. As we have discussed at length, supersymmetry is both a blessing and a curse for in#ation model-building. It is a blessing, primarily because it allows one to understand the existence of scalar "elds. As a bonus, it can practically eliminate the quartic term in the in#aton potential, which would normally spoil in#ation. It is a curse, because in a generic supergravity theory all scalar "elds have masses that are too big to support in#ation. Let us recall ways of handling this problem. According to supergravity, the potential is the sum of an F-term and a D-term. In most models the F-term dominates and we consider them "rst. With an F-term of generic form, the in#aton mass is too big. One can suppose that it is suppressed by an accidental cancellation, but one can instead invoke a non-generic form, which guarantees the suppression. Such a form can emerge from weakly coupled heterotic string theory, though probably not from Horava-Witten M-theory. Alternatively, one can suppose that while the in#aton mass is indeed unsuppressed at the Planck scale, quantum corrections drive it to a small value at lower scales so as to permit in#ation after all. At the present time this &running mass' model looks quite attractive. A di!erent strategy is to suppose that a Fayet-Illiopoulos D-term dominates, with the charged "elds driven to zero. These models have received a lot of attention because at least in the simplest versions they have two remarkable features. One is that supergravity corrections to the in#aton mass are absent. The other is that there is an accurate prediction for the spectral index, n"0.96 to 0.98 which will eventually be testable. Further investigation, though, has revealed a serious problem. In contrast with the F-term models, the in#aton "eld value has to be at least of order M . . As a result, one has gained control of the in#aton mass, only to be in danger of losing it for the quartic and higher terms of the potential. In string theory there are two additional problems. One is the existence of "elds which are liable to drive the D-term to zero. The other is that the predicted magnitude of the cmb anisotropy is far higher than the COBE measurement. It is fair to say that D-term in#ation is under considerable pressure at the moment. The predictions of di!erent models for the spectral index n, and for its scale dependence, are summarized in Tables 1 and 2. Remarkably, the eventual accuracy *n&0.01 o!ered by the Planck satellite is just what one might have speci"ed in order to distinguish between various models, or at least between their various shells. At the most extravagant, one might have asked for *n&10\. In summary, observation will discriminate strongly between models of in#ation during the next 10 or 15 years. By the end of that period, there may be a consensus about the form of the in#ationary potential, and at a deeper level we may have learned something valuable about the nature of the fundamental interactions beyond the Standard Model. We shall also have con"rmed, or practically rejected, the remarkable hypothesis that in#ation is responsible for structure in the Universe.
140
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146
11. Postscript At the "nal proof-reading, observation is beginning to pin down the cosmological parameters, and therefore the spectral index. A preliminary estimate [298] is "n!1"(0.05. Looking at Table 1, this would rule out a potential of the form <"< (1!c ), and almost rule out one of the form <"< (1!c ). The latter case is practically equivalent to the form chosen for the "rst viable model of in#ation, Eq. (186), so that form is almost ruled out as well. For a number of other forms of the potential (Table 2) the preliminary result for n places a non-trivial lower limit on N, the number of e-folds of in#ation occurring after cosmological scales leave the horizon. It seems that we are already entering the promised land, the golden age of cosmology!
Acknowledgements DHL is grateful to CfPA and LBL, Berkeley, for the provision of "nancial support and a stimulating working environment when this work was started. He is indebted to Ewan Stewart and Andrew Liddle for long-standing collaborations, and to David Wands for many useful conversations. He has also received valuable input from Mar Bastero-Gil, Laura Covi, Mary Gaillard, Andrei Linde, Hitoshi Murayama, Hans-Peter Nilles, Burt Ovrut, Graham Ross and Subir Sarkar. AR is grateful to the Theoretical Astrophysics group at Fermilab, where this work was initiated, for the incomparable stimulating atmosphere. In particular, he is indebted to Scott Dodelson, Will Kinney and Rocky Kolb for many stimulating conversations and for continuously spurring his e!orts. He is also grateful to Michael Dine, Gia Dvali, Jose Ramon Espinosa, Steve King, Andrei Linde and Graham Ross for enjoyable collaborations. DHL acknowledges support from PPARC and NATO grants, and from the European Commission under the Human Capital and Mobility programme, contract No. CHRX-CT94-0423.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]
F.C. Adams, K. Freese, Phys. Rev. D 43 (1991) 353. F.C. Adams et al., Phys. Rev. D 47 (1993) 426. J.A. Adams, G.G. Ross, S. Sarkar, Phys. Lett. B 391 (1997) 271. J.A. Adams, G.G. Ross, S. Sarkar, Nucl. Phys. B 503 (1997) 405. I. A%eck, M. Dine, N. Seiberg, Nucl. Phys. B 256 (1985) 557. A. Albrecht, P.J. Steinhardt, Phys. Rev. Lett. 48 (1982) 1220. B. Allen, R.R. Caldwell, E.P.S. Shellard, A. Stebbins, S. Veeraraghavan, FERMILAB-PUB-97-334-A preprint. B. Allen, R.R. Caldwell, S. Dodelson, L. Knox, E.P.S. Shellard, A. Stebbins, Phys. Rev. Lett. 79 (1997) 2624. L. Alvarez-GaumeH , M. Claudson, M. Wise, Nucl. Phys. B 207 (1982) 96. U. Amaldi, W. de Boer, H. Furstenau, Phys. Lett. B 260 (1991) 447. G.W. Anderson, A. Linde, A. Riotto, Phys. Rev. Lett. 77 (1996) 3716. I. Antoniadis, N. Arkani-Hamed, S. Dimopoulos, G. Dvali, Phys. Lett. B 436 (1998) 257. I. Antoniadis, K.S. Narain, T.R. Taylor, Phys. Lett. B 267 (1991) 37. N. Arkani-Hamed, S. Dimopoulos. G. Dvali, Phys. Lett. B 429 (1998) 263. N. Arkani-Hamed, H. Murayama, Phys. Rev. D 56 (1997) 673. N. Arkani-Hamed, M. Dine, S.P. Martin, Phys. Lett. B 431 (1998) 329.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146 [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67]
141
J. Atick, L. Dixon, A. Sen, Nucl. Phys. B 292 (1987) 109. J. Bagger et al., Phys. Rev. D 55 (1997) 3188 for a review of the phenomenological signals of gauge mediation. D. Bailin, A. Love, Supersymmetric Gauge Field Theory and String Theory, IOP, Bristol, 1994. T. Banks, L. Dixon, D. Friedan, E. Martinec, Nucl. Phys. B 299 (1988) 613. T. Banks et al., Phys. Rev. D 52 (1995) 3548. J.M. Bardeen, Phys. Rev. D 22 (1980) 1882. J.M. Bardeen, P.S. Steinhardt, M.S. Turner, Phys. Rev. D 28 (1983) 679. J.M. Bardeen, J.R. Bond, D. Salopek, in: A. Coley, C. Dyer, B. Tupper (Eds.), Proc. 2nd Canadian Conf. on General Relativity and Relativistic Astrophysics, Toronto, Canada World Scienti"c, Singapore, 1988. J.M. Bardeen, J.R. Bond, G. Efstathiou, Astrophys. J. 321 (1990) 28. T. Barreiro et al., hep-ph/9602263. T. Barreiro, B, de Carlos, E.J. Copeland. Phys. Rev. D 58 (1998) 083513. J.D. Barrow, A.R. Liddle, P. Parsons, Phys. Rev. D 50 (1994) 7222. M. Bastero-Gil, S.F. King, Phys. Lett. B 423 (1998) 27. M. Bastero-Gil, S.F. King, hep-ph/9801451. M. Bastero-Gil, S.F. King, hep-ph/9806477. M.C. Bento, O. Bertolami, Phys. Lett. B 384 (1996) 98. Z. Berezhiani, Z. Tavartkiladze, Phys. Lett. B 409 (1997) 220. P. BineH truy, M.K. Gaillard, Phys. Rev. D 34 (1986) 3069. P. BineH truy, G. Dvali, Phys. Lett. B 388 (1996) 241. P. BineH truy, E. Dudas, Phys. Lett. B 389 (1996) 503. P. BineH truy, M.K. Gaillard, Y. Wu, Nucl. Phys. B 493 (1997) 27; Phys. Lett. 412 (1997) 228. P. BineH truy et al., Phys. Lett. B 403 (1997) 38. P. BineH truy, C. De!ayet, P. Peter, Phys. Lett. B 441 (1998) 52. P. BineH truy, G. Dvali, A. Riotto, in preparation. P. BineH truy, C. De!ayet, E. Dudas, P. Ramond, Phys. Lett. B 441 (1998) 163. R. Bousso, A. Linde, Phys. Rev. D 58 (1998) 083503. R. Brandenberger, A. Riotto, Phys. Lett. B 445 (1998) 323. A. Brignole, L.E. Ibanez, C. Munoz, Nucl. Phys. B 422 (1994) 125. R. Brustein, P.J. Steinhardt, Phys. Lett. B 302 (1993) 196. M. Bucher, A.S. Goldhaber, N. Turok, Phys. Rev. D 52 (1995) 3314. E.F Bunn, A.R. Liddle, M. White, Phys. Rev. D 54 (1996) 5917R. G.L. Cardoso, B.A. Ovrut, Phys. Lett. B 298 (1993) 292. B.J. Carr, J.H. Gilbert, J.E. Lidsey, Phys. Rev. D 50 (1994) 4853. J.A. Casas, C. Munoz, Phys. Lett. B 216 (1989) 37. J.A. Casas, J.M. Moreno, C. Munoz, M. Quiros, Nucl. Phys. B 328 (1989) 272. J.A. Casas, G.B. Gelmini, Phys. Lett. B 410 (1997) 36. J.A. Casas, G.B. Gelmini, Phys. Lett. B 410 (1997) 36. J.A. Casas, G.B. Gelmini, A. Riotto, to appear. S. Chaudhuri, G. Hockney, J. Lykken, Nucl. Phys. B 469 (1996) 357. K. Choi, J.E. Kim, Phys. Lett. B 154 (1985) 393. G. Cleaver, M. Cvetic\ , J.R. Espinosa, L. Everett, P. Langacker, Nucl. Phys. B 525 (1998) 3. G. Cleaver, M. Cvetic\ , J.R. Espinosa, L. Everett, P. Langacker, to appear. A.G. Cohen, D. Kaplan, A.E. Nelson, hep-th/9803132. E.J. Copeland, A.R. Liddle, D.H. Lyth, E.D. Stewart, D. Wands, Phys. Rev. D 49 (1994) 6410. G.D. Coughlan, R. Holman, P. Ramond, G.G. Ross, Phys. Lett. B 140 (1984) 44. L. Covi, D.H. Lyth, L. Roszkowski, hep-ph/9809310. L. Covi, D.H. Lyth, Phys. Rev. D 59 (1999) 063515. R.L. Davis et al., Phys. Rev. Lett., 69 (1992) 1856; ibid. 70 (1993) 1733. J.-P. Derendinger, C.A Savoy, Nucl. Phys. B 237 (1984) 307. L. Dixon, V. Kaplunovsky, J. Louis, Nucl. Phys. B 329 (1990) 27. L. Dixon, V. Kaplunovsky, J. Louis, Nucl. Phys. B 355 (1991) 649.
142 [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119]
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146 N. Deruelle, K. Gundlach, D. Langlois, Phys. Rev. D 45 (1992) 3301. N. Deruelle, K. Gundlach, D. Langlois, Phys. Rev. D 46 (1992) 5337. M.S. Turner, Phys. Rev. D 44 (1991) 3737. K.R. Dienes, Phys. Rep. 287 (1997) 447. A. Vilenkin, Phys. Rev. D 57 (1998) 7069. S. Dimopoulos, S. Raby, Nucl. Phys. B 219 (1983) 479. S. Dimopoulos, G. Dvali, R. Rattazzi, Phys. Lett. B 410 (1997). E.T. Vishniac, K.A. Olive, D. Seckel, Nucl. Phys. B 289 (1987) 717. A useful review of these questions is given by M. Dine, hep-th/9207045. M. Dine, Supersymmetry phenomenology, hep-ph/9612389 and refs. therein. M. Dine, W. Fischler, M. Srednicki, Nucl. Phys. B 189 (1981) 575. M. Dine, W. Fischler, Phys. Lett. B 110 (1982) 227. M. Dine, M. Srednicki, Nucl. Phys. B 202 (1982) 238. M. Dine, W. Fischler, Nucl. Phys. B 204 (1982) 346. M. Dine, W. Fischler, D. Nemeschansky, Phys. Lett. B 136 (1984) 169. M. Dine, A.E. Nelson, Y. Nir, Y. Shirman, Phys. Rev. D 53 (1996) 2658. M. Dine, N. Seiberg, E. Witten, Nucl. Phys. B 289 (1987) 585. M. Dine, I. Ichinose, N. Seiberg, Nucl. Phys. B 293 (1987) 253. M. Dine, A.E. Nelson, Y. Nir, Y. Shirman, Phys. Rev. D 53 (1996) 2658. M. Dine, A. Riotto, Phys. Rev. Lett. 79 (1997) 2632. G. Dvali, Q. Sha", R. Schaefer, Phys. Rev. Lett. 73 (1994) 1886. G. Dvali, hep-ph/9605445. G. Dvali, S. Pokorski, Phys. Rev. Lett. 78 (1997) 807. G. Dvali, A. Riotto, Phys. Lett. B 417 (1998) 20. J. Ellis, D.V. Nanopoulos, K.A. Olive, K. Tamvakis, Phys. Lett. B 127 (1983) 331. J. Ellis, S. Kelley, D.V. Nanopoulos, Phys. Lett. B 249 (1990) 441; B 260 (1991) 131. K. Enqvist, D.V. Nanopoulos, Nucl. Phys. B 252 (1985) 508. K. Enqvist, D.V. Nanopoulos, M. Quiros, C. Kounnas, Nucl. Phys. B 262 (1985) 538. J.R. Espinosa, A. Riotto, G.G. Ross, Nucl. Phys. B 531 (1998) 461. A. Faraggi, Phys. Lett. B 278 (1992) 131. A.E. Faraggi, J.C. Pati, Nucl. Phys. B 526 (1998) 21. P. Fayet, Nucl. Phys. B 90 (1975) 104. P. Fayet, J. Iliopoulos, Phys. Lett. B 51 (1974) 461. S. Ferrara, D. LuK st, S. Thiesen, Phys. Lett. B 233 (1989) 147. A. Font, L. Iban ez, D. Lust, F. Quevedo, Phys. Lett. B 245 (1990) 401. A. Font, L. Iban ez, F. Quevedo, A. Sierra, Nucl. Phys. B 331 (1990) 421. K. Freese, J. Frieman, A.V. Olinto, Phys. Rev. Lett. 65 (1990) 3233. J.N. Fry, Y. Wang, Phys. Rev. D 46 (1992) 3318. M.K. Gaillard, H. Murayama, K.A. Olive, Phys. Lett. B 355 (1995) 71. M.K. Gaillard, D.H. Lyth, H. Murayama, Phys. Rev. D 58 (1998) 123505. J. Garcia-Bellido, Phys. Lett. B 418 (1998) 252. J. Garcia-Bellido, D. Wands, Phys. Rev. D 52 (1995) 6739. J. Garcia-Bellido, D. Wands, Phys. Rev. D 53 (1996) 5437. J. Garcia-Bellido, A.R. Liddle, D.H. Lyth, D. Wands, Phys. Rev. D 52 (1995) 6750. J. Garcia-Bellido, A. Linde, D. Wands, Phys. Rev. D 54 (1996) 6040. Y. Wang, Phys. Rev. D 50 (1994) 6135. T. Gherghetta, A. Riotto, L. Roszkowski, Phys. Lett. B 440 (1998) 287. T. Gherghetta, G.F. Giudice, A. Riotto, Phys. Lett. B 446 (1998) 28. G.F. Giudice, A. Masiero, Phys. Lett. 206 (1988) 480. G.F. Giudice, R. Rattazzi, hep-ph/9801271. C. Giunti, C.W. Kim, U.W. Lee, Mod. Phys. Lett. 16 (1991) 1745. J.R. Gott, Nature 295 (1982) 304.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146 [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169]
143
S. Gottlober, V. Muller, A.A. Starobinsky, Phys. Rev. D 43 (1991) 2510. S. Gottlober, J.P. Mucket, A.A. Starobinskii, Astrophys. J. 434 (1994) 417. A.M. Green, A.R. Liddle, Phys. Rev. D 54 (1996) 2557. A.M. Green, A.R. Liddle, A. Riotto, Phys. Rev. D 56 (1997) 7559. See, for instance, the two volumes of the book Superstring Theory, M.B. Green, J.H. Schwarz, E. Witten, Cambridge Univ. Press, Cambridge, 1987. M.B. Green, J. Schwarz, Phys. Lett. B 149 (1984) 117. M. Grisaru, W. Siegel, M. Rocek, Nucl. Phys. B 159 (1979) 429. L.P. Grishchuk, Ya.B. Zel'dovich, Sov. Astron. 22 (1978) 125. S. Weinberg, Phys. Rev. D 9 (1974) 3357. A.H. Guth, Phys. Rev. D 23 (1981) 347. A.H. Guth, S.-Y. Pi, Phys. Rev. Lett. 49 (1982) 1110. H.E. Haber, G.L. Kane, Phys. Rep. 117 (1985) 75. E. Halyo, Phys. Lett. B 387 (1996) 43. S.W. Hawking, G.F.R. Ellis, The Large-Scale Structure of Space-Time Cambridge University Press, Cambridge, 1973. S.W. Hawking, Phys. Lett. B 115 (1982) 295. S.W. Hawking, N. Turok, Phys. Lett. B 425 (1998) 25; gr-qc/9802062; Phys. Lett. B 432 (1998) 271. H.M. Hodges, G.R. Blumenthal, L.A. Kofman, J.R. Primack, Nucl. Phys. B 335 (1990) 197. H.M. Hodges, G.R. Blumenthal, Phys. Rev. D 42 (1990) 3329. H.M. Hodges, Phys. Rev. Lett. 64 (1990) 1080. H.M. Hodges, J.R. Primack, Phys. Rev. D 43 (1991) 3155. R. Holman et al., Phys. Rev. D 43 (1991) 3833. P. Horava, E. Witten, Nucl. Phys. B 475 (1996) 94; ibid B 460 (1996) 506. L.E. Iban ez, H.-P. Nilles, F. Quevedo, Phys. Lett. B 187 (1987) 25. L.E. Iban ez, D. Lust, Nucl. Phys. B 382 (1992) 305. K. Intriligator, N. Seiberg, hep-th/9509066. K. Intriligator, S. Thomas, Nucl. Phys. B 473 (1996) 121. N. Irges, S. Lavignac, Phys. Lett. B 424 (1998) 293. K. Izawa, T. Yanagida, Prog. Theory Phys. 95 (1996) 829. K.I. Izawa, T. Yanagida, Phys. Lett. B 393 (1997) 331. K.I. Izawa, M. Kawasaki, T. Yanagida, Phys Lett. B 411 (1997) 249. R. Jeannerot, Phys. Rev. D 56 (1997) 6205. R. Jeannerot, Phys. Rev. D 53 (1996) 5426. D.E. Kaplan, F, Lepeintre, A. Masiero, A.E. Nelson, A. Riotto, hep-ph/9806430. V. Kaplunovsky, J. Louis, Phys. Lett. B 306 (1993) 269. J. Wess, J. Bagger, Supersymmetry and Supergravity, Princeton University Press, Princeton, 1983. S.Yu. Khlebnikov, I.I. Tkachev, Phys. Rev. Lett. 77 (1996) 219. S. Khlebnikov, I. Tkachev, Phys. Lett. B 390 (1997) 80. S. Khlebnikov, I. Tkachev, Phys. Rev. Lett. 79 (1997) 1607. S. Khlebnikov, I. Tkachev, Phys. Rev. D 56 (1997) 653. T.W.B. Kibble, J. Phys. A 9 (1976) 1387. S.F. King, A. Riotto, Phys. Lett. B 442 (1998) 68. W.H. Kinney, K.T. Mahanthappa, Phys. Rev. D 52 (1995) 5529. W.H. Kinney, K.T. Mahanthappa, Phys. Rev. D 53 (1996) 5455. W.H. Kinney, A. Riotto, hep-ph/9704388. W.H. Kinney, A. Riotto, Phys. Lett. B 435 (1998) 272. L. Knox, A. Olinto, Phys. Rev. D 48 (1993) 946. T. Kobayashi, H. Nakano, Nucl. Phys. B 496 (1997) 103. H. Kodama, M. Sasaki, Prog. Theory Phys. Suppl. 78 (1984) 1. E. Witten, Phys. Lett. B 155 (1985) 151. L.A. Kofman, A.D. Linde, Nucl. Phys. B 282 (1987) 555.
144 [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218]
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146 L.A. Kofman, D.Yu. Pogosyan, Phys. Lett. B 214 (1988) 508. L.A. Kofman, A.D. Linde, A.A. Starobinsky, Phys. Lett. B 157 (1985) 361. L. Kofman, D. Pogosyan, Phys. Lett. B 214 (1988) 508. L. Kofman, G.R. Blumenthal, H. Hodges, J.R. Primack, in: D.W. Latham, L.N. da Costa (Eds.), Proc. Workshop on LS Structure, Rio, 1989. L. Kofman, A.D. Linde, A.A. Starobinsky, Phys. Rev. Lett. 73 (1994) 3195. L. Kofman, A.D. Linde, A.A. Starobinsky, Phys. Rev. Lett. 76 (1996) 101. L. Kofman, The origin of matter in the Universe: reheating after in#ation, astro-ph/9605155, UH-IFA-96-28 preprint, 16pp. L. Kofman, In: B. Jones, D. Markovic (Eds.), Relativistic Astrophysics: A Conference in Honor of Igor Novikov's 60th Birthday, for a more recent review and a collection of refs. L. Kofman, A.D. Linde, A.A. Starobinsky, Phys. Rev. D 56 (1997) 3258. E.W. Kolb, Phys. Scr. T 36 (1991) 199. E.W. Kolb, M.S. Turner, The Early Universe, Addison-Wesley, Reading, MA, 1990. E.W. Kolb, A.D. Linde, A. Riotto, Phys. Rev. Lett. 77 (1996) 4290. C. Kolda, J. March-Russell, hep-ph/9802358. E.W. Kolb, S.L. Vadas, Phys. Rev. D 50 (1994) 2479. A. Kosowsky, M.S. Turner, Phys. Rev. D 52 (1995) 1739. K. Kumekawa, T. Moroi, T. Yanagida, Prog. Theor. Phys. 92 (1994) 437. D. La, P.J. Steinhardt, Phys. Rev. D 62 (1989) 376. A.B. Lahanas, D.V. Nanopoulos, Phys. Rep. 145 (1987) 1. P. Langacker, M.-X. Luo, Phys. Rev. D 44 (1991) 817. E. Witten, Nucl. Phys. B 258 (1985) 75. G. Lazarides, Q. Sha", Phys. Lett. B 308 (1993) 17. G. Lazarides, Q. Sha", Nucl. Phys. B 392 (1993) 61. G. Lazarides, C. Panagiotakopoulos, Phys. Rev. D 52 (1995) 559. G. Lazarides, Q. Sha", Phys. Lett. B 372 (1996) 20. G. Lazarides, R.K. Schaefer, Q. Sha", Phys. Rev. D 56 (1997) 1324. T. Li, hep-th/9801123. A.R. Liddle, D.H. Lyth, Phys. Lett. B 291 (1992) 391. A.R. Liddle, D.H. Lyth, Phys. Rep. 231 (1993) 1. A.R. Liddle, D.H. Lyth, Cosmological In#ation and Large-Scale Structure, to be published. E. Witten, Nucl. Phys. B 471 (1996) 135. J.E. Lidsey et al., Rev. Mod. Phys. 69 (1997) 373. A.D. Linde, Phys. Lett. B 108 (1982). A.D. Linde, Phys. Lett. B 129 (1983) 177. A.D. Linde, Phys. Lett. B 132 (1983) 317. A.D. Linde, Phys. Lett. B 175 (1986) 395. A.D. Linde, Particle Physics and In#ationary Cosmology, Harwood Academic, Switzerland, 1990. A. Linde, Phys. Lett. B 249 (1990) 18. A.D. Linde, Phys. Lett. B 259 (1991) 38. A.D. Linde, Phys. Rev. D 49 (1994) 748. A. Linde, Phys. Lett. B 351 (1995) 99. A.D. Linde, Phys. Rev. D 58 (1998) 083514. A. Linde, D. Linde, A. Mezhlumian, Phys. Rev. D 49 (1994) 1783. A. Linde, A. Mezhlumian, Phys. Rev. D 52 (1995) 6789. A.D. Linde, A. Riotto, Phys. Rev. D 56 (1997) 1841. A. Lukas, B. Ovrut, D. Waldram, Nucl. Phys. B 532 (1998) 43. D.H. Lyth, Phys. Lett. B 147 (1984) 403, erratum Phys. Lett. B 150 (1985) 465. D.H. Lyth, Phys. Rev. D 31 (1985) 1792. D.H. Lyth, Phys. Lett. B 246 (1990) 359. D.H. Lyth, Phys. Rev. D 45 (1992) 3394.
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146 [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239] [240] [241] [242] [243] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] [255] [256] [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] [270]
D.H. Lyth, Phys. Rev. Lett. 78 (1997) 1861. D.H. Lyth, Phys. Lett. B 419 (1998) 57. D.H. Lyth, M. Mukherjee, Phys. Rev. D 38 (1988) 485. D.H. Lyth, A. Riotto, Phys. Lett. B 412 (1997) 28. D.H. Lyth, E.D. Stewart, Astrophys. J. 361 (1990) 343. D.H. Lyth, E.D. Stewart, Phys. Lett. B 252 (1990) 336. D.H. Lyth, E.D. Stewart, Phys. Rev. Lett. 75 (1995) 201. D.H. Lyth, E.D. Stewart, Phys. Rev. D 53 (1996) 1784. D.H. Lyth, E.D. Stewart, Phys. Rev. D 54 (1996) 7186. J. Yokoyama, Phys. Lett. B 212 (1988) 273; Phys. Rev. Lett. 63 (1989) 712. A. De la Macorra, S. Lola, Phys. Lett. B 373 (1996) 299. home page at http://map.gsfc.nasa.gov/. J. March-Russell, Phys. Lett. B 437 (1998) 318 R.N. Mohapatra, A. Riotto, Phys. Rev. D 55 (1997) 1138. R.N. Mohapatra, A. Riotto, Phys. Rev. D 55 (1997) 4262. R.N. Mohapatra, hep-ph/9801235 and refs. therein. S. Mollerach, S. Matarrese, F. Lucchin, Phys. Rev. D 50 (1994) 4835. V.F. Mukhanov, JETP Lett. 41 (1985) 493. V.F. Mukhanov, G.V. Chibisov, JETP Lett. 33 (1981) 532; Sov. Phys. JETP 56 (1981) 258. L.V. Mukhanov, L.A. Kofman, D.Yu. Pogosyan, Phys. Lett. 193 (1987) 427. V.F. Mukhanov, M.I. Zelnikov, Phys. Lett. B 263 (1991) 169. V.F. Mukhanov, H.A. Feldman, R.H. Brandenberger, Phys. Rep. 215 (1992) 203. H. Murayama et al., Phys. Rev. D 50 (1994) 2356. M. Nagasawa, J. Yokoyama, Nucl. Phys. B 370 (1992) 472. T.T. Nakamura, E.D. Stewart, Phys. Lett. B 381 (1996) 413. C.R. Nappi, B.A. Ovrut, Phys. Lett. B 113 (1982) 175. A.E. Nelson, Nucl. Phys. B Proc. Suppl. 62 (1998) 261. A.E. Nelson, D. Wright, Phys. Rev. D 56 (1997) 1598. H.P. Nilles, Phys. Rep. 110 (1984) 1. H.P. Nilles, M. Olechowski, M. Yamaguchi, Phys. Lett. B 415 (1997) 24. H.P. Nilles, M. Srednicki, D. Wyler, Phys. Lett. B 120 (1983) 346. H.P. Nilles, N. Polonsky, Phys. Lett. B 412 (1997) 69. K. Olive, Phys. Rep. 190 (1990) 307. B.A. Ovrut, P.J. Steinhardt, Phys. Lett. B 133 (1983) 161. B.A. Ovrut, P.J. Steinhardt, Phys. Rev. Lett. 53 (1984) 732. B.A. Ovrut, S. Thomas, Phys. Lett. B (1991) 267; ibid B 277 (1992) 53. C. Panagiotakopoulos, Phys. Lett. B 402 (1997) 257. Ue-Li Pen, U. Seljak, N. Turok, Phys. Rev. Lett. 79 (1997) 1611. home page at http://astro.estec.esa.nl/Planck. P. Peter, D. Polarski, A.A. Starobinsky, Phys. Rev. D 50 (1994) 4827. D. Polarski, A.A. Starobinsky, Nucl. Phys. B 385 (1992) 623. D. Polarski, Phys. Rev. D 49 (1994) 6319. D. Polarski, A.A. Starobinsky, Phys. Rev. D 50 (1994) 6123. D. Polarski, A.A. Starobinsky, Phys. Lett. B 356 (1995) 196. J. Polonyi, preprint no. KFKI-77-93, 1997. G. Dvali, A. Pomarol, Phys. Rev. Lett. 77 (1996) 3728. S. Raby, Phys. Rev. D 56 (1997) 2852. L. Randall, M. Soljacic, A.H. Guth, Nucl. Phys. B 472 (1996) 408. A. Riotto, Nucl. Phys. B 515 (1998) 413. A. Riotto, hep-ph/9710329. D. Roberts, A.R. Liddle, D.H. Lyth, Phys. Rev. D 51 (1995) 4122. M.I. Zelnikov, V.F. Mukhanov, JETP Lett. 54 (1991) 197.
145
146 [271] [272] [273] [274] [275] [276] [277] [278] [279] [280] [281] [282] [283] [284] [285] [286] [287] [288] [289] [290] [291] [292] [293] [294] [295] [296] [297] [298]
D.H. Lyth, A. Riotto / Physics Reports 314 (1999) 1}146 V.A. Rubakov, M.V. Sazhin, A.V. Veryaskin, Phys. Lett. B 115 (1982) 189. D.S. Salopek, J.R. Bond, J.M. Bardeen, Phys. Rev. D 40 (1989) 1753. D.S. Salopek, Phys. Rev. Lett. 69 (1992) 3602. D.S. Salopek, Phys. Rev. D 52 (1995) 5563. S. Sarkar, Rep. Prog. Phys. 59 (1996) 1493. M. Sasaki, Prog. Theory Phys. 76 (1986) 1036. M. Sasaki, E.D. Stewart, Prog. Theory Phys. 95 (1996) 71. A more accurate calculation, analogous to the one in Ref. [291] for a single-component in#aton, is provided by Ref. [243]. Q. Sha", A. Vilenkin, Phys. Rev. Lett. 52 (1984) 691. M. Shifman, Prog. Part. Nucl. Phys. 39 (1997) 1 G.F. Smoot et al., Astrophys. J. 396 (1992) L1. S. Soni, A. Weldon, Phys. Lett. B 126 (1983) 215. A.A. Starobinsky, Phys. Lett. B 117 (1982) 175. A.A. Starobinsky, Quantum Gravity, Proc. 2nd Seminar Quantum Theory of Gravitation [in Russian], Inst. Nucl. Res. USSR Acad. Sci., Moscow, 1982, p. 58. A.A. Starobinsky, Sov. Astron. Lett. 11 (1985) 133. A.A. Starobinsky, JETP Lett. 42 (1985) 152. A.A. Starobinsky, in: H.J. de Vega, N. Sanchez (Eds.) Lecture Notes in Physics, vol. 242, Springer, Berlin, 1986. A.A. Starobinsky, Phys. Lett. B 91 (1990) 99. A.A. Starobinsky, J. Yokoyama, gr-qc/9502002 (1995). E.D. Stewart, Phys. Rev. D 51 (1995) 6847. E.D. Stewart, Phys. Lett. B 345 (1995) 414. E.D. Stewart, D.H. Lyth, Phys. Lett. B 302 (1993) 171. E.D. Stewart, M. Kawasaki, T. Yanagida, Phys. Rev. D 54 (1996) 6032. E.D. Stewart, Phys. Lett. B 391 (1997) 34. E.D. Stewart, Phys. Rev. D 56 (1997) 2019. R. Stompor et al., astro-ph/9511087. L. Susskind, Phys. Rev. D 20 (1979) 2619. S. Thomas, hep-th/9801007. R. Bond, Pritzker Symposium, http://www-astro-theory.fnal.gov/Personal/psw/talks/bond/bond.03.gif.
R. Lai, A.J. Sievers/Physics Reports 314 (1999) 147}236
147
NONLINEAR NANOSCALE LOCALIZATION OF MAGNETIC EXCITATIONS IN ATOMIC LATTICES
R. LAI, A.J. SIEVERS Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY, 14853-2501, USA
AMSTERDAM } LAUSANNE } NEW YORK } OXFORD } SHANNON } TOKYO
Physics Reports 314 (1999) 147}236
Nonlinear nanoscale localization of magnetic excitations in atomic lattices R. Lai, A.J. Sievers* Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY, 14853-2501, USA Received September 1998; editor: D.L. Mills Contents 1. Introduction 1.1. Historical background 1.2. Magnetic lattices 1.3. Overview and organization 2. Ferromagnetic chain with nearest-neighbor exchange interaction and easy-plane on-site anisotropy 2.1. The nearest-neighbor 1-D model and equations of motion 2.2. Stationary intrinsic localized spin wave modes (ILSMs) 2.3. Travelling ILSMs 2.4. Interaction of ILSMs with magnetic defects 3. Isotropic ferromagnetic chain with nearest- and next-nearest-neighbor exchange interactions 3.1. The 1-D model Hamiltonian 3.2. Stationary intrinsic localized spin wave resonances (ILSRs) 3.3. Conditions for the occurrence of ILSRs 3.4. Translating ILSRs 4. Antiferromagnetic chain with on-site easy-axis anisotropy 4.1. Equations of motion 4.2. Stationary intrinsic localized spin wave gap modes (ILSGs) 4.3. Moving gap modes 4.4. Weak nonlinearity limit 4.5. Stability of stationary ILSGs
150 150 151 153
156 156 157 161 163 165 167 168 171 175 177 179 181 183 186 190
5. Antiferromagnetic chain with on-site easy-plane or biaxial anisotropy 5.1. The model Hamiltonian 5.2. Stationary intrinsic localized spin wave modes 5.3. Existence conditions 6. Modulational instability of an extended nonlinear spin wave in an easy-axis antiferromagnet 6.1. Travelling nonlinear extended waves 6.2. Modulational instability of extended waves 6.3. Comparison between numerical simulations and analytical results 6.4. Modulational instability recurrence 7. Production of intrinsic localized spin wave modes and the CW driving of antiferromagnetic instabilities 7.1. Creation of intrinsic localization 7.2. Uniaxial FeF 7.3. Uniaxial FeCl 7.4. Biaxial (C H NH ) CuCl 8. Conclusions 8.1. Summary 8.2. Other systems and future prospects Acknowledgements References
* Corresponding author. Tel.: #1 607 2556422; fax: #1 607 2556428; e-mail: [email protected] 0370-1573/98/$ } see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 9 0 - 8
193 193 195 200 202 203 204 210 213
215 216 220 222 223 228 228 232 233 233
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
149
Abstract Reviewed here is the nonlinear intrinsic localization expected for large amplitude spin waves in a variety of magnetically ordered lattices. Both static and dynamic properties of intrinsic localized spin wave gap modes and resonant modes are surveyed in detail. The modulational instability of extended nonlinear spin waves is discussed as a mechanism for dynamical localization of spin waves in homogeneous magnetic lattices. The interest in this particular nonlinear dynamics area stems from the realization that some localized vibrations in perfectly periodic but nonintegrable lattices can be stabilized by lattice discreteness. However, in this rapidly growing area in nonlinear condensed matter research the experimental identi"cation of intrinsic localized modes is yet to be demonstrated. To this end the study of spin lattice models has de"nite advantages over those previously presented for vibrational models both because of the importance of intrasite and intersite nonlinear interaction terms and because the dissipation of spin waves in magnetic materials is weak compared to that of lattice vibrations in crystals. Thus, both from the theoretical and the experimental points of view, nonlinear magnetic systems may provide more tractable candidates for the investigation of intrinsic localized modes which display nanoscale dimensions as well as for the future exploration of the quantum properties of such excitations. 1999 Elsevier Science B.V. All rights reserved. PACS: 46.10.#z; 63.20.Pw; 63.20.Ry; 75.10.Hk; 75.30.Ds; 75.30.!m Keywords: Intrinsic localized spin wave modes; Antiferromagnets; Nonlinear dynamics; Discrete lattices; Nanoscale
150
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
R.¸. would like to dedicate this review to his wife, ¸inda Huang, for her support and A.J.S. to Paul Camenzind for showing him the way.
1. Introduction 1.1. Historical background Both nonlinearity and lattice discreteness have played important roles in many branches of condensed matter physics, including crystal [1,2] and spin dynamics [3}5]. The traditional approach in solid state physics has been to treat them separately. Lattice discreteness is usually manifested by the existence of an upper bound on the plane wave spectrum whereas nonlinear terms are typically assumed to be small and treated as a perturbation to the harmonic approximation, leading to both damping and a frequency shift of plane-wave excitations in an otherwise linear lattices [6,7]. This approach has successfully explained most of the phenomena in condensed matter physics involving weak nonlinearity. The nonlinearity cannot be treated as a perturbation in all cases, as evidenced by the appearance of domain walls, kinks and solitons [8]. An important advance in dealing with nonlinearity in condensed matter physics has been the introduction of the soliton as a new type of elementary excitation. It has been suggested [9] that solitons, which had been extensively studied [10] in #uids, plasmas and optics, may be present as thermal excitations in quasi-one-dimensional materials as well, and should be treated as a new type of elementary excitation in addition to spatially extended plane wave-like modes. Since then nonlinear excitations have attracted wide interest in many branches of condensed matter physics, for example, in lattice dynamics [11], electronic polymers [12], molecular crystals [13] and magnetic systems [14,15]. Classically, these nonlinear excitations are solutions of integrable nonlinear partial di!erential equations [16,17] which can be used to describe some realistic physical systems within the continuum approximation. The paradigm of such nonlinear excitations has provided a rather useful framework for investigating a large number of phenomena in condensed matter physics, especially the thermodynamic and transport properties of low dimensional materials [18,19]. In particular, solitary excitations in one-dimensional magnetic systems have been extensively investigated. In general, there are no exact solutions to the equations of motion derived from Heisenberg ferro- and antiferromagnetic Hamiltonians. The classical continuum limit approximation of 1-D easy-plane magnets demonstrates the existence of sine-Gordon kink excitations in addition to spin waves [20,21]. Another type of continuous nonlinear excitation that can be supported by a magnetic chain is a breather, which can be visualized as a magnon bound state [22,23]. Excellent reviews of the theoretical and experimental investigation of magnetic solitary excitations have been given [14,15]. Perhaps because these nonlinear excitations have in"nite lifetime in integrable systems but are found to be unstable in non-integrable continuous systems, historically, most attention in nonlinear dynamics was devoted to integrable continuous models. Among these integrable models are the (1#1)-dimensional sine-Gordon equation, the KdV equation, and the nonlinear SchroK dinger equation, to name a few of the best-known examples [16,17,24]. In strongly nonlinear discrete systems, the topic of this review, the spatial size of a nonlinear excitation can become comparable to the lattice spacing; hence, the discreteness of the underlying physical systems is expected to have
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
151
a signi"cant e!ect on the properties of nonlinear excitations in condensed matter physics. However, the study of nonlinear excitations in discrete lattices has been relatively rare since, except for the Toda lattice [11] and the Ablowitz}Ladik lattice [25], few discrete lattices are integrable and even these discrete lattice models appear to be integrable by construction rather than motivated by realistic physical systems. A major advance of the theory of nonlinear excitations in discrete lattices in late 80s and early 90s was the discovery that some localized vibrations in perfectly periodic but non-integrable lattices can be stabilized by lattice discreteness [26}29]. This realization has led to extensive studies of the features associated with intrinsic localization in various nonlinear nonintegrable lattices, and it has proven to be a conceptual and practical breakthrough [30}34]. In the literature these localized excitations are called either `intrinsic localized modesa (ILMs) with the emphasis on the fact that their formation involves no disorder and that they extend over a nano-length scale, or `discrete breathersa with the emphasis on their similarity to exact breather soliton solutions in nonlinear continuum theories. Although it is well known that no bound state or localized mode exists in a continuum 3D space for a scalar "eld [35], ILMs in discrete lattices are not con"ned to certain lattice dimensions [29,36}39]. These unusual modes can occur at any site and may be stationary or move slowly through the lattice [30]. One key element for realistic lattices is the existence of gapped linear dispersion relations. Depending on the nature of the interparticle forces, a variety of interesting ILMs can exist, with spatial mode patterns ranging in type from alternating (zone center) to staggered (zone boundary). The earlier work of Sievers and Takeno [27], and Page [28] has recently been formalized in terms of a number of useful existence and stability criteria [32], and many physically exciting contexts are currently emerging } in nonlinear crystal dynamics [30,34], magnetic systems [40,41], electron}phonon systems [42], molecular biophysics [43], friction [44], etc. The potential for these self-localized oscillatory excitations in equilibrium and nonequilibrium classical and quantum discrete lattices is now extensive and this thrust is becoming a major activity in nonlinear condensed matter research. The challenge at this writing is that these excitations are yet to be unequivocally identi"ed in experiment. 1.2. Magnetic lattices In magnetic systems, both exchange interactions between spins and spin anisotropy (either single-ion or dipole}dipole) are intrinsically nonlinear. Since the strength of the internal e!ective "eld acting on a spin always decreases with increasing the spin deviation from its equilibrium direction, the nonlinearity in magnetic systems is generally soft. Fig. 1 provides a qualitative illustration of how the nonlinearity can introduce new e!ects in the spin wave excitation spectrum for a discrete lattice. The dispersion curve associated with small amplitude linear excitations for a two sublattice easy axis antiferromagnet is illustrated in the center of the "gure. The eigenvector for one of the doubly degenerate uniform precession magnetic dipole active plane wave modes is shown in the top panel. An essential point of this review is to demonstrate that if su$cient transverse amplitude can be given to a particular spin then a stationary localized spin wave excitation may appear such as is illustrated in the bottom panel of the "gure. As we shall see this excitation is stabilized by the combination of the nonlinearity and the discreteness of the lattice. Its eigenvector may appear familiar since it extends over the same nanoscale lengths as those associated with the well studied defect-induced localized spin wave modes [45}48].
152
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 1. Schematic picture of possible excitations in a 1-D lattice of antiferromagnetically coupled spins with easy uniaxis spin anisotropy. The center panel shows the doubly degenerate dispersion relation for small amplitude excitations of the spins. One of the magnetic dipole active eigenvectors associated with the uniform mode is shown in the top frame. The bottom panel shows a particular eigenvector that can occur when a large transverse amplitude appears at an individual spin site. The larger the amplitude of the transverse excitation of this spin the lower the mode frequency below the bottom of the plane wave spectrum and the more localized the spatial extent of the excitation. One reason for the interest in these nonlinear excitations is that the scale of the localization of such intrinsic modes can be on a scale comparable to the lattice spacing itself.
It has been known since the 1960s that the destruction of the translational symmetry of a magnetic crystal may produce both localized vibrational and localized magnetic excitations [47]. The properties of these linear localized magnetic excitations in insulating doped antiferromagnetic systems were "rst studied with a number of experimental techniques, including far infrared [45], Raman scattering [49], optical #uorescence [50], and neutron scattering [51]. Of the magnetic defect systems in which far infrared active localized modes have been observed, perhaps MnF : Fe> has been examined in the greatest detail. One reason is that both single and localized magnetic pair modes occur at frequencies where there is almost no interference from the host phonon absorption spectrum of the tetragonal MnF host. The far infrared spectra for two di!erent crystal geometries are shown in Fig. 2. The plane wave far IR-active modes of the host crystal, consisting of the antiferromagnetic resonance (AFMR) at the lowest frequency and the electric}dipole active two magnon absorption associated with a pair of magnons of equal wavevector excited on opposite sublattices, gives absorption peaks at 106 cm\ for E#c and at 100 cm\ for ENc. A localized phonon mode occurs at 113 cm\. The impurity-induced magnetic excitations consist of a localized magnetic mode at 94.9 cm\ and a pair excitation at 144.9 cm\ which
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
153
Fig. 2. Spectra showing intrinsic and impurity-induced absorption lines in MnF : Fe>. The sample temperature is 1.2 K. The relative strengths of the di!erent modes are drawn qualitatively to scale for a 0.2 mol% FeF doping. The identi"cation of the di!erent excitations is given in the "gure. Two spectra are required to characterize the far-IR absorption in this tetragonal crystal (after Ref. [52]).
involves a simultaneous excitation of the spin deviations associated with the Fe> impurity spin and a shell mode centered on the neighboring ions [52]. The result is a fairly complex array of localized modes, even for a simple point defect substitution. A fundamental di!erence between intrinsic localized spin wave modes and the defect-induced localized spin modes shown in the spectra presented in Fig. 2 is that impurity modes are trapped at the defect site whereas, according to Bloch's theorem, intrinsic localized modes must be able to move through the lattice. Another di!erence is that impurity modes can appear at frequencies above or below the linear spin wave spectrum while the intrinsic localization result presented in Fig. 1 favors low-frequency modes because of the intrinsically soft nonlinearity. This frequency red shift result is somewhat similar to that found for nonlinear vibrational excitations in crystal lattices which contain realistic intersite potentials. Note that because of the large size of the intersite cubic anharmonicity, the e!ective vibrational potential is always soft [31,53}56]. 1.3. Overview and organization In the search for ways to study intrinsic localized modes and produce nano-scale localized excitations experimentally, spin lattice models have de"nite advantages over vibrational models. Relatively simple spin lattice models can yield more complex linear dispersion curves because of the importance of both intrasite and intersite interaction terms. The combination of these dispersion
154
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
curves with the soft nonlinearity in magnetic chains can generate both nonlinear gap modes [40] and resonant modes [57,58] in the spin wave spectrum. Another important di!erence between crystal models and spin models is that realistic two-body interactions between atoms involve cubic nonlinear potential terms [54] which produce a static distortion centered at the localized excitation whereas nonlinear interactions in simple two sublattice spin lattice models exclude odd nonlinear terms, again making the nonlinear dynamical models somewhat simpler to implement. Furthermore, the dissipation of spin waves in magnetic materials is usually weak compared to that of lattice vibrations in crystals and such weak dissipation should make it easier experimentally to drive the uniform spin wave mode into a highly nonlinear regime. Hence, there is de"nite value in "rst exploring experimentally nonlinear localized excitations in magnetic systems with small damping before turning to consider other types of crystal systems which involve large damping. Thus, both from the theoretical and experimental points of view, magnetic systems may provide more tractable candidates for the study of the formation of intrinsic localized modes which display nano-scale lengths and also for the exploration of the quantum properties of such excitations. In the "rst paper focusing explicitly on intrinsic localized spin wave modes in discrete lattices Takeno and Kawasaki [37] argued that coherent quantum states should be the appropriate starting point for large amplitude collective modes like those associated with intrinsic localization. They noted that one should employ the coherent-state ansatz for the eigenfunction W(t) of the Hamiltonian H so that (1.1) W(t)" exp[!()("a "#"b ")] exp(a a>#b b>)"02 , L L L L L L L where "02 is the vacuum state of the spin boson system using a standard notation. In principle, the time dependent variational principle
d dt1W(t)"i (R/Rt)!H"W(t)2"0
(1.2)
gives the nonlinear quantum mechanical equations of motion. However, since a solution to such equations has not yet been demonstrated these authors and subsequent workers in the "eld have been forced to work with the classical spin equations of motion for discrete lattices which contain the corresponding nonlinearities. This review details such classical studies focusing on the theoretical properties of intrinsic localized spin wave modes (ILSMs) in ferro- and antiferromagnetic lattices. The existence condition for ILSMs in these non-integrable models is obtained from the nonlinear SchroK dinger-type equations that describe the spin wave dynamics in the continuum limits and from the linear stability analysis of the corresponding extended nonlinear plane waves. For many examples described here the stability and mobility of ILSMs have been investigated using both analytical techniques and molecular dynamics (MD) simulations. In general, spin lattices cannot support localized modes above the linear spin wave spectrum but there are exceptions which can be obtained with the application of a dc external "eld such as for the special con"guration of a ferromagnetic chain where spins are forced to align along the hard on-site anisotropy axis by a strong external "eld [41]. The next section explores the properties of this particular arrangement. When both the external "eld and the anisotropy are turned o! Section 3
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
155
shows that the simplest ferromagnetic system that can support ILSMs is a onedimensional ferromagnetic chain with both nearest-neighbor (NN) and next-nearest-neighbor (NNN) exchange interactions. Long-lived in-band intrinsic localized spin wave resonances (ILSRs) may exist at frequencies near the Brillouin zone boundary when the strength of the NNN exchange interaction relative to the NN exchange interaction exceeds a speci"c threshold. However, as such ILSRs in isotropic ferromagnetic chains possess a zero net ac magnetic moment, they are not magnetic dipole active. Although a gap may appear in the linear spin wave spectrum of an isotropic ferromagnetic chain when an external dc magnetic "eld is applied, no ILSGs can exist in such a gap since the external "eld only contributes a linear term in the equations of motion, producing a frequency shift of the entire spin wave spectrum. For this reason, we consider only spin chains in the absence of external dc "elds in Section 4 where it is shown that the existence of intrinsic localized spin wave gap modes (ILSGs) in anisotropic antiferromagnets is a natural consequence of the intrinsic softness of the exchange interaction and the on-site anisotropy. Since a large number of ordered magnetic spin systems appear as anisotropic antiferromagnets and the antiferromagnetic resonance (AFMR) mode is often IR-active, these ILSGs are of particular interest. In antiferromagnetic chains with on-site easy-axis anisotropy, both single- and double-peaked ILSGs are found without invoking the usual rotating wave approximation (RWA) necessary in analytical work on vibrational problems [27]. As the maximum spin deviation at the mode center or the relative strength of the anisotropy increases, the degree of localization of the ILSGs increases and the frequency drops further into the spin wave gap. Both linear stability analysis and MD simulations demonstrate that only single-peaked ILSGs are stable against perturbations whereas a perturbed double-peaked ILSG evolves into a single-peaked one. Moving ILSGs are also discussed. It is demonstrated that the mobility of an ILSG decreases as the degree of localization increases so that strongly localized modes can become pinned at low temperatures. Section 5 examines the nonlinear properties of the more complex but more general easy-plane anisotropy and biaxial anisotropy antiferromagnets. For antiferromagnetic chains with on-site easy-plane anisotropy, the spin wave dispersion curve is separated into two distinct branches, one of which extends to zero frequency. In this hard uniaxial case, intrinsic localized spin wave resonances (ILSRs) can exist with frequencies within the linear spin wave spectrum of the lower branch when the upper branch of the dispersion curve has positive curvature at the center of Brillouin zone. The key feature in this nonlinear dynamics problem is the polarization di!erence between the two plane wave branches. The smaller the frequency of the q"0 mode in the upper branch, the less strongly coupled the resulting ILSR is to the other branch of the plane wave spectrum. Numerical simulation studies demonstrate that the ILSR excitations are long-lived with regard to a random noise perturbation. When the rotational symmetry in the easy-plane is broken and the anisotropy becomes biaxial, ILSGs, in addition to ILSRs, can also exist in the gap below the lower branch of linear spin wave spectrum. The question as to the best way of experimentally creating such an atomic scale large-amplitude excitation in a homogeneous but discrete lattice is still open although, as we shall demonstrate, the existence, localization and stability of ILMs in a variety of discrete lattices have been extensively investigated numerically. To "nd a method for generating ILSMs, the modulational instability mechanism for extended nonlinear spin waves in easy-axis antiferromagnetic chains has been reviewed in Section 6, both analytically within the framework of linear stability analysis and
156
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
numerically by means of MD simulations. The exploration of the in#uence of damping on these modes in Section 7 is su$ciently encouraging that the remainder of the section is devoted to exploring the possible generation of ILSMs for particular sublattice antiferromagnets via MD simulations. ILSM-like excitations are generated with realistic parameters. Since ILSMs in antiferromagnets have frequencies below the extended spin waves and are magnetic dipole-active, it is anticipated that their signature can be directly probed by homodyne detection methods. Finally, the main conclusions and future prospects are given in Section 8.
2. Ferromagnetic chain with nearest-neighbor exchange interaction and easy-plane on-site anisotropy Intrinsic localized spin wave modes (ILSMs) are expected to exist in perfect but nonintegrable discrete magnetic chains because of the intrinsic nonlinearity in the exchange and anisotropy interactions. Here we consider the simplest magnetic system that can possibly support ILSMs, namely, ferromagnetic chains. Although no ILSM can occur in isotropic ferromagnetic chains with only nearest-neighbor exchange interaction because the nonlinearity in the Heisenberg exchange interaction is intrinsically soft, both even-parity and odd-parity ILSMs appear in Heisenberg ferromagnetic chains with easy plane anisotropy [41,59] when a strong magnetic "eld is applied perpendicular to this plane. Like their vibrational counterpart [30}34], these highly localized ILSMs involve only a few lattice sites and have amplitude-dependent frequencies which lie outside the harmonic plane-wave bands. The existence of such ILSMs for a ferromagnetic chain with nearest-neighbor interactions requires that the strength of the single-ion anisotropy and the external magnetic "eld exceed certain critical values so that the resulting ILSM frequencies can appear above the linear spin wave band. The production of ILSMs by the application of an external magnetic "eld makes use of an experimental parameter not available with crystal lattice systems and we review some of the "ndings below. 2.1. The nearest-neighbor 1-D model and equations of motion The Hamiltonian for the ferromagnetic chain with on-site easy-plane anisotropy [41,48,59] can be written as H"!2J S ) S #D (SX)!H SX! h (SV cos ut!SW sin ut) , (2.1) L L> L L L L L L L L L where J'0 is the exchange interaction constant, D is the uniaxial anisotropy constant, and H is the magnitude of the external "eld applied along the zL -axis. The last term in this equation describes a circularly polarized "eld of strength h and frequency u applied in the xy-plane. Positive values of L D correspond to the case of easy-plane anisotropy. To arrive at the interesting nonlinear dynamical regime the external "eld H is taken to be su$ciently large so that in the ground state all spins are ordered along the external "eld direction, that is, along the hard anisotropy axis. This spin arrangement requires that H 'DS in Eq. (2.1). This one-dimensional model has been demon strated successfully to describe the spin dynamics of the linear chain compound CsFeCl [60].
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
157
Some magnetic superlattice structures [61] can also be described by the energy funcitonal presented in Eq. (2.1). (Note that for CsFeCl , a rather large applied magnetic "eld, in the range of 40 T would be required to align the spins parallel to the hard axis.) To investigate spin deviations from the ground state, rotational symmetry is incorporated in the de"nition of the appropriate spin variables in the usual way so that s!"(SV$iSW)/z and sX"SX/S , (2.2) L L L L L where S is the magnitude of spin. In these circularly polarized coordinates, the equations of motion become i
ds> L "H s>#2JS[s>(sX #sX )!sX(s> #s> )]!2DSsXs>!SsXh e\ SR , L L L> L\ L L> L\ LL L L dt
(2.3)
where for classical spins sX"(1!"s>" and for convenience both and the gyromagnetic ratio L L are set to 1. 2.2. Stationary intrinsic localized spin wave modes (I¸SMs) Eq. (2.3) admits time-periodic stationary solutions of the form s>"s e\ SR with real timeL L independent coe$cients s . In these stationary modes all spins are engaging in circular precession L on a cone making an angle of h "sin\(s ) with respect to the z-axis. From Eq. (2.3) the system of L L coupled nonlinear time independent equations for amplitude s becomes L Xs "s ((1!s #(1!s )!(s #s )(1!s!2As (1!s!c (1!s , L L L> L\ L> L\ L L L L L (2.4) where the various parameters are de"ned by the following equations: X"(u!H )/2JS , A"D/2J
(2.5) (2.6)
and c "h /2JS . (2.7) L L Stationary ILSMs in the absence of a driving "eld (c "0) have been studied by Wallis et al. [41] L where localized modes with both even and odd parity are found. There is a critical value of A, the ratio of the anisotropy to the exchange, below which ILSMs do not exist. Fig. 3 shows the linear spin wave band for two di!erent values of A (solid curves) and the corresponding frequency positions of nonlinear plane wave modes at the zone boundary (dashed lines). For the case of the isotropic ferromagnetic chain (A"0), the e!ect of the nonlinearity in the exchange interaction is to force the frequency of the large amplitude zone boundary mode down into the linear continuum, hence no localized modes can exist. The in#uence of the nonlinearity on the frequency of large amplitude zone-boundary mode is however quite di!erent when A'2.0. In this case the nonlinear zoneboundary mode frequency increases with increasing amplitude, leaving room for a localized modes to exist below it but above the linear plane wave spectrum. For the case of A"10, Fig. 3 illustrates where a nonlinear localized mode (dot-dashed line) appears above the plane wave spectrum. The critical value of A"2.0 has a straightforward explanation which is consistent with the intrinsic nonlinear softness of the exchange and anisotropy interactions. The total "eld acting on
158
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 3. Frequency X versus wave-vector-lattice-constant product qa for the linear spin}wave band (solid curve) for A"0 and 10. Frequencies are shown for the nonlinear bulk mode at the zone boundary (dashed line) and the nonlinear localized mode (dot-dashed line) (after Ref. [41]).
each spin consists of three parts: the applied "eld, the exchange "eld and the anisotropy "eld } only the latter two are nonlinear. In the con"guration discussed here the anisotropy "eld is antiparallel to the applied "eld and the exchange "eld. When A'2.0, the anisotropy "eld is stronger than the exchange "eld. Thus, the resultant nonlinear "eld is antiparallel to the applied "eld, and its magnitude decreases with increasing spin wave amplitude, resulting in nonlinear spin wave modes with frequencies above the linear spin wave spectrum. The conclusion that A'2.0 is required in order for the localized mode to exist is reinforced by numerical work which is illustrated in Fig. 4. The frequency di!erence between the localized mode (solid curve) and the nonlinear zone boundary mode (dashed curve) extrapolates to zero at A"2.0, indicating that no local mode exists for A(2.0. When the transverse spin amplitude becomes su$ciently small, the intrinsic localized spin wave mode acquires a large spatial extent. In this limit the continuum approximation can be invoked. Introducing the new variable W "(!1)Ls which varies slowly in space, one obtains, for the L L continuum limit (dW/dx)!aW#bW"0 ,
(2.8)
where a"(X#2A!4)/a and b"(A!2)/a, with a the lattice constant. If b'0, that is, A'2, Eq. (2.8) has a localized solution given by W(x)"W sech K
(2A!4)W K (x!x ) , 2a
where the frequency X"4!2A#(A!2)W /2 increases with the amplitude. K
(2.9)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
159
Fig. 4. Frequency relative to the top of the linear spin}wave band *X versus anisotropy parameter A for the nonlinear bulk mode at the zone boundary (dashed line) and the nonlinear localized mode (solid curve) (after Ref. [41]).
Rakhmanova and Mills [59] have investigated multi-soliton states in ferromagnetic chains described by Eq. (2.1) as well as the nature of these nonlinear excitations in an applied ac magnetic "eld. For any frequency above the maximum linear spin wave frequency, one can obtain a hierarchy of solutions with slowly varying envelopes that have the appearance of one soliton, two soliton, three soliton, 2, states. Fig. 5 shows an example of this hierarchy for zero driving "eld. Here f "(!1)Ls . Computer simulations have demonstrated that these states are quite stable L L against small amplitude perturbations. When a spatially uniform driving "eld (c "cO0) is applied with frequency u above the linear L spin wave band, Eq. (2.4) has time-periodic solutions with an ILSM-like feature near the center of the chain while a uniform background appears in both wings far from the center. Fig. 6 displays such a solution for the case of c"0.04. One can see from Fig. 6b that the application of an external driving "eld introduces an oscillatory modulation in the envelope f . These oscillations increases in L amplitude as the strength of the driving "eld increases. The presence of the external driving "eld has a signi"cant e!ect on the stability of these localized spin wave modes. Although ILSMs in zero driving "eld are stable over a long period of time, these ILSMs in the presence of a driving "eld appear to have a "nite lifetime which decreases with increasing "eld strength, as demonstrated by the computer simulations shown in Fig. 7. (The simulation time corresponds to roughly 100 precessional periods.) The solid line connects values of Re+s (t),, and the dotted lines identify Im+s (t), at t "250/JS. The localized structure L L is quite stable in Fig. 7a for a weak driving "eld (c"0.01), while no evidence of the central feature remains at t "250/JS when the strength of the driving "eld is increased to c"0.2. These simulations indicate that when an ac driving "eld is used to excite such ferromagnetic localized spin wave modes, its amplitude cannot be too large, though at the same time it should be large enough to drive the system into the regime where the nonlinearity is signi"cant.
160
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 5. For the frequency X"!3.95 and A"4, the envelope function f is shown for (a) a two soliton state, (b) a three L soliton state, and (c) a four soliton state. For (a) we have s "s "3.365;10\, for (b) s "s "2.177;10\, and for , , (c) s "s "5.000;10\ (after Ref. [59]). ,
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
161
Fig. 6. The functions (a) s and (b) f "(!1)Ls are shown for a nonlinear spin excitation in the presence of a rf field. Here L L L the dimensionless "eld strength c"0.04 and the remaining parameters are the same as in Fig. 5 (after Ref. [59]).
2.3. Traveling ILSMs To examine traveling ILSMs, one may assume that s>(t)"s (t)e OL?\SR where s (t) is in general L L L complex since the traveling ILSMs are elliptically polarized and the ellipticity increases with increasing q. Inserting this ansatz into Eq. (2.3), gives the equation of motion [61]
Xs !s ((1!"s "#(1!"s " ) ds L L L\ L> L"i !(s !s ) sin qa(1!"s ", (2.10) L> L\ L dq #[2As #(s #s ) cos qa](1!"s " L L L> L\ where q"2JSt. To integrate Eq. (2.10) forward in time, an initial set of spin con"gurations is required. The authors assume that at q"0 the quantity in curly brackets on the right-hand side of Eq. (2.10) vanishes, that is Xs (0)"s (0)((1!"s (0)"#(1!"s (0)" ) L> L L L> (2.11) ![2As (0)#(s (0)#s (0))cos qa](1!"s (0)". L L> L\ L It should be emphasized that this approximation is good only when the envelop varies slowly in space and the wavevector q is small [58,62]. With the application of free end boundary conditions, initial spin con"gurations have been obtained from Eq. (2.11) for the investigation of single traveling ILSMs as well as their collisions [61]. The approximate initial spin con"guration for traveling ILSMs works quite well for
162
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 7. An illustration of the stability of the one soliton state in an external rf "eld. The structure, perturbed slightly, has been followed up to a time t "500/(2JS). For strong external "elds, the lifetime of the state shortens. Each "gure displays the response at t . In (a), c"0.01, in (b), c"0.04, in (c), c"0.15, and in (d), c"0.20. In all cases, X"!3.95 and A"4 (after Ref. [59]).
frequencies X close to the top of the linear spin wave band and for small wavevectors. If both X and q are large, the traveling ILSM will be scattered by the lattice and energy will shake away in the form of small amplitude extended spin waves before a stable ILSM can be achieved. This observation is similar to the results reported for easy-axis antiferromagnetic chains [62] and for isotropic ferromagnetic chains with both nearest- and next-nearest-neighbor exchange interactions [58]. Fig. 8 shows the collision between two solitons (two small amplitude ILSMs with slowly varying envelope functions). The initial spin con"guration is the two-soliton state obtained from Eq. (2.11). These two excitations have the same frequency and opposite wavevector. It is evident from the "gure that they remain unchanged in shape after the collision. Simulations with other initial spin con"gurations con"rm that traveling ILSMs with small q's are soliton-like objects since they pass through each other upon collision.
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
163
Fig. 8. Collision of two ILSM solitons. Each has X"!0.385 and qa"0.1, due to the fact that they were obtained by taking as the initial con"guration a two-soliton solution of Eq. (2.11) (after Ref. [61]).
2.4. Interaction of ILSMs with magnetic defects Just as lattice defects can give rise to localized vibrational modes [7,63] and some spin defects can result in localized spin wave defect modes in linear theory by breaking the translational invariance of the underlying lattice [46,47] an important question is how these new ILSM's interact with localized spin wave defect modes. The 1-D ferromagnetic chain lends itself naturally to the study of the interaction between nonlinear intrinsic localized modes and magnetic defects and the results have been described in Ref. [61]. In the authors' model a defect spin is placed in the middle of a ferromagnetic chain and it distinguishes itself from other spins only by its di!erent anisotropy constant A"A!*A where A is the intrinsic anisotropy constant. In the theory of linearized excitations, there always exists a localized defect spin wave mode with frequency above the top of linear spin wave band when *A'0 since for this model with the appropriate strength of the magnetic "eld the anisotropy "eld is antiparallel to the applied dc "eld. The frequency of the linear spin wave defect mode may be written as X "X #*X , + where *X"2((1#(*A)!1) ,
(2.12)
(2.13)
164
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 9. Propagation of an ILSM soliton with X"!0.385 through the spin chain with a defect. The defect spin is at site 251, and has the values: (a) *A/A"0.05; (b) *A/A"0.07 (after Ref. [61]).
and X the frequency of the zone boundary spin wave. From simulations it is found that stationary + nonlinear localized defect modes can occur when the internal frequency X of the ILSM exceeds X , the frequency of the linear defect mode. A nonlinear localized defect mode may have a spatial pro"le very similar to that of an ILSM in a perfect periodic chain, but it is more localized than an
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
165
ILSM with the same amplitude. For a "xed frequency, the amplitude of the nonlinear intrinsic localized defect mode decreases with increasing *A/A. Computer simulations have demonstrated that the interaction between traveling ILSMs and magnetic defects can have a diverse and rich character, depending on the relation between the internal frequency of the ILSM and the frequency of the linear defect mode. Fig. 9 displays the propagation of an ILSM with frequency X"!3.85 through a chain of 501 spins with a defect placed at site 251. The intrinsic anisotropy parameter is A"4.0 so that X "!4.0. In Fig. 9a + and Fig. 9b, the values of *A/A are 0.05 and 0.07, respectively. Hence in both cases X'X according to Eq. (2.11) and part of the energy in the initial ILSM is trapped at the defect site to form a nonlinear localized defect mode while another part is re#ected to form a well-de"ned traveling ILSM. The rest of the energy is radiated in the form of small amplitude extended spin waves. Note that in Fig. 9b, the re#ected ILSM is con"ned between the end of the chain and the defect, and that it bounces back and forth. This result suggested that a traveling ILSM might be trapped between two defects in a 1-D chain with proper impurity anisotropy constants which are separated in space. When the value of *A/A increases so that X exceeds the frequency of the initial ILSM, the defect can no longer support a nonlinear defect mode of frequency X. In this regime, the incident ILSM is elastically re#ected from the defect site as illustrated by the computer simulation results shown in Fig. 10a. Since it is expected that the e!ect of defects on the propagation of ILSMs would disappear as *A/A approaches zero, one natural question is whether this limit represents a trivial case. Computer simulations (not shown here) demonstrate that the ILSM passes over the defect with its velocity decreased appreciably when *A/A is small (+0.01) but positive. Some small amplitude extended spin waves are also excited after such a collision. The decrease in velocity and the excitation of extended spin wave become less signi"cant as *A/A decreases. A rather di!erent picture occurs when *A/A approaches zero from the negative side. According to linear spin wave theory no spin wave mode localized at the defect site can exist for this parameter region. The propagating ILSM is fully re#ected by the defect even when *A/A is as small as !0.0075, as demonstrated in Fig. 10b. It should be noted that related dynamical results have been found for the case of a onedimensional nonlinear vibrational diatomic lattice which incorporates realistic nearest-neighbor Born}Mayer}Coulomb potentials. When the interaction of intrinsic gap modes with stationary anharmonic mass defect impurity modes is examined in numerical simulation studies, a variety of scattering results are found depending on the mass defect magnitude and the site in the diatomic chain. Two important features of the trajectories are that the gap mode is trapped at the mass defect when the vibrational frequencies of the moving mode and the anharmonic defect mode are near resonance while the scattering is elastic when the frequencies are far apart [64].
3. Isotropic ferromagnetic chain with nearest- and next-nearest-neighbor exchange interactions Although intrinsic localized spin wave modes can not exist in isotropic ferromagnetic chains with only nearest-neighbor (NN) exchange interactions [41], long-lived intrinsic localized spin wave resonances (ILSRs) can appear below the Brillouin zone boundary frequency when nextnearest-neighbor (NNN) ferromagnetic coupling of su$cient strength is included. The inclusion of
166
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 10. Propagation of an ILSM soliton with X"!0.385 through the spin chain with a defect. The defect spin is at site 251, and has the values: (a) *A/A"0.35; (b) *A/A"!0.0075 (after Ref. [61]).
NNN exchange is an important step in identifying the properties of nano-scale localization stemming from the long range interactions. The key feature of the dynamics of an in-band ILSR is the coexistence at the same frequency of the quasilocalilzed ILSR and the linear spin-wave spectrum. Recently, intrinsic localized resonant modes have also been identi"ed in lattice vibration
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
167
simulations [65] although from analytical work, vibrational resonant modes already had been proposed to be a natural consequence of intrinsic localization [29]. 3.1. The 1-D model Hamiltonian For a one-dimensional ferromagnetic chain of N spins which are coupled through both nearest-neighbor (NN) and next-nearest-neighbor (NNN) isotropic exchange interactions in the absence of anisotropy and applied dc "eld the Hamiltonian becomes H"!2J S ) S !2J S ) S , (3.1) L L> L L> L L where both the NN coupling constant J and the NNN coupling constant J are positive. All spins are taken to be aligned along the z-axis in the ground state since the chain is isotropic. For a non-dissipative chain of classical spins, the equation of motion for the spin at the nth site is then dS /dt"S ;H , L L L where the e!ective "eld, H, acting on the spin can be obtained from L H(t)"! SLH"2J (S #S )#2J (S #S ) . L\ L> L\ L> L For the circular variables de"ned in Eq. (2.2), the equations of motion become
(3.2)
(3.3)
i ds> L "s>(sX #sX #osX #osX )!sX(s> #s> #os> #os> ) , (3.4) L L\ L> L\ L> L L\ L> L\ L> 2J S dt where the dimensionless parameter o"J /J measures the strength of NNN coupling relative to NN coupling. Once again sX"(1!"s>" so that Eq. (3.4) is intrinsically nonlinear in s>. L L L The linear spin wave dispersion curve is obtained by linearizing Eq. (3.4) (approximating sX by 1). L Introducing s>"s e OL?\SR in the usual manner the dispersion curve becomes L u (q)"8J S[sin(qa/2)#o sin(qa)] , (3.5) where q"(2p/Na)n (n"0,$1,2,$N/2), and a is the lattice spacing between adjacent spins. The dispersion curves for di!erent NNN coupling strengths are shown in Fig. 11. Although the frequency of the Brillouin zone boundary mode is independent of the NNN coupling because a precessing spin is in-phase with its NNNs in this mode, the NNN coupling tends to raise the dispersion curve at intermediate wave numbers so that its maximum frequency u may appear at
a wave number q other than p/a, i.e.
p 1 (3.6) q " ! cos\(1/4o) and u "2J S(1#4o)(1#(1/4o)) ,
a a when o is greater than 1/4. In this case the dispersion curve has a local minimum frequency at the Brillouin zone boundary, which opens up a new possibility, namely, that an ILSR may drop from the band edge.
168
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 11. Spectra of linear spin waves in isotropic ferromagnetic chains with both nearest-neighbor (NN) and nextnearest-neighbor (NNN) exchange interactions. From top to bottom, the relative strength of the NNN coupling o"J /J are 1.0, 0.4 and 0, respectively. When o is greater than 1/4 the maximum frequency would appear at q and
the Brillouin zone boundary becomes a local minimum. The ILSR arrow identi"es the frequency of the intrinsic localized spin wave resonance described in the text (after Ref. [58]).
3.2. Stationary intrinsic localized spin wave resonances (I¸SRs) To "nd the eigenvector of a stationary ILSR below the Brillouin zone boundary value all nonlinear terms must be included. Inserting the ansatz sH"s , sX"(1!s) s>"s e\ SR, L L L L L L into Eq. (3.4) gives the time-independent set of nonlinear equations
(3.7)
u s "s [(1!s #(1!s #o((1!s #(1!s )] L L\ L> L\ L> 2J S L ! (1!s[s #s #o(s #s )] , (3.8) L L\ L> L\ L> where u is the frequency of the ILSR. The eigenvectors can be classi"ed in terms of their parities since both an odd-parity mode and an even-parity mode can occur for a range of parameter values o. Given the maximum spin deviation at the mode center s , Eq. (3.8) can be solved numerically
to obtain the eigenvector shape and the frequency of the ILSR [58]. 3.2.1. ILSR eigenvector shapes To illustrate the di!erent shapes that can be found, the eigenvectors of an odd-parity ILSR and an even-parity ILSR for the set of parameters A"1.0 and s "0.7 are presented in Fig. 12a and
Fig. 12b. The lattice site with maximum spin deviation is the symmetry center of the odd-parity mode, while the symmetry center of the even-parity mode is located between two adjacent sites,
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
169
Fig. 12. Eigenvectors of stationary ILSRs of odd and even parities. The ferromagnetic chain contains 256 spins with the parameter o"1.0. Both modes have the same maximum spin deviation s "0.7: (a) Odd-parity mode, the symmetry
center is on the lattice site with maximum spin deviation; (b) Even-parity mode, the symmetry center (the X) is between two adjacent sites. In both cases the left side shows a factor 40 expansion of the ordinate to display the resonant mode wave character in the wings (after Ref. [58]).
each with the same maximum spin deviation. The common feature of both ILSRs is that, although the large amplitude region only extends over a few lattice sites, the spin deviations do not disappear with increasing distance from the center but instead evolve into a weak plane wave pattern with increasing distance from the center. The ILSRs in the magnetic chains reviewed here are fundamentally di!erent from the resonant breathers identi"ed in the continuous model [66] and in discrete nonlinear lattices with substrate potentials [67]. In such resonant breathers, the localized center oscillates at a fundamental frequency outside the linear spectrum while the extended plane-wave tail oscillates at the higher harmonics of the fundamental frequency. It is these higher harmonics that are in the linear spectrum. In contrast, the fundamental frequency of the ILSRs described here in these magnetic chains is coincident with the linear spectrum itself since the ILSRs are monochromatic. By examining the Fourier transform of the eigenvector using
"s(q)"" s exp(iqna) L L
(3.9)
the wave numbers associated with the weak plane wave pattern can be identi"ed. Besides Fourier components centered at q"p/a, there is a sharp peak located at q with a strength that grows with J increasing maximum spin deviation. When this q is substituted into the linear spin dispersion J relation, Eq. (3.5), the appropriate resonant frequency u is obtained. The strength of the coupling between localized resonant modes and extended spin waves depends on two factors: the strength of the NNN interaction and the maximum spin deviation.
170
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
With either deceasing relative strength of the NNN interaction or increasing the maximum spin deviation both the odd-parity mode and the even-parity mode become more localized and the amplitude of the o!-center plane wave component increases [58]. Quasi-localized solutions are not found for o(1/4. 3.2.2. Amplitude dependence of the mode frequency The dependence of the frequency of the ILSR on its maximum spin deviation is shown in Fig. 13. Here the NNN/NN parameter o"1.0. The open circles and crosses denote the frequencies of the odd-parity ILSRs and the frequencies of the even-parity ILSRs, respectively. The frequencies are found by numerically solving Eq. (3.8) with periodic boundary condition. For small spin deviations, the frequency of an ILSR lies close to the Brillouin zone boundary frequency of the linear spin wave band. With increasing spin deviation the mode becomes more localized and its frequency drops further into the linear spin wave band. Note that for "xed maximum spin deviation, the frequency of the even-parity ILSR is lower than that of the odd-parity ILSR. Although there is no apparent distinction between odd-parity modes and even-parity modes with small maximum spin deviations, the di!erence between them increases as s increases. The dot-dashed line is the
continuum approximation frequency obtained later in Section 3.3.2, which is in good agreement with the discrete results up to s "0.5.
3.2.3. ILSR lifetime It is expected that an ILSR becomes unstable and delocalizes after su$cient time because the localized excitation is in resonance with the plane-wave spectrum. The lifetime of an ILSR has been investigated by means of molecular dynamics (MD) simulations by using the eigenvectors obtained
Fig. 13. Dependence of the frequency of stationary ILSR on its maximum spin deviation. The frequency is normalized by u , the frequency of the linear spin wave at the Brillouin zone boundary. The ferromagnetic chain contains 256 spins 8 with the parameter o"1.0. Open circles are the frequencies of odd-parity ILSRs, and crosses are the frequencies of even-parity ILSRs. The dot-dashed line is the continuum approximation given by Eq. (3.27) (after Ref. [58]).
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
171
from numerically solving Eq. (3.8) to provide the initial conditions. Integrating the discrete equations numerically by using the fourth-order Runge}Kutta method with a time step of ¹ /200 8 where ¹ "p/4J S, the period of the linear spin-wave mode at the Brillouin zone boundary, 8 demonstrates that the lifetime of an ILSR depends on three factors (1) its parity, (2) the maximum spin deviation, and (3) the relative strength of the NNN interaction o. At su$ciently small amplitudes ILSRs of both parities are observed to last many hundreds of periods. With increasing amplitude the emission of plane-waves from the ILSRs becomes important so that the larger the amplitude the faster the decay. Symmetry is important since it is always observed that the even-parity mode is more unstable than the odd-parity one. The time evolution of the energy density distribution for an odd-parity ILSR with modest spin deviation (s "0.7) is
shown in Fig. 14 where the energy density is measured from the ground state value. This oddparity mode preserves its initial shape over the complete simulation time of 1500¹ . Contrast this 8 result with that shown in Fig. 15, which presents the corresponding time evolution for an even-parity ILSR with the same spin deviation, where the even-parity mode starts to move and decay after about 750¹ . The strength of the NNN interaction also plays an important role in the 8 decay of an ILSR. As the NNN interaction increases the wave number of the spin-wave that is in resonance with the localized excitation moves away from the band edge (see Fig. 11) so that the coupling between the ILSR and the spin wave becomes weaker producing an increased lifetime. 3.3. Conditions for the occurrence of ILSRs The interrelation between the modulational instability of the extended band edge plane-waves and the existence of spatially localized excitations has been established for a number of nonlinear vibrational lattices [68}74]. Such stability analysis of extended plane-waves provides a useful way to predict under what conditions nonlinear localized excitation can occur. Here the existence
Fig. 14. Time evolution of the energy density distribution of the odd parity ILSR shown in Fig. 12a. The energy density shown here is measured from the ground state energy density and is in units of 2J S. Time is measured in units of ¹ . 8 Fig. 15. Time evolution of the energy density distribution of the even parity ILSR shown in Fig. 12b. Again the energy density is measured from the ground state energy density and is in units of 2J S.
172
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
conditions for an ILSR as determined both from linear stability analysis of extended nonlinear zone boundary mode and also from the continuum approximation are reviewed. 3.3.1. Modulational instability of extended nonlinear spin wave modes Owing to the translational invariance of the underlying lattices, for any "nite spin deviation, Eq. (3.4) also has time-periodic solutions of spatially extended nonlinear spin-wave modes with frequency u(q)"u (q)(1!s where the deviation s is not negligible. The nonlinear terms tend to decrease the frequency and it is these extended spin-wave modes that are modulationally unstable under certain conditions and linear stability analysis can be used to determine the parameter space in which they are unstable. Assume an extended nonlinear spin-wave mode s>(t)"s e OL?\SR is perturbed so that L s>(t)Ps>(t)"(s #b #it )e OL?\SR , (3.10) L L L L where u is the frequency of the nonlinear extended spin wave and the perturbations b and t are L L real and much smaller than s in magnitude. To obtain the growth rate of modulation waves let (3.11) b "be /L?\S R#c.c. , L and t "te /L?\S R#c.c. , (3.12) L where Q and u are the wave number and frequency of the modulation wave. Substituting
Eqs. (3.10) into (3.4) and linearizing in b and t gives two coupled linear equations
M M b "0 , M M t where the matrix elements of M are given by
(3.13)
u M "M " !2(1!s (sin Qa sin qa#o sin 2Qa sin 2qa) , 2J S M "!i2(1!s [(1!cos Qa) cos qa#o(1!cos 2Qa) cos 2qa] ,
(3.14) (3.15)
and (3.16) M "!M !i (2s/(1!s )[(cos Qa!cos qa)#A(cos 2Qa!cos 2qa)] . The dispersion curve of the modulation wave, u (q, Q), is determined by the condition that the
determinant of the matrix M is zero so that Eq. (3.13) has nontrivial solutions. For arbitrary q and Q, with the help of Eq. (3.5), one "nds
1 u(q#Q)#u(q!Q) u (q#Q)!u (q!Q) u ! (1!s "
4 !2u (q) 2 u (q#Q)#u (q!Q) u (q#Q)#u (q!Q) ; !s . !2u (Q) !2u (q)
(3.17)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
173
For long wavelength modulation Q;1, Eq. (3.17) can be further simpli"ed to
1 du du Q Q!2su (q) , [u !Qv (q)(1!s]"
4 dq dq
(3.18)
where v (q)"du /dq is the group velocity of linear spin waves. It can be seen from Eq. (3.18) that instability may only occur in the region where the curvature of the linear dispersion curve is positive. It is clear that Im+X (q, Q),'0 if and only if
0((du /dq)Q(2su (q) . (3.19) Since the ILSR mode, if it exists, should bifurcate from the band edge mode, attention should be focused on the band edge spin wave. Setting qa"p in Eq. (3.5) gives u (p/a)"8J S and du /dq"4J Sa(4o!1) . (3.20) Since the right-hand side (RHS) of the second equation of Eq. (3.20) is always negative for o4o with o "0.25 regardless of the magnitude of the spin deviation s , the extended nonlinear band edge mode is stable for this parameter range. However, as the relative strength of NNN coupling gets stronger so that o'o , the extended band edge mode becomes modulationally unstable to long wavelength perturbations when the spin deviation exceeds the threshold
1 du p s" Q "(4o!1) . (3.21)
2u (p/a) dq N Here use has been made of the fact that in a periodic ferromagnetic lattice of N spins the smallest wavevector is Q "2p/Na. In a real system s is essentially zero since N is of the order of 10. This
instability region is also the region in which ILSRs can occur. The critical value of o agrees with the numerical "ndings. The energy density distribution as determined by MD simulation for randomly perturbed extended nonlinear band edge spin wave mode for two values of the NNN strength at di!erent times is shown in Fig. 16. The dot-dashed lines represent the energy density distribution of the ground state. In Fig. 16a, the relative strength of the NNN interaction o"0.2 is below the critical value of o "0.25 while in Fig. 16b, o"0.6 is above the critical value. The initial spin deviations are the same for both chains, and are given by s "(!1)L0.1#ds where the magnitude of random L L perturbation "ds " is less than 0.005. The behavior of the extended band edge mode is qualitatively L di!erent for the two cases. In a chain with o less than o , the energy density distribution remains spatially extended throughout the simulation period, as shown in Fig. 16a. In a chain with o greater than o , Fig. 16b illustrates that the band edge spin wave is unstable and evolves into temporal ILSR-like localized excitations. At t"360¹ , the initially uniformly distributed energy 8 has been concentrated into "ve ILSR-like excitations. This MD simulation demonstrates that the modulational instability of the extended band edge mode is a possible mechanism for the creation of ILSR excitations from extended modes. 3.3.2. Envelope solitons in the continuum approximation When the maximum spin deviation of an ILSR is small, so that it extends over a large number of lattice sites, the magnitude of the spin deviation s will vary slowly with site index n. Since the L
174
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 16. Decay of extended band edge spin waves into ILSRs via modulational instability. The time is measured in units of ¹ . The snapshots of energy density distributions are determined by MD simulations for randomly perturbed 8 extended band edge spin waves in chains containing 256 spins with periodic boundary condition. One ordinate unit is 0.05. The dot-dashed lines are the energy density distribution of the ground state: (a) The NNN interaction parameter o"0.2 is less than the critical value o ; (b) The NNN interaction parameter o"0.6 is greater than the critical value o (after Ref. [58]).
spatial symmetry of a small amplitude ILSR is close to that of the corresponding plane wave, the staggered variable
"(!1)Ls> (3.22) L L can be introduced where now is complex. This continuum approach can be used to determine L the existence condition of ILSRs and provide some properties of small amplitude ILSRs which are expected to be qualitatively correct even for ILSRs in the discrete limit. The analysis used here is not limited to the stationary modes but only assumes that the wavenumber associated with the ILSR is close to the band edge so that both the phase and the magnitude of vary slowly in space. L After substituting Eq. (3.22) in Eq. (3.4) and neglecting nonlinear terms higher than cubic a nonlinear SchroK dinger (NLS) type equation is obtained for (x, t), namely, iR /Rt"2J S(1!4o)a(R /Rx)#8J S !4J S" " . (3.23) The NLS equation is integrable, and the condition for Eq. (3.23) to have a localized solution (one-soliton solution) is that the coe$cient of the second-order spatial derivative and the coe$cient of the cubic nonlinear term be of the same sign, that is, o'0.25. (The linear term can be removed by a gauge transformation.) Note that this critical value of o agrees with the value of o already found from linear stability analysis. When o'o , Eq. (3.23) has both stationary and moving localized solutions which are given by
(x, t)" sech
x!v t e\ )V\SR>? , l
(3.24)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
175
where p/a!K is the wave number of the carrier wave associated with the ILSR, the maximum
spin deviation and a a constant phase factor. A stationary mode can be obtained by setting K"0. The envelope velocity v , the envelope width l , and the frequency of the traveling ILSR u are given by v "!4J Sa(4o!1)K , l "((4o!1)/ )a
(3.25) (3.26)
and u "8J S#2J S(4o!1)(Ka)!2J S . (3.27) The traveling velocity v of the envelope is just the group velocity of the linear spin wave of the same wavevector which is v "du /dq"4J Sa(sin qa#2o sin 2qa) , (3.28) for q"p/a!K with K small. In the continuum approximation for a given maximum spin deviation, the traveling ILSR and stationary ILSR have the same envelope shape with width inversely proportional to the maximum spin deviation and which increases with the NNN coupling strength. The "rst term on the right-hand side of Eq. (3.27) is just the frequency of the linear band-edge spin wave, and the third term is the anharmonic frequency shift. Since the frequency shift is negative, u is in the linear spin wave band. The independence of the mode frequency on the strength of the NNN coupling is consistent with the fact that at the Brillouin zone boundary the spins precess in-phase with their NNNs. However, the NNN coupling does tend to broaden the envelope of the localized mode. In the discrete limit where the ILSR is highly localized so that the continuum approximation breaks down, one should expect these observations still to be qualitatively correct. It may seem surprising that the essential feature of an ILSR, the non-decaying plane-wave tail, does not appear in the solution although the continuum approximation describes the shape of the center of an ILSR when the maximum spin deviation is small and gives the correct threshold of o. The linear dispersion curve of the continuum model described by Eq. (3.23) can be obtained by setting "0 in Eq. (3.27), which has a parabolic shape with a gap at K"0. Since this dispersion
curve is just a local approximation of the real dispersion curve Eq. (3.5) in the neighborhood of zone boundary, in the continuum model the solution Eq. (3.24) describes a gap soliton. 3.4. Translating ILSRs So far the large amplitude stationary ILSR excitation is assumed to be described by an elementary excitation with a circular precession frequency u . If an ILSR is traveling along the chain then the circular precession may be viewed as an internal degree of freedom and the translational motion as an external one. The separation of these degrees of freedom is a good approximation only when the wave number of the carrier wave is close to Brillouin zone boundary and hence the traveling velocity of the ILSR is small compared to the phase velocity of the carrier wave. To describe traveling ILSRs, solutions of the form s>(t)"s (t)e OL?\SR"(!1)L (t)e\ )L?>SR , L L L
(3.29)
176
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
need to be examined where (t) is real and qa"p!Ka with Ka;1. The substitution of L Eq. (3.29) into Eq. (3.4) gives 1 d
L"(1! [( ! )sin Ka!o( ! )sin 2Ka] , L L\ L> L\ L> 2J S dt
(3.30)
and (u /2J S) " [(1! #(1! #o((1! #(1! )] L> L\ L> L L L\ #(1! [( # )cos Ka!o( # )cos 2Ka] . (3.31) L L\ L> L\ L> The translational velocity of the ILSR is determined by Eq. (3.30) while its envelope is determined by Eq. (3.31). 3.4.1. Properties of a single traveling ILSR The envelope of a traveling ILSR for the discrete limit can be found by numerically solving Eq. (3.31). Once the initial envelope shape of an ILSR is obtained MD simulations can be used to investigate its motion. Fig. 17 shows the time evolution of an ILSR with s "0.3, Ka"p/32 and
o"1.0. Since the maximum spin deviation is fairly small, the initial envelope shape can be obtained from the continuum approximation, Eq. (3.24). In this case the characteristic o!-central plane wave pattern of a resonant mode, which would occur if Eq. (3.31) is solved, is negligible. It is observed that this small amplitude ILSR can travel freely through the lattice with the velocity given by Eq. (3.25), and that there is no decay or slowing down within numerical error. This simulation result con"rms the validity of the continuum approximation where small amplitude ILSRs are treated as gap solitons. As the maximum spin deviation increases this continuum approximation breaks down and one has to solve Eq. (3.31) numerically to obtain the initial spin deviations. Like a stationary ILSR, the moving ILSR has a weak plane wave tail in the o!-central region. Fig. 18 shows the time evolution of an ILSR with s "0.7, Ka"p/32 and o"1.0. Since the
maximum spin deviation is not too large the ILSR can still travel through the lattice, but by the time 800¹ has passed (not shown) about 5% of the energy has decayed into the plane-wave 8 modes. The larger the amplitude and the faster the velocity of an ILSR, the larger the emission of plane-spin wave modes. 3.4.2. Collision between two ILSRs A fundamental property of solitons is that they pass through each other as non-interacting particles. Recent studies have shown that this is not the case for intrinsic localized modes in discrete lattices, and that both energy transfer between intrinsic localized modes and collision-induced decay into plane-wave modes are observed in computer simulations [74,75]. Here we examine interactions of both the small amplitude and large amplitude ILSRs in these FM chains. To launch two small amplitude ILSRs moving toward each other, two small amplitude ILSRs obtained from Eq. (3.24) are placed 256 sites apart in a chain of 512 spins with periodic boundary condition. The NNN interaction parameter o"1.0. The parameters for the two ILSRs are: left, s "0.2, Ka"!p/25.6; right, s "0.3, Ka"p/25.6. The collision is illustrated in Fig. 19.
Before and after the collision the two modes move with uniform velocity and maintain their
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
177
Fig. 17. Traveling ILSR with small maximum spin deviation in a 256 spin chain. The parameters are o"1.0, s "0.3
and Ka"p/32. The energy density shown here is measured from the ground state energy density and is in units of 2J S. Time is measured in units of ¹ . 8 Fig. 18. Traveling ILSR with intermediate maximum spin deviation. The parameters are o"1.0, s "0.7 and
Ka"p/32. For clarity, only part of the 256 spin chain is shown. The energy density shown here is measured from the ground state energy density and is in units of 2J S. Time is measured in units of ¹ . 8
original shapes. More accurate integration shows that the energy transfer between them after one collision is less than 0.5% of the total energy. The interaction between ILSRs becomes more violent as the mode amplitude increases. As an example the collision between a large amplitude stationary ILSR and a small amplitude traveling ILSR for a chain of 256 spins is shown in Fig. 20. Each mode maintains its own shape before the collision, they interact strongly when they meet, and both become unstable after the collision. Only a fraction of the small amplitude traveling ILSR can pass through the stationary ILSR, and it decays quickly into plane-wave modes after the collision. Meanwhile the stationary ILSR also shakes away energy from its central region after the collision, although it still remains localized over the simulation interval examined. When both ILSRs have large amplitudes neither survives the collision. The observation reported in Ref. [75] that the collision between intrinsic localized vibrational modes tends to favor the growth of the larger excitation at the expense of smaller ones is not seen for the ferromagnetic chain with both NN and NNN interactions. The most likely explanation is that the NNN interaction produces a long-range coupling between spins.
4. Antiferromagnetic chain with on-site easy-axis anisotropy Since both intrinsic localized spin wave modes (ILSMs) in easy-plane ferromagnetic chains [41,59] and intrinsic localized spin wave resonances (ILSRs) in isotropic ferromagnetic chains [58] stem from the Brillouin zone boundary mode, they are not magnetic dipole-active. In order to generate or detect such ILMs it would seem that the direct interaction with electromagnetic radiation would be a desirable property. Since the intrinsic localized gap modes in diatomic lattices have been found to be IR-active we review here the available information on two sublattice spin
178
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
179
systems. There is the added practical advantage that a large number of insulating magnetic materials are anisotropic antiferromagnets and many of the uniform antiferromagnetic resonance (AFMR) modes are IR-active. Thus, the possibility of the existence of ILSMs in the gap below the AFMR frequency combined with an associated magnetic dipole activity is of particular interest from the observational point of view. Takeno and Kawasaki [40,76] "rst proposed that the soft nonlinearity in a NN anisotropic exchange interaction of a one-dimensional Heisenberg antiferromagnetic chain could produce both symmetric and antisymmetric ILSMs in the gap below the uniform mode AFMR frequency. Later, magnetic gap solitons with "nite but small spin deviation were investigated in 1-D Heisenberg antiferromagnets [77,78]. These solitons have zero group velocity. In all of these works, the coherent state ansatz [79] was employed. As shown in the previous sections the classical torque equation provides a straightforward and transparent approach for exploring the nonlinear classical equations of motion and we continue to present this method. In this section, the nonlinear dynamics of classical antiferromagnetic ordered spins interacting via NN isotropic exchange and on-site anisotropy is examined with attention directed at the experimentally relevant dynamical properties of stationary and moving IR-active ILSMs. The introduction of on-site anisotropy rather than anisotropic exchange between neighbors enables one to make contact with the measured parameters for a variety of known magnetic insulators. 4.1. Equations of motion A perfect one-dimensional antiferromagnetic chain of N classical spins in which each spin interacts with its nearest neighbors via the Heisenberg exchange interaction and each spin feels an on-site anisotropy "eld is described by the Hamiltonian H"2J S ) S !D (SX) . (4.1) L L> L L L where both the exchange constant J and the single-ion anisotropy constant D are positive. In the ground-state adjacent spins point in opposite directions along the z-axis. The usual periodic boundary condition is imposed so that S "S . The anisotropy that is used in Eq. (4.1) is an L L>, e!ective anisotropy which may arise, for example, from the crystalline "eld interaction of the magnetic moments with their neighboring ions and/or from the long range dipolar interaction between the magnetic moments [80].
䉳
Fig. 19. Collision between two traveling small amplitude ILSRs in a chain of 512 spins with o"1.0. The parameters for the two ILSRs are: left, s "0.2, Ka"!p/25.6; right, s "0.3, Ka"p/25.6. The two ILSRs pass through each
other as non-interacting particles as expected for soliton-like excitations. The energy transfer between them is less than 0.5% of the total energy after one collision. Fig. 20. Collision between a traveling small amplitude ILSR and a stationary large amplitude ILSR in a chain of 256 spins with o"1.0. The small amplitude ILSR is characterized by s "0.2, Ka"p/32, and the stationary ILSR by
s "0.7, Ka"0. Both modes are unstable after the collision.
180
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
For the Hamiltonian above, the classical equations of motion for a spin on nth site can be obtained from Eq. (3.2) where now the e!ective "eld, H, acting on the spin is L H(t)"! SLH"!2J(S #S )#2DSXe , (4.2) L L\ L> L X with e the unit vector along the positive z-axis. From Eqs. (3.2) and (4.2), one obtains the equation X of motion for s>, namely, L ids>/dt"!2JS[(sX #sX )s>!(s> #s> )sX]#2DSsXs> . (4.3) L L\ L> L L\ L> L LL In the small spin deviation limit, i.e., sX"(!1)L, the linear dispersion curve for spin wave is L obtained from Eq. (4.3) as X(q)"$((A#2)!4 cos qa ,
(4.4)
where X(q)"u(q)/2JS, A"D/J, q and a are the wave vector and the distance between adjacent spins, respectively. The two branches ($) correspond to the two di!erent directions of precession. A gap occurs below the q"0 antiferromagnetic resonance frequency X "X(0)"(A(A#4). $+0 Since the nonlinearity in both the exchange interaction and anisotropy "eld is soft ILSMs cannot appear above the top of the plane-wave spectrum in either isotropic or anisotropic antiferromagnetic chains. However, as the amplitude increases intrinsic localized spin wave gap modes (ILSGs) with frequencies below X can split o! from the bottom of the linear dispersion $+0 curve as shown in Fig. 21. The desired large amplitude-localized excitation is assumed to be described by an elementary excitation with a circular precession frequency u . If this mode is traveling through the chain then the circular precession can be treated as an internal degree of freedom and the translational motion as an external one. Thus circularly polarized solutions of the form s>(t)"s (t)exp[i(qna!u t)] L L
(4.5)
Fig. 21. Frequency of an intrinsic localized spin wave mode in an antiferromagnetic chain with on-site easy-axis anisotropy, relative to the linear spin wave dispersion curve. Only gap modes are allowed owing to the softness of the nonlinearity in both the exchange and the anisotropy "elds.
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
181
are to be found where the envelope s (t)"[s (t)]H is time-dependent. The assumption is exact for L L stationary modes (q"0), and is a good approximation in cases where the localized mode moves slowly (with small q) and the magnitude of the spin deviation does not change rapidly from site to site. Hence we shall restrict q to the region near the center of Brillouin zone. Inserting Eq. (4.5) into Eq. (4.3) results in the following system of nonlinear equations for the transverse spin deviations: and
X s "(!1)L+((1!s #(1!s )s #[(s #s ) cos qa#As ](1!s, , L> L L\ L> L L L L\
(4.6)
1 ds L"(!1)L(s !s )(1!s sin qa . L> L\ L 2JS dt
(4.7)
Here the time variable t has been explicitly dropped from both equations. Eq. (4.6) determines the envelope of the mode while Eq. (4.7) determines the envelope velocity. The system of coupled nonlinear equations represented by Eq. (4.6) can support localized solutions. Note that no further approximation, such as the rotating wave approximation (RWA) necessary in nonlinear lattice vibration, is made in deriving Eq. (4.6). It should be emphasized again that Eqs. (4.6) and (4.7) are exact for stationary modes and they are a good approximation for slowly moving modes with broad envelopes. 4.2. Stationary intrinsic localized spin wave gap modes (I¸SGs) Setting q"0 in Eq. (4.7) yields a time-independent envelope, hence a stationary ILSG. Eq. (4.6) can be solved numerically for both the ILSG frequency and eigenvector with a appropriate symmetry imposed on the mode shape. Owing to the symmetry in Eq. (4.6) spin wave modes of the shapes proposed in Ref. [40] do not exist. However, it has been shown by others that both single-peaked and double-peaked intrinsic localized spin wave modes can be found with frequencies in the gap below X . $+0 4.2.1. Eigenvector of a single-peaked I¸SG Since a stationary ILSG, if it exists, bifurcates from the spatially uniform q"0 mode, both eigenvectors are expected to have the same pattern of sign alternation of spin deviations. In a singled-peaked ILSG the maximum spin deviation is at the center of the mode. Once the appropriate symmetry is imposed and the maximum spin deviation is "xed, both the eigenvector and eigenfrequency of an ILSG can be obtained numerically. As an illustration, the spin deviations of an ILSG versus site index in a chain of 128 spins is plotted in Fig. 22a for a maximum spin deviation s "0.7 and with anisotropy parameter A"1. The sign of the spin deviation alternates from one spin to the next. Some qualitative statements can be made for ILSGs with di!erent maximum spin deviations and anisotropy parameters. For small s , the envelope of the ILSG spreads over a large region of the lattice. As s increases, the degree of localization increases. For "xed s , the degree of localization also increases with increasing anisotropy strength A. It is observed that the frequency of an ILSG drops faster for a chain of larger anisotropy as s increases. The frequency of a stationary single-peaked ILSG as function of s is plotted in Fig. 23 as open circles for A"1.0.
182
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 22. Eigenvector shapes of stationary single-peaked and double-peaked ILSGs. Both modes are in a chain of 128 spins with anisotropy parameter A"1, and have the same maximum spin deviation s "0.7: (a) Single-peaked ILSG; (b) double-peaked ILSG. The "lled circles identify up spins while the open circles identify down spins (after Ref. [116]). Fig. 23. Frequency of stationary ILSGs as function of maximum spin deviation. The antiferromagnetic chain consist of 128 spins with anisotropy strength A"1. The open circles are frequencies of single-peaked ILSGs, and the triangles are those of double-peaked ILSGs. The solid curve is obtained in the continuum limit.
4.2.2. Eigenvector of a doubled-peaked I¸SG Another kind of symmetrical localized mode that can exist in an easy-axis antiferromagnetic chain is a double-peaked mode, which is plotted in Fig. 22b for an ILSG with A"1 for a chain of 128 spins. This mode is di!erent from the single-peaked mode in that the maximum spin deviation is on the two nearest-neighbors of the mode center rather than on the mode center itself. The double-peaked mode has the same static properties as those of single-peaked mode except that the frequency of a double-peaked mode decreased faster with increasing maximum spin deviation than that of a single-peaked mode. For comparison, the frequency of the double-peaked mode as function of maximum spin deviation is plotted as triangles in Fig. 23. The solid curve in Fig. 23 is obtained from the continuum limit approximation to be discussed in Section 4.4. Although the two kinds of localized spin wave modes have similar static properties, we show that they have quite di!erent dynamical properties. 4.2.3. Power spectra of stationary ILSGs Once an ILSG eigenvector is obtained by numerically solving Eq. (4.6), it can be used as the initial condition in a molecular dynamics (MD) simulation to test the modes stability. A log power spectrum of the total transverse magnetic moment m>(t)" s>(t) of a single-peaked stationary LL ILSG is shown in Fig. 24a. The results show that there is a very strong peak at X /X "0.9073, $+0 in the gap below the bottom of the plane-wave spectrum. The resulting stationary ILSG is
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
183
Fig. 24. Power spectra of the transverse magnetic moment of stationary ILSGs in an easy-axis antiferromagnetic chain of 128 spins with A"1. Both modes have the same maximum spin deviation of 0.7: (a) Single-peaked ILSG; (b) double-peaked ILSG.
essentially monochromatic so the frequency obtained from the MD power spectrum can be compared to the frequency calculated from Eq. (4.6). The MD simulation frequencies found in this manner are plotted as open circles in Fig. 25 for single-peaked ILSGs resulting in excellent agreement over the entire range of s . MD simulations of double-peaked stationary ILSGs demonstrate that these modes can also last for thousands of periods without any apparent decay in the absence of noise perturbation. The power spectrum of such a double-peaked stationary ILSG is plotted in Fig. 24b, which shows a single clean peak in the gap. The above MD simulations demonstrate that both single-peaked and double-peaked localized solutions found by the numerical procedure are true periodic orbits with a single frequency. A di!erent picture develops when the modes are randomly perturbed. The single-peaked mode is stable against noise perturbation whereas the double peaked one is unstable. A detailed discussion of the stability results is deferred until Section 4.5. 4.3. Moving gap modes Now let us turn to the case of traveling ILSGs. For certain nonlinear lattices it has been shown that traveling intrinsic localized vibrational modes can be excited, and the motion of such intrinsic localized excitations in discrete lattices can be very di!erent from the motion of soliton-like excitations in a continuum since discreteness breaks the continuous translational invariance [55,75,81}83]. Since the shape of an ILSG in a discrete lattice interchanges between single-peaked and double-peaked ones while it travels through the lattice, the symmetry classi"cation used for
184
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 25. Comparison between simulated and analytical frequencies of a stationary single-peaked ILSG in chains with di!erent anisotropy parameters over a wide range of maximum spin deviation s . The dot-dashed curves are frequencies obtained by solving Eq. (4.6) with q"0. The open circles are MD simulation results. Top curve: A"0.8; Lower curve: A"2 (after Ref. [62]).
stationary ILSGs loses its signi"cance for traveling ILSGs. To launch a moving ILSG, the initial envelope is assumed to be single-peaked and can be found from Eq. (4.6) with non-zero q. The periodic boundary condition requires q to be of the form q"2pm/Na where m is an integer. For "xed maximum spin deviation the degree of localization increases as q increases. Once the initial envelope of the moving ILSG is obtained, numerical simulations can be used to investigate its motion. Fig. 26 shows the time evolution of four di!erent trajectories of moving ILSGs. Each curve in Fig. 26 presents the energy density distribution averaged over one ¹ , and is $+0 separated from the next by the time period of 20¹ . When both the relative strength of the $+0 anisotropy and the wave vector are small, as shown in the upper left panel (A"0.8, qa"2p/75) and the lower left panel (A"0.8, qa"p/15), the ILSG remains localized while moving, but detailed numerical studies show that a small spinwave tail is produced while the mode travels with uniform velocity. However, no noticeable slowdown is observed during the entire time interval as shown in the upper and lower left panels, respectively. As q grows, the traveling ILSG becomes increasingly unstable and quickly collapse into plane spin waves when q is su$ciently large (not shown). As either the strength of anisotropy A or the maximum spin deviation s grows, it becomes increasingly di$cult for an ILSG to travel through the lattice. The upper right panel shows a case where the ILSG is pinned (A"2, s "0.5 and qa"2p/75). In order to keep the ILSG (A"2) moving, one must either increase the wave vector (qa"p/15) as shown in the lower right panel or reduce the maximum spin deviation. Note that this traveling ILSG gradually slows down while its amplitude decreases. It is "nally pinned as illustrated in the lower right panel. Fig. 27a and Fig. 27b presents the power spectra of the transverse magnetic moment of two traveling ILSGs corresponding to the trajectories shown in the lower panels of Fig. 26. The peak
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
185
Fig. 26. Comparison of traveling ILSGs with di!erent sets of parameters. The energy density is measured from the ground state: (a) Upper left panel, A"0.8, s "0.7 and qa"2p/75; (b) Lower left panel, A"0.8, s "0.7 and qa"p/15; (c) Upper right panel, A"2,s "0.5 and qa"2p/75; (d) Lower right panel, A"2,s "0.5 and qa"p/15. Each curve is separated from the next by the time period of 20¹ , and shows the mean energy density averaged over $+0 one ¹ . The lowest curve in each frame identi"es the starting time. The lattice contains 150 spins and the cyclic $+0 boundary condition is applied (after Ref. [62]). Fig. 27. Power spectra of the transverse magnetic moment of traveling ILSGs with various sets of parameters: (a) A"0.8, s "0.7 and qa"p/15; (b) A"2, s "0.5 and qa"p/15. In both cases the peak frequency in the power spectrum su!ers a red shift from the internal frequency due to the motion of the ILSG (after Ref. [62]).
frequency of the power spectrum, u , is shifted from the internal frequency u by an amount of *u"!qv, i.e., u "u !qv , (4.8) since the ILSG is traveling at the velocity v. As expected the power spectra of traveling ILSGs are much broader than those of stationary ILSGs because the ILSGs are scattered by the discrete lattice due to the absence of continuous translational invariance. In both Fig. 27 a and Fig. 27 b a secondary peak at the plane wave u can be seen, which is more pronounced in Fig. 27b $+0 consistent with the observation that the traveling state in the lower right panel of Fig. 26 emits more spin waves per unit time. The above picture of a traveling localized mode is similar to that observed for the discrete nonlinear SchroK dinger lattice where the motion of the localized state can be described in terms of a Peierls}Nabarro (PN) barrier generated by the lattice discreteness [83,84]. For a traveling ILSG, the height of the PN energy barrier is the energy di!erence between the single-peaked ILSG and the double-peaked ILSG with the same frequency. This energy di!erence serves as a barrier against the motion of an ILSG. This barrier height increases as either the maximum spin deviation or the
186
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
strength of anisotropy increases. Therefore an ILSG can be trapped by discreteness with increasing the anisotropy strength as observed in the right panels of Fig. 26. 4.4. Weak nonlinearity limit In previous sections numerical solutions to Eq. (4.6) have illustrated both stationary and traveling ILSG eigenvectors. Since, in the small spin deviation limits, the spatial size of such an ILSG becomes much larger than the lattice spacing the continuum approximation can be invoked to obtain approximate analytical solutions of ILSGs. These analytical solutions, though derived for ILSGs in the limit of weak anharmonicity, provide a qualitative understanding of how the anisotropy parameter and the maximum spin deviation change the properties of an ILSG with large spin deviation. 4.4.1. The reductive perturbation method Consider the two sublattices, A and B. Sublattice A contains all spins on even sites (up spins), while sublattice B contains all spins on odd sites (down spins). The appropriate new variables are
"s> and t "s> . (4.9) J J J J> As demonstrated by numerical simulations, both stationary and traveling ILSGs can be excited in easy-axis antiferromagnetic chains in the small amplitude limit although ILSGs with large amplitude would be trapped by the discreteness of the lattice. To represent the continuum approximation for ILSGs for a range of wavevectors, the reductive perturbation method [68,71] can be used in which the phase of the carrier wave is treated exactly and the envelope functions are treated in the continuum approximation. The solutions of Eq. (4.3) are expected to have the form
(t)" eH F (l, t)e IFJ , J HI H I t (t)" eH G (l, t)e IFJ, j, k"1, 2, 3,2 , J HI H I where e is a small parameter measuring the magnitude of spin deviations, and
(4.10)
h "q2la!ut . (4.11) J By construction the envelope functions F and G vary slowly with both position and time. HI HI Substituting Eq. (4.10) into Eq. (4.3), and equating powers of e and considering each harmonic separately, one obtains, to O(e), +(kX!A!2)F !(1#e\ IO?)G ,e IF"0 , I I I
(4.12)
+(1#e\ IO?)F #(kX#A#2)G ,e IF"0 . I I I
(4.13)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
187
Hence only linear terms contribute to this order, and Eqs. (4.12) and (4.13) can be satis"ed only if coe$cients of each harmonic are zero, which requires either kX"(A#2)!4 cos kqa and
F 2 cos kqa I" e\ IO? G X!A!2 I
(4.14)
or
F I "0 (4.15) G I for any k. Since the amplitudes of the fundamental harmonic (k"1) are non-zero Eq. (4.14) must hold for the case of k"1. Hence, the frequency of the carrier wave is the same as that of the plane spin wave of the same wavevector. One can show that the "rst equation in Eq. (4.14) cannot be satis"ed for any k'1. Eq. (4.15) therefore must hold for any k'1, that is, to O(e) no higher harmonics are generated. Collecting terms of O(e) in Eq. (4.3) and setting the coe$cient equal to zero gives
RG i RF I#2a I e IO?#(kX!A!2)F !(1#e IO?)G e IF"0 , (4.16) I I Rx 2JS Rt I i RG RF I#2a I e IO?#(1#e IO?)F #(kX#A#2)G e IF"0 , (4.17) I I 2JS Rt Rx I where t "et and x "ex with 2laPx. Because Eq. (4.15) holds for any k'1, the coe$cients for kth harmonics involve only F and G . As the "rst equation in Eq. (4.14) is not satis"ed for any I I k'1, one can therefore conclude from Eqs. (4.16) and (4.17) that
F I "0 (k'1) . G I For k"1, we assume that F
(4.18)
and G depend on x and t as
m "x !ct , (4.19) where c can be interpreted as the envelope velocity to "rst-order and is to be determined by the solvability condition. Substituting Eq. (4.19) into Eqs. (4.16) and (4.17) and with the help of the second equation in Eq. (4.14), one can obtain two coupled linear equations for the three variables, i.e., RF /Rm , F and G . Since the number of variables is larger than the number of equations, the condition for nontrivial solution is that the two equations are identical hence, one "nds that c"v "2JS dX/dq"(4JSa/X) sin 2qa , (4.20) where v is the group velocity of the linear spin wave with wavenumber q, and iv /2JS#[(4a cos qa)/(X#A#2)]e\ O? RF X!A!2 , F ! (4.21) G " 1#e\ O? Rm 1#e\ O? where RF /Rm is an unknown function to be determined, and the envelope function F can be set to zero for simplicity.
188
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Similarly collecting terms of O(e) in Eq. (4.3) and setting the coe$cient to zero gives
i RF RG RF RG RG I# I #2a I# I e\ IO?!2a I e IO? 2JS Rt Rx Rt Rx Rx I 1 #F G GH e I\IF# [G (1#e\ IO?)#AF ] F FH e I\IF I I I I I I 2 I II II #(kX!A!2)F !(1#e\ IO?)G I I
e IF"0 ,
(4.22)
RG RF R F i RG RF I e IO? I# I #2a I# I e\ IO?#2a Rt Rx Rx 2JS Rt Rx I 1 !G F FH e I\IF! [F (1#e\ IO?)#AG ] G GH e I\IF I I I I I I 2 I II II #(1#e\ IO?)F #(kX#A#2)G I I
e IF"0 .
(4.23)
where t "et and x "ex. By the same argument leading to Eq. (4.18), one can determine from Eqs. (4.22) and (4.23) that
F I "0 (k'1) . G I
(4.24)
Eqs. (4.18) and (4.24) indicate that only fundamental harmonics appear up to the order of O(e). By reduction one can show that no higher harmonics can be generated up to any order which is consistent with the fact that no rotating wave approximation is needed in order to obtain numerically the eigenvector of an ILSG. The remaining part, i.e., the coe$cients of e F, yields two coupled nonlinear di!erential equations. These two di!erential equations can be simpli"ed by setting m "x !v t
and q"t
(4.25)
and seeking solutions of the form F (m , m , q) and G (m , m , q). With the help of the second equation in Eqs. (4.14) and (4.21), one can combine the two nonlinear di!erential equations to obtain the following nonlinear SchroK dinger equation (NLS) for the envelope function F : i
RF X RF #C(q) #P(q)F "F ""0 , Rm JS Rq
(4.26)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
189
where C(q)"XdX/dq"4a cos 2qa!(v /2JS) , P(q)"2AX/(X#A#2) .
(4.27) (4.28)
4.4.2. Envelope solitons Eq. (4.26) has localized solutions (envelope solitons) when C(q)P(q)'0, and the solutions are given by F (m , q)"F sech
P(q) F (m !m) e (1POX $ O ,
2C(q)
(4.29)
where both F and m are constants. Therefore to O(e) the spin deviation on even sites, (x, t), can
be obtained in terms of x and t as
x!v t!x e OV\S> SR , (4.30) ¸ where is the maximum spin deviation, ¸ the width parameter of the envelope given by
2C(q) 1 (4.31) ¸" P(q)
and
(x, t)" sech
*u"!(JSP(q)/2X) (4.32)
is the nonlinear frequency shift. The spin deviation on odd sites, t(x, t), can then be found from Eq. (4.14) to be t(x, t)"![2 cos qa/(X#A#2)] e O? (x, t) .
(4.33)
This localized solutions given by Eqs. (4.30) and (4.33) is similar to the localized precession soliton in the collinear phase of an antiferromagnet, which was obtained from a continuum phenomenological description [14]. Although the functional forms of these solutions are slightly di!erent, they both exist in the gap of the linear spin wave spectrum. Envelope solitons with nonzero q still couple to electromagnetic radiation because they have a non-zero transverse magnetic moment while linear spin waves with nonzero q do not. The density of the transverse magnetic moment can be de"ned as
(x, t)#t(x, t) (x, t) 2 cos qa " 1! e O? . m>(x, t)" O 2a 2a X#A#2
(4.34)
Therefore the net transverse magnetic moment can be obtained by integrating m>(x, t) over the O lattice, i.e.,
M>(t)" O
p 2C(q) nq¸ m>(x, t) dx" sech 2a P(q) 2 \
2 cos qa 1! e O? e OV\SHR , (4.35) X#A#2
190
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
where the frequency of the net magnetic moment M>(t) rotating in the x}y plane is given by O uH"u![(JS)P(q)/u] !qv . (4.36)
Note that the frequency of a small amplitude ILSG decreases quadratically with its amplitude. The frequency of stationary ILSGs given by Eq. (4.36) with q"0 is plotted as the solid curve in Fig. 23 for comparison. Good agreement is obtained for ILSGs with maximum spin deviation less than 0.3 beyond which an apparent discrepancy occurs and the ILSG frequency drops faster than given by a quadratic law. It is clear from this "gure that single-peaked and doubled-peaked ILSGs converge in the small amplitude limit. Note that according to Eq. (4.35) the small-ampltiude stationary ILSGs (q"0) have a net magnetic moment independent of the maximum spin deviation. 4.5. Stability of stationary ILSGs In Section 4.2 the eigenvectors of stationary single-peaked and double peaked ILSGs have been presented. MD simulations in the absence of noise perturbation demonstrate that these localized spin wave modes are true periodic orbits with a single frequency. In realistic systems, however, noise perturbations are always present. It is therefore of experimental interest to investigate the dynamical behavior of ILSGs in the presence of such noise perturbations. Next we examine the linear stability analysis of ILSGs of both types. The eigenvector of a stationary ILSG satis"es a system of time-independent equation given by Eq. (4.6), which can be written formally in matrix form as ¸K sL "0 ,
(4.37)
where the eigenvector sL "(s , s ,2, s )2 , ,\ and ¸K is a tridiagonal matrix with the non-zero elements given by
(4.38)
¸ "X #(!1)L>((1!s #(1!s #A(1!s) , L> L LL L\
(4.39)
¸ "¸ "(!1)L>(1!s . LL\ LL> L The time evolution of a perturbed ILSG can be written in the form
(4.40)
s>(t)"[s #u (t)]e\ SR , (4.41) L L L where the complex perturbation u (t) is separated from the stationary eigenvector. Substituting L Eq. (4.41) into the equation of motion given by Eq. (4.3) gives in matrix form
0 !¸K 1 d uL 0 " 2JS dt uL ' KK 0
uL 0 uL '
.
(4.42)
Here the superscripts R and I represent the real and imaginary parts, respectively, and KK "¸K #M K .
(4.43)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
191
The matrix M K is also a tridiagonal matrix with non-zero elements given by s L M "(!1)L (s #s #As ) LL L> L (1!s L\ L s s L\ L M "(!1)L LL\ (1!s L\
(4.44) (4.45)
and s s L> L . (4.46) "(!1)L LL> (1!s L> The stability of an ILSG is determined by the eigenvalues, j's, of the system of equations given by Eq. (4.42). The perturbation grows exponentially if Re+j,'0 for one or more j's, where j"2JSK, and for any given ILSG eigenvector K can be obtained by solving the following equation: M
det[K#KK ¸K ]"0 .
(4.47)
When Eq. (4.47) is solved for single-peaked stationary ILSGs in a chain of 128 spins with periodic boundary condition, the real parts of the j's are essentially zero within numerical accuracy. The small positive values of some j's (Re+j,/u
10\ or less) for single-peaked ILSGs are $+0 explained by the inaccuracy of the ILSG eigenvectors and the numerical procedure used to solving Eq. (4.47). The double-peaked ILSG yields a di!erent picture. The maximum real part of j is plotted in Fig. 28 for double-peaked ILSG as function of the maximum spin deviation of the mode. The "lled circles are obtained for ILSGs in an antiferromagnetic chain with anisotropy parameter A"2 while the "lled squares are for the case of A"1. It appears that the double-peaked ILSG
Fig. 28. The maximum growth rate of perturbation as function of the maximum spin deviation for a double-peaked ILSG in an uniaxial easy-axis antiferromagnetic chain of 128 spins. The anisotropy parameters are A"2.0 for the "lled circles and A"1.0 for the "lled squares, respectively.
192
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
becomes more unstable as either the maximum spin deviation or the anisotropy strength is increased. As the amplitude decreases, the maximum growth rate of the perturbation in a doublepeaked ILSG approaches zero, consistent with the fact that both single-peaked and double-peaked modes are governed by Eq. (4.26) in the continuum limit. Analytical analysis shows that single-peaked ILSGs are stable against noise perturbations while double peaked ILSGs are not. The question of how an unstable double-peaked ILSG evolves remains to be answered. To this end, let us consider the stability of stationary ILSGs in the presence of noise perturbation via MD simulations. Fig. 29 shows the time evolution of the two types of mode when the initial conditions of the eigenvectors include a random noise perturbation with maximum magnitude 0.5% of the largest spin deviation in the mode. Each curve in Fig. 29 presents the energy density distribution at a speci"c time. Fig. 29a demonstrates that the ILSG of single peak is stable since the shape of the mode is unchanged throughout the time interval, while Fig. 29b shows that the ILSG of double peaks is unstable and quickly evolves into a stable single-peaked mode. Fig. 30 shows the power spectra of the total transverse magnetic moment of both modes for di!erent initial conditions. In Fig. 30a, the power spectrum of the perturbed single-peaked ILSG
Fig. 29. Parity dependence of the stability of stationary ILSGs. The chain contains 150 spins and the anisotropy parameter A"2.0. Both ILSGs are initially randomly perturbed: (a) Time evolution of the energy density distribution of a single-peaked ILSG with maximum spin deviation of 0.7; (b) Time evolution of the energy density distribution of a double-peaked ILSG with maximum spin deviation of 0.58. The time t is measured in units of 2p/u (after Ref. $+0 [116]). Fig. 30. Power spectrum of the transverse magnetic moment of a stationary ILSG with di!erent symmetry under di!erent conditions. The antiferromagnetic chain consists of 150 spins with anisotropy parameter A"2.0: (a) Power spectra for a single-peaked ILSG with maximum spin deviation 0.7; (b) Power spectra for a double-peaked ILSG with maximum spin deviation of 0.58. Dot-dashed curves: unperturbed. Solid curves: perturbed with random noise. In both (a) and (b), the solid curve is shifted up by 4 decades from the dot-dashed curve for clarity (after Ref. [116]).
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
193
(solid curve) contains a strong peak below the bottom of the plane wave spectrum corresponding to the local mode, as well as a much smaller peak at the q"0 AFMR frequency excited by the random noise perturbation. The peak of the solid curve matches the peak resulting from the exact eigenvector without perturbation (dot-dashed curve), indicating again that the mode with the single peak is stable. In Fig. 30b, the results for a double-peaked ILSG are shown. When a noise perturbation is included the clean power spectrum for a double-peaked mode (dot-dashed curve) is replaced by a new complex power spectrum (solid curve) peaked at a lower frequency corresponding to a new stable single-peaked ILSG since the perturbed unstable double-peaked ILSG evolves into a stable single-peaked one.
5. Antiferromagnetic chain with on-site easy-plane or biaxial anisotropy A large number of antiferromagnets actually are characterized by biaxial anisotropy while some have near gapless linear spin wave spectra which are often approximated by an easy-plane anisotropy [15,85]. In this section we examine stationary ILSMs in chains of classical spins coupled antiferromagnetically through nearest-neighbor exchange interactions with on-site biaxial anisotropy. The uniaxial easy-plane antiferromagnetic chain [57] then appears as a special case. 5.1. The model Hamiltonian We consider a one-dimensional antiferromagnetic chain of N classical spins which is described by the Hamiltonian H"2J S ) S !D (SV)!D (SW) , L L> V L W L L L L
(5.1)
where both the nearest-neighbor exchange constant, J, and the on-site anisotropy constants D and V D are positive and let D 5D . The z-axis is a hard axis and N is even. This chain is magnetically W V W ordered along the x-axis at low temperatures with spins pointing alternatively parallel or antiparallel to this axis. (The uniaxial easy-plane antiferromagnetic chain is recovered by setting D "D .) V W The e!ective magnetic "eld acting on nth spin is given by H(t)"! SLH"!2J(S #S )#2D SVe #2D SWe , L\ L> V L V W L W L
(5.2)
where e are unit vectors along positive x and y axes, respectively. Since spin waves in the model VW given by Eq. (5.1) are in general elliptically polarized the equations of motion for the x,y and z components of the classical spin vectors 1 dsV L "[(sX #sX )sW!(sW #sW )sX]!A sWsX , L\ L> L L\ L> L WLL 2JS dt
(5.3a)
1 dsW L"[(sX #sX )sV!(sV #sV )sX]#A sVsX L\ L> L L\ L> L VL L 2JS dt
(5.3b)
194
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
and 1 dsX L"(sV #sV )sW!(sW #sW )sV!(A !A )sVsW , L\ L> L L\ L> L V W L L 2JS dt
(5.3c)
should be considered. The dimensionless variables are A "D /J and s "S /S where S is the VW VW L L magnitude of spin. With cyclic boundary conditions applied the eigenfrequencies of the linear spin waves are found from Eqs. (5.3) to be
A (1$cos qa) , X (q)"(A !A )(A #4)#4 sin qa# W ! V W V 2
(5.4)
where a is the lattice spacing between two adjacent spins and the dimensionless frequency X (q)"u (q)/2JS. Can an intrinsic localized spin wave mode exist above the top at the zone ! ! boundary of the plane wave spectrum? A necessary condition for an ILSM is that the substitution of q"p/2a#ii into Eq. (5.4) gives a real localized mode frequency but a complex frequency is found so this possibility can be excluded. The corresponding eigenvectors are given by
A #2(1Gcos qa) A #2(1Gcos qa) V ,!i,$ V ,$i , (5.5) X (q) X (q) ! ! where the upper (lower) sign is for the upper (lower) branch. Note that the polarization of spin waves for one branch is orthogonal to that of the other branch. A typical linear spin wave spectrum is plotted in Fig. 31 for the case of A OA . The two V W branches are degenerate at the Brillouin zone boundary. In this biaxial case a gap appears below the lower branch as a consequence of the breaking of rotational symmetry in the x}y plane. It is +sW , sX , sW , sX ,J
Fig. 31. Linear spin wave spectrum for an antiferromagnetic chain with biaxial anisotropy. The anisotropy parameter is A "1.5 and A "1.0. The ILSR arrow identi"es the frequency of the intrinsic localized spin wave resonance while the V W ILSG arrow identi"es that of the intrinsic localized spin wave gap mode.
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
195
clear from Eq. (5.4) that the gap X (0)"((A !A )(A #4) in the lower branch would disappear \ x W V for the hard axis uniaxial case where A "A . Since the small amplitude spin waves are elliptically V W polarized one may anticipate that the nonlinear resonant spin wave excitation which drops from the bottom of a branch will also be elliptically polarized. To "nd the eigenvector of a stationary ILSM one may use the ansatz sW(t)"sW cos ut, sX(t)"sX sin ut L L L L and sV(t)"(!1)L+1!(sW) cos ut!(sX) sin ut, . (5.6) L L L Here the squared terms in sW and sX cannot be neglected. Substituting Eq. (5.6) into Eqs. (5.3b) and L L (5.3c) one "nds, in the rotating wave approximation (RWA) where harmonics higher than the fundamental are ignored [30,34], the following coupled time-independent nonlinear equations: (!1)L>XsW"f (sX, sW)[sX #sX #A sX]#[f (sX , sW )#f (sX , sW )]sX , L L L L\ L> VL L\ L\ L> L> L
(5.7a)
and (!1)L>XsX"[f (sW , sX )#f (sW , sX )]sW#f (sW, sX)[sW #sW #(A !A )sW] , L L\ L\ L> L> L L L L\ L> V W L (5.7b) where
1 1 b !a , f (a, b)"(1!a)F ! , , 2, 22 1!a
(5.8)
and F(a, b, c, z) is the hypergeometric function [86]. With appropriate symmetry imposed Eqs. (5.7a) and (5.7b) can be solved for the eigenvector and eigenfrequency of a localized nonlinear spin wave mode. Below we consider only the single-peaked modes since double-peaked modes are unstable with respect to the noise. 5.2. Stationary intrinsic localized spin wave modes 5.2.1. Uniaxial easy-plane anisotropy For the uniaxial case where A "A "A the lower branch of spin wave spectrum is gapless and V W the only possible intrinsic localized spin wave modes are resonant modes (ILSRs) oscillating at frequencies below X (0)"2(A. Such in-band resonances occur because the upper and lower > branches have di!erent polarizations. A stationary localized resonance, if it exists, should bifurcate from the spatially uniform q"0 mode of the upper branch in Fig. 31. According to Eq. (5.5) the uniform mode eigenvector is (5.9) +sW , sX , sW , sX ,J+(A/2, !1, (A/2, 1, . L L L> L> For the ILSR one seeks the symmetric single peaked localized solution with the same pattern of sign alternation near the center of the resonance. Since the ILSR has a frequency within the lower branch of linear spin wave spectrum, far away from the mode center the localized solution would be
196
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
mixed with the spatially uniform plane wave solution from the lower branch. Such mixing is demonstrated with the numerical solution of the set of coupled nonlinear equations given by Eqs. (5.7) for a chain of 64 spins with periodic boundary conditions. The intrinsic localized resonance does exist for a range of anisotropy parameters. As an illustration of the eigenvector the spin deviation versus site index of an ILSR is represented in Fig. 32 by the "lled circles for the parameters sX "!0.65 and A"1.0. The frequency of this ILSR is found to be X "0.9301X (0) within the rotating wave approximation (RWA). As > expected, near the center of the resonance the sign of the z component of the spin deviation alternates from one spin to the next while the sign of the y component does not change. Hence the time-periodic and spatially localized ILSR has an oscillating net magnetic moment in the y direction. Unlike the intrinsic localized gap modes of antiferromagnetic chains with easy-axis anisotropy described in Section 4, the spin deviations do not disappear with increasing distance from the mode center. Instead the localized excitation evolves into a weak plane wave pattern, as expected for a resonance. The plane wave pattern of the mode center has the eigenvector character of the q"0 mode from the upper branch. The seemly irregular o!-center region of sW far from the center L exhibits a smooth plane wave pattern under the transformation sWP(!1)LsW. This sign alternation L L of sW is a characteristic feature of the lower branch. The wavenumber q associated with the small L amplitude o!-center plane wave is obtained from the Fourier transform of sX in q space to be L q"0.6(p/2a), corresponding to a frequency of 0.9277X (0) in the lower branch, which is in good > agreement with the ILSR frequency given the fact that the small size of the lattice limits the accuracy of the wavenumber speci"ed.
Fig. 32. Shape of a stationary intrinsic localized spin wave resonance with the maximum spin deviation sX "!0.65 and the anisotropy parameters A "A "1.0: (a) The spin deviation sX versus lattice site index n. The left side shows a factor V W L 5 expansion of the ordinate to display the plane wave character in the wings; (b) The spin deviation sW versus site index n. L The left side shows the same factor 5 expansion and a sign alternation to illustrate the resonant mode plane wave character. The sign alternation is a characteristic feature of the lower branch (after Ref. [57]).
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
197
Fig. 33. Comparison between the rotating wave approximation(RWA) frequency and the MD simulation frequency for a stationary ILSR versus spin deviation. The anisotropy parameters A "A "1.0. The dot-dashed curve is obtained V W using the RWA, and the open circles are results calculated from the "rst 820 ¹ (0) MD simulation points of the net > magnetic moment MW(t) (after Ref. [57]). Fig. 34. Stability of the unperturbed ILSR shown in Fig. 32. The energy density shown here is measured from the ground state energy and averaged over one period. The time is measured in units of ¹ (0) (after Ref. [57]). >
The dot-dashed curve in Fig. 33 shows the ILSR frequency found in the RWA as a function of the maximum spin deviation sX . The frequency drops further into the lower spin wave band as sX increases. These RWA frequencies can be compared to molecular dynamics (MD) simulation frequencies. Since the RWA has been used to obtain the ILSR eigenvector the mode stability needs to be checked. In MD simulations the numerically determined eigenvector is used as the initial condition, i.e.,s "((!1)L(1!(sX), 0, sX), and the discrete equations of motion for the x}y}z spin L L L components are integrated numerically by using the fourth-order Runge}Kutta method with a time step of ¹ (0)/200 where ¹ (0)"p/JSX (0). > > > These molecular dynamics simulations show that an ILSR with modest spin deviation can last many hundreds of periods without apparent decay. For example, the time evolution of the ILSR energy density averaged over one period is plotted in Fig. 34. The parameters are the same as those in Fig. 32. No decay can be seen after 800 ¹ (0). When a noise perturbation ((0.1%) is added, the > ILSR in Fig. 34 moves after about 800 ¹ (0) while it remains localized as the perturbation > develops further. As the maximum spin deviation is increased, the amplitude of the plane wave component in both wings of this excitation increases leading to instability and delocalization after su$cient time as might be expected for a localized excitation which is nearly degenerate with some modes in the plane wave spectrum. Since the ILSR is a collective excitation, a calculation of the power spectrum of the total magnetic moment, M(t)" s (t), is a useful method with which to identify the relative strength of LL the di!erent frequency components of the excitation as well as to check the accuracy of the RWA.
198
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 35. Power spectrum of the net magnetic moment MW(t) of the stationary ILSR shown in Fig. 32. Besides the strong peak at 0.9265u (0) corresponding to the ILSR much weaker peaks appear in the power spectrum at u (0) and at 3u , > > indicating that linear spin waves and the third harmonic are also excited due to the inaccuracy in the eigenvector resulting from the rotating wave approximation.
In the uniaxial case MX commutes with the Hamiltonian, and is therefore a constant of the motion. Because of the sign alternation in sX, MX is indeed zero. The total magnetic moment is thus linearly L polarized. Fig. 35 shows the log power spectrum of MW(t) for the ILSR plotted in Fig. 32. This power spectrum is calculated from the "rst 820 ¹ (0) MD data values. A strong peak appears at > X "0.9265X (0), which should be compared to the value 0.9301X (0) found in the RWA. Since > > the eigenvector is not an exact eigenvector due to the RWA, linear spin waves are also excited. However, the strength of the power spectrum peak at X (0) is more than 3 orders of magnitude > weaker than the resonance peak. Peaks at the third and "fth harmonics are also present in the power spectrum, but their strengths are at least four orders of magnitude weaker than the peak corresponding to the fundamental ILSR frequency indicating that the RWA is a good approximation for this nonlinear system. The MD simulation frequency versus spin deviation is plotted in Fig. 33 as open circles and these values compare well with the RWA frequencies represented by the dashed line. The MD simulation frequency is slightly lower than the corresponding RWA frequency and the di!erence between the two becomes larger as the maximum spin deviation increases. The di!erence grows because more and more energy goes into higher harmonics but even so the overall agreement is satisfactory over the entire range. 5.2.2. Biaxial anisotropy When the rotational symmetry in the easy-plane is broken by setting A OA , a gap appears V W below the lower branch of the spin wave spectrum. Now both an intrinsic localized gap modes and a resonance may appear. The eigenvectors and frequencies of the ILSGs and ILSRs can be obtained by solving Eq. (5.7) including initial guesses of the appropriate pattern of sign alternation as given by Eq. (5.5) for q"0.
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
199
Fig. 36. Shape of a stationary intrinsic localized spin wave gap mode with the maximum spin deviation sW "0.7 in a chain of 128 spins characterized by A "1.5 and A "1.5: (a) The spin deviation sW versus lattice site index n; (b) The V W L spin deviation sX versus site index n. L Fig. 37. Dependence of the plane-wave wing amplitude on the center amplitude of ILSRs. The parameters for the antiferromagnetic chain are the same as in Fig. 36: (a) sX "0.335; (b) sX "0.480; (c) sX "0.581.
When Eq. (5.7) is solved numerically to "nd the stationary ILSG eigenvector for a 128-spin antiferromagnetic chain characterized by A "1.5 and A "1.0, Fig. 36 is the result. Both y and V W z components of spin deviation vanish with increasing distance from mode center, as expected for a localized mode outside plane-wave spectrum. The ILSG in a biaxial easy-plane antiferromagnetic chain is elliptically polarized, and has a nonzero total magnetic moment oscillating along the hard axis. Fig. 37 presents the shapes of three stationary ILSRs in the same chain to illustrate the dependence of the wing amplitude on the center amplitude. Here only the z-components are plotted. Although the ILSRs have no nonzero net magnetic moment in the z direction, they do have a non-zero net magnetic moment oscillating along the y-axis, orthogonal to that associated with the ILSG. The wing amplitude increases rapidly with the center amplitude. It has been shown in the continuous model that the amplitude of the `far-"elda radiation is exponentially small in the breather amplitude, and the decay is negligible for small amplitude breathers [66,87]. The numerical calculations described here suggests a similar dependence of the wing amplitude on the mode center amplitude for an ILSR although no analytical solution is available due to the complexity of the discrete system. MD simulations demonstrate that both ILSG and ILSR modes are long-lived. Since the frequency of an ILSG is outside the extended spin wave spectrum, its instability is caused entirely by the use of the rotating wave approximation. The resulting eigenvector becomes more inaccurate as the mode amplitude increases. On the other hand, since an ILSR is an in-band resonance, in
200
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
addition to the rotating wave approximation, the coupling between the ILSR and extended spin waves of lower branch also contributes to its instability. The strength of the coupling is measured by the wing amplitude, which increases with the mode center amplitude and the density of state of the lower branch spin waves that are at the same frequency as the ILSR. Thus, the lifetime of an ILSR decreases much faster than a linear law with an increase in the mode amplitude. While ILSGs exist for the entire parameter range, no ILSR can occur for large anisotropy parameters when the frequency of the q"0 extended spin wave mode of the upper branch is in the neighborhood of the frequency of the zone boundary spin wave mode of the lower branch. This result may be understood qualitatively as follows. Since the polarization of an ILSR at the zone center is similar to that of the extended zone boundary spin wave modes of the lower branch, the ILSR is expected to interacted more strongly with the zone boundary spin waves. Furthermore, the lower branch has a large density of state at the zone boundary because of the #atness of the spectrum so there are more states with which to interact. The precise parameter range for the existence of ILSRs will be discussed in Section 5.3. Contact can be made between the anharmonic gap mode and resonant mode which have orthogonal net magnetic moments and the ILSG described in Section 4. Setting D "0 in Eq. (5.1) W reduces it to the Hamiltonian of an easy-axis antiferromagnetic chain with x-axis as the easy-axis. As D P0, the two dispersion curves shown in Fig. 31 converge into one, and the two elliptically W polarized orthogonal ILSR and ILSG result in a circularly polarized ILSG, which does not require the rotating wave approximation due to the uniaxial symmetry. 5.3. Existence conditions The numerical calculations described in Section 5.2 indicate that ILSGs can exist for any anisotropy parameters whereas ILSRs can only exist in a certain parameter regime. In this section the existence conditions are presented for small amplitude ILSGs and ILSRs for the continuum limit. The condition for the existence of small amplitude ILSGs also applies to ILSGs with large amplitude beyond the continuum approximation; however, it should be emphasized that the condition for the existence of small amplitude ILSRs does not guarantee the existence of large amplitude ILSRs. 5.3.1. Gap modes Since an anharmonic gap mode has the same pattern of sign alternation as the zone center extended spin waves of the lower branch, new variables can be introduced t "(!1)LsW,
"sX . L L L L Since Eq. (5.8) reduces to
(5.10)
(5.11) f (a,b)+1!a!b , within the continuum approximation up to the lowest nonlinear terms, that is, cubic terms, a nonlinear SchroK dinger equation for the envelope function t(x) can be obtained from Eqs. (5.7a) and (5.7b), namely, a
dt !at#bt"0 , dx
(5.12)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
201
where a is the spacing between two adjacent spins, [X (0)!X](A #4) V a" \ (A #4)!X V
(5.13)
and X[3X#X (0)]#(A #4)[X#3X (0)] \ V \ b" . 8[(A #4)!X](A #4) V V In this small amplitude limit, the envelope function (x) is given by the linear relation
(5.14)
X
(x)"! t(x) . (5.15) A #4 V Since Eq. (5.12) has localized solution if and only if a'0 and b'0, one can obtain the existence condition for the gap mode from Eqs. (5.13) and (5.14), this is, X(X (0). Thus, nonlinear gap \ modes always exist as long as there is a gap in the spin wave spectrum. The localized solution of Eq. (5.12) centered at x is 2a x!x . (5.16) sech a t(x)" b a
Since the frequency of a small-amplitude ILSG is just below X (0), one can determine from \ Eq. (5.13) the width of the localized mode lJ1/(*X and the central amplitude t J(*X where K *X is the frequency shift of the ILSG from X (0). An ILSG therefore becomes delocalized, \ approaching the extended spin wave, as its central amplitude decreases. 5.3.2. Resonant modes Since ILSRs, if they exist, have the same sign alternation as the extended zone center spin wave modes of the upper branch, the appropriate new variables now are t "sW,
"(!1)LsX . (5.17) L L L L In a similar fashion as described in the previous section, the envelope function t(x) also satis"es a nonlinear SchroK dinger equation given by Eq. (5.12) with parameters a and b replaced by X (0)!X A a" > X!A V V
(5.18)
and X(3X#A)#X (0)(X#3A) V > V . b" (5.19) 8A (X!A) V V The condition for the existence of localized solution is that both a and b are positive, which is equivalent to A (X(X (0) . V >
(5.20)
202
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Since from Eq. (5.4), X (0)"(A (A !A #4). For A *A , Eq. (5.20) can be satis"ed if and > V V W V W only if A (4 . W
(5.21)
The envelope function of an ILSR is given by Eq. (5.16) with the corresponding parameters a and b given by Eqs. (5.18) and (5.19). The plane-wave wings are neglected in this small amplitude limit. It can be easily seen from Eqs. (5.4) that (5.21) is equivalent to the condition that X (q) have > positive curvature at q"0. It should be emphasized again that Eq. (5.21) is the condition for the existence of small-amplitude ILSRs. Since the frequency shift of an ILSR from X (0) increases with > its amplitude as *XJt , and according to Eq. (5.20), the frequency of an ILSR is allowed to lie in K a narrow range just below X (0) as A approaches 4, no large-amplitude ILSRs can exist in this > V case. The existence condition of ILSRs given by Eq. (5.21) is consistent with observations in numerical searches of ILSRs. Like the ILSR in ferromagnetic chains examined in Section 3, the ILSR in easy-plane antiferromagnetic chains is strikingly di!erent from nonlinear resonant modes in other models [66,67] in that its fundamental frequency instead of higher harmonics is in the plane-wave spectrum. In addition, an intrinsic localized spin wave gap mode can exist in the gap below the lower branch of the spin wave spectrum when the uniaxial symmetry is broken. Since the ILSR and ILSG modes in an antiferromagnetic chain are elliptically polarized and have nonzero total magnetic moments orthogonal to each other, unlike the non-ir-active ILSR in isotropic ferromagnetic chains reviewed in Section 3, they can couple to far-IR radiation. The long-lived nonlinear excitations explored here approach continuously without threshold the corresponding spin waves of linear theory as their amplitudes decrease. They are thus strikingly di!erent from the topological sine-Gordon kink excitations found in 1D easy-plane magnets [20,21,88]. The key feature in the nonlinear dynamics problem of ILSR is the polarization di!erence between the two plane-wave branches. In numerical simulations studies of ILSRs the smaller the frequency of the q"0 mode in the upper branch, the less strongly coupled the resulting ILSR is to the other branch of the plane-wave spectrum.
6. Modulational instability of an extended nonlinear spin wave in an easy-axis antiferromagnet So far we have illustrated that both analytical studies and numerical simulations do predict the existence of intrinsic localized spin wave modes in various magnetic chains; however, there remains the fundamental question how best to excite such atomic scale large-amplitude excitations in homogeneous discrete lattices. Modulational instability (MI), which refers to the exponential growth of certain modulation sidebands of nonlinear plane waves propagating in a dispersive medium as a result of the interplay between nonlinearity and dispersion e!ects, has been studied in a variety of "elds [89}92]. In most of these cases, MI appears in continuous media where the propagation of nonlinear waves is usually governed by nonlinear SchroK dinger-type partial di!erential equations. Computer simulations and experiments [73,93,94] have demonstrated that one of the main e!ects of the modulational instability is the generation of localized pulses. For example, subpicosecond soliton-like optical pulses have been experimentally generated from a weakly
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
203
modulated input via an induced modulation instability in single-mode optical "bers [93]. Given this observation, the modulational instability mechanism has recently has been proposed and examined as a possible way to produce energy localization in discrete lattices [69,70,72}74,94}96]. Although many aspects of MI in discrete systems are the same as those in continuous media, the discreteness can drastically modify the modulational instability parameter space as deduced from a continuum or even semi-discrete approximation [69]. The advantage of making use of modulational instability to create localized excitations in discrete lattices is that because of the lack of continuous translational symmetry the localized pulse generated by the nonlinear instability can be trapped by discreteness to form strongly localized long-lived excitations. In this section the focus is on the modulational instability of extended nonlinear spin waves in antiferromagnetic chains with easy-axis anisotropy. 6.1. Traveling nonlinear extended waves The one-dimensional antiferromagnetic chain to be investigated is described by the Hamiltonian given by Eq. (4.1). Because of the translational symmetry of the underlying lattice, the equation of motion, Eq. (4.4), can support nonlinear extended spin wave modes as well as stable intrinsic localized spin-wave modes in the gap below the standard antiferromagnetic resonance frequency. The traveling extended spin wave mode with wavevector q and frequency u is found by substituting into Eq. (4.3) the following circularly polarized trial solution: s> (t)"f e LO?\SR>F , L s> (t)"ge L>O?\SR>F , L>
sX "(1!f , L sX "!(1!g . L>
(6.1)
Here h is a constant phase, and both spin deviations f and g are real and non-negligible. De"ning the parameter r"2/(A#2) and inserting Eq. (6.1) into Eq. (4.3) gives for the ratio of the two amplitudes g r cos qa "! , f (1$(1!r cos qa)(1!f )
(6.2)
where the $ signs designate two degenerate branches. Owing to the symmetry between the up-spins and down-spins the solution with positive sign in the denominator is chosen so that "g/f "(1. Given the spin wave amplitude f, the frequency as a function of wavevector is X(q, f )"2(1!af #(A!2a cos qa)(1!f ,
(6.3)
where a"!g/f ; hence, the frequency of an extended nonlinear spin wave depends on both its wavevector and its amplitude. For the small amplitude case f ;1, one "nds AX (q) f , X(q, f )+X (q)! X (q)#X (n/2a)
(6.4)
204
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
where X (q)"((A#2)!4 cos qa is the linear spin wave frequency. Eq. (6.4) indicates that the nonlinear spin wave frequency decreases quadratically with increasing amplitude. 6.2. Modulational instability of extended waves 6.2.1. Linear stability analysis To study the modulational stability of the extended nonlinear spin waves a perturbed nonlinear spin wave of the form s> (t)"( f#b #it )e LO?\SR>F , L L L (6.5) s> (t)"(g#b #it )e L>O?\SR>F , L> L> L> is introduced where f, g and u are related by Eqs. (6.2) and (6.3), and the perturbations +b (t), and L +t (t), are real and are assumed to be small in comparison with the parameters of the carrier wave. L (Note that in this form the perturbation is added in a frame rotating with the exact periodic solution.) The advantage of using Eq. (6.5) is that it ensures that the resulting linearized equations of the perturbation have constant coe$cients instead of time-dependent coe$cients as would be obtained in the usual stability analysis of periodic solutions. Since the perturbations +b (t), and L +t (t), are arbitrary, this does not involve any approximation. Inserting Eq. (6.5) into Eq. (4.3) and L separating the real and imaginary parts, one obtains, up to linear terms of +b (t), and +t (t),, L L a system of coupled di!erential equations of +b (t), and +t (t),, which can be solved by expanding L L the perturbation in terms of Fourier components as
b b (Q) L " e L/? , (6.6) t t (Q) / L b b (Q) L> " e L>/? . (6.7) t t (Q) / L> This decomposition allows one to identify the time evolution of each individual component. Since +b (t), and +t (t), are real, L L bH(Q)"b (!Q) and tH(Q)"t (!Q) where (i"0, 1) . (6.8) G G G G Comparing the coe$cients for the same Fourier component gives
b (Q) M M d b(Q) "2JS dt t (Q) M M t (Q) where M 's are 2 by 2 matrices given by GH 0 M " !2i(1!g sin Qa sin qa
b (Q) b (Q) , t (Q) t (Q) 2i(1!f sin Qa sin qa 0
(6.9)
,
(6.10)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
205
M "M ,
(6.11)
2a(1!f cos qa M " !2(1!g cos Qa cos qa
M "!M #
(A!2a) f (1!f
2(1!f cos Qa cos qa , 2 ! (1!g cos qa a gf 2 cos Qa (1!g
cos qa
gf
!(A!2/a)g
!2 cos Qa (1!f
(1!g
cos qa
(6.12)
.
(6.13)
The general solution of Eq. (6.9) is a superposition of terms having the time dependence e\ SKR where the u 's are the frequencies of the modulation wave relative to the extended nonlinear
spin wave and the !iu 's are the eigenvalues of the 4 by 4 matrix 2JSM. The stability of the
extended nonlinear spin wave mode is determined by the imaginary part of u , i.e., the extended
nonlinear spin wave is unstable when the Im+u ,'0, otherwise it's stable. De"ning the dimen sionless frequency j"u /2JS, the j's are obtained from
det"j(q, Q)I!iM""0 . (6.14) Eq. (6.14) determines the condition for the stability of an extended nonlinear spin wave with wavevector q with respect to the modulation with wavevector Q. Since M "M "0 when Q"0, j(q, 0)"0 is always one of the eigenvalues of matrix M. Note that since the trace of the matrix M is zero regardless of the values of q and Q, the condition for stability is that all eigenvalues of M are imaginary. Otherwise there must be at least one eigenvalue having a positive real part. Furthermore, the symmetric modulation sidebands at q$Q have the same growth rate since j(q,!Q)"!jH(q, Q). Since there is no simple analytical form for the dispersion relation j(q, Q) for arbitrary q, Eq. (6.14) has to be solved numerically to determine the domains of instability in the (q, Q) plane. There are two important cases that can be solved analytically: the zone center and zone boundary spin waves. These two cases will be considered "rst before examining spin waves with arbitrary wavevectors. 6.2.2. The uniform mode The case of the q"0 extended nonlinear spin wave is particularly important since the condition for an instability also tells one when stationary ILSGs can exist [62]. Since M "M "0 when q"0, the case of the zone center spin wave mode becomes particularly simple as Eq. (6.14) becomes det"j(0, Q)I#M M ""0 , which yields the following dispersion relation for the modulation wave:
B j (0, Q)" aC# !4E sin Qa$ ! a
B !4BC sin Qa , aC# a
(6.15)
(6.16)
206
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
where 2 B" !(A#2)g!2a((1!g)(1!f ) , a
(6.17)
2 C"2a!(A#2) f ! ((1!g)(1!f ) , a
(6.18)
E"gf!((1!g)(1!f ) .
(6.19)
In the case of an isotropic chain (A"0), B"C"0 and E"!1 so that Eq. (6.16) reduces to j (0, X)"4 sin Qa , (6.20) ! which is the linear spin wave dispersion relation and is positive for any Q. This is to be expected since the q"0 mode in an isotropic chain is simply rotation of the whole lattice in spin space by an arbitrary amount and any small amplitude perturbation to this state is a superposition of linear spin waves. An instability can however occur for the zone center spin wave mode in an anisotropic antiferromagnetic chain. Since j (0, Q)50 holds for any Q, this branch is stable. To determine the > instability condition one need only focus on the j (0, Q) branch. It is clear from Eq. (6.16) that \ j (0,0)"0 and the instability occurs if and only if j (0, Q) becomes negative for some non-zero Q. \ \ This condition requires that 2E (aC#B/a) !BC . sin Qa( 4E
(6.21)
The RHS of Eq. (6.21) is always positive for any spin wave amplitude f, and is proportional to f in the small f limit. In a "nite periodic lattice of size N the smallest wavevector is Q"2p/Na, thus there exists an amplitude threshold f &O(1/N) so that only zone center spin waves with amplitude larger than f are unstable; however, in a 1-D solid N&10 and the amplitude threshold becomes negligibly small. Thus, the zone center spin wave is always unstable to long wavelength modulation. Also it can be shown that the RHS of Eq. (6.21) can become greater than 1 for su$ciently large A and f so that the q"0 extended nonlinear spin wave is unstable to any perturbation. When the RHS of Eq. (6.21) is less than 1 the critical wavevector is given by Q a"arcsin
2E (aC#B/a) !BC , 4E
(6.22)
and the extended nonlinear spin wave mode is unstable to modulation with "Q"(Q . Fig. 38 presents an example of the stability region of the q"0 extended nonlinear spin wave in the (Q, f ) plane for two di!erent anisotropy parameters. The anisotropy parameter A"1.0 in Fig. 38a demonstrating that the q"0 spin wave is stable to modulation waves with large wavevector. Fig. 38b shows the results for the more anisotropic case A"2.0 where the spin wave with large amplitude can become unstable with respect to any modulation wavevector.
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
207
Fig. 38. Regions of modulational instability in the (Q, f ) plane for the q"0 extended spin wave: (a) The anisotropy parameter A"1.0. The large Q region is always stable regardless of the spin wave amplitude; (b) The anisotropy parameter A"2.0. As the strength of the anisotropy increases, the stable region shrinks. The spin wave with large amplitude becomes unstable to perturbation of any wavevector (after Ref. [102]).
In the small amplitude limit, Eqs. (6.16) and (6.22) can be simpli"ed to yield more transparent results. Keeping only lowest terms of f, Eq. (6.22) gives 1 Q a" (A/*X )f , 1#*X /2X (0)
(6.23)
where *X is the bandwidth of the linear spin wave band. In this small amplitude limit the critical wavevector of the modulation wave is linear in the spin wave amplitude. Since the q"0 spin wave with a small amplitude is unstable only to long wavelength (Qa&f ) modulation, Eq. (6.16) can be simpli"ed in lowest order, i.e., f , to yield r 4(1!r) f sin Qa# sin Qa . j (0, Q)"! \ 1!r 1#(1!r
(6.24)
The RHS of Eq. (6.24) is 0 at Q"0 and negative for 0("Q"(Q , hence j (0,Q) is purely \ imaginary in the small Q region. The maximum growth rate is found from Eq. (6.24) to be
1 A f Im+j (0, Q), " \
2 1#*X /2X (0) for the modulation wavevector Q "Q /(2 .
(6.25)
(6.26)
208
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
6.2.3. The zone boundary mode At the Brillouin zone boundary q"p/2a and the spin deviation at odd sites g"0, hence a"0. However, the ratio cos qa/a is well de"ned, i.e., cos qa 1 #(1!f p P as qP . r a 2a
(6.27)
Inserting Eq. (6.27) into Eq. (6.14) gives an equation for j(p/2a, Q), namely,
j
p p , Q #a j , Q #a "0 , 2a 2a
(6.28)
where the coe$cients a and a are 8A a "16(1!f )sin Qa# f (1#(1!f )sin Qa r
(6.29)
and 4 a "! (1#(1!f )#8(1!f sin Qa , r
(6.30)
respectively. The dispersion relation for the perturbation wave j(p/2a, Q) is described by
p !a $(a!4a . j ,Q " ! 2a 2
(6.31)
Since a (0 and a 50, the stability condition becomes a!4a 50, which leads to 4 4 (1#(1!f ) ! (1#(1!f )! f sin Qa50 . r r r
(6.32)
The extended nonlinear spin waves at the Brillouin zone boundary are stable to perturbations by any wavevector since the inequality given by Eq. (6.32) holds for any Q and f since sin Qa41. 6.2.4. Instability region for spin waves of arbitrary wavevector Although for nonlinear spin waves of arbitrary wavevector q the dispersion relation j(q, Q) has to be obtained by numerically solving Eq. (6.14), still it is of value to consider some qualitative properties of the eigenvalues. Since det"jI!iM""det"(jI!iM )#(jI!iM )M (jI!iM )\M " , (6.33) and iM , M and M are real 2 by 2 matrices, Eq. (6.14) is a fourth order polynomial of j with real coe$cients. Thus the j(q, Q)'s are therefore either real or form complex conjugate pairs. In regions where the nonlinear extended spin waves are stable Im+j(q, Q),"0 for any of the four j(q, Q)'s, while at least one of the four j(q, Q)'s has positive imaginary part in unstable region. Fig. 39a shows a typical plot of the regions of modulational instability in the (Q, q) plane which are determined by the values of Im+j(q, Q), for an anisotropic antiferromagnetic chain. The dot-dashed lines separate the regions of stability (I and III) and region of instability (II). For a given
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
209
Fig. 39. (a) Diagram of regions of modulational instability in the (Q, q) plane for extended spin waves of amplitude f"0.2 in an easy-axis antiferromagnetic chain. The anisotropy parameter A"2.0. Regions I and III are stable regions, and region II is unstable. Spin waves with wavevector larger than p/4a are stable to a perturbation of any wavevector. (b) The real and imaginary part of the two relevant j(q, Q)'s along the long-dashed line CD shown in (a). Solid lines: real parts; dot-dashed line: imaginary parts. The two j(q, Q)'s converge at instability boundaries.
spin wave amplitude f, the area of the unstable region grows with increasing anisotropy parameter A. The lower boundary moves towards the direction of large Q, while the upper boundary approaches q"p/4a. However, spin waves with q'p/4a are stable against any perturbation independent of the values of Q and f. On the other hand, as the chain becomes more and more isotropic with A approaching zero, the area of the unstable region shrinks until it disappears completely for the isotropic chain. Numerical solutions of Eq. (6.14) demonstrate that among the four j(q, Q)'s, two of them are always real in the (Q, q) plane, and therefore are irrelevant with respect to the spin wave instability. To see how the relevant j(q, Q)'s evolve from a stable region to an unstable one, the real and imaginary parts of the two j(q, Q)'s are plotted in Fig. 39b for "xed Q along the long-dashed line CD shown in Fig. 39a. The two j(q, Q)'s converge at the instability boundaries to form degenerate double roots while they form a complex conjugate pair in the unstable region (II). Thus the instability boundaries are determined by the condition that Eq. (6.14) have double roots. That the existence of an intrinsic localized mode is always accompanied by an instability of the corresponding extended nonlinear waves has been shown in a number of studies for various lattice dynamical models [69,72,74,95]. The fact that ILSMs can occur only in the gap below the standard antiferromagnetic resonance frequency at q"0 while no ILSM exists at the Brillouin zone boundary [62] is in agreement with Fig. 37 where only extended spin waves with small q are
210
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
unstable to a long wavelength modulation while the zone boundary spin waves are stable to modulation of any wavelength. 6.3. Comparison between numerical simulations and analytical results In an easy-axis antiferromagnetic chain linear stability of an extended nonlinear spin wave with wavevector q modulated by a small-amplitude wave of wavevector Q is determined by the dispersion relation j(q, Q) which can be obtained from Eq. (6.14). Although such linear stability analysis can determine the important domain in parameter space and predict quantitatively how the amplitude of a modulational sideband evolves at the onset of the instability, such analysis is based on the linearization around the unperturbed carrier wave. Since the linear approximation must fail at large time scales as the amplitude of the unstable sideband grows exponentially and since it neglects additional combination waves generated through wave-mixing processes which can become signi"cant at large time scales if its wavevector falls inside an instability domain, such linear stability analysis cannot determine the long-time evolution of a modulated extended nonlinear spin wave. This requires the application of molecular dynamics simulations. For such numerical simulations for easy-axis chains with various anisotropy parameters the initial conditions involve coherently modulated extended nonlinear spin waves of the form
b s> (0)" f# [b (Q)e L/?#c.c#i(t (Q)e L/?#c.c.)] e LO? , L 2
b s> (0)" g# [b (Q)e L>/?#c.c#i(t (Q)e L>/?#c.c.)] e L>O? , L> 2
(6.34)
where c.c. denotes the complex conjugate, (b (Q), b (Q), t (Q), t (Q)) is a normalized eigenvector of the M matrix, and b is a small parameter measuring the relative strength of the modulation wave to the carrier wave, typically &0.01. The amplitudes f and g are related by Eq. (6.2). Since "b#it"O"bH#itH", the two satellites at q$Q have di!erent strengths except when q"0. Given s>(0), the z-components of spins can be obtained from L (6.35) sX(0)"(!1)L(1!"s>(0)" . L L Once an initial condition is given the time evolution of a modulated spin wave can be investigated with MD simulations. In order to monitor the time evolution of individual Fourier components, one requires the complete spatial Fourier transform of spin deviations
N N ,\ ! (p4 . (6.36) m(p,t)" s>(t)e\ LNp,, L 4 4 L The growth rate of each individual Fourier component can be obtained by the least-squares "tting of "m(p, t)" over the "rst few periods during which time it is expected to grow at the rate of 2 Im+j(q, Q),. A chain of 128 spins with periodic boundary conditions has been used as a speci"c example. The anisotropy parameter is taken to be A"1.0, and the spin wave amplitude f"0.2. Figs. 40 and 41 shows the long time evolution of the carrier wave with wavevector q"15p/64a modulated by
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
211
Fig. 40. Time evolution of the main Fourier components of an extended spin wave with q"15p/64a and f"0.2 modulated by a small amplitude wave with Q"17p/64a. The components are at q (solid curve), q#Q (dot-dashed curve) and q!Q (long-dashed curve). The anisotropy parameter is A"1.0, and time is measured in units of ¹ , the period $+0 of the uniform mode. Fig. 41. Time evolution of the complete Fourier spectrum of the extended spin wave described in Fig. 40. After a su$ciently long time, combination modes appear due to wave mixing processes.
small amplitude waves with wavevectors Q"$17p/64a which falls in the unstable region. The exponential growth of q$Q satellite sidebands at the initial stage of instability is obvious as can be seen in the log-linear plot of Fig. 40. Fig. 41 shows the time evolution of the complete Fourier spectrum where additional combination waves generated from wave-mixing processes can be seen after about 300 ¹ as the instability develops further. $+0 The growth rates as a function of the modulation wavevector for the running carrier waves with various wavevectors are plotted in Fig. 42. The solid curves represent analytical results obtained from diagonalizing the matrix M while the "lled circles are MD simulation results. The excellent agreement between these two sets of results demonstrates that the linear stability analysis does give a quantitatively correct description of the instability onset. Fig. 42 shows that while the carrier waves with small q are unstable to long wavelength modulation (small Q), a carrier wave of large q (q"15p/64a) is stable to long wavelength modulations but unstable to some short wavelength modulations (large Q). This "nding should be contrasted with those found in Ref. [69] for a monatomic Klein}Gordon chain where the small Q region is always the unstable region as long as an instability occurs for the corresponding carrier wave. Note that because combination waves are neglected in the linear analysis the prediction of stability does not necessarily rule out the occurrence of instability in the long time evolution. This point is illustrated by the long time evolution of a perturbed carrier wave with wavevector q"15p/64a plotted in Figs. 43 and 44 where the modulation wavevectors Q"$p/8a lie in the stable region as shown in Fig. 42a. The Fourier component corresponding to the carrier wave
212
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 42. Comparison of analytical and MD simulation results of the growth rate of modulational waves for carrier waves with various wavevectors. The parameters are A"1.0 and f"0.2. X is the standard antiferromagnetic resonance $+0 frequency. The wavevectors of carrier waves are: (a) 15p/64a; (b) 7p/32a; (c) p/8a, and (d) 0. The solid curves are analytical results while the "lled circles are MD simulation results (after Ref. [102]).
Fig. 43. Long time instability induced evolution of speci"c fourier components by combination excitations. The components are at q (solid curve), q#Q (dot-dashed curve), q!Q (long-dashed curve), q#2Q (doted curve) and q!2Q (short-dashed curve). The wavevector and amplitude of the carrier wave are q"15p/64a and f"0.2, respectively. The modulation wavevector Q"p/8a lies in the stable region, as can be seen in Fig. 42a. Time is measured in units of ¹ . $+0 Clearly, the q$Q components are initially stable while the combination modes at q$2Q are unstable. Fig. 44. Time evolution of the complete Fourier spectrum for the MD simulation described in Fig. 43. At long times combination modes appear due to wave mixing processes.
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
213
remains the same for a period of approximately 180¹ before the instability occurs. From the $+0 time evolution of the spatial Fourier components at wave vectors q, q$Q and q$2Q plotted in Fig. 43, it can be seen that the q$Q components do not grow until after t"180¹ , just as $+0 predicted in the linear stability analysis; however, the q$2Q components which are neglected in the linear stability analysis grow signi"cantly after 180¹ so that the carrier wave becomes $+0 unstable. The time evolution of the complete Fourier spectrum in q space is plotted in Fig. 44. The generation of combination modes becomes evident after su$cient time. This simulation study demonstrates that the combination modes at q$2Q, q$3Q generated by the nonlinearity, though their magnitudes are smaller than that of the q$Q by at least a factor b at t"0, may fall in the instability region and play an important role at su$ciently large time scales. Hence, the condition for stability for large time scales is that the main satellite modulation and also all combination modes must not lie in the regions of instability. Note that unlike other models, such as the Klein}Gordon lattice and Fermi}Pasta}Ulam lattice, the nonlinearity in the uniaxial easy-axis antiferromagnetic chains does not generate combination waves at $2q,$3q,2, etc. The stability condition is given by
mod q$nQ,
p , unstable regions, n"1,2,2 , a
(6.37)
which is quite restrictive so that only carrier waves with wavevector q'p/4a are stable at long times. As the anisotropy parameter increases the antiferromagnetic chain e!ectively appears more discrete and according to the analytical results the area of the instability region in the (Q, q) plane also grows so that the upper boundary of the instability region in Fig. 39 approaches q"p/4a. Consider a chain with a larger anisotropy parameter A"2.0 but amplitude of the extended nonlinear carrier waves still f"0.2. The growth rates of the amplitude of modulation waves for carrier waves with a wide range of wavevectors are plotted in Fig. 45. The MD simulation results ("lled circles) are in excellent agreement with the analytical results. The instability region steadily grows with increasing carrier wave wavevector, and the carrier wave with q"15p/64a is unstable to modulation by any wavevector until the carrier wave wavevector increases beyond q"p/4a where it becomes stable. 6.4. Modulational instability recurrence The MD simulation examples have demonstrated that the linear stability analysis correctly describes the initial stage of instability and the numerical simulation results are in excellent agreement with the analytical ones. On the other hand, since the linear stability analysis is based on the linearization around the initial extended nonlinear spin wave modes one should not expect the linear analysis to be valid when the instability is fully developed, as demonstrated by the numerical simulations. It is generally believed that the initial state dominated by a single mode would eventually evolve into a nearly chaotic state after su$ciently long time since the direction of energy #ow should favor equipartition among the numerous modes available so that it occupies maximum volume in phase space. In the intermediate stage, however, the time evolution of the unstable mode can exhibit both regular and irregular behavior. Studies of a number of monatomic lattices and
214
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 45. Growth rate of modulational waves as a function of modulation wavevector in an antiferromagnetic chain with large anisotropy. The parameters are A"2.0 and f"0.2, and X is the standard antiferromagnetic resonance $+0 frequency. Solid curves are analytical results while "lled circles are MD simulation results. The wavevector of carrier waves are: (a) 15p/64a; (b) 7p/32; (c) 3p/16; (d) p/8; (e) p/16, and (f ) 0. Note that the carrier wave with q"15p/64a is unstable to perturbation of any wavevector (after Ref. [102]).
continuum models [94,96,97] have shown that under appropriate but rather restrictive conditions the linearly unstable carrier wave can behave quite regularly, showing interesting recurrence phenomenon (MI recurrence) over long time scales. Since the condition of MI recurrence is so restrictive even in monatomic lattices it is not clear whether MI recurrence can be observed in more complex systems, such as the antiferromagnetic chains considered here. To illustrate the MI recurrence phenomenon in easy-axis antiferromagnetic chains, an antiferromagnetic chain of 128 spins with anisotropy parameter A"1.0 has been examined. The q"0 extended spin wave of amplitude f"0.2 is modulated at wavevector Q"$p/32a with b"0.01. Figs. 46 and 47 show the time evolution of the Fourier components of this modulated nonlinear plane spin wave. In Fig. 46 the unstable carrier wave (solid curve) and the modulation waves (dot-dashed curve) both show a quasi-periodic behavior with a period of approximately 180¹ . Fig. 47 shows the time evolution of complete Fourier spectrum of the $+0 modulated wave. Besides the q and q$Q components, the combination modes q$2Q, q$3Q, etc., are also apparently visible in the spectrum although their magnitudes are much smaller than that of the main satellites. The underlying physics of MI recurrence is the mode coupling between the unstable carrier wave and its satellite modes generated from the modulation. The satellite modes grow exponentially and the energy is transferred from the carrier wave to the satellite modes until the growth is saturated; then the energy is returned back to the carrier wave. Such a process can repeat itself over a long time scale under appropriate conditions. Although the MI recurrence phenomenon is ubiquitous
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
215
Fig. 46. Time evolution of the main Fourier components illustrating modulational instability recurrence. The components are at q (solid curve) and q$Q (dot-dashed curve). The parameters are A"1.0, q"0, f"0.2 and Q"p/32a. Time is measured in units of ¹ . Note that the two sidebands are symmetric when q"0. $+0 Fig. 47. Time evolution of the complete Fourier spectrum for the MD simulation described in Fig. 46. At long times combination modes can be clearly seen.
since it appears in many di!erent models, the condition for the quasi-periodic recurrence phenomenon to occur is quite restrictive, especially in the discrete lattices of interest here. Numerical simulations both for monatomic chains [96] and for `diatomica antiferromagnetic chains demonstrate that MI recurrence has a strong dependence on the wavevectors of the carrier and modulation waves. Extensive numerical simulations have shown that the quasi-periodic behavior of the system can be easily destroyed as the wavevector of either the carrier wave or modulational wave is changed even by the smallest amount allowed in the periodic lattice. This is because the quasi-periodic behavior requires that the energy is con"ned between the carrier wave and a small number of modulation satellites; however, in general, the energy can leak into additional combination modes and this process tends to be irreversible.
7. Production of intrinsic localized spin wave modes and the CW driving of antiferromagnetic instabilities Although the ILMs in a variety of discrete nonlinear lattices are reminiscent of impurity modes in linear lattices which can be probed by conventional radiation sources, their generation and detection demand di!erent approaches because of the homogeneity of the underlying lattices. A number of approaches have been proposed to generate and detect ILMs. For instance, it has been suggested that ILMs can be thermally excited in molecular crystals. The signature of their occurrence would be the transition of thermal relaxation from exponential law to non-exponential law with increasing temperature [98]. Another approach for generating ILM excitations is to use
216
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
an optimal control scheme with a pre-designed sequence of laser pulses to indirectly excite vibrational ILM in crystals [99]. However, the best method for the generation and detection of these large-amplitude localized excitations in crystals is still an open question. Furthermore, realistic damping has been ignored, which could severely a!ect the practical feasibility of the proposed schemes. This section is concerned with reviewing the possibility of using the modulation instability for the generation and detection of ILSMs in antiferromagnetic systems in general and then applying these methods to particular materials with realistic parameters. The dissipation of spin waves in magnetic materials is usually weak compared to that of lattice vibrations in crystals. For example, the ratio of the linewidth to the antiferromagnetic resonance frequency C/u&10\ in bulk MnF [100] and FeF [101] is quite a bit smaller than the corresponding linewidth to TO mode frequency C/u&10\ for lattice vibrations. With this in mind some simulation studies are described for antiferromagnetic materials where intrinsic localized spin wave modes (ILSMs) [40,57,62,76] are created via modulational instability [69,102] when the uniform mode is driven with a large amplitude CW ac "eld. One class of systems, namely layered antiferromagnets, stands out as particularly interesting since some of these can be represented by a one dimensional system with reasonable accuracy. Within this class the lowest-lying uniform spin wave mode of the layered antiferromagnet (C H NH ) CuCl has been found to be unstable and it appears that this spin system can be driven su$ciently hard with a laboratory CW microwave "eld so that intrinsic localized spin wave modes would be produced. 7.1. Creation of intrinsic localization 7.1.1. Redistribution of energy for a lossless system Numerical simulations have demonstrated that the energy initially concentrated in one unstable plane wave mode will "nally #ow to all available modes in Fourier space, e.g., the energy is delocalized in Fourier space. Since a delocalized state in Fourier space can be either a localized state or a delocalized state in the corresponding real space, depending on the relative phases between Fourier components, the time evolution in Fourier space alone does not tell one the complete process of energy redistribution. In a su$ciently long time the system will "nally reach equipartition of energy since entropy should grow during the system's time evolution so that it approaches a state where the energy is evenly distributed not only among modes in Fourier space but also on lattice sites in real space. This "nal arrangement does not exclude the possibility of energy localization at intermediate stages since one of the main e!ects of modulational instability is the creation of localized excitations from spatially extended excitations [93]. This modulationalinstability-induced energy localization has been proposed to be a useful mechanism for the formation of intrinsic localization [69,93}95,99]. First we review how the energy initially concentrated in one mode is redistributed in an antiferromagnetic chain in the absence of dissipation. The time evolution of a large amplitude zone center mode perturbed by random noise in both Fourier space and real space is plotted in Fig. 48. The chain consists of 128 spins with anisotropy parameter A"2.0 and amplitude of the zone center spin wave f"0.2 with the amplitude of noise perturbation small compared to that of the carrier wave. In Fig. 48a the time evolution of the complete Fourier spectrum shows that the q"0 mode remains stable for a short period of time (about 80¹ ) then quickly decays into other $+0
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
217
Fig. 48. Decay of an extended nonlinear spin wave into intrinsic localized spin wave excitations on a lossless chain via its modulational instability. The anisotropy parameter is A"2.0. Initially the q"0 extended spin wave with amplitude f"0.2 is perturbed by random noise. Time is measured in units of ¹ , and energy is measured from ground state in $+0 units of 2JS: (a) Time evolution of the perturbed spin wave in Fourier space; (b) Time evolution of the energy density distribution in real space (after Ref. [102]).
Fourier components so that the energy becomes delocalized in Fourier space. In Fig. 48b, the time evolution of the energy density distribution in real space shows a di!erent picture. The initial uniformly distributed energy becomes localized as the instability develops so that a number of localized excitations are created and they appear to be trapped by the discreteness of the lattice. Numerical experiments with di!erent anisotropy parameters and carrier wave amplitudes demonstrate that although localized excitations can be created in this way their lifetimes depend strongly on the anisotropy parameter of the lattice and the amplitude of the initial carrier wave. Since the anisotropy here is on-site it is not only a measure of the nonlinearity but also an e!ective measure of the discreteness of the lattice. As the anisotropy parameter A or the carrier wave amplitude decreases the lifetime of localized excitations decreases. An energy}energy correlation function can be used to obtain a more quantitative characterization [94,103], namely, C (n, t)"N #
e(m, t) e(m#n, t) K [ e(m, t)] K
,
(7.1)
where 122 indicates the average over initial conditions. For a uniform energy distribution, such as our initial conditions, C (n) is uniformly distributed, while when localized excitations appear # C (n) should reduce to a central spike. Since the total energy is a conserved quantity in the absence # of dissipation, the degree of localization can be measured by the height (or the width) of the central spike.
218
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 49. The height of the central peak of energy}energy correlation function as a function of time. Initially the q"0 extended spin wave with amplitude f"0.2 is perturbed by random noise. Solid curve: A"2.0. Dot-dashed curve: A"1.0. Each curve is averaged over 20 initial conditions. The oscillation in the dot-dashed curve indicates ILSGs generated via MI in a chain with small anisotropy parameter are short lived (after Ref. [102]).
The height of the central spike of the energy}energy correlation function as a function of time is plotted in Fig. 49 for two antiferromagnetic chains with anisotropy parameters A"1.0 and 2.0, respectively. In both cases, the carrier waves have the same amplitude and wavevector, i.e., q"0 and f"0.2, and each curve is averaged over 20 initial conditions. Note that the solid curve (A"2.0) is qualitatively di!erent from the dot-dashed curve (A"1.0). In the case of the larger anisotropy parameter, the height of the central spike in the energy}energy correlation function increases with time during the simulation period, which indicates that localized excitations are generated and grow with time. Although localized excitations are also generated in a lattice with a smaller anisotropy parameter, such as for MnF , they are short-lived and the energy appears to be readily exchanged back and forth between localized excitations and extended spin waves demonstrating once again that both discreteness and strong anharmonicity appear to be essential for the creation of long-lived localized excitations. 7.1.2. Redistribution for a dissipative uniaxial antiferromagnet Next we consider the in#uence of weak dissipation where the equations of motion become dS /dt"S ;H!eS ;(S ;H) . L L L L L L
(7.2)
The new second term on the RHS represents Landau}Gilbert damping [85] which preserves the spin length. Here e is a small parameter measuring the damping strength. For the case of weak dissipation the amplitude decay rate C of plane spin waves is, from Eq. (7.2), C(q)"2JS(A#2)e#O(e f ) .
(7.3)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
219
The condition that the maximum MI growth rate be greater than the damping rate gives a dissipation-imposed amplitude threshold f . From Eqs. (6.25) and (7.3), one obtains *X 2(A#2)S e. 1# (7.4) f " 2 X (0) A It should however be pointed out that Eq. (7.4) does not guarantee the formation of ILSMs from the MI since the formation of ILSMs is a dynamical process in which the competing e!ects of nonlinearity and dispersion reach a delicate balance. The characteristic time scale of this nonlinear process can be obtained from the nonlinear frequency shift given in Eq. (6.4), namely,
2p X (q)#X (p/2a) ¹ " " ¹ . ,* "*u" $+0 Af
(7.5)
With the parameters A"2.0 and f"0.2 used in the numerical simulations, one "nds ¹ + ,* 93¹ . The conclusion is that ILSMs can be created from the MI only when e!ects of $+0 nonlinearity and dispersion are much stronger than the dissipation e!ect [104], hence the necessary condition becomes C(q)¹ ;1 . (7.6) ,* MD simulations with the perturbed q"0 extended large amplitude spin wave with amplitude f"0.2 as initial condition have been carried out and the time evolution of the energy distribution examined for two di!erent dissipation values. The energy density displayed in Fig. 50a for C/u (0)"10\ and Fig. 50b for C/u (0)"10\ is multiplied by eCR for ease in viewing. Interesting
Fig. 50. The in#uence of weak dissipation on ILSMs' formation from modulational instability. The energy density is multiplied by eCR for ease in viewing. Initially the q"0 extended spin wave with amplitude f"0.2 is perturbed in a lattice with A"2.0: (a) C/u (0)"10\; (b) C/u (0)"10\ (after Ref. [102]).
220
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
general di!erences are found between the two cases. The ILSMs in the weaker dissipation case (Fig. 50a) are much more localized and appear to be pinned less strongly than for the case of no dissipation previously shown in Fig. 48b. On the other hand, the ILSMs in the stronger dissipation case (Fig. 50b) are somewhat delocalized and hence more mobile. This di!erence results from the competition between the MI-induced energy localization and the delocalizing dissipation e!ect. In the weaker dissipation case (C¹ +0.058) the MI process can take place before dissipation ,* becomes signi"cant, while in the stronger dissipation case (C¹ +0.58) the dissipation e!ect ,* prevents energy from being strongly localized by decreasing the amplitude and hence reducing the e!ective strength of the nonlinearity. 7.2. Uniaxial FeF
Studies reviewed in Section 4 for an antiferromagnetic chain with single-ion easy-axis anisotropy have demonstrated that ILSMs can exist in the gap below the AFMR uniform mode frequency for any D'0, and that the ratio of the anisotropy "eld to the exchange "eld, H /H , is a crucial # parameter which determines the localization properties of these new modes [40,62,102,105]. For a speci"c maximum spin deviation, the larger the ratio H /H , the more strongly localized is the # spin wave mode, with the mode frequency moving further into the gap. For the two standard easy axis antiferromagnets MnF and FeF very di!erent localization properties are to be expected. Because of its relatively weak anisotropy "eld, H /H "10\ [85] broad intrinsic localized spin # wave gap modes (ILSGs) created on a short time scale should be generated in MnF , which would be di$cult to identify in experiment due to the coexistence of extended nonlinear spin waves. On the other hand, because FeF has a much larger anisotropy value, H /H "0.345 [101], strong # localization should be produced and remain for a much longer time scale, which because of the relatively large frequency shifts should be more easily separated from the extended spin waves associated with the AFMR. 7.2.1. CW driver The parameters for the chain are chosen to match those of FeF , which are given in Table 1. Since only the ratios between parameters matter in these computer simulations D/J"0.69 is used in Eq. (4.1). For a dissipative chain of classical spins, the equation of motion is given by Eq. (7.2) but now the gyromagnetic ratio c is explicitly included and multiplies the terms on the RHS of that
Table 1 Model parameters for FeF [101], FeCl [107] and (C H NH ) CuCl [109]. H is the exchange "eld, H is the # anisotropy "eld tensor, H" F(n) (even (odd) n) is the dipolar "eld tensor arising from the same (di!erent) L sub-lattice, and u (0)/c are the AFMR "elds for resonance. Both H and H have only diagonal elements, which are ! listed here. All parameters are in units of Oersteds
FeF FeCl (C H NH ) CuCl
H #
H
HQ
H
u (0)/c !
5.55;10 1.41;10 829
1.91;10 1.57;10 +69, 974, 0,
Neglected Neglected +!247, 500,!253,
Neglected Neglected +75,!150, 75,
5.01;10 1.71;10 1915, 494
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
221
equation in order to make contact with experimental measurements. For FeF e in Eq. (7.2) is chosen such that C/u "4;10\ to agree with the intrinsic linewidth of the AFMR mode $+0 *H"20 Oe at 4.2 K [101]. The e!ective "eld acting on the nth spin is given by (7.7) H(t)"! sLH#H (t) , L where H is the Hamiltonian in the absence of driving "eld, and H is the "eld strength of the circularly polarized driving "eld in the x}y plane. Although one could imagine modulating the driving "eld in various ways because the extended nonlinear spin wave frequency red shifts with increased as "eld strength, the MD simulations described here correspond to the simplest possible case: a CW source with a "xed driving frequency. The appropriate perturbation is H (t)"H (cos u t e !sin u t e ) . V W
(7.8)
7.2.2. Molecular dynamics simulations Molecular dynamics (MD) simulations have been used to estimate the magnitude of the ac "eld required to create ILSGs via the modulations instability of large amplitude extended spin waves for an easy-axis antiferromagnetic chain of 256 spins with periodic boundary conditions [106]. In the MD simulations the discrete equations of motion for the xyz spin components, given by Eq. (7.2), are integrated numerically using the fourth-order Runge}Kutta method. For a particular ac "eld strength, the driving frequency is set to a value slightly below the AFMR frequency of the uniform mode, u . The optimal driving frequency should maximize the total energy fed into the $+0 antiferromagnetic chain so that the system attains the maximum possible nonlinear contribution. The time evolution of the energy per spin in a chain driven by a CW ac "eld with "xed strength H "4.0;10\u /c at three di!erent frequencies is shown in Fig. 51. In each case the chain $+0 has the same initial con"guration, that is, the spins are randomly tilted from their ground state con"guration with an average spin deviation 1dS 2"0.005. The driving frequencies u /u are L $+0 0.994 (dot-dashed line), 0.995 (solid line), and 0.996 (dashed line), respectively. The time is measured in units of ¹ "2p/u . These three MD simulations demonstrate that the driving frequency $+0 $+0 is a crucial parameter and that u "0.995u is the optimal driving frequency for this $+0 particular ac "eld strength. For this optimal case, the energy in the chain increases smoothly with time during the "rst 240 ¹ , and then becomes saturated at longer times with the deviations $+0 produced by irregular #uctuations. Although the details of the MD simulation results depend on the initial spin con"guration the evolution of energy per spin does not show qualitative di!erence between what is shown here and other random initial con"gurations. The time evolution of the energy density distribution in the analog 1-D FeF system is plotted in Fig. 52 for the optimal case of H "4.0;10\u /c and u "0.995u . After the driving $+0 $+0 "eld is turned on the energy density distribution increases smoothly with time and remains plane wave-like until the instability triggered by the random initial condition begins to manifest itself after about 240 ¹ when the energy tends to build up in the system and the extended spin waves $+0 become unstable. With continued development of the instability the extended q"0 spin wave decays into a few ILSG excitations which slowly move around the lattice. Since the ILSG excitations have lower frequencies than the uniform mode their coupling to the ac driving "eld is weak and the energy in the chain reaches this steady state as shown in Fig. 51 as the solid line. (The "ne wiggles shown here result from the fact that up spins for ILSGs in uniaxial easy-axis
222
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 51. Time evolution of energy per spin in a periodic uniaxial easy-axis antiferromagnetic chain of 256 spins with parameters corresponding to those of FeF , driven by a circularly polarized ac CW "eld. The strength of the circularly polarized ac "eld is H "4.0;10\u /c, and the frequencies of ac CW "elds u /u are as follows: 0.994 $+0 $+0 (dot-dashed line), 0.995 (solid line), and 0.996 (dashed line). Fig. 52. Energy distribution versus time showing ILSMs generated in the analog 1-D FeF model. The particular driving frequency u "0.995u and H "4.0;10\u /c. The energy density is in arbitrary units, and the time is $+0 $+0 measured in units of ¹ . $+0
antiferromagnets have a larger deviation than adjacent down spins and hence have higher anisotrophy energy [62].) Since both the ILSG and the uniform mode AFMR have net transverse magnetic dipole moments it is instructive to look at the resulting power spectrum of M>(t) to di!erentiate between the two signatures. This spectrum which is calculated from the data during the time interval between 300 and 1938 ¹ is shown in Fig. 53. Since the driving frequency is 0.995 times the $+0 AFMR frequency almost all of the complex structure seen here is associated with the generation of ILSGs from the uniform mode. The power spectra calculated from MD simulations with di!erent random initial con"gurations show qualitatively similar results. Although these numerical simulations demonstrate that the ILSMs can be generated in a chain with the FeF parameters via the modulational instability mechanism by driving the unstable extended spin waves with a CW ac source, the optimum ac "eld parameter required to produce this e!ect corresponds to H "200 Oe at a driving frequency of u "52.34 cm\. Such intense CW sources do not yet exist in this frequency region. Also it should be noted that this "eld strength represents an actual lower estimate since it is known that localization is more di$cult to produce in a 3-D system than in a 1-D one [39]. 7.3. Uniaxial FeCl
This layered antiferromagnet has the spins oriented in ferromagnetic sheets along the hexagonal c-axis with successive sheets antiparallel to each other [107]. The "eld parameters for this system
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
223
Fig. 53. Power spectrum of the transverse magnetic moment M>(t) showing the ILSMs generated in the analog 1-D FeF model. The parameters are the same as for Fig. 52. The power spectrum is calculated from the simulation data during the time period between 300 and 1938 ¹ . $+0
are given in Table 1. The nonlinear dynamical properties of the spins in FeCl as driven by an ac "eld could be expected to be quite di!erent from those just found for FeF , since now the ratio of H /H "11, that is, A"22, and the e!ective exchange "eld is only 2.5% of the FeF value. # However, as we have already seen in Section 7.1 damping adds another constraint to the production of localization upon driving the uniform mode to large amplitudes. Since there is no reason for the dissipation of the spins in FeCl to be any smaller than the value previously found in FeF , one can assume that FeCl and FeF have the same ratio of C/u . Furthermore, it is to be expected $+0 that the uniform mode in FeCl must be driven to a nonlinear regime such that the product C¹ <1 and that this has the same value as for FeF . From Eq. (7.5) one then obtains for the ,* threshold spin wave amplitude f +0.58f . Next, to estimate the required strength of the CW $$ $! ac "eld, use can be made of the expression fJH /C to obtain H +40 Oe, which is only a factor 5 smaller than that required for the previous system. The strength of this CW source is still too large to be viable in the submillimeter wave region. 7.4. Biaxial (C H NH ) CuCl As the localization strength of an ILSM is really determined by the ratio of H /H it should be # possible to apply the same method to excite ILSMs with H comparable to H but for antiferro # magnets with the AFMR frequency in the GHz region where powerful sources are available. Since the exchange "eld H is scaled down by a factor of 100 one would expect the required strength of # the driving "eld to decrease by roughly the same factor. The necessary `largea ac "eld is now only a few Oersteds. Because of such practical constraints a well-known layered antiferromagnet (C H NH ) CuCl [108,109] becomes a reasonable candidate [106]. The structure of this com pound is face-centered orthorhombic. The interactions between the spin 1/2 copper ions within the ab-plane are strongly ferromagnetic while there exists a very weak antiferromagnetic interaction between neighboring copper ions in adjacent layers. At ¹"1.4 K the interlayer antiferromagnetic exchange "eld is H "829 Oe and the intralayer ferromagnetic exchange "eld H "5.5;10 Oe. # # Thus H /H "1.51;10\;1, and below the NeH el temperature (¹ "10.2 K) the low-frequency # # ,
224
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
spin dynamics can be described quite accurately by a 1-D two-sublattice antiferromagnet with spins in the same layer pointing in the same direction. Since a given layer contains many similarly oriented Cu spins it can be modeled by one classical spin. 7.4.1. Nonlinear dipolar anisotropy To describe the low frequency spin dynamics of this two sublattice antiferromagnet, an e!ective one-dimensional Hamiltonian of the form 1 (7.9) H"2J S ) S # S ) D ) S # S ) F(n!n) ) S , L LY $ L L> L L 2 LLY L L is used where S is the e!ective spin of nth layer, and is treated as a classical vector with unit length. L The antiferromagnetic exchange constant J '0, and the anisotropy tensor D arises from the $ anisotropic ferromagnetic exchange interaction between spins belonging to the same layer. The third term describes the magnetic dipole}dipole interactions. Since the frequencies of the spin waves in (C H NH ) CuCl lie in the GHz region, this dipolar term should play a signi"cant role. The e!ective dipolar interaction tensors F(n!n) can be obtained by summing contributions from spins belonging to the nth layer, that is
3r r 1 I! H H , (7.10) F(n!n)"F("n!n")"(gk ) r r H HZLY H where r represents the vector pointing from a spin denoted by 0 in the nth layer to a spin denoted H by j in the nth layer, and the summation runs over all spins in the nth layer. (Note that j"0 should be excluded from the summation when n"n.) The largest component F(0) represents the dipolar interaction between spins belonging to the same layer. Owing to the symmetry of the lattice [19], both the anisotropy tensor D and the e!ective dipolar interaction tensors F(n!n) have only non-zero diagonal elements. From Eqs. (7.7) and (7.9) the e!ective magnetic "eld acting on the nth spin is H(t)"!2J (S #S )!2D ) S ! F(n!n) ) S #H (t) . (7.11) L $ L\ L> L LY LY The model parameters are set to be J "207 Oe and D"diag+34.5, 487, 0, Oe so that the z-axis $ is the easy-axis, and the antiferromagnetic exchange "eld and the anisotropy "eld match those measured for (C H NH ) CuCl . These are listed in Table 1. The e!ective dipolar interaction tensor F(n!n) has been obtained by summing contributions from spins belonging to the nth layer [109]. Since the Hamiltonian does not posses uniaxial symmetry there are two antiferromagnetic spin wave branches, u (q), which are given by ! u (q)"[4J (1$cos qa)#2(D !D )#F (q)$F (q)!F(0)#F(0)] ! $ V X V V X X ;[4J (1Gcos qa)#2(D !D )#F(q)GF(q)!F(0)#F(0)], (7.12) $ W X W W X X where a is the distance between adjacent layers, and F(q)" F(2l)e J? and F(q)" F(2l#1)e J>?. J J
(7.13)
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
225
Fig. 54. Spectrum of the antiferromagnetic spin waves in (C H NH ) CuCl as calculated from Eq. (7.12) with the "eld parameters given in Table 1. The positive curvature of the lower branch at the center of the Brillouin zone permits intrinsic localized spin wave modes in the gap.
The model parameters have been chosen to yield the experimentally measured antiferromagnetic resonance frequencies, that is, u (0)/c"1915 Oe and u (0)/c"494 Oe. At the zone boundary, the > \ two branches are degenerate with u (p/2a)/c"1584 Oe. The dispersion curves for the two low ! lying antiferromagnetic branches are plotted in Fig. 54. Now consider the instability properties of these antiferromagnetic spin wave branches. Because of the negative curvature of the upper dispersion curve branch at q"0 the existence of intrinsic spin wave resonances (ILSRs) can be ruled out [57]; however, the positive curvature of the lower branch at the zone center permits the existence of ILSGs to occur in the gap below u (0). \ 7.4.2. MD simulation results To study with MD simulations the creation of ILSGs in this system where magnetic dipoledipole anisotropy is important, a CW microwave "eld given by Eq. (7.8) has been applied to a chain of 256 spins with periodic boundary conditions [106]. Since the AFMR frequency is in such a low frequency region where the phonon density of state is small the coupling between the Cu ion and lattice is expected to be weak. The damping parameter e is set to be 10\ so that the ratio of C/u is of the same order as that in FeF . At time 0 the spin associated with each layer is randomly oriented from the easy axis by a small amount with 1dS 2"0.005. L The smallest driving "eld strength that will induce signi"cant localization can be searched for in the same way as described in Section 7.2. With the driving "eld strength given, the driving frequency should be chosen to maximize the energy going into the system. After a few trials, the
226
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
frequency of the driving "eld is set to u /c"487.5 Oe for the ac "eld of strength H "0.987 Oe. The MD simulation results showing the evolution of the energy density distribution for this optimal case are plotted in Fig. 55. The amplitude of extended nonlinear spin wave grows smoothly for about 70 ¹ (2p/u (0)) while for longer times the extended mode decays into slowly moving $+0 \ localized excitations. MD simulations with di!erent random initial con"gurations show qualitatively similar results. In Fig. 55, the ILSGs extend over roughly 20}30 lattice sites. The characterize the spatial size of ILSGs in a more quantitative way, one can calculate the energy-energy correlation function as de"ned in Eq. (7.1). For an extended uniform spin wave, C (n)"1. As the system is driven into # strongly nonlinear region, the extended spin wave becomes unstable and a central spike grows in the energy-energy correlation function. The average size of ILSGs can be de"ned as the FWHM of the central peak. For the run shown in Fig. 55, the FWHM of the central peak averaged between 100 and 200 ¹ equal to 20.8a. $+0 The ILSGs for this anisotropic chain have non-zero net magnetic moment, M (t)" SW(t), W L polarized in the y direction. Fig. 56 displays the power spectrum of M (t) for the MD run shown in W Fig. 55. The power spectrum is calculated from the data during the time period from 90 and 512 ¹ to exclude the contribution from initial extended spin waves. The dot-dashed curve $+0 identi"es the frequency of the driving "eld. The lower frequency components shown here are generated by the production of ILSMs which are produced by the decay of unstable extended spin
Fig. 55. Energy density distribution versus time showing ILSMs generated in the analog 1-D (C H NH ) CuCl model. The strength of the applied ac circularly polarized CW "eld is H "0.987 Oe, and the frequency is the corresponding optimal frequency u /c"487.5 Oe. The energy density is in arbitrary units, and time is measured in units of ¹ (2p/u (0)). $+0 \ Fig. 56. Power spectrum of the net magnetic moment M (t) showing the ILSMs generated in the analog 1-D W (C H NH ) CuCl model. The parameters are the same as for Fig. 55. The power spectrum is calculated from the simulation data during the time period from 90 to 512 ¹ , and is normalized so that the integrated strength is unity. $+0 The dot-dashed line identi"es the driving frequency.
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
227
Fig. 57. Energy density distribution versus time showing ILSMs generated in the analog 1-D (C H NH ) CuCl model. All parameters are the same as those in Fig. 55 except that the frequency of ac CW "eld is u /c"485 Oe, which is the optimal frequency corresponding to the new applied ac "eld strength H "1.4 Oe. The energy density is in arbitrary units, and time is measured in units of ¹ . $+0 Fig. 58. Power spectrum of the net magnetic moment M (t) showing the ILSMs generated in the analog 1-D W (C H NH ) CuCl model. The parameters are the same as for Fig. 57. The power spectrum is calculated from the simulation data during the time period from 90 to 512 ¹ , and is normalized so that the integrated strength is unit. $+0 The dot-dashed line identi"es the driving frequency.
waves. The center of gravity of the power spectrum is calculated to be 451$4.8 Oe for 10 MD runs with di!erent random initial con"gurations. To demonstrate the e!ect of the ac "eld strength on the ILSG spectrum, the power of the CW microwave source is doubled so that the ac "eld strength is increased by a factor of (2 to H "1.4 Oe. As a consequence of the red shift resulting from the increasing driving power, the optimal driving frequency is moved down further to u /c"485 Oe. Figs. 57 and 58 show the MD simulation results. As demonstrated in Fig. 57, the e!ect of the instability begins to show up earlier owing to the increase in the driving "eld strength, and the ILSGs are more localized with the FWHM of the central peak of the energy-energy correlation function [102] averaged between 100 and 200 ¹ equal to 17.8a compared to 20.8a in Fig. 49a. The power spectrum of M (t) for the $+0 W corresponding run is displayed in Fig. 58. Compared to Fig. 56 where the shift of the center of gravity of the frequency spectrum from the driving frequency is 9%, in Fig. 58 there are more low-frequency components producing a corresponding shift of 10% since on average the ILSGs generated in this case have larger amplitudes and hence lower mode frequencies. Although spin wave resonances in the paramagnetic phase of (C H NH ) CuCl have been extensively studied, the measurement of the AFMR in the antiferromagnetic phase is relatively rare and the exact intrinsic linewidth of the AFMR is not available. In the formation process of ILSGs, there are two competing factors, that is, nonlinear instability and damping. The nonlinear instability results in the decay of extended nonlinear wave into localized excitations while the
228
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
Fig. 59. Power spectra of the net magnetic moment M (t). All parameters are the same as those in Fig. 55 except for the W damping coe$cient and the optimal driving frequency: (a) e"10\ and u /c"487.5 Oe; (b) e"10\ and u /c"490 Oe. The power spectrum is calculated from the simulation data during the time period from 90 to 512 ¹ , $+0 and is normalized so that the integrated strength is unity. The dot-dashed lines identify the driving frequencies.
damping prevents the formation of localized excitations by reducing the strength of the nonlinearity. To investigate how damping might a!ect the formation of ILSGs in (C H NH ) CuCl , computer simulations have been carried out with di!erent e's. Plotted in Fig. 59 are the power spectra of M (t) for e"10\ and 10\, respectively. At e"10\, the e!ects of nonlinearity and W dispersion are still much stronger than the dissipation e!ect. In this case, the extended large amplitude spin wave breaks into localized modes producing a ILSG band in the power spectrum, as shown in Fig. 59a. As the damping coe$cient is increased further to e"10\ in Fig. 59b, the power spectrum exhibits a dominant peak at the driving frequency. In this case, the characteristic time scale for the formation of ILSGs from extended large amplitude spin wave is longer than the damping time, preventing the extended spin wave from breaking into ILSGs. The study via MD simulations appears to demonstrate that ILSGs can be created via modulational instability by driving unstable extended nonlinear spin waves in a realistic antiferromagnetic material with a conventional microwave source. Since the ILSMs have frequencies below the extended spin wages and are magnetic dipole-active, it is anticipated that their signature can be directly probed by microwave homodyne detection methods.
8. Conclusions 8.1. Summary A variety of nonlinear features associated with intrinsic localization in simple 1-D periodic ferromagnets and antiferromagnets have been presented in this review. Since the discrete nature of
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
229
Table 2 Qualitative comparison of intrinsic localized mode properties for 1-D crystal and magnetic lattices Ions
Spins
Comparison
Monotomic Intersite interactions One degree of freedom
Ferromagnetic Onsite#intersite interactions Two degrees of freedom
Di!erent Di!erent
nn two-body potential SoftPno intrinsic localization
nn exch.#on site anisotropy SoftPintrinsic localization
Di!erent
Diatomic Two ILGM parities (odd stable)
Two sublattice antiferromagnet Two ILSM parities (odd stable)
Similar
Diatomic#nn interactions Gap mode
Antiferromagnet#nn interactions Uniaxial anisotropyPgap mode Biaxial anisotropyPgap mode#reson
Similar Di!erent
Diatomic ILGM Electric dipole active q"0 optic mode stable BZ mode unstable
Antiferromagnetic ILSG Magnetic dipole active q"0 optic mode unstable BZ mode stable
Similar Di!erent Di!erent
Odd potential terms P local dc distortion
Only even terms No local dc distortion
Di!erent
the lattice plays a crucial role in de"ning the properties of these excitations only two subsections of this review (Sections 4.4 and 5.3) have dealt with the continuous limit. There is some value in comparing and contrasting the nano-scale behavior of the magnetic excitations described here with those already identi"ed in simulation studies of the excitations in nonlinear crystal lattices. For the vibration of a monatomic 1-D crystal lattice only intersite interactions are relevant while even for a simple ferromagnet there are onsite interactions to consider because of the local anisotropy "eld. When NN two body potentials are introduced into the monatomic vibrational problem the resulting anharmonicity is soft so intrinsic localized modes can not form. For the 1-D magnetic case the nonlinearities produced by both the NN exchange interaction and the anisotropy "eld are also soft but now the on-site anisotropy can produce a gap at the bottom of the spin wave spectrum and localized modes may appear in this gap. For such modes to be strongly localized the anisotropy energy and the exchange energy would need to be comparable. The di!erences between these two kinds of nonlinear systems are summarized in the "rst two rows of Table 2. The application of a dc magnetic "eld provides another di!erence between the two cases. As described in Section 2 for the special case of a 1-D ferromagnetic chain with NN interactions and easy plane anisotropy with a large enough magnetic "eld applied perpendicular to that plane, ILSMs can appear above the top of the linear spin wave spectrum. Numerical methods have been used to study the collisions between these objects as well as the collision of these objects with magnetic defects. The results described here can also be expected to apply to some magnetic superlattices systems.
230
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
A multilayer stack consisting of alternating ferromagnetic "lms and non-magnetic "lms has weak ferromagnetic coupling between the ferromagnetic layers as well as a small inplane anisotropy. The application of a su$ciently strong external magnetic "eld perpendicular to the stack will then reorient the spin alignment to the "eld direction with the added feature that spin precession will not produce an intra-"lm demagnetizing "eld [61]. The simplest way to make possible intrinsic localization in a 1-D ferromagnetic chain without anisotropy or a magnetic "eld is to include both NN and NNN isotropic exchange coupling. This possibility was reviewed in Section 3. It was shown that when the strength of NNN exchange interaction relative to the NN interaction exceeds a speci"c threshold value then intrinsic localized spin wave resonances (ILSRs) of both odd and even parity may appear coincident with the frequencies of the linear spin wave spectrum. Numerical studies demonstrate that the lifetime of an ILSR depends on the mode parity, the maximum spin deviation, and the relative strength of the NNN interaction to the NN one. In the small amplitude continuum approximation the traveling ILSR and stationary ILSR have the same envelope shape: its width is inversely proportional to the maximum spin deviation and increases with increasing NNN coupling strength. The properties of a translating ILSR depend on the size of the spin deviation. If the maximum spin deviation is modest the ILSR can travel through the lattice, but it is scattered by the discreteness of the lattice and decays into plane-spin waves over su$cient distances. The larger its amplitude, or the larger its velocity, the larger is the emission of plane-wave modes. For colliding ILSRs soliton-like behavior is found for small spin deviations in that the ILSRs preserve their shapes after collision and the energy transfer between them is negligible but for large amplitudes neither ILSR can survive the collision. The 1-D diatomic lattice with realistic NN two body potentials has been used to identify where and how intrinsic localization might appear in the vibrational spectrum. It was found that because of the soft anharmonicity intrinsic localized gap modes (ILGMs) may drop out of the optic branch and appear in the gap between the optic and acoustic plane-wave branches. The two sublattice 1-D antiferromagnet has somewhat similar features as outlined in rows 3 and 4 of Table 2. The case of NN exchange interactions and on-site easy-axis anisotropy has been described in some detail in Section 4. Here truly localized modes can be produced with frequencies in the gap below the standard AFMR frequency. The amplitude of such an ILSG is either single or double peaked and for both cases the ILSG frequency decreases as its amplitude grew. The degree of localization increases as either the maximum spin deviation or the ratio of the anisotropy constant to the exchange coupling constant increases. In the small spin deviation limit both types of ILSGs became identical envelope solitons. Although single- and double-peaked ILSGs are observed to have similar static properties, both analytical study and MD simulations reveals that only the singlepeaked ILSGs are stable in the presence of a noise perturbation whereas a randomly perturbed double-peaked ILSG evolves into a single peaked one. The instability of a double peaked ILSG increases with its amplitude and also with the relative strength of anisotropy "eld. Although ILSMs can exist at any site owing to the homogeneity of the lattice, the lack of continuous translational invariance in a discrete lattice prevents ILSMs of large amplitudes from moving from site to site and they become more easily pinned as the anisotropy to exchange "eld ratio increases. The pinning of ILSGs can be understood in terms of a Peierls}Nabarro (PN) barrier created by the lattice discreteness. The height of the PN energy barrier is the energy di!erence between the single-peaked ILSG and the double-peaked ILSG at the same frequency and this barrier increases
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
231
with the mode amplitude and the onsite anisotropy "eld. Unlike the ILSR in a NN and NNN exchange coupled ferromagnetic chain, an ILSG in an easy-axis antiferromagnetic chains can interact with far infrared radiation since the excitation has a net transverse magnetic moment thus these gap modes may be more relevant from the experimental point of view. Another magnetic case, which is similar to that for localized vibrational modes in the 1-D diatomic lattice, occurs for an antiferromagnetic chain with uniaxial on-site easy-plane anisotropy which was reviewed in Section 5. For this system one linear spin wave mode goes to u"0 at q"0 while the other mode remains at "nite frequency so there is no spin wave gap and only an intrinsic localized spin wave resonance (ILSR) is possible. Although ILSGs can always exists in gapped antiferromagnetic chains, an ILSR can exist only when the upper branch dispersion curve has positive curvature at the center of Brillouin zone, with frequency lying in a constrained range. As its amplitude decreases such a nonlinear excitation approaches continuously without threshold the corresponding spin wave mode of linear theory and is strikingly di!erent from the topological sine-Gordon kink excitations found in 1D easy-plane magnets [20,21,88]. The key feature in the nonlinear dynamics of an ILSR is the polarization di!erence between the two plane wave branches. The smaller the frequency of the q"0 mode in the upper plane wave branch, the less strongly coupled the resulting ILSR is to the other branch of the plane wave spectrum. When biaxial symmetry replaces the uniaxial one then ILSM can exist in the gap below the lower branch spin wave spectrum with properties similar to those described in Section 4. This overlap between the spin wave and vibrational problems is summarized in rows 3 and 4 of Table 2. Since the ILSR and ILSG modes in an antiferromagnetic chain are elliptically polarized and have nonzero transverse magnetic moments orthogonal to each other, unlike the non-ir-active ILSR in isotropic ferromagnetic chains studied in Section 3, they can coupled to far ir radiation as can ILGMs in diatomic lattices (see row 5 of Table 2). To determine how the intrinsic localized modes manifest themselves in physical systems the modulational instability of extended nonlinear spin waves in easy-axis antiferromagnetic chains, both analytically in the frame of linear stability analysis and numerically by means of molecular dynamics simulations, has been reviewed in Section 6. The analysis is equivalent to that for a `diatomica lattice with two degrees of freedom per site but an important di!erence with the vibrational problem is that it involves both on-site and intersite nonlinearity. Because of this di!erence the simplest case of NN interactions produces di!erent instability criteria for the crystal and spin systems as outlined in row 5 of Table 2. Stability analysis shows that the instability of an ILSM is determined by its symmetry. In contrast to the monatomic Klein}Gordon chain [69] where plane waves with wavevectors in the lower half of Brillouin zone are always unstable to long wavelength perturbations, in easy-axis antiferromagnetic chains spin waves with wavevectors close to the zone center are stable to both long wavelength and short wavelength perturbations but unstable to perturbations of moderate wavelengths. However, since the amplitude threshold for the instability of long wavelength spin waves is inversely proportional to the lattice size it tends to zero for macroscopic systems. Numerical simulations reveal that combination waves generated via wave-mixing processes can have signi"cant e!ect on the spin wave stability at large time scales. Section 7 focuses on one of the main e!ects of the modulational instability which is the creation of localized pulses. Weak dissipation imposes a "nite amplitude threshold even for in"nite chains and, in addition, ILSMs become mobile during formation because of the reduced strength of the nonlinearity. This section develops the intimate connection between modulational instability and
232
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
the dynamical localization of spin waves for real systems and demonstrates with a particular application to layered antiferromagnets that magnetic dipole-active ILSMs can be created via modulational instability by driving unstable extended nonlinear spin waves with an appropriate microwave radiation source. Since the ILSMs have frequencies lower than the extended spin waves, they produce an ILSM band in the absorption spectrum below the driving frequency. This signature of the ILSMs may be directly probed by homodyne detection methods. The opportunity of experimentally identifying ILMs appears at hand since the same experimental arrangement can be used both for their generation and their detection. For ILSMs which are not magnetic dipole-active, such as ILSRs in isotropic ferromagnetic chains, other techniques such as neutronscattering, may be required. Perhaps the biggest di!erence between ILMs in the crystal and spin problems is that identi"ed in row 6 of Table 2. It comes about because the odd potential terms present in the two body interaction for vibrational dynamics are absent for the NN exchange and on site anisotropy interactions in spin dynamics. Associated with such anharmonic potential terms is the production of a local dc distortion concomitant with the excitation of an ILGM. Since this dc strain "eld is long range some care is required in MD simulations, especially for higher dimensional cases. Thus an MD simulation for the ILM magnetic lattice is much simpler than that for a vibrational lattice with realistic potentials. 8.2. Other systems and future prospects Although ferrimagnets have not yet been considered they represent perhaps the closest magnetic example of the vibrational diatomic lattice system. Based on the antiferromagnetic and diatomic vibrational studies of ILMs in 1-D lattices which have been reported to date, the properties of magnetic ILMs in a ferrimagnet with NN antiferromagnetic exchange interactions between neighboring spins of unequal magnitude are easy to visualize. Since the linear dynamics is now represented by an acoustic ferrimagnetic branch and an optical exchange resonance branch for the case where a frequency gap may exist between these branches then an ILSGM would be possible here. The introduction of anisotropy and applied dc external "eld would increase the richness of these systems. The recently analyzed 1-D coupled rotor lattice [110}112] may at "rst appear closer to the spin models treated here than the dynamics of other 1-D discrete lattice models. But one feature that makes the rotor model standout is the coexistence of rotational and librational motions so that in the extreme case, an intrinsic localized rotor mode can consist of only one rotational center plus librational wings which can never occur for the spin model that we have reviewed here. In addition, since the equations of motion for the rotor model involve both "rst and second derivatives, it is not possible to make direct contact between spin and rotor dynamics. Nevertheless, just like all other ILM-bearing models, the existence of an ILM in the rotor model is closely related to an instability of some extended plane wave states. Because the coupled rotor lattice can be mapped onto a Josephson junction array [113] there is the opportunity to carry out experimental tests exploiting this analogy but the lattice scale will necessarily be much coarser than for the atomic lattice considered in this review. The quantization of ILSMs represents a di$cult question yet to be answered. Recent vibrational studies have shown that energy focusing (ILMs) is prevalent not only in classical but also in
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236
233
quantum discrete nonlinear lattices. However, these studies are either based on quantum models that are constructed to conserve the phonon number operator [114] or based on exact numerical diagonalization of Hamiltonian matrices in truncated phonon spaces [115]. In the "rst case the model is analytically tractable but can hardly be expected to describe any realistic physical system. In the second case, the number of particles in the chain (N) is limited to be less than 8 owing to the rapid increase of matrix dimension with N even for a truncated phonon space, and as yet no study of size-dependence has been carried out. On the other hand, it is known that the amplitude threshold for modulational instability is proportional to 1/N. This result suggests that the occurrence of nonlinear localization in small chains might have a strong size dependence so that the results obtained for a particular small chain may not be reliable for comparison with experiments. Magnetic lattices may be expected to provide an important alternative for the study of quantum ILMs since the number of states (2S#1) of each spin is usually much smaller than the number of allowed Einstein phonon states at each site (&17 in Ref. [115]). The current computational power of parallel computers should allow magnetic chains with a large range of sizes to be studied through the exact diagonalization approach without truncation. The simplest models suitable for the study of quantum ILMs might be Heisenberg ferromagnets, either isotropic ferromagnetic chains with both nearest- and next-nearest-neighbor exchange interactions or ferromagnetic chains with nearest-neighbor exchange interaction and easy-axis on-site anisotropy.
Acknowledgements We thank J.P. Sethna and R.H. Silsbee for helpful conversations and N.I. Agladze for the production of some of the "gures. This work is supported in part by NSF-DMR-9631298, ARO-DAAH04-96-1-0029 and NSF ECS-9612255. Some of this research was conducted using the resources of the Cornell Theory Center, which receives major funding from the National Science Foundation and New York State.
References [1] G. Leibfried, W. Ludwig, in: F. Seitz, D. Turnbull (Eds.), Solid State Physics, vol. 12, Academic Press, New York, 1961. [2] A.A. Maradudin, E.W. Montroll, G.H. Weiss, I.P. Ipatova, Theory of Lattice Dynamics in the Harmonic Approximation, vol. 3, 2nd ed., Academic Press, New York, 1971. [3] M. Sparks, Ferromagnetic Relaxation Theory, McGraw-Hill, New York, 1964, p. 182. [4] F. Ke!er, in: H.P.J. Wijn (Ed.), Handbuch der Physik, vol. XVIII/2, Springer, Berlin, 1966. [5] R.M. White, Quantum Theory of Magnetism, Springer, Berlin, 1987. [6] N.W. Ashcroft, N.D. Mermin, Solid State Physics, Saunders College, Philadelphia, 1976. [7] H. Bilz, D. Strauch, R.K. Wehner, Vibrational infrared and Raman spectra of non-metals, in: L. Genzel (Ed.), Handbuch der Physik, vol. XXV, Pt. 2d, Springer, Berlin, 1984. [8] P.M. Chaikin, T.C. Lubensky, Principles of Condensed Matter Physics, Ch. 10, Cambridge University Press, Cambridge, 1995. [9] J.A. Krumhansl, J.R. Schrie!er, Phys. Rev. B 11 (1975) 3535. [10] G.L. Lamb, Elements of Soliton Theory, Wiley, New York, 1980.
234 [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58]
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236 M. Toda, Theory of Nonlinear Lattices, Springer, New York, 1989. W.P. Su, J.R. Schrie!er, A.J. Heeger, Phys. Rev. Lett. 42 (1979) 1698. A.S. Davydov, Solitons in Molecular Systems, Reidel, Boston, 1985. A.M. Kosevich, B.A. Ivanov, A.S. Kovalev, Phys. Rep. 194 (1990) 117. H.-J. Mikeska, M. Steiner, Adv. Phys. 40 (1991) 191. G.B. Whitham, Linear and Nonlinear Waves, Wiley, New York, 1974. A. Newell, Solitons in Mathematics and Physics, SIAM, Philadelphia, 1985. A.R. Bishop, T. Schneider, in: M. Cardona, P. Fulde, H.-J. Queisser (Eds.), Springer Series in Solid-State Sciences, Springer, Berlin, 1978. A.R. Bishop, D.K. Campbell, P. Kumar, S.E. Trullinger, in: M. Cardona, P. Fulde, K.v. Klitzing, H.-J. Queisser (Eds.), Springer Series in Solid-State Sciences, vol. 69, Springer, Berlin, 1987. H.J. Mikeska, J. Phys. C 11 (1978) L29. J.K. Kjems, M. Steiner, Phys. Rev. Lett. 41 (1978) 1137. T. Oguchi, T. Ishikawa, J. Phys. Soc. Jpn. 34 (1973) 1486. I.G. Gochev, Sov. Phys. JETP 34 (1972) 392. E. Infeld, G. Rowlands, Nonlinear Waves Solitons and Chaos, Cambridge University Press, Cambridge, 1990. M.J. Ablowitz, J.F. Ladik, J. Math. Phys. 17 (1976) 1011. A.S. Dolgov, Sov. Phys. Solid State 28 (1986) 907. A.J. Sievers, S. Takeno, Phys. Rev. Lett. 61 (1988) 970. J.B. Page, Phys. Rev. B 41 (1990) 7835. S. Takeno, A.J. Sievers, Solid State Commun. 67 (1988) 1023. S.A. Kiselev, S.R. Bickham, A.J. Sievers, Comments Cond. Mater. Phys. 17 (1995) 135. A.J. Sievers, J.B. Page, in: G.K. Horton, A.A. Maradudin (Eds.), Dynamical Properties of Solids, vol. 7, North-Holland, Amsterdam, 1995, p. 137. S. Aubry, Physica D 103 (1997) 201. S.R. Bickham, S.A. Kiselev, A.J. Sievers, in: B.D. Bartolo (Ed.), Spectroscopy and Dynamics of Collective Excitations in Solids, vol. 356, Plenum Press, New York, 1997, p. 247. S. Flach, C.R. Willis, Phys. Rep. 295 (1998) 181. R. Rajaraman, Solitons and Instantons, North-Holland, Amsterdam, 1982, p. 47. V.M. Burlakov, S.A. Kiselev, V.N. Pyrkov, Phys. Rev. B 42 (1990) 4921. D. Bonart, A.P. Mayer, U. Schroder, Phys. Rev. Lett. 75 (1995) 870. S.A. Kiselev, A.J. Sievers, Phys. Rev. B 55 (1997) 5755. S. Flach, K. Kladko, R.S. MacKay, Phys. Rev. Lett. 78 (1997) 1207. S. Takeno, K. Kawasaki, Phys. Rev. B 45 (1992) R5083. R.F. Wallis, D.L. Mills, A.D. Boardman, Phys. Rev. B 52 (1995) R3828. S. Flach, K. Kladko, Phys. Rev. B 53 (1996) 11531. J.L. Ting, M. Peyrard, Phys. Rev. E 53 (1996) 1011. O.M. Braun, T. Dauxois, M. Peyrard, Phys. Rev. B 56 (1997) 4987. R. Weber, Phys. Rev. Lett. 21 (1968) 1260. R. Weber, Z. Phys. 223 (1969) 299. R.A. Cowley, W.J.L. Buyers, Rev. Mod. Phys. 44 (1972) 406. D. Donnelly, Phys. Rev. B 52 (1995) 1042. A. Osero!, P.S. Pershan, Phys. Rev. Lett. 21 (1968) 1593. L.F. Johnson, R.E. Dietz, H.J. Guggenheim, Phys. Rev. B 17 (1966) 13. W.J.L. Buyers, R.A. Cowley, P.M. Holden, R.W.H. Stevenson, J. Appl. Phys. 39 (1968) 1118. K.C. Johnson, Ph.D. Thesis, Cornell University, 1972. S.R. Bickham, S.A. Kiselev, A.J. Sievers, Phys. Rev. B 47 (1993) 14206. S.A. Kiselev, S.R. Bickham, A.J. Sievers, Phys. Rev. B 48 (1993) 13508. S.A. Kiselev, S.R. Bickham, A.J. Sievers, Phys. Rev. B 50 (1994) 9135. S.A. Kiselev, A.J. Sievers, Phys. Rev. B 55 (1997) 5755. R. Lai, A.J. Sievers, Phys. Rev. B 55 (1997) R11937. R. Lai, S.A. Kiselev, A.J. Sievers, Phys. Rev. B 56 (1997) 5345.
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236 [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109]
235
S. Rakhmanova, D.L. Mills, Phys. Rev. B 54 (1996) 9225. B. Schmid, B. Dorner, D. Petitgrand, L.P. Regnault, M. Steiner, Z. Phys. B 95 (1994) 13. S. Rakhmanova, D.L. Mills, Phys. Rev. B 58 (1998) 11458. R. Lai, S.A. Kiselev, A.J. Sievers, Phys. Rev. B 54 (1996) R12665. A.S. Barker, A.J. Sievers, Rev. Mod. Phys. 47 (1975) S1. S.A. Kiselev, S.R. Bickham, A.J. Sievers, Phys. Rev. B 50 (1994) 9135. S.A. Kiselev, R. Lai, A.J. Sievers, Phys. Rev. B 57 (1998) 3402. J.P. Boyd, Nonlinearity 3 (1990) 177. J.L. Martin, S. Aubry, Nonlinearity 9 (1996) 1501. A. Tsurui, Prog. Theoret. Phys. 48 (1972) 1196. Y.S. Kivshar, M. Peyrard, Phys. Rev. A 46 (1992) 3198. Y.S. Kivshar, M. Salerno, Phys. Rev. E 49 (1994) 3543. N. Flytzanis, S. Pnevmatikos, M. Remoissenet, J. Phys. C 18 (1985) 4603. K.W. Sandusky, J.B. Page, Phys. Rev. B 50 (1994) 866. J.M. Bilbault, P. Marquie, Phys. Rev. B 53 (1996) 5403. V.M. Burlakov, S.A. Kiselev, Sov. Phys. JETP 72 (1991) 854. T. Dauxois, M. Peyrard, Phys. Rev. Lett. 70 (1993) 3935. S. Takeno, K. Kawasaki, J. Phys. Soc. Jpn. 60 (1991) 1881. G. Huang, Z. Xu, W. Xu, J. Phys. Soc. Jpn. 62 (1993) 3231. O.A. Chubykalo, Phys. Lett. A 189 (1994) 403. W.M. Zhang, D.H. Feng, R. Gilmore, Rev. Mod. Phys. 62 (1991) 867. K. Yosida, Theory of Magnetism, Springer, New York, 1996, p. 34. K.W. Sandusky, J.B. Page, K.E. Schmidt, Phys. Rev. B 46 (1992) 6161. S.R. Bickham, A.J. Sievers, S. Takeno, Phys. Rev. B 45 (1992) 10344. D. Cai, A.R. Bishop, N. Gronbech-Jensen, Phys. Rev. Lett. 72 (1994) 591. Y.S. Kivshar, D.K. Campbell, Phys. Rev. E 48 (1993) 3077. A.H. Morrish, The Physical Principles of Magnetism, Wiley, New York, 1965, p. 623. I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series and Products, Academic Press, New York, 1980, p. 1039. H. Segur, M.D. Kruskal, Phys. Rev. Lett. 58 (1987) 747. K.M. Leung, D.W. Hone, D.L. Mills, P.S. Riseborough, S.E. Trullinger, Phys. Rev. B 21 (1980) 4017. T.B. Benjamin, J.E. Feir, J. Fluid Mech. 27 (1967) 417. T. Taniuti, H. Washimi, Phys. Rev. Lett. 21 (1968) 209. A. Hasegawa, Optical Solitons in Fibers, Springer, New York, 1989. P. Marquie, J.M. Bilbault, M. Remoissenet, Phys. Rev. E 49 (1994) 829. K. Tai, A. Tomita, J.L. Jewell, A. Hasegawa, Appl. Phys. Lett. 49 (1986) 236. I. Daumont, T. Dauxois, M. Peyrard, Nonlinearity 10 (1997) 617. Y.S. Kivshar, Phys. Rev. E 48 (1993) 4132. V.M. Burlakov, S.A. Darmanyan, V.N. Pyrkov, Phys. Rev. B 54 (1996) 3257. E. Infeld, Phys. Rev. Lett. 47 (1981) 717. G.P. Tsironis, S. Aubry, Phys. Rev. Lett. 77 (1996) 5225. T. RoK ssler, J.B. Page, Phys. Rev. Lett. 78 (1997) 1287. J.P. Kotthaus, V. Jaccarino, Phys. Rev. Lett. 28 (1972) 1649. R.W. Sanders, R.M. Belanger, M. Motokawa, V. Jaccarino, S.M. Rezende, Phys. Rev. B 23 (1981) 1190. R. Lai, A.J. Sievers, Phys. Rev. B 57 (1998) 3433. D.W. Brown, L.J. Bernstein, K. Lindenberg, Phys. Rev. E 54 (1996) 3352. A.N. Slavin, B.A. Kalinikos, N.G. Kovshikov, in: P.E. Wigen (Ed.), Nonlinear phenomena and chaos in magnetic materials, World Scienti"c, Singapore, 1994, p. 209. J. Ohishi, M. Kubota, K. Kawasaki, S. Takeno, Phys. Rev. B 55 (1997) 8812. R. Lai, A.J. Sievers, Phys. Rev. Lett. 81 (1998) 1937. R.J. Birgeneau, W.B. Yelon, E. Cohen, J. Makovsky, Phys. Rev. B 5 (1972) 2607. J. Shi, H. Yamazaki, M. Mino, J. Phys. Soc. Jpn. 57 (1988) 3580. M. Chikamatsu, M. Tanaka, H. Yamazaki, J. Phys. Soc. Jpn. 50 (1981) 2876.
236 [110] [111] [112] [113] [114] [115] [116]
R. Lai, A.J. Sievers / Physics Reports 314 (1999) 147}236 S. Takeno, M. Peryard, Physica D 92 (1996) 140. S. Takeno, M. Peyrard, Phys. Rev. E 55 (1997) 1922. D. Bonart, J.B. Page, Phys. Rev. Lett. (1998), submitted. L.M. Floria, J.L. Marin, P.J. Martinez, F. Falo, S. Aubry, Europhys. Lett. 36 (1996) 539. A.C. Scott, Physica D 78 (1994) 194. W.Z. Wang, J.T. Gammel, A.R. Bishop, M.I. Salkola, Phys. Rev. Lett. 76 (1996) 3598. R. Lai, A.J. Sievers, J. Appl. Phys. 81 (1997) 3972.
Physics Reports 314 (1999) 237}574
Simpli"ed models for turbulent di!usion: Theory, numerical modelling, and physical phenomena Andrew J. Majda*, Peter R. Kramer New York University, Courant Institute, 251 Mercer Street, New York, NY 10012, USA Received August 1998; editor: I. Procaccia Contents 1. Introduction 2. Enhanced di!usion with periodic or shortrange correlated velocity "elds 2.1. Homogenization theory for spatiotemporal periodic #ows 2.2. E!ective di!usivity in various periodic #ow geometries 2.3. Tracer transport in periodic #ows at "nite times 2.4. Random #ow "elds with short-range correlations 3. Anomalous di!usion and renormalization for simple shear models 3.1. Connection between anomalous di!usion and Lagrangian correlations 3.2. Tracer transport in steady, random shear #ow with transverse sweep 3.3. Tracer transport in shear #ow with random spatio-temporal #uctuations and transverse sweep 3.4. Large-scale e!ective equations for mean statistics and departures from standard eddy di!usivity theory 3.5. Pair-distance function and fractal dimension of scalar interfaces 4. Passive scalar statistics for turbulent di!usion in rapidly decorrelating velocity "eld models 4.1. De"nition of the rapid decorrelation in time (RDT) model and governing equations
240 243 245 262 285 293 304 308 316
342
366 389 413
4.2. Evolution of the passive scalar correlation function through an inertial range of scales 4.3. Scaling regimes in spectrum of #uctuations of driven passive scalar "eld 4.4. Higher-order small-scale statistics of passive scalar "eld 5. Elementary models for scalar intermittency 5.1. Empirical observations 5.2. An exactly solvable model displaying scalar intermittency 5.3. An example with qualitative "nite-time corrections to the homogenized limit 5.4. Other theoretical work concerning scalar intermittency 6. Monte Carlo methods for turbulent di!usion 6.1. General accuracy considerations in Monte Carlo simulations 6.2. Nonhierarchical Monte Carlo methods 6.3. Hierarchical Monte Carlo methods for fractal random "elds 6.4. Multidimensional simulations 6.5. Simulation of pair dispersion in the inertial range 7. Approximate closure theories and exactly solvable models Acknowledgements References
417
* Corresponding author. Tel.: (212) 998-3324; fax: (212) 995-4121; e-mail: [email protected]. 0370-1573/99/$ } see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 8 3 - 0
427 439 450 460 462 463 483 488 493 495 496 521 545 551 559 561 561
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
239
Abstract Several simple mathematical models for the turbulent di!usion of a passive scalar "eld are developed here with an emphasis on the symbiotic interaction between rigorous mathematical theory (including exact solutions), physical intuition, and numerical simulations. The homogenization theory for periodic velocity "elds and random velocity "elds with short-range correlations is presented and utilized to examine subtle ways in which the #ow geometry can in#uence the large-scale e!ective scalar di!usivity. Various forms of anomalous di!usion are then illustrated in some exactly solvable random velocity "eld models with long-range correlations similar to those present in fully developed turbulence. Here both random shear layer models with special geometry but general correlation structure as well as isotropic rapidly decorrelating models are emphasized. Some of the issues studied in detail in these models are superdi!usive and subdi!usive transport, pair dispersion, fractal dimensions of scalar interfaces, spectral scaling regimes, small-scale and large-scale scalar intermittency, and qualitative behavior over "nite time intervals. Finally, it is demonstrated how exactly solvable models can be applied to test and design numerical simulation strategies and theoretical closure approximations for turbulent di!usion. 1999 Elsevier Science B.V. All rights reserved. PACS: 47.27.Qb; 05.40.#j; 47.27.!i; 05.60.#w; 47.27.Eq; 02.70.Lq
240
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
1. Introduction In this review, we consider the problem of describing and understanding the transport of some physical entity, such as heat or particulate matter, which is immersed in a #uid #ow. Most of our attention will be on situations in which the #uid is undergoing some disordered or turbulent motion. If the transported quantity does not signi"cantly in#uence the #uid motion, it is said to be passive, and its concentration density is termed a passive scalar "eld. Weak heat #uctuations in a #uid, dyes utilized in visualizing turbulent #ow patterns, and chemical pollutants dispersing in the environment may all be reasonably modelled as passive scalar systems in which the immersed quantity is transported in two ways: ordinary molecular di!usion and passive advection by its #uid environment. The general problem of describing turbulent di!usion of a passive quantity may be stated mathematically as follows: Let *(x, t) be the velocity "eld of the #uid prescribed as a function of spatial coordinates x and time t, which we will always take to be incompressible ( ' *(x, t)"0). Also let f (x, t) be a prescribed pumping (source and sink) "eld, and ¹ (x) be the passive scalar "eld prescribed at some initial time t"0. Each may have a mixture of deterministic and random components, the latter modelling noisy #uctuations. In addition, molecular di!usion may be relevant, and is represented by a di!usivity coe$cient i. The passive scalar "eld then evolves according to the advection}di+usion equation R¹(x, t)/Rt#*(x, t) ' ¹(x, t)"iD¹(x, t)#f (x, t) , ¹(x, t"0)"¹ (x) . (1) The central aim is to describe some desired statistics of the passive scalar "eld ¹(x, t) at times t'0. For example, a typical goal is to obtain e!ective equations of motion for the mean passive scalar density, denoted 1¹(x, t)2. While the PDE in Eq. (1) is linear, the relation between the passive scalar "eld ¹(x, t) and the velocity "eld *(x, t) is nonlinear. The in#uence of the statistics of the random velocity "eld on the passive scalar "eld is subtle and very di$cult to analyze in general. For example, a closed equation for 1¹(x, t)2 typically cannot be obtained by simply averaging the equation in Eq. (1), because 1*(x, t) ' ¹(x, t)2 cannot be simply related to an explicit functional of 1¹(x, t)2 in general. This is a manifestation of the `turbulence moment closure problema [227]. In applications such as the predicting of temperature pro"les in high Reynolds number turbulence [196,227,247,248], the tracking of pollutants in the atmosphere [78], and the estimating of the transport of groundwater through a heterogeneous porous medium [79], the problem is further complicated by the presence of a wide range of excited space and time scales in the velocity "eld, extending all the way up to the scale of observational interest. It is precisely for these kinds of problems, however, that a simpli"ed e!ective description of the evolution of statistical quantities such as the mean passive scalar density 1¹(x, t)2 is extremely desirable, because the range of active scales of velocity "elds which can be resolved is strongly limited even on supercomputers [154]. For some purposes, one may be interested in following the progress of a specially marked particle as it is carried by a #ow. Often this particle is light and small enough so that its presence
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
241
only negligibly disrupts the existing #ow pattern, and we will generally refer to such a particle as a (passive) tracer, re#ecting the terminology of experimental science in which #uid motion is visualized through the motion of injected, passively advected particles (often optically active dyes) [227]. The problem of describing the statistical transport of tracers may be formulated as follows: Let *(x, t) be a prescribed, incompressible velocity "eld of the #uid, with possibly both a mean component and a random component with prescribed statistics modelling turbulent or other disordered #uctuations. We seek to describe some desired statistics of the trajectory X(t) of a tracer particle released initially from some point x and subsequently transported jointly by the #ow *(x, t) and molecular di!usivity i. The equation of motion for the trajectory is a (vectorvalued) stochastic di!erential equation [112,257] dX(t)"*(X(t), t) dt#(2i dW(t) ,
(2a)
X(t"0)"x . (2b) The second term in Eq. (2a) is a random increment due to Brownian motion [112,257]. Basic statistical functions of interest are the mean trajectory, 1X(t)2, and the mean-square displacement of a tracer from its initial location, 1"X(t)!x "2. It is often of interest to track multiple particles simultaneously; these will each individually obey the trajectory equations in Eqs. (2a) and (2b) with the same realization of the velocity "eld * but independent Brownian motions. The advection}di!usion PDE in Eq. (1) and the tracer trajectory equations in Eqs. (2a) and (2b) are related to each other by the theory of Ito di!usion processes [107,257], which is just a generalization of the method of characteristics [150] to handle secondorder derivatives via a random noise term in the characteristic equations. We will work with both of these equations in this review. In principle, the turbulent velocity "eld *(x, t) which advects the passive scalar "eld should be a solution to the Navier}Stokes equations R*(x, t)/Rt#*(x, t) ' *(x, t)"! p(x, t)#lD*(x, t)#F(x, t) ,
' *(x, t)"0 ,
(3)
where p is the pressure "eld, l is viscosity, and F(x, t) is some external stirring which maintains the #uid in a turbulent state. But the analytical representation of such solutions corresponding to complex, especially turbulent #ows, are typically unwieldy or unknown. We shall therefore instead utilize simpli"ed velocity "eld models which exhibit some empirical features of turbulent or other #ows, though these models may not be actual solutions to the Navier}Stokes equations. Incompressibility ' *(x, t)"0 is however, enforced in all of our velocity "eld models. Our primary aim in working with simpli"ed models is to obtain mathematically explicit and unambiguous results which can be used as a sound basis for the scienti"c investigation of more complex turbulent di!usion problems arising in applications for which no analytical solution is available. We therefore emphasize the aspects of the model results which illustrate general physical mechanisms and themes which can be expected to be manifest in wide classes of turbulent #ows. We will also show how simpli"ed models can be used to strengthen and re"ne the
242
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
arsenal of numerical methods designed for quantitative physical exploration in natural and practical applications. First of all, simpli"ed models o!er themselves as a pool of test problems to assess the variety of numerical simulations schemes proposed for turbulent di!usion [109,180,190,219,291,335]. Moreover, we shall explicitly describe in Section 6 how mathematical (harmonic) analysis of simpli"ed models can be used as a basis to design new numerical simulation algorithms with superior performance [82,84}86]. Accurate and reliable numerical simulations in turn enrich various mathematical asymptotic theories by furnishing explicit data concerning the quality of the asymptotic approximation and the signi"cance of corrections at "nite values of the small or large parameter, and can reveal new physical phenomena in strongly nonlinear situations unamenable to a purely theoretical treatment. Physical intuition, for its part, suggests fruitful mathematical model problems for investigation, guides their analyses, and informs the development of numerical strategies. We will repeatedly appeal to this symbiotic interaction between simpli"ed mathematical models, asymptotic theory, physical understanding, and numerical simulation. Though we do not dwell on this aspect in this review, we wish to mention the more distant goal of using simpli"ed velocity "eld models in turbulent di!usion to gain some understanding in the theoretical analysis and practical treatment of the Navier}Stokes equations in Eq. (3) in situations where strong driving gives rise to complicated turbulent motion [196,227]. The advection}di!usion equation in Eq. (1) has some essential features in common with the Navier}Stokes equations: they are both transport equations in which the advection term gives rise to a nonlinearity of the statistics of the solution. At the same time, the advection}di!usion equation is more managable since it is a scalar, linear PDE without an auxiliary constraint analogous to incompressibility. The advection}di!usion equation, in conjunction with a velocity "eld model with turbulent characteristics, therefore serves as a simpli"ed prototype problem for developing theories for turbulence itself. Our study of passive scalar advection}di!usion begins in Section 2 with velocity "elds which have either a periodic cell structure or random #uctuations with only mild short-range spatial correlations. We explain the general homogenization theory [12,32,148] which describes the behavior of the passive scalar "eld at large scales and long times in these #ows via an enhanced `homogenizeda di!usivity matrix. Through mathematical theory, exact results from simpli"ed models, and numerical simulations, we examine how the homogenized di!usion coe$cient depends on the #ow structure, and investigate how well the observation of the passive scalar system at large but "nite space}time scales agrees with the homogenized description. In Section 3, we use simple random shear #ow models [10,14] with a #exible statistical spatio-temporal structure to demonstrate explicitly a number of anomalies of turbulent di!usion when the velocity "eld has su$ciently strong long-range correlations. These simple shear #ow models are also used to explore turbulent di!usion in situations where the velocity "eld has a wide inertial range of spatio-temporal scales excited in a statistically self-similar manner, as in a high Reynolds number turbulent #ow. We also describe some universal small-scale features of the passive scalar "eld which may be derived in an exact and rigorous fashion in such #ows. Other aspects of small-scale passive scalar #uctuations are similarly addressed in Section 4 using a complementary velocity "eld model [152,179] with a statistically isotropic geometry but very rapid decorrelations in time. In Section 5, we present a special family of exactly solvable shear #ow models [207,233] which explicitly demonstrates the phenomenon of large-scale intermittency in the statistics of the passive scalar "eld, by which we mean the occurrence of a broader-than-Gaussian distribution for the value of the passive scalar
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
243
"eld ¹(x, t) recorded at a single location in a turbulent #ow [155,127,146,147,191]. Next, in Section 6, we focus on the challenge of developing e$cient and accurate numerical `Monte Carloa methods for simulating the motion of tracers in turbulent #ows. Using the simple shear models from Section 3 and other mathematical analysis [83,87,140], we illustrate explicitly some subtle and signi"cant pitfalls of some conventional numerical approaches. We then discuss the theoretical basis and demonstrate the exceptional practical performance of a recent waveletbased Monte Carlo algorithm [82,84}86] which is designed to handle an extremely wide inertial range of self-similar scales in the velocity "eld. We conclude in Section 7 with a brief discussion of the application of exactly solvable models to assess approximate closure theories [177,182,196,200,227,285,286,344] which have been formulated to describe the evolution of the mean passive scalar density in a high Reynolds number turbulent #ow [13,17]. Detailed introductions to all these topics are presented at the beginning of the respective sections.
2. Enhanced di4usion with periodic or short-range correlated velocity 5elds In the introduction, we mentioned the moment closure problem for obtaining statistics of the passive scalar "eld immersed in a turbulent #uid. To make this issue concrete, consider the challenge of deriving an equation for the mean passive scalar density 1¹(x, t)2 advected by a velocity "eld which is a superposition of a mean #ow pattern V(x, t) and random, turbulent #uctuations *(x, t) with mean zero. Angle brackets will denote an ensemble average of the included quantity over the statistics of the random velocity "eld. Since the advection}di!usion equation is linear, one might naturally seek an equation for the mean passive scalar density by simply averaging it: R1¹(x, t)2/Rt#V(x, t) ' 1¹(x, t)2#1*(x, t) ' ¹(x, t)2"iD1¹(x, t)2#1 f (x, t)2 , 1¹(x, t"0)2"1¹ (x)2 .
(4)
Eq. (4) is not a closed equation for 1¹(x, t)2 because the average of the advective term, 1* ' ¹2, cannot generally be simply related to a functional of 1¹(x, t)2. An early idea for circumventing this obstacle was to represent the e!ect of the random advection by a di!usion term: 1*(x, t) ' ¹(x, t)2"! ' (K M ' ¹(x, t)) , 2
(5)
where K M is some constant `eddy di!usivitya matrix (usually a scalar multiple of the identity 2 matrix I) which is to be estimated in some manner, such as mixing-length theory ([320], Section 2.4). From assumption (5) follows a simple e!ective advection}di!usion equation for the mean passive scalar density R1¹(x, t)2/Rt#V(x, t) ' 1¹(x, t)2" ' ((iI#K M ) ' 1¹(x, t)2)#1 f (x, t)2 , 2 1¹(x, t"0)2"1¹ (x)2 ,
244
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where the di!usivity matrix (iI#K M ) is (presumably) enhanced over its bare molecular value by 2 the turbulent eddy di!usivity K M coming from the #uctuations of the velocity. The closure 2 hypothesis (5) is the `Reynolds analogya of a suggestion "rst made by Prandtl in the context of the Navier}Stokes equations (see [227], Section 13.1). It may be viewed as an extension of kinetic theory, where microscopic particle motion produces ordinary di!usive e!ects on the macroscale. There are, however, some serious de"ciencies of the Prandtl eddy di!usivity hypothesis, both in terms of theoretical justi"cation and of practical application to general turbulent #ows (see [227], Section 13.1; [320], Ch. 2). First of all, kinetic theory requires a strong separation between the microscale and macroscale, but the turbulent #uctuations typically extend up to the scale at which the mean passive scalar density is varying. Moreover, the recipes for computing the eddy di!usivity K M are rather vague, and are generally only de"ned up to some unknown numerical constant `of 2 order unitya. More sophisticated schemes for computing eddy viscosities based on renormalization group ideas have been proposed in more recent years [243,300,344], but these involve other ad hoc assumptions of questionable validity. In Section 2, we will discuss some contexts in which rigorous sense can be made of the eddy di!usivity hypothesis (5), and an exact formula provided for the enhanced di!usivity. All involve the fundamental assumption that, in some sense, the #uctuations of the velocity "eld occur on a much smaller scales than those of the mean passive scalar "eld. These rigorous theories therefore are not applicable to strongly turbulent #ows, but they provide a solid, instructive, and relatively simple framework for examining a number of subtle aspects of passive scalar advection}di!usion in unambiguous detail. Moreover, they can be useful in practice for certain types of laboratory or natural #ows at moderate or low Reynolds numbers [301,302]. Overview of Section 2: We begin in Section 2.1 with a study of advection}di!usion by velocity "elds that are deterministic and periodic in space and time. Generally, we will be considering passive scalar "elds which are varying on scales much larger than those of the periodic velocity "eld in which they are immersed. Though the velocity "eld is deterministic, one may formally view the periodic #uctuations as an extremely simpli"ed model for small-scale turbulent #uctuations. Averaging over the #uctuations may be represented by spatial averaging over a period cell. After a convenient nondimensionalization in Section 2.1.1, we formulate in Sections 2.1.2 and 2.1.3 the homogenization theory [32,149] which provides an asymptotically exact representation of the e!ects of the small-scale periodic velocity "eld on the large-scale passive scalar "eld in terms of a homogenized, e!ective di!usivity matrix KH which is enhanced above bare molecular di!usion. Various alternative ways of computing this e!ective di!usivity matrix are presented in Section 2.1.4. We remark that, in contrast to usual eddy di!usivity models, the enhanced di!usivity in the rigorous homogenization theory has a highly nontrivial dependence on molecular di!usivity. We will express this dependence in terms of the PeH clet number, which is a measure of the strength of advection by the velocity "eld relative to di!usion by molecular processes (see Section 2.1.1). The physically important limit of high PeH clet number will be of central interest throughout Section 2. In Section 2.2, we apply the homogenization theory to evaluate the tracer transport in a variety of periodic #ows. We demonstrate the symbiotic interplay between the rigorous asymptotic theories and numerical computations in these investigations, and how they can reveal some important and subtle physical transport mechanisms. We "rst examine periodic shear #ows with various types of cross sweeps (Sections 2.2.1 and 2.2.2), where exact analytical formulas can be derived. Next we turn to #ows with a cellular structure and their perturbations (Section 2.2.3), and
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
245
the subtle e!ects which the addition of a mean sweep can produce (Section 2.2.4). We discuss how other types of periodic #ows can be pro"tably examined through the joint use of analytical and numerical means in Section 2.2.5. An important practical issue is the accuracy with which the e!ective di!usivity from homogenization theory describes the evolution of the passive scalar "eld at "nite times. We examine this question in Section 2.3 by computing the mean-square displacement of a tracer over a "nite interval of time. For shear #ows with cross sweeps, an exact analytical expression can be obtained (Section 2.3.1). The "nite time behavior of tracers in more general periodic #ows may be estimated numerically through Monte Carlo simulations (Section 2.3.2). In all examples considered, the rate of change of the mean-square tracer displacement is well described by (twice) the homogenized di!usivity after a transient time interval which is not longer than the time it would take molecular di!usion to spread over a few spatial period cells [230,231]. In Section 2.4, we begin our discussion of advection}di!usion by homogenous random velocity "elds. We identify two di!erent large-scale, long-time asymptotic limits in which a closed e!ective di!usion equation can be derived for the mean passive scalar density 1¹(x, t)2. First is the `Kubo theorya [160,188,313], where the time scale of the velocity "eld varies much more rapidly than that of the passive scalar "eld, but the length scales of the two "elds are comparable (Section 2.4.1). The `Kubo di!usivitya appearing in the e!ective equation is simply related to the correlation function of the velocity "eld. Next we concentrate on steady random velocity "elds which have only short-range spatial correlations, so that there can be a meaningfully strong separation of scales between the passive scalar "eld and the velocity "eld. A homogenization theorem applies in such cases [12,98,256], and rigorously describes the e!ect of the small-scale random velocity "eld on the large-scale mean passive scalar "eld through a homogenized, e!ective di!usivity matrix (Section 2.4.2). Homogenization for the steady periodic #ow "elds described in the earlier Sections 2.1, 2.2 and 2.3 is a special case of this more general theory for random "elds. We present various formulas for the homogenized di!usivity in Section 2.4.3, and discuss its parametric behavior in some example random vortex #ows in Section 2.4.4. We emphasize again that high Reynolds number turbulent #ows have strong long-range correlations which do not fall under the purview of the homogenization theory discussed in Section 2. The rami"cations of these long-range correlations will be one of the main foci in the remaining sections of this review. 2.1. Homogenization theory for spatio-temporal periodic -ows Here we present the rigorous homogenization theory which provides a formula for the e!ective di!usion of a passive scalar "eld at large scales and long times due to the combined e!ects of molecular di!usion and advection by a periodic velocity "eld. We "rst prepare for our discussion with some de"nitions and a useful nondimensionalization in Section 2.1.1. Next, in Section 2.1.2, we state the formula prescribed by homogenization theory for the e!ective di!usivity of the passive scalar "eld on large scales and long times, and show formally how to derive it through a multiple scale asymptotic analysis [32,205]. We indicate in Section 2.1.3 how to generalize the homogenization theory to include large-scale mean #ows superposed upon the periodic #ow structure [38,230]. In Section 2.1.4, we describe some alternative formulas for the e!ective di!usivity, involving Stieltjes measures [9,12,20] and variational principles [12,97]. These representations can
246
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
be exploited to bound and estimate the e!ective di!usivity in various examples and classes of periodic #ows [40,97,210], as we shall illustrate in Section 2.2. 2.1.1. Nondimensionalization We begin our discussion of convection-enhanced di!usivity with smooth periodic velocity "elds *(x, t) de"ned on 1B which have temporal period t , and a common spatial period ¸ along each of T T the coordinate axes: *(x, t#t )"*(x, t) , T *(x#¸ eL , t)"*(x, t) , T H where +eL ,B denotes a unit vector in the jth coordinate direction. More general periodic velocity H H "elds can be treated similarly; the resulting formulas would simply have some additional notational complexity. We also demand for the moment that the velocity "eld have `mean zeroa, in that its average over space and time vanishes:
RT
¸\Bt\ T T
*(x, t) dx dt"0 .
B
*
In Section 2.1.3, we will extend our discussion to include the possibility of a large-scale mean #ow superposed upon the periodic velocity "eld just described. It will be useful to nondimensionalize space and time so that the dependence of the e!ective di!usivity on the various physical parameters of the problem can be most concisely described. The spatial period ¸ provides a natural reference length unit. To illuminate the extent to which the T periodic velocity "eld enhances the di!usivity of the passive scalar "eld above the bare molecular value i, we choose as a basic time unit the cell-di!usion time t "¸/i, which describes the time G T scale over which a "nely concentrated spot of the passive scalar "eld will spread over a spatial period cell. This will render the molecular di!usivity to be exactly 1 in nondimensional units. The velocity "eld is naturally nondimensionalized as follows: T
*(x, t)"v *3(x/¸ , t/t ) , T T where *3 is a nondimensional function with period 1 in time and in each spatial coordinate direction, and v is some constant with dimension of velocity which measures the magnitude of the velocity "eld. The precise de"nition of v is not important; it may be chosen as the maximum of "*(x, t)" over a space}time period for example. The initial passive scalar density ¹ (x) will be assumed to be characterized by some total `massa
M "
¹ (x) dx
1B
and length scale ¸ : 2 M ¹ (x)" ¹3 (x/¸ ) . 2 ¸B 2
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
247
We choose M as a reference unit for the dimension characterizing the passive scalar quantity (which may, for example, be heat or mass of some contaminant), and we nondimensionalize accordingly the passive scalar density at all times: M ¹(x, t)" ¹3(x/¸ , t/t ) . T T ¸B T Passing now to nondimensional units x3"x/¸ , t3"t/t , in the advection}di!usion equation, and T T subsequently dropping the superscripts 3 on all nondimensional functions, we obtain the following advection}di!usion equation: R¹(x, t) v ¸ # T *(x, t(¸/it )) ' ¹(x, t)"D¹(x, t) , T T i Rt ¹(x, t"0)"(¸ /¸ )B¹ (x(¸ /¸ )) . (6) T 2 T 2 We now identify several key nondimensional parameters which appear in this equation. The "rst is the Pe& clet number Pe,v ¸ /i , (7) T which formally describes the ratio between the magnitudes of the advection and di!usion terms [325]. It plays a role for the passive scalar advection}di!usion equation similar to the Reynolds number for the Navier}Stokes equations. Next, we have the parameter q "it /¸ , T T T which is the ratio of the temporal period of the velocity "eld to the cell-di!usion time. Thirdly, we have the ratio of the length scale of the velocity "eld to the length scale of the initial data, which we simply denote d,¸ /¸ . (8) T 2 Rewriting Eq. (6) in terms of these newly de"ned nondimensional parameters, we obtain the "nal nondimensionalized form of the advection}di!usion equation which we will use throughout Section 2: R¹(x, t)/Rt#Pe *(x, t/q ) ' ¹(x, t)"D¹(x, t) , T ¹(x, t"0)"dB¹ (dx) . (9) Notice especially how the PeH clet number describes, formally, the extent to which the advection}di!usion equation di!ers from a pure di!usion equation. We note that the nondimensional velocity "eld *(x, t/q ) has period 1 in each spatial coordinate T direction and temporal period q . It will be convenient in what follows to de"ne a concise notation T for averaging a function g over a spatio-temporal period:
1g2 ,q\ N T
OT
g(x, t) dx dt . B
248
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
2.1.2. Homogenization theory for periodic yows with zero mean We now seek to describe the evolution of the passive scalar "eld on length scales and time scales large compared to those of the periodic velocity "eld. It is natural in this regard to take the initial length scale ratio d between the velocity and passive scalar length scales to be very small. From experience with kinetic theory in which microscopic collision processes give rise to ordinary di!usive transport on macroscales, we can expect that the joint action of the mean zero velocity "eld and molecular di!usion will give rise to a net di!usion on the large scales. We therefore rescale time with space according to the standard di!usive relation xPdx, tPdt, and the passive scalar density according to ¹B(x, t),d\B¹(dx, dt) .
(10)
The amplitude rescaling preserves the total mass of the passive scalar quantity. We note that the choice of di!usive rescaling is appropriate here only because of the strong separation of scales between the velocity "eld and the passive scalar "eld; when this scale separation fails to hold, other `anomalousa space}time scaling laws may be required (see Section 3.4). The rescaled form of the advection}di!usion equation (9) reads R¹B(x, t)/Rt#d\Pe *
x t , ' ¹B(x, t)"D¹B(x, t) , d dq T
¹B(x, t"0)"¹ (x) . (11) On these large space}time scales (d;1), the advection by the velocity "eld has a large magnitude (O(d\)) and is rapidly oscillating in space and/or time. Because the velocity "eld has mean zero, the strong and rapidly #uctuating advection term has a "nite di!usive in#uence on ¹B(x, t) in the dP0 limit, i.e. on large scales and long times. This is the content of the homogenization theory for advection}di!usion in a periodic #ow, which we now state [205]. 2.1.2.1. Homogenized ewective diwusion equation for periodic velocity xelds. In the long time, largescale limit, the rescaled passive scalar "eld converges to a "nite limit lim ¹B(x, t)"¹M (x, t) , B which satis"es an e!ective di!usion equation R¹M (x, t)/Rt" ' (KH ¹M (x, t)) ,
(12)
(13a)
¹M (x, t"0)"¹ (x) , (13b) with constant, positive de"nite, symmetric e+ective di+usivity matrix KH. This e!ective di!usivity matrix can be expressed as KH"I#K M , where I is the identity matrix (representing the nondimensionalized molecular di!usion) and K M is a nonnegative-de"nite enhanced di+usivity matrix which represents the additional di!usivity due to
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
249
the periodic #ow. The enhanced di!usivity matrix K M can be computed as follows. Let v(x, t) be the (unique) mean zero, periodic solution to the following auxiliary parabolic cell problem (in the unscaled nondimensional space}time coordinates): Rv(x, t)/Rt#Pe *(x, t/q ) ' v(x, t)!Dv(x, t)"!Pe *(x, t/q ) . T T Then the components of the enhanced di!usivity matrix may be expressed as
(14)
K M "1 s ' s 2 . (15) GH G H N For the special case of a steady, periodic velocity "eld, the cell problem (14) becomes elliptic, again with a unique mean zero, periodic solution: Dv(x)!Pe *(x) ' s (x)"Pe *(x) . (16) H The convergence (12) of the passive scalar "eld rescaled on large scales and long times to the solution of the e!ective di!usion equation (13) can be rigorously established in the following sense: lim sup sup "¹B(x, t)!¹M (x, t)""0 B XRXR xZ1B for every "nite t '0, provided that ¹ and * obey some mild smoothness and boundedness conditions [205]. We will sketch the derivation of the above results in a moment, but "rst we make a few remarks on the nature of the equation and the e!ective di!usivity matrix. The e!ective large-scale, long-time equation (13) is often called a `homogenizeda equation because the e!ects of the advection by the relatively small-scale (heterogeneous) velocity "eld #uctuations (along with molecular di!usion) have been replaced by an overall e!ective di!usivity matrix KH which is a constant `bulka property of the #uid medium. Note that this homogenized di!usivity need not simply be a scalar multiple of the identity; anisotropies in the periodic #ow can de"nitely in#uence the large scales. The homogenization procedure was "rst developed for problems such as heat conduction in a medium with periodic, "ne-scale spatial #uctuations in conductivity (see for example [32]), and was adapted to advection}di!usion problems in [229,263]. We emphasize that the e!ective di!usivity is truly enhanced over the (nondimensionalized) bare molecular di!usion because K M is evidently a nonnegative-de"nite, symmetric matrix. The enhanced di!usivity matrix K M is always nontrivial when the #ow has nonvanishing spatial gradients, and it depends, in our nondimensional units, on both the PeH clet number and the temporal period q . Of particular interest is its behavior at large PeH clet number, and we develop some precise results T along these lines in Paragraph 2.1.4.1 and in Section 2.2. We "nally remark that the homogenized e!ective equation (13a) also describes the long-time asymptotic evolution of the passive scalar density evolving from small-scale or even concentrated initial data [149]. The point is that even a delta-concentrated source will, on time scales O(d\), spread over a large spatial scale O(d\) due to molecular di!usion. Since the probability distribution function (PDF) of the position X(t) of a single tracer initially located at x obeys the advection}di!usion equation with initial data ¹ (x)"d(x!x ), it follows that the PDF for the tracer's location becomes Gaussian in the long-time limit, with mean x and covariance matrix growing at an enhanced di!usive rate: lim 1(X(t)!x )(X(t)!x )2&2KHt . R
250
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Note in particular that the asymptotic behavior of the tracer is independent of its initial position; the reason is that molecular di!usion will in time smear out the memory of the initial position. We next sketch, following [32,149,205,263], how the homogenized e!ective equation for the rescaled passive scalar density ¹B(x, t) arises from a multiple scale asymptotic analysis. Subsequently, we will o!er some physical interpretations for the homogenization formulas (14) and (15) for the e!ective di!usivity matrix. 2.1.2.2. Derivation of homogenized equation. We seek an asymptotic approximation to ¹B(x, t) of the following form in the dP0 limit: ¹B(x, t)"¹
x t x t x t x, , t, #d¹ x, , t, #d¹ x, , t, #2 . d d d d d d
(17)
In accordance with the usual prescription for multiple scale analysis [158], we have explicitly accounted for the fact that the terms in the asymptotic expansion may su!er rapid oscillations in the dP0 limit due to the rapid oscillations in the coe$cient of the advection term in the rescaled advection}di!usion equation (11). We label the arguments corresponding to the rapid oscillations as n"x/d and q"t/d. In the functions appearing in the multiple scale asymptotic expansion (17), the variables (x, n, t, q) may be treated as varying independently of one another, provided we replace space and time derivatives as follows: R R R P #d\ , Rq Rt Rt
P x#d\ n . Substituting now Eq. (17) into the rescaled advection}di!usion equation (11), and separately equating terms of the three leading orders results in the following PDEs: O(d\) : Q¹
"0 , O(d\) : Q¹ "!Pe * ' x¹ #2 x ' n¹ , R¹ !Pe * ' x¹ O(d) : Q¹ "! #2 x ' n¹ #Dx¹ , Rt
(18a) (18b) (18c)
where the di!erential operator Q is de"ned: Q,R/Rq#Pe *(n, q/q ) ' n!Dn . T Note that it involves only the variables n and q, and that we may view Q as operating on functions with spatial period 1 in m and temporal period q in q. From the uniform parabolicity of T this operator and the incompressibility of the velocity "eld, it follows from classical linear PDE theory ([105], Ch. 7) that we have the following solvability condition for Q: Given any smooth space}time periodic function f (n, q), the equation Qg(n, q)"f (n, q)
(19)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
251
has a smooth periodic solution g(n, q) if and only if f (n, q) has mean zero. This solution is moreover unique up to an arbitrary additive constant. It follows in particular that the only functions of n and q annihilated by Q are constants, so Eq. (18a) implies that ¹ in fact only depends on the large-scale variables x and t: ¹ (x, n, t, q)"¹M (x, t) . (20) Eq. (18b) therefore satis"es the solvability condition, since the right-hand side may be written as !*(n, q) ' x¹M (x, t) , and * has mean zero. We can consequently express ¹
as
(x, n, t, q)"v(n, q) ' x¹M (x, t)#C , where C is some constant and v(n, q) is the unique, periodic, mean zero solution to ¹
(21)
Qv(n, q)"!Pe *(n, q) .
(22)
Next, applying the solvability condition to Eq. (18c), we "nd that a necessary condition for the solution ¹ (x, n, t, q) to exist is that R¹ !Pe * ' x¹ ! #2 n ' x¹ #Dx¹ "0 . (23) Rt The third term, which is the average of a divergence with respect to the variable n, vanishes by the divergence theorem. Substituting Eqs. (20) and (21) into this solvability relation, we have
B R¹M (x, t) R¹M (x, t) !Pe 1s (n, q)v (n, q)2 #Dx¹M (x, t)"0 . ! G H Rx Rx Rt G H GH Symmetrizing the coe$cient of the Hessian of ¹M in the second term, we can rewrite this as R¹M (x, t)/Rt" ' (KH ¹M (x, t)) ,
(24)
where the e!ective di!usivity matrix is expressed KH"I#K M ,
(25)
K M "!Pe(1s (n, q)v (n, q)2 #1s (n, q)v (n, q)2 ). GH G H H G This is the content of the homogenization theorem, except that the formula for the enhanced di!usivity K M must still be massaged a bit more to bring it in the form stated in Eq. (15). Note that the e!ective di!usion equation for ¹M (x, t) arises from a solvability condition for a higher-order (O(d)), rapidly #uctuating term; this re#ects the fact that the e!ective di!usivity is determined by how the small-scale passive scalar #uctuations equilibrate under the in#uence of the small-scale periodic variations in the velocity "eld (see below).
252
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
To show that Eq. (25) is equivalent to Eq. (15), use Eq. (22) to express v in terms of s : H H !Pe1s v #s v 2 "1s Qs #s Qs 2 "1Q(s s )#2 ns ' ns 2 G H H G G H H G G H H G "1 ns ' ns 2 . G H The "rst term in the average in the penultimate equality vanishes because 1Qg2"0 for any function g (see the discussion near Eq. (19)). The derivation we have presented shows that, at least formally, there exist functions ¹ and ¹ so that R x t #d\* , !D [¹B(x, t)!¹B (x, t)]"O(d) , Rt d d
where
x t x t ¹B (x, t),¹M (x, t)#d¹ x, , t, #d¹ x, , t, d d d d
and ¹M (x, t) solves Eq. (24). Using energy estimates and the maximum principle, it can be rigorously shown from this development that ¹ and ¹ are bounded and lim "¹B(x, t)! B ¹B (x, t)""0, with both the boundedness and convergence uniform over all of space and over "nite time intervals [149,205]. It follows from this that the ¹B(x, t) converges to ¹M (x, t) in maximum norm as dP0. The gradient of ¹B(x, t), however, does not converge (strongly) to the gradient of ¹M (x, t) because of rapid oscillations [264]; note from Eq. (21) that d ¹ does not vanish in the dP0 limit. 2.1.2.3. Physical meaning of homogenization formulas and relation to eddy diwusivity modelling. We pause to remark upon the physical meaning of the cell problem (14) and the formula (15) for the homogenized di!usivity matrix which arose rather mechanically through self-consistent solvability conditions in the asymptotic expansion just presented. Note "rst that the passive scalar "eld will evolve much more rapidly on the small scales than the large, so the small-scale #uctuations of the passive scalar "eld will quickly reach a quasi-equilibrium state which depends on the local large-scale behavior of the passive scalar "eld. (This quasi-equilibrium state will be periodic in time, rather than steady, when the velocity "eld has periodic temporal #uctuations.) According to Eq. (21), the quasi-equilibrium behavior of the small-scale #uctuations is determined to leading order by the local gradient ¹(x, t) of the large-scale variations of the passive scalar "eld. This is formally obvious from the advection}di!usion Eq. (11) rescaled to large space and time scales. From Eqs. (21) and (22), we see that s (x, t) is exactly the response of the small-scale passive scalar H #uctuations to a large-scale gradient of ¹(x, t) directed along eL . Further discussion of this point H may be found in [97,264]. We now show how the e!ective di!usivity formula (15) can be understood from a direct consideration of the advection}di!usion equation along with the multiple scale representation of the passive scalar "eld. When we view the passive scalar "eld on large scales, we are e!ectively taking a coarse-grained average over small scales. As the small-scale #uctuations are periodic, this coarse-graining is equivalent to (local) averaging over a spatio-temporal period cell. The
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
253
coarse-grained and rescaled advection}di!usion equation therefore reads
x t R1¹B(x, t)2 #d\Pe * , ' ¹B(x, t) "D1¹B(x, t)2 , d dq Rt T 1¹B(x, t"0)2 "¹ (x) . (26) According to both formal intuition and the multiple scale analysis, the coarse-grained passive scalar "eld 1¹B(x, t)2 is, in the limit of strong scale separation (dP0), well approximated by a function ¹M (x, t) varying only on the large scales and independent of d. The main challenge is to represent the coarse-grained average of the advective term in terms of ¹M (x, t). This di!ers from the simple factorization into averages over * and ¹B because of the coupling between the small-scale #uctuations of the velocity "eld and the small-scale #uctuations they induce in the passive scalar "eld. Though the small-scale #uctuations of the passive scalar "eld are O(d) weak in amplitude relative to the main large-scale variation, they are relevant in determining the large-scale transport because they are integrated over large space and time scales. We mentioned at the beginning of Section 2 an ad hoc approach to estimate the coarse-grained advective term as an eddy di!usivity. For the present case in which the velocity "eld has periodic spatio-temporal variations on scales strongly separated from those characterizing the leadingorder passive scalar "eld, the closure hypothesis (5) is in fact precise and may be constructed from the multiple scale representation of the passive scalar "eld which was obtained in the derivation of the homogenization theorem: ¹B(x, t)"¹M (x, t)#d¹
x t x, , t, #O(d) , d d
(27)
(x, n, t, q)"v(n, q) ' x¹M (x, t)#C . Using the incompressibility of the velocity "eld to re-express the average of the advective term in Eq. (26), and substituting the asymptotic expansion (27) into it, we obtain ¹
x t x t , ' ¹B(x, t) "d\Pe ' * , ¹B(x, t) d dq d dq T T x t x t x t #O(d) . (28) ¹M (x, t) #Pe ' * , ¹ x, , t, "d\Pe ' * , d d d dq d dq T T The remainder term is indeed O(d), not withstanding the divergence acting on the expectation 1 ) 2 , because the averaging over the period cell removes the rapid oscillations. The "rst term appearing after the last equality in Eq. (28) vanishes because * is the only rapidly oscillating factor in the argument, and has zero average over the period cell. Therefore, we are left with an expression which takes the form of an enhanced di!usion term involving the coupling of the small-scale #uctuations of the velocity "eld with the small-amplitude, small-scale #uctuations induced in the passive scalar "eld d\Pe *
d\Pe *
x t , ' ¹B(x, t) "Pe ' * d dq T B R "Pe Rx G G
x t x t , ¹ x, , t, d dq d d T R¹M (x, t) B R¹M (x, t) 1v s 2 "!Pe K M , G H Rx GH Rx Rx H G H G
254
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
with the enhanced di!usivity K M "!(1v s 2 #1v s 2 ) . H G GH G H This agrees with expression (25), which was subsequently shown to be equivalent to formula (15). 2.1.3. Generalization of homogenization theory to include large-scale -ows We now show how the homogenization for periodic #ows described above can be extended to allow for the presence of certain kinds of large-scale mean #ow components in the velocity "eld. We treat in turn the cases of a steady periodic #ow with a constant mean drift, and then a superposition of a weak, large-scale mean #ow with small-scale, periodic spatio-temporal #uctuations. 2.1.3.1. Constant mean yow. In several applications, #uid is driven along a speci"c direction by a large-scale pressure gradient, and the resulting #ow pattern consists of some mean constant motion and #uctuations induced either by #ow instability or by variations in the properties of the medium through which the #uid is drawn [223]. A simple but instructive idealization of such #ows is a superposition of a constant, uniform velocity V with a mean zero, steady periodic #ow *(x) representing the #uctuations. This can serve as a prototype model for hydrological #ows through porous media [130]. We will often refer to a spatially constant mean #ow such as V as a mean sweep. Now we show how the homogenization theory can be generalized to incorporate the mean sweep V. The nondimensionalized form of the advection}di!usion equation (9) is modi"ed to R¹(x, t)/Rt#Pe(V#*(x)) ' ¹(x, t)"D¹(x, t) ,
(29) ¹(x, t"0)"dB¹ (dx) . An immediate large-scale, long-time rescaling (10) of this equation would produce a term d\PeV ' ¹B(x, t). This term is singular in the dP0 limit, and would create di$culties at the O(d\) level in the multiple scale analysis of Paragraph 2.1.2.2 because V does not have zero average over a period cell. A preliminary Galilean transformation to a frame comoving with the mean #ow, ¹I (x, t),¹(x#Vt) however, averts this obstacle. The advection}di!usion equation for ¹I (x, t) reads R¹I (x, t)/Rt#Pe *(x!Vt) ' ¹I (x, t)"D¹I (x, t) , ¹I (x, t"0)"dB¹ (dx) . Now, if each component of V is an integer multiple of a common real number j, then *(x!Vt) would be mean zero with spatial period 1 in each coordinate direction and temporal period j\. The homogenization theory of Section 2.1.2 can then be directly applied, yielding the following statement. Homogenized e+ective di+usion equation for steady, periodic velocity ,elds with constant mean -ow: The large-scale, long-time limit of the passive scalar "eld, ¹M (x, t),lim ¹B(x, t), B
¹B(x, t),d\B¹I (dx, dt) ,
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
255
obeys an e!ective di!usion equation R¹M (x, t)/Rt" ' (KH ¹M (x, t)) , ¹M (x, t"0)"¹ (x) . The e!ective di!usivity matrix KH in this equation can be expressed as
(30) (31)
KH"I#K M with the enhanced di!usivity K M given by K M "1 s ' s 2 , (32) GH G H where v(x) is the (unique) mean zero, periodic solution to the following parabolic cell problem: Rv(x, t) #Pe *(x!Vt) ' v(x, t)!Dv(x, t)"!Pe *(x!Vt) . Rt It is helpful to note that the period cell average in Eq. (32) is unchanged if v(x, t) is replaced by v(x!Vt, t), so the cell problem can be replaced by the purely spatial, elliptic PDE [210,230]: Pe(V#*(x)) ' v(x)!Dv(x)"!Pe *(x) .
(33)
When the components of V cannot be expressed as integer multiples of a common real number, then the velocity "eld *(x#Vt) is quasiperiodic rather than periodic. It can still be argued through more sophisticated means [38], however, that the homogenization formulas presented above carry over for general V without change. 2.1.3.2. Weak large-scale mean -ow. It would be very interesting to describe the large-scale, long-time evolution of the passive scalar "eld in the more general situation in which the mean #ow varies on large spatial and slow time scales. Such a velocity "eld could be a heuristically useful (but greatly simpli"ed) idealization of an inhomogenous turbulent #ow in which some mean large-scale #ow pro"le is disturbed by turbulent #uctuations represented as small-scale periodic #uctuations. Unfortunately, there does not appear to be a homogenization theory which generally describes the net large-scale transport properties arising from the interaction between the large-scale mean #ow, the periodic #uctuations, and molecular di!usion. The goals of such a program, however, can be concretely illustrated by consideration of large-scale mean #ows which are weak in a sense which we now describe. For simplicity, we shall assume that the length scale of the large-scale velocity "eld coincides with that of the initial passive scalar "eld ¸ "¸ and that the time scale of the large-scale velocity 4 2 "eld is given by d\¸/i, which is O(d\) slow relative to the natural molecular di!usion time T scale. We do not assume that the large-scale velocity "eld is periodic. As important special cases, we allow the large-scale velocity "eld to be steady and/or spatially uniform. The large-scale mean #ow will further be assumed weak in that its amplitude is O(d) relative to the amplitude of the small-scale periodic velocity "eld. In units nondimensionalized according to the prescription in Section 2.1.1, the total velocity "eld (mean #ow with periodic #uctuations) has the form Pe[dV(dx, dt/q )#*(x, t/q )] . T T
256
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The advection}di!usion equation for the passive scalar "eld ¹B(x, t) (10) rescaled to large scales and long times then becomes (cf. (11))
x t R¹B(x, t) #Pe V(x, t/q )#d\* , ' ¹B(x, t)"D¹B(x, t) , T d dq Rt T ¹B(x, t"0)"¹ (x) . Because the mean #ow was assumed to be O(d) weak, it produces a regular, order unity advection term in the rescaled coordinates. The multiple scale analysis of Paragraph 2.1.2.2 can now be directly generalized to include the e!ects of the weak mean #ow, which only modi"es the O(d) equation in Eq. (18c). If V(x, t/q ) is smooth and bounded, the homogenization theorem for purely T periodic velocity "elds can be rigorously extended [209] to state that in the present case, ¹B(x, t) converges as dP0 to a nontrivial limit ¹M (x, t) which satis"es the following large-scale, e!ective `homogenizeda advection}di!usion equation: R¹M (x, t)/Rt#V(x, t/q ) ' ¹M (x, t)" ' (KH ¹M (x, t)) , (34) T ¹M (x, t"0)"¹ (x) . (35) The homogenized di!usivity KH is determined through the same formula and cell problem (14) as in the case of no mean #ow. In other words, KH is completely independent of V(x, t/q ). T The homogenized equation (35) is a rigorous realization of the goal of large-scale modelling of passive scalar transport by a velocity "eld with a macroscopic mean #ow component and small-scale #uctuations. The small-scale periodic #uctuations a!ect the large-scale passive scalar dynamics purely through an enhancement of di!usivity, while the mean #ow appears straightforwardly in the advection term. We stress that this simple picture relies crucially on the assumptions that the mean #ow is weak and that there is a strong separation between the scales of the #uctuating and mean components of the velocity "eld. Neither of these assumptions is generally valid in realistic turbulent #ows, and the e!ective description of the large-scale passive scalar dynamics can be expected to be considerably more complicated [182,286]. Moreover, homogenization theory is only valid on su$ciently large (O(d\)) time scales; we explore the practical relevance of this condition in Section 2.3. Nonetheless, since no precise theories analogous to homogenization theory have yet been developed for realistic turbulent #ows, there is much we can learn about passive scalar transport by careful study of small-scale periodic velocity "elds, for which we can obtain certain results rigorously. McLaughlin and Forest [232] have recently investigated the e!ects of another kind of large-scale variation on the transport of a passive scalar "eld in a periodic velocity "eld. In this work, the velocity "eld is chosen as a large-scale, compressible modulation of a periodic, incompressible, small-scale #ow. The weak compressibility of the #ow models the response to a large-scale strati"cation of the density of the #uid (as in the atmosphere) through the anelastic equations. A homogenized equation for the evolution of the passive scalar "eld on large scales and long times is derived through a modi"cation of the multiple scale analysis described in Paragraph 2.1.2.2. This homogenized equation has variable coe$cients re#ecting the large-scale variation in the #uid density, and its solutions can exhibit focusing and the formation of nontrivial spatial structures. Several numerical simulations in [232] compare the evolution of these solutions to those of the
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
257
standard di!usion equations resulting from the homogenization of purely incompressible, periodic velocity "elds. We proceed next to develop some tools for characterizing the e!ective di!usivity arising from homogenization theory, which we will apply in Section 2.2 to several instructive classes of #ows. We will in particular underscore the subtle in#uence which a constant mean #ow can have on the e!ective passive scalar di!usivity [210]. Some aspects of passive scalar transport at "nite (nonasymptotic) time scales will be illustrated explicitly in Section 2.3. 2.1.4. Alternative representations and bounds for ewective diwusivity Homogenization theory rigorously reduces the description of the large-scale, long-time dynamics of the passive scalar "eld to the determination of a constant e!ective di!usivity matrix KH, which however still requires the solution of a nontrivial cell problem (14). This cell problem can be solved explicitly for some special #ows (see Sections 2.2.1 and 2.2.2), but must in general be treated by some approximate analytical or numerical methods. We present here some alternative analytical representations of the e!ective di!usivity which are useful for obtaining rigorous, computable estimates, particularly concerning its asymptotic dependence on large PeH clet number. We will discuss the numerical solution of cell problems for some speci"c #ows in Sections 2.2.3, 2.2.4 and 2.2.5. 2.1.4.1. Stieltjes integral representation. One way to attempt to analyze the cell problem in general is to treat Pe as a small parameter, and to construct a perturbative solution for v(x, t) as an ascending power series in Pe [181,224]. This is not di$cult to construct, since the zeroth-order equation is just the ordinary heat equation in periodic geometry. The drawback to this approach is that the resulting series has a very limited radius of convergence [9,12,181], making this approach limited for typical applications in which the PeH clet number is substantial or very large. Some formal diagramatic resummation techniques have been proposed in the context of turbulence and "eld theory to attempt to extract meaningful information from a formal power series at parameter values (i.e. high PeH clet number) where they diverge [181]. The validity of these methods is open to question, however, since they typically neglect a wide class of terms in the power series, without clean justi"cation. Fortunately, an exact and rigorous diagrammatic resummation is possible for the homogenized e!ective di!usivity matrix KH of a periodic velocity "eld, and gives rise to a Stieltjes measure representation which is valid for arbitrary PeH clet number [9,11,12]. Here we will formally sketch a more direct way [9,12,39,210] of achieving the Stieltjes measure representation formula, focusing on the case of a steady periodic velocity "eld with a constant (possibly zero) mean sweep V. The cell problem for each component s (x) in this case may be expressed as follows H (cf. Eq. (33)): Ds (x)!Pe(V#*(x)) ' s (x)"Pe v (x) . H H H This equation can be rewritten as an abstract integral equation for s (x) by application of the H operator D\ to both sides. We then obtain (I!Pe AV ) ' s "Pe AeL , H H
(36)
258
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
with I the identity matrix. The other operators are de"ned on the Hilbert space ¸(3B) of periodic, square-integrable functions as follows: Au" D\(*(x) ' u) ,
(37a)
AVu" D\((V#*(x)) ' u) .
(37b)
A key property of these operators, which follows from incompressibility of the velocity "eld, is that they are compact [281] and skew symmetric when restricted to the subspace ¸(3B) of square
integrable (generalized) gradients of periodic functions ¸(3B)"+u" f : 1" f "2 #1"u"2 (R, . (38)
These properties are more apparent when the operators are reformulated in terms of the stream function (or stream matrix), see [12]. The spectral theory of compact, skew-symmetric operators [281] guarantees the existence of an orthonormal basis of functions in ¸(3B) which are eigenfunc
tions of AV with purely imaginary eigenvalues. Moreover, the eigenvalues and eigenfunctions come in complex conjugate pairs, with the magnitude of the eigenvalues clustering asymptotically near zero. We may therefore index the eigenvalues by +$ikL, where kL is a real, positive sequence L decreasing toward zero; there may also possibly be a zero eigenvalue of AV. The cell problem (36) may now be solved by expanding s (x) and AeL (which is in ¸(3B)) in H H
terms of the eigenfunctions of the operator AV. Substituting the result into the e!ective di!usivity formula (32), we thereby achieve the Stieltjes Integral Representation Formula for the enhanced di!usivity along any given direction eL in a steady periodic velocity "eld with a possible constant mean sweep:
aL . eL ' K M ' eL "Pe#* ' eL # a#2 \ 1#Pe(kL) L The parts of this formula which remain to be explained are:
(39)
E An order unity prefactor measuring the magnitude of the nondimensionalized velocity "eld in a certain (Sobolev) norm ([105], Ch. 6), "*L k ' eL " #* ' eL # ,1"AeL "2 " , (40) \ k 9B 4p"k" Z where *L k are the Fourier coe$cients of *(x). E The mean square a"1"g"2 of the projection g of the normalized function AeL /1"AeL "2 onto the null space of AV in ¸(3B).
E The mean square aL"1"gL"2 of the projection gL of the normalized function AeL /1"AeL "2 onto the eigenspace of AV in ¸(3B) corresponding to eigenvalue ikL (or equivalently to !ikL).
The normalization by the factor 1"AeL 2 implies that aL"a#2 aL"1 , L\ L
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
259
so the +aL, may be interpreted as the weights of a normalized discrete measure,
do* V" ad(k)# aL(d(k!kL)#d(k#kL)) dk . L The summation appearing in Eq. (39) therefore has the form of a Stieltjes integral against this discrete measure:
eL ' K M ' eL "Pe#* ' eL # \
\
do* V . 1#Pek
(41)
The Stieltjes integral representation for the e!ective di!usivity in a periodic velocity "eld was derived by Avellaneda and the "rst author [9,12] and in a slightly di!erent form by Bhattacharya et al. [39]. Similar, but more notationally complex, formulas for o!-diagonal elements of K M may be found in [39]. A similar Stieltjes integral representation was derived by Avellaneda and Vergassola [20] for spatio-temporal periodic velocity "elds with no mean sweep. The only di!erence is that the de"nition (37b) of the operator AV is to be replaced by AVu" D\(*(x, t/q ) ' u)#(R/Rt)D\u , T
(42)
which is still real, compact and skew-symmetric on the subspace of square-integrable gradients of spatio-temporal periodic functions, ¸(3B;[0, q ]).
T Note that the formal expansion of the summands in Eq. (39) in powers of Pe will recover a formal power series which converges only for "Pe "((k)\ [9,12]. The Stieltjes integral representation may be interpreted as a rigorous resummation of this series which is valid for all Pe; this is demonstrated explicitly in [11]. The Stieltjes integral is admittedly too di$cult to evaluate directly in general because the full spectral information of the operator AV is required. Nonetheless, as we shall now describe, much practically useful information can be deduced from the Stieltjes integral representation. Rigorous bounds through Pade& approximants: The Stieltjes integral representation (41) "rst of all permits the construction of rigorous upper and lower bounds on the e!ective di!usivity for all Pe& clet number. By noting that do* V is a nonnegative measure with total integral equal to unity, we can immediately deduce the following elementary lower and upper bounds on the e!ective di!usivity [12]: 14eL ' KH ' eL 41# Pe#* ' eL # . \
(43)
The Stieltjes integral representation also makes it possible to construct sharper bounds on the e!ective di!usivity using information from a "nite number of terms in a small Pe& clet number expansion, which can be determined by a straightforward formal perturbation procedure [12,181]. Suppose one has obtained in this way a small PeH clet number asymptotic expansion of eL ' KHeL : + eL ' KH ' eL "1# PeKb #O(PeK>) , K K
(44)
260
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where b are some constants involving explicit integrals which can be evaluated or at least K estimated numerically [12,40]. Comparing with a formal small PeH clet number expansion of the Stieltjes integral representation (41), we "nd that each b is proportional to the moment of order K 2(m!1) of the measure do* V. The knowledge of these moments implies rigorous restrictions for the values which eL ' KH ' eL , given by the Stieltjes integral representation (41), may attain for arbitrary values of PeH clet number. More precisely, it has been shown [12,337] that eL ' KH ' eL is rigorously bounded above and below, for all PeH clet number, by certain PadeH approximants, which are rational functions of Pe explicitly constructed from the coe$cients of the perturbation series (44) (see for example [30]). PadeH approximants were applied to construct rigorous bounds for the e!ective di!usivity in certain periodic #ows in [40]; some of this work will be brie#y discussed in Section 2.2.5. The PadeH approximant bounds may also be used to rigorously extrapolate the value of KH over a range of Pe, given its measured value at a "nite set of Pe, and to check the validity of Monte Carlo simulations for the e!ective di!usivity [12,40]. Maximal and minimal enhanced di+usivity: While the PadeH approximants can produce sharp estimates of the e!ective di!usivity for small and moderate values of the PeH clet number, they eventually deteriorate at su$ciently large Pe [40]. One "nds only that the e!ective di!usivity in the asymptotic regime of large PeH clet number must exceed some constant independent of Pe, but cannot grow more quickly than Pe, which is indicated already by the simplest bounds (43). The high PeH clet number asymptotics of the e!ective di!usivity are however of considerable practical interest, since the PeH clet number can be quite large in a number of natural and experimental situations. One important question is how rapidly the e!ective di!usivity grows with PeH clet number. Following the work of McLaughlin and the "rst author in [210], we classify two extreme situations. We say that #ows produce E maximally enhanced di+usion in a certain direction eL when the di!usivity along this direction grows quadratically with Pe as PePR. This is the most rapid growth possible, according to Eq. (43). E minimally enhanced di+usion in the direction eL if the e!ective di!usivity remains uniformly bounded in this direction for arbitrarily large Pe. Explicit shear #ow examples will be presented in Section 2.2.1 which demonstrate the realizability of both of these extreme behaviors. Other large Pe number behavior can be realized by various #ows (see Section 2.2.3); the classes of #ow which are maximally or minimally di!usive in a given direction are not exhaustive. The Stieltjes integral representation provides some simple general criteria for determining whether a given #ow will be maximally or minimally di!usive in a given direction eL . It is evident from Eq. (39) that maximally enhanced di!usion is equivalent to aO0. This is rigorously veri"ed in [39,210], where it is moreover demonstrated that maximal di!usivity along eL is equivalent to the existence of a complex periodic function h(x) which is constant along streamlines, (V#*(x)) ' h(x)"0 , has a nontrivial projection against *(x) ' eL , 1h* ' eL 2 O0 ,
(45)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
261
and is su$ciently smooth that it belongs to the Sobolev space H(3B) of complex periodic, square-integrable functions with square-integrable (generalized) derivatives ([105], Ch. 6). These conditions for maximal di!usivity along a direction eL have been interpreted in a rigorous, geometric manner by MezicH et al. [239] as indicating a lack of ergodicity of * ' eL ; that is, the average value of * ' eL along streamlines is not everywhere zero. The reason why this situation gives rise to maximally enhanced di!usion is that, in the absence of molecular di!usion, particles on streamlines with a nonzero average value of * ' eL would proceed in the direction eL at a ballistic rate (distance linearly proportional to time) [168]. Such streamlines are often manifested as open channels [210], as we shall see concretely in Section 2.2. Molecular di!usion acts as an impediment to the rapid transport along these open channels by knocking tracers into other streamline channels with average values of * ' eL with the opposite sign. (Such compensatory channels must exist since * ' eL has mean zero). The net result at long time is a di!usive motion along eL (on top of any constant mean drift V ' eL ), with the e!ective di!usivity constant growing rapidly with PeH clet number, since a high PeH clet number permits particles to travel a long way along open channels before getting knocked away from them by molecular di!usion. If a"0, then the Stieltjes integral representation (39) implies that the tracer di!usion is not maximally di!usive, but does not necessarily imply that the tracer motion is minimally di!usive. Even though the contribution from each term in the sum from n"1 to R individually approaches a "nite constant in the PePR limit, the full sum can still diverge in the PePR limit depending on how rapidly the eigenvalues kL approach zero. More information is needed to determine whether a #ow produces minimally enhanced di!usion or not. One su$cient condition for minimally enhanced di!usion along a direction eL established in [39,210] is the existence of a periodic function u3H(3B) which satis"es the equation (V#*(x)) ' u(x)"!*(x) .
(46)
Whereas maximally enhanced di!usion along a direction eL is associated with open channels, minimally enhanced di!usion along eL appears to be related to the presence of a layer of streamlines which block #ow along the eL direction [210], as we shall illustrate in Sections 2.2.3 and 2.2.4. The e!ective di!usivity along blocked directions eL remains bounded in proportion to the molecular di!usivity, regardless of how large Pe becomes, because the transport rate is always limited by the need for the tracer to cross the layer of blocked streamlines, which only molecular di!usion can accomplish. Indeed, in the limit of no molecular di!usivity, the motion of the tracer along a blocked direction eL would remain forever trapped. We caution the reader that our care in stating the function spaces to which solutions of Eqs. (45) and (46) is quite essential. If one were to naively treat these equations in the same way as "nite-dimensional linear algebra problems, one would wrongly conclude that any #ow produces either maximally or minimally enhanced di!usion. Such a supposition is falsi"ed by the example of steady cellular #ows which are neither maximally nor minimally di!usive, as we shall discuss in Section 2.2.3. In particular, even though the nice streamline structure of this #ow (Fig. 2) permits a formal construction of a function h constant along streamlines and thereby satisfying Eq. (45), it turns out that any such function is not smooth enough at the corners of the period cell to be in H(3B). Therefore, the condition for maximally enhanced di!usion is not satis"ed by the steady cellular #ow, but one could be misled if the smoothness considerations are not taken into account.
262
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The above rigorous criteria for maximal and minimal di!usivity have been gainfully applied by McLaughlin and the "rst author [210] to categorize the e!ects of a nonzero constant mean #ow on e!ective transport, and we shall describe some of these results in Section 2.2.4. Some other applications of the critera for maximal and minimal di!usivity to some special classes of #ows, particularly involving special kinds of streamline blocking, may be found in [39]. 2.1.4.2. Variational principles. Another useful representation of the homogenized di!usivity is through a variational principle. Avellaneda and the "rst author [12] introduced the "rst such variational principle for steady, periodic velocity "elds *(x) with no mean sweep: For all vectors eL 31B, the e!ective di!usivity along direction eL may be expressed as the following minimization problem: 1"u"#Peu ' K ' u2 , min L 3B
\CZ* where the nonnegative, self-adjoint operator K is de"ned eL ' KH ' eL "
(47)
uu
K"(A)RA and ¸(3B) is the Hilbert space of square-integrable gradients de"ned in (38).
This variational principle allows us to generate rigorous upper bounds on the e!ective di!usivity by substituting arbitrary functions u with u!eL 3¸(3B) into the functional on the right-hand side
of Eq. (47). Note that the functional to be minimized involves the nonlocal operator K. Fortunately, in certain cases, the calculation can be greatly simpli"ed by a suitable choice of trial "elds u. A related dual (nonlocal) maximal variational principle was later derived by Fannjiang and Papanicolaou [97]. By carefully using the minimal and maximal variational principles in tandem, the e!ective di!usivity can be estimated in a fairly sharp manner for certain tractable classes of #ows. These authors also formulate some local minimax variational principles as well as variational principles for the e!ective di!usivity of time-dependent periodic velocity "elds. We mention in passing that another, philosophically di!erent, variational approach to deriving rigorous upper bounds for the e!ective di!usivity of a passive scalar "eld over "nite regions has been developed by Krommes and coworkers [161,187]. Also, a rigorous bound on the e!ective di!usivity depending on the maximum of the stream function (or stream matrix) has been obtained by Tatarinova et al. [314] for arbitrary velocity "elds which are con"ned to "nite regions. This result is a di!erent weaker interpretation of the upper bound in Eq. (43). 2.2. Ewective diwusivity in various periodic yow geometries We now demonstrate the utility of the rigorous formulas for the e!ective di!usivity of a tracer over long times by applying them to a various speci"c classes of periodic #ows. Explicit formulas for the e!ective di!usivity can be derived for shear #ows with spatially uniform cross sweeps, as we will show in Sections 2.2.1 and 2.2.2; in other cases one can turn to a numerical solution of the cell problem [40,165,210]. We will for the most part, however, be concerned with the asymptotic behavior of the e!ective di!usivity in the case of large PeH clet number Pe, which arises in many
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
263
practical situations. We will show in the rest of Section 2.2 how the variational and the Stieltjes measure representation for the e!ective di!usivity can be utilized to rigorously determine its exact scaling behavior with respect to large Pe, even when the cell problem (14) cannot be analytically solved. By such means, we shall study in Section 2.2.3 the tracer transport in a special oneparameter family of two-dimensional, steady, periodic #ows which interpolate between a cellular #ow and a shear #ow, and we shall describe in Section 2.2.4 the subtle e!ects which arise upon the addition of a constant mean sweep V. The e!ective di!usivity scales as #KH#&Pe in the pure cellular #ow [67], but the presence of a mean sweep can produce either maximally enhanced di!usion (eL ' KH ' eL &Pe) along most directions eL or minimally enhanced di!usion in all directions (#KH#&Pe), depending on such sensitive criteria as whether the components of V are rationally related, whether V is transverse to a mean shear #ow pattern, and whether the total #ow has stagnation points [210]. Numerical evaluations of the e!ective di!usivity [210] con"rm these mathematically derived asymptotics, and reveal a variety of interesting crossover behavior at large but "nite PeH clet number which demonstrate the practical relevance of the criteria for maximally and minimally enhanced di!usion just listed. The numerical and mathematical analysis of the long-time e!ective di!usivity of a tracer in some other periodic #ows using the formulas from Section 2.1.4 will be discussed brie#y in Section 2.2.5. We stress that the results to be presented throughout Section 2.2 all deal with the asymptotic long-time behavior of the passive scalar "eld. Some issues concerning the observation of the tracer motion and passive scalar "eld evolution at "nite times will be discussed in Section 2.3. We shall endeavor throughout Section 2.2 to supplement the rigorous homogenization theory results with intuitive physical explanations for the large PeH clet number behavior of the e!ective di!usivity through consideration of the streamline geometry. A common qualitative theme which will emerge is that, in steady #ows at large PeH clet number, open channels are associated with greatly enhanced di!usion and blocked streamlines with only moderately enhanced di!usion. This notion will become clearer through discussion and pictures of streamlines for the speci"c examples we shall discuss. Another way of intuitively understanding the behavior of the e!ective di!usivity is through an informal consideration of Taylor's formula [317] for the mean-square tracer displacement in terms of the correlation function of the tracer (Lagrangian) velocity. We will emphasize the geometric perspective here, and elaborate upon the heuristic use of Taylor's formula in Section 3, where we examine tracer di!usion in random shear #ows. 2.2.1. Periodic shear yows with constant (or zero) cross sweep Shear #ows are a very useful class of examples for the examination and illustration of general theories for turbulent di!usion, as we shall see now and in much greater depth in a random context in Section 3. They arise naturally in various physical applications, and they are quite tractable analytically due to their simple structure. A two-dimensional spatio-temporal shear velocity "eld aligned along the y-axis has the general form
*(x, t)"*(x, y, t)"
0
v(x, t)
.
In particular, it is completely described by a scalar function v(x, t) which depends on only one spatial variable x in addition to time for nonsteady #ows. In this sense, shear #ows play the role of
264
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
a one-dimensional model for incompressible #ows. This feature often permits explicit solution and analysis for various passive scalar and tracer statistics with quite general v(x, t), as we shall show at greater length in Sections 2.3.1 and 3 (see also other mathematical developments for shear #ows in [10,206]). A particularly useful extension of the shear #ow model which preserves much of its exact solvability is the inclusion of a purely time-dependent cross sweep w(t):
*(x, t)"*(x, y, t)"
w(t)
v(x, t)
.
(48)
One could also allow a purely time-dependent sweeping component along the shear #ow, but this is less interesting because the resulting tracer motion would simply be the sum of its motion due to Eq. (48) and due to this additional shear-parallel sweep. On the other hand, a cross sweep w(t), as appears in Eq. (48), interacts nonlinearly with the shear #ow convection by dragging the tracer across its spatial variations. In our present discussion, we will be able to write down explicit formulas for the e!ective di!usivity of a tracer in a periodic, mean zero, spatio-temporal shear #ow v(x, t) with periodic, constant, or vanishing cross sweep w(t), and thereby identify the in#uence of the various parameters. We will moreover be able to explicitly relate these formulas to their abstract Stieltjes measure representation. To "x the main ideas, we concentrate in Section 2.2.1 on the case of constant or zero cross sweep. We treat in turn a steady periodic shear #ow with no cross sweep (v"v(x), w(t)"0), a steady periodic shear #ow with a nonzero constant cross sweep (v"v(x), w(t)"wN O0), a spatio-temporal periodic shear #ow with no cross sweep (v"v(x, t), w(t)"0), and a spatio-temporal periodic shear #ow with nonzero constant cross sweep (v"v(x, t), w(t)"wN O0). The interesting features created by a periodically #uctuating w(t) will be elaborated upon in Section 2.2.2. 2.2.1.1. Steady shear yow with no cross sweep. A steady, mean zero, periodic shear #ow (48) with v"v(x) and w(t)"0 has been used as a simple model for #ow in a strati"ed porous medium [130]. The cell problem (16) reads !Ds (x, y)#Pe v(x) V
Rs (x, y) V "0 , Ry
(49)
Rs (x, y) "!Pe v(x) . !Ds (x, y)#Pe v(x) W W Ry Clearly s (x, y)"0, and we can seek a solution for s (x, y) which is independent of y in terms of V W a Fourier series expansion: v(x)" vL ep IV , I I$ vL I ep IV . s (x)"!Pe W 4pk I$
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
265
Substituting the functions s and s into Eq. (15), we "nd the following expression for the e!ective V W di!usivity matrix:
1
0
. 1#KM WW It di!ers from the molecular di!usivity matrix only through the enhancement KH"
0
(50)
"vL " "vL " I "Pe I KM "KM "Pe (51) WW WW 4pk 2pk I$ I along the shearing direction. This formula was "rst derived by Zeldovich [348] through a direct computation, and later by Gupta and Bhattacharya [130] through the homogenization approach put forth here. We see explicitly that di!usion is maximally enhanced along all directions eL which are not transverse to the shear #ow. This can be easily understood from the streamline structure, which in this case corresponds to straight lines parallel to the y-axis. In the absence of molecular di!usion, tracers would move along the streamlines at a ballistic rate (meaning that the distance travelled grows linearly in time). The addition of molecular di!usion knocks the tracer o! of its original streamline and eventually onto streamlines with velocity in the opposite direction, destroying the ballistic motion and producing a di!usive transport behavior instead. Since molecular di!usion is therefore an impediment to transport in a steady shear #ow with no cross sweep, the e!ective di!usivity grows very rapidly as Pe (which is inversely proportional to the molecular di!usivity) becomes large. Another physical interpretation for the e!ective di!usivity formula (51) can be found in Section 3.2.1 in the context of a steady random shear #ow, for which a closely related formula applies when the the statistical correlations are su$ciently short-ranged. 2.2.1.2. Steady shear yow with constant cross sweep. We now add a constant cross sweep w(t)"wN O0 to the shear #ow. In the context of porous media, this cross sweep can model a mean #ow through a strati"ed aquifer due to gravity (where x is taken as the vertical direction) [130,223]. The cell problem (49) is then altered only by the addition of the term wN (Rs (x, y)/Rx) on the VW left-hand side of each equation. The solution method proceeds as before, yielding the e!ective di!usivity matrix (50) with enhanced di!usivity along the shear now given by [130] "vL " I . (52) KM "KM U"2Pe WW WW 4pk#PewN I The cross sweep wN causes KM UN to remain uniformly bounded in Pe, corresponding to minimally WW enhanced di!usion. The reason for this drastic change from maximally enhanced di!usion is that the cross sweep blocks streamlines along the shearing direction [210]; see Fig. 1. Without molecular di!usion, the tracers would simply be swept along the x direction at a constant rate and oscillate within a bounded interval along the y direction. Molecular di!usion is thus necessary for e!ective transport along the shearing direction, as is generally the case for situations of minimally enhanced
266
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 1. Streamlines for v(x)"sin2px and w(t)"wN "1 (from [210]).
di!usion. The molecular di!usion along the y direction of course induces a standard tracer di!usion along the y direction, but the enhancement KM UN due to the convection comes only from its WW interaction with the x component of the molecular di!usion. Indeed, the net tracer displacement along the shear over any time interval 1/wN would be exactly zero without molecular di!usion, but becomes a nonzero random number when molecular di!usion is active. Part of this randomness comes directly from the molecular di!usion along the y direction. The more interesting component of the random displacement along the shear results from the randomness induced by the molecular di!usion in the cross-shear tracer motion, which breaks up the exact periodicity of its motion along the shear. It is these extra random displacements which produces the shear-assisted di!usion enhancement KM UN . A similar formula for the enhanced di!usivity applies for the case of a random WW steady shear #ow with su$ciently short-ranged correlations and a constant cross sweep; see Paragraph 3.2.5.2. Another perspective on the qualitative dependence of KM UN with respect to the WW various parameters is provided there.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
267
We pause to remark that Eq. (52) provides a concrete example of the Stieltjes measure representation for the e!ective di!usivity. This is better seen by rewriting Eq. (52) slightly: "vL /(2pk)" I KM "KM UN "Pe . (53) WW WW 1#Pe(wN /(2pk)) I$ The eigenvalues $i+kL, and eigenfunctions + L, L, of the operator AV (Eq. (37b)), after L L some rearranging, can be shown to be determined by the equations
L" tL , wN
RtL(x, y) RtL(x, y) #v(x) "ikLDtL(x, y) , Ry Rx
where tL3H(3). It is readily seen that a subset of eigenvalues and eigenfunctions is given by kLI"wN /2pk,
tLI(x, y)"e\p IV .
The square of these eigenvalues appear in the denominator of Eq. (53), and the square modulus of the projection of AyL " D\v(x) against these eigenfunctions appears in the numerator. By factoring out #AyL #, we obtain exactly the Stieltjes measure formula (39) for the enhanced di!usivity along the shear direction yL . Of course, we have not found all the eigenfunctions and eigenvalues of AV, but since AyL depends only on x and the eigenfunctions we have found provide a complete (Fourier) basis for such functions in the Hilbert space ¸(3B) with mean zero, the
projection of AyL against the remaining eigenfunctions of AV must be zero. As can be seen, it is more cumbersome in the present case to use the Stieltjes measure formula to obtain explicit formulas than it is to solve to cell problem. The Stieltjes measure formula, however, becomes extremely useful in analyzing the high PeH clet number behavior of tracer transport in situations where explicit solutions are not available, as we shall show in Section 2.2.4. 2.2.1.3. Spatio-temporal periodic shear -ow with no cross sweep. We now brie#y consider how the tracer transport is in#uenced by temporal oscillations in a periodic shear #ow v"v(x, t), "rst with no cross sweep w(t)"wN "0. The cell problem (49) now becomes parabolic with the addition of the term Rs (x, y, t)/Rt on the left-hand sides, but can still be solved in terms of the space}time Fourier VW series of the shear velocity "eld: (54) v(x, t)" vL ep IV>KROT . IK IK$ Again, the e!ective di!usivity matrix assumes the form (50), with the enhanced di!usivity along the shear given by k"vL " IK KM "KM R"Pe . (55) WW WW 4pk#mq\ T IK$ This formula was "rst computed by direct calculation (with no appeal to homogenization theory) by Zeldovich [348].
268
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
We note that, as for the steady shear #ow, each Fourier mode of the shear contributes additively to the e!ective di!usivity, and therefore we may discuss them on an individual basis. The spatio-temporal Fourier modes with m"0 are steady, and their presence produces maximally enhanced di!usion for the reasons presented in our discussion of the steady shear #ow. The di!usivity contributed by the temporally oscillating Fourier components (mO0), is depleted however, particularly when the (nondimensional) temporal period q is small. The ine$ciency of T transport by the temporally oscillating shear modes relative to the steady shear modes is related to the fact that, in the absence of molecular di!usion, the temporally oscillating Fourier shear modes induce bounded tracer oscillations rather than a unidirectional ballistic drift. Molecular di!usion is therefore necessary for e!ective tracer transport, and interacts with the periodically oscillating shear #ow to produce enhanced di!usivity along the shear by the same type of phase-randomizing mechanism as we discussed above for the case of a constant cross sweep in a steady shear #ow. 2.2.1.4. Spatio-temporal periodic shear -ow with constant cross sweep. The inclusion of a constant cross sweep w(t)"wN O0 in a spatio-temporal shear #ow v(x, t) is straightforward, either by direct calculation (as in [348] or in Section 3) or by homogenization. The enhanced di!usivity along the shear has the form k"vL " IK . KM "KM UR"Pe WW WW 4pk#(Pe wN k#mq\) T IK$ The new phenomenon here is the possibility of resonances between the cross sweep and the temporal oscillations of the #ow. When Pe wN k#mq\ is near zero, the contribution from the T temporally oscillating mode vL (with mO0) can be boosted far above what it would be without IK the mean sweep. 2.2.2. Steady periodic shear yow with periodic cross sweep We now consider how tracer transport is in#uenced when a cross sweep w(t/q ) acting across the T shear #ow (see Eq. (48)) has periodic variations in time (with period q ). We focus the case of T a steady periodic shear #ow v"v(x) in which the main features are all manifest. The cell problem involved in the homogenization theory can be solved exactly in a manner similar to that described in Section 2.2.1, but as the computations are a bit lengthier, we defer their presentation until after we discuss the results. The e!ective di!usivity matrix will in general have the form (50), with di!usivity enhanced along the shear above its bare molecular value (1) by an amount KM . There is no enhanced di!usivity across the shear; the cross-shear motion is simply WW a sum of a drift due to the mean of wN "1w(t/q )2, a periodic motion due to the periodic temporal T variations of w(t/q ), and a Brownian motion due to molecular di!usion. There is no interaction T between these cross-shear advection and di!usion processes because they are all independent of spatial location. We saw in Section 2.2.1 that the enhanced di!usivity along the shear is maximal (lim KM &Pe) when the cross sweep vanishes (w(t/q ),0) and the #ow streamlines are . WW T unbounded along the y direction, and that the enhanced di!usivity is minimal (lim KM &Pe) . WW when a nonzero, constant cross sweep w(t/q )"wN O0 is active which blocks streamlines along the T y direction.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
269
In our current study, where w(t/q ) is periodic, we can expect the shear-enhanced di!usivity to T behave in a way intermediate to the w,0 and w,wN O0 cases. If we take instantaneous snapshots of the streamlines of the #ow, then, we will see that they are almost always blocked along the y direction. But as w(t/q ) oscillates, there may be times when it passes through zero. Near these T times, the amplitude of the streamline oscillations in the y direction will grow unboundedly until the instant t at which w(t/q )"0, when the streamlines will form parallel lines along the H T y direction. Thus, the #ow will sometimes act on the tracer like a shear #ow blocked by a cross sweep (where enhanced di!usivity is bounded in PeH clet number), and other times like a shear #ow without cross sweep (where enhanced di!usivity grows quadratically with PeH clet number). The rigorous homogenization theory bears out this intuition. Let us assume that w(t/q ) vanishes T at most to "nite order at a "nite number of times in each period. Then the enhancement of the di!usivity along the shear #ow has the high PeH clet number asymptotics (56) lim KM &C M (Pe, q )Pe,,> , T WW ) . where N is the order of the highest zero of w(t), and C M (Pe, q ) is a positive function bounded strictly T ) away from zero and in"nity when the temporal oscillation period q is held ,xed. The order n of T a zero t of w(t/q ) is de"ned to be the unique positive integer n for which H T w(t /q )"w(t /q )"2"wL\(t /q )"0 , H T H T H T wL(t /q )O0 , H T where wH denotes the jth derivative of w. If the cross sweep never vanishes, then one should take N"0. We see that indeed the enhanced di!usivity is some sort of compromise between the Pe scaling associated to the zero cross sweep case and the Pe scaling associated to a constant cross sweep. Note that the asymptotic scaling exponent a"2N/(N#1) of KM as function of PeH clet number WW increases from 0 to 2 as the maximal order of vanishing, N, increases. This can be understood by noting that a function with a high-order zero at t is #atter and `stays closer to zeroa in the vicinity H t than a function with a simple zero would. Thus, a higher-order zero of w(t) should permit the H shear #ow to contribute more strongly toward tracer di!usion because the cross sweep is more nearly vanishing over a broader time interval. We can formally think of the case of an identically vanishing cross sweep w(t/q )"0 as an NPR limit. T Note that the scaling exponent is set only by the behavior of w(t) near times (if any) at which it vanishes. This is in accordance with the intuition that, at high PeH clet number, di!usivity along a shear is much more e!ective when there is no cross sweep, so most of the transport will occur in narrow time intervals about those moments at which w(t/q ) vanishes and the streamline blocking is T released. These time intervals of e$cient di!usion are responsible for the main contribution to the e!ective di!usivity KM . We remark that this relation relies crucially on the fact that the occasions of WW rapid transport occur periodically without fail. There is an important distinction here from the case of an isotropic material with random, inhomogenous di!usivity in which the regions of low di!usivity percolate. In this case, the passive scalar entity will get `trappeda for long times by the regions of slow motion, and the e!ective long-time di!usivity is likely to feel a strong e!ect from the presence of regions of ine!ective transport. There is no such feedback in the present situation.
270
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
2.2.2.1. Computation of ewective diwusivity. We now indicate how homogenization theory produces the results described above. The parabolic cell problem reads Rs (x, y, t) Rs (x, y, t) Rs (x, y, t) V #Pe w(t/q ) V #Pe v(x) V !Ds (x, y, t)"!Pe w (t/q ) , T V D T Rt Rx Ry
(57a)
Rs (x, y, t) Rs (x, y, t) Rs (x, y, t) W #Pe w(t/q ) W #Pe v(x) W !Ds (x, y, t)"!Pe v(x) , T W Rt Rx Ry
(57b)
where w (t/q )"w(t/q )!wN is the periodically #uctuating part of the cross sweep. Since the D T T right-hand side of Eq. (57a) is purely time-dependent, we can readily "nd a mean zero periodic solution for s : V R s "s (t)"!Pe w (s/q ) ds . (58) V V D T The solution of Eq. (57b) requires a little more work, but is still straightforward. As the inhomogeneity depends only on x and t, we are led to seek a solution s "s (x, t). Taking a partial W W Fourier transform with respect to x,
s (x, t)" sL (t)ep IV , W WI I\ v(x, t)" vL (t)ep IV , I I\ we reduce this part of the cell problem to a system of decoupled ODEs: dsL (t) WI #2piPe kw(t/q )sL (t)#4pksL (t)"!Pe vL (t) . T WI WI I dt These linear ODEs may be solved directly to produce the following time-periodic solutions:
R
(59a) vL (t) e\p .IRYR UQOT Qe\pIR\RY dt for kO0 , I \ R (59b) sL (t)"!Pe vL (t) dt . W We used the usual trick that the periodic solution of a periodically forced, dissipative equation may be represented by using Duhamel's formula with the initial data formally taken at t"!R (so that all nonperiodic transients have died out by any "nite time). Substituting the solutions (58) and (59) of the cell problem into formula (15) for e!ective di!usivity, we "nd an enhancement only for the component KH "1#KM , which may be expressed as WW WW OT (60) KM "KM U"8p kq\ "sL (t)" dt . WI WW WW T I sL (t)"!Pe WI
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
271
In the high PeH clet number limit, the integrand in Eq. (59a) becomes a rapidly oscillating function of time. The asymptotic behavior of the integral may be rigorously evaluated through the method of stationary phase (see [250] or [312]). The dominant contribution to the integral comes from the vicinity of those points at which the phase Pe R w(s/q ) ds has zero derivative, and these points are RY T precisely the zeroes of w(t). The contribution to sL (t) from integration near a nth-order zero t of WI H w(t/q ) is given by [250] T (n#1)! L>C(1/(n#1)) !2PeLL>qLL> k\L> T 2p n#1
s vL (t ) L I H ; ep . IRHR UQOT Qe\pIR\RH "wL(t )"L> H
(61)
where
e pL>sgn wL(t ) for n51 and odd , H s" p L for n52 and even . cos 2(n#1) Note that the magnitude of this contribution is an increasing function of the order n of the zero. The high PeH clet number asymptotics of sL are obtained by summing the contributions 61 from WI each zero t of w(s/q ) on the interval (!R, t], and the high PeH clet number asymptotics (56) of H T KM U then follow from Eq. (60). WW The e!ective di!usivity can be computed in a similar manner when the shear velocity "eld has temporal #uctuations in time. We do not provide details; it su$ces to say that the high PeH clet number asymptotics can be understood by putting together the principles which we described separately above for the case of a spatio-temporal periodic shear #ow with constant cross sweep and for a steady shear #ow with periodic cross sweep. 2.2.3. Steady cellular and related yows A natural periodic #ow which has received much attention is the steady two-dimensional cellular #ow, which is de"ned through a stream function t(x, y) as follows:
*(x, y)"
Rt(x,y) Ry
, (62a) Rt(x,y) ! Rx t(x, y)"t (x, y)"sin(2px)sin(2py) . (62b) The streamlines of this #ow are plotted in Fig. 2. No exact solution is known to the homogenization cell problem (16). A number of authors [260,287,294,303], starting with Childress [67], have instead endeavored to compute the e!ective di!usivity of the passive scalar "eld by various matched asymptotic expansion techniques applied directly to the advection}di!usion equation. The homogenized cell problem is not used in these works. The basic idea is that at high PeH clet number, the tracer is rapidly transported across any given cell, but is then trapped to remain in that cell for a while until molecular di!usion allows it to leak across the boundary to an adjacent cell.
272
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 2. Contour plot of stream function t (x, y) (62) of cellular #ow (from [210]).
The communication of the passive scalar "eld between cells is the rate-limiting process determining the e!ective di!usivity, and this #ux is determined by the sharp gradient of the passive scalar "eld formed in a thin layer (with width &Pe\) near the separatrices between the cells. The common conclusion of all the matched asymptotic expansions [67,260,287,294,303] is that, in the limit of large PeH clet number, the e!ective di!usivity matrix is isotropic with diagonal entries scaling as Pe, though there is some slight disagreement as to the precise value of the prefactor of the scaling law [260,287,294,303]. The predicted asymptotic Pe scaling of the e!ective di!usivity was later rigorously con"rmed by Fannjiang and Papanicoloau [97,99] through the use of variational principles within the framework of homogenization theory (see Paragraph 2.1.4.2). This implies in particular that steady cellular #ows are neither maximally di!usive nor minimally di!usive. The reason they are not maximally di!usive is quite apparent; there are no open channels, and molecular di!usion is crucially relevant for the tracer to hop from one cell to the next. But the
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
273
streamline barrier between the cells is not of the same blocking character as in Fig. 1 for a shear #ow with a transverse cross sweep. Only a single streamline separates the two cells in a cellular #ow, and the tracer must only move an in"nitesimal amount across it before it can be carried a great distance by the convection of the neighboring cell. By contrast, a tracer moving in"nitesimal amounts across the streamlines in Fig. 1 gets no help from the #ow itself in making further headway. This helps explain why the e!ective di!usion in the cellular #ow is more than minimally enhanced. But molecular di!usion is clearly still a facilitator, rather than a disruptor, of transport in a steady cellular #ow, and this is re#ected in the fact that the dimensionalized e!ective di!usivity is directly proportional to the square root of the molecular di!usion coe$cient. McCarty and Hortshemke [226] investigated the e!ective di!usivity of the passive scalar "eld in the cellular #ow (62) at "nite PeH clet numbers to ascertain the range of validity of the asymptotic Pe scaling law and the corrections thereto. These authors computed the homogenized e!ective di!usivity through a "nite mode Fourier truncation of the cell problem (16). They "nd that the Pe scaling is well satis"ed for Pe '500, with a leading order correction of magnitude O(Pe), in accordance with an earlier theoretical estimate [287]. Similar conclusions were reached by Biferale et al. [40] through other numerical approximations for the homogenized di!usivity; we discuss these further in Section 2.2.5 below. Tracer transport in the steady cellular #ow (62) at "nite PeH clet number was numerically studied in a quite di!erent way by Rosenbluth et al. [287] and Biferale et al. [40] using direct Monte Carlo simulations; see Paragraph 2.3.2.1 below. Solomon and Gollub [302] experimentally measured the transport rates of dyes and latex spheres in a laboratory Rayleigh}BeH nard steady convection cell with a #ow pattern similar to Eqs. (62a) and (62b). They found e!ective di!usivities in general accord with the high PeH clet number theoretical predictions for 10:Pe :10. In addition to the studies of the steady cellular #ow (62), there are also a number of theoretical and numerical investigations of various periodic modi"cations of it which elucidate other types of transport mechanisms. We discuss brie#y three such lines of inquiry. 2.2.3.1. Childress}Soward -ows. A useful one-parameter family of steady two-dimensional #ows which interpolate between a shear #ow and a cellular #ow were introduced and analyzed by Childress and Soward [68]. The stream functions tC (x, y) of this `Childress}Sowarda family of !1 #ows is de"ned as tC (x, y)"sin(2px) sin(2py)#e cos(2px) cos(2py) , !1
(63)
with the parameter e chosen from the range 04e41. For e"0, we recover the steady cellular #ow (62), while for e"1, the Childress}Soward #ow is a shear #ow directed parallel to the vector (1,1)R. For 0(e(1, the #ow takes the form of a mixture of `cat's eyea vortices and open channels parallel to (1,1)R on the whole (Fig. 3). As e increases, the width of the channels increase at the expense of the cat's eyes. Inspection of the streamline structure of the Childress}Soward #ows with 0(e(1 suggests that tracer motion should be easy along open channels in the (1,1)R direction, but must work against streamline blocking in the (1,!1)R direction. Based upon our discussion in Paragraph 2.1.4.1, we may therefore expect that di!usion is maximally enhanced along any direction with a nonzero component along the open channels, but only minimally enhanced in the
274
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 3. Contour plot of stream function tC (x, y) of Childress}Soward #ow (63) with e"0.5 (from [231]). !1
orthogonal, blocked direction. We may therefore hypothesize for 0(e(1 the high PeH clet number scalings
1 lim eL ' K M ' eL &O(Pe) for eL O(1/(2) , !1 . (64) 1 lim eL ' K M ' eL &O(1) for eL "(1/(2) . !1 . An elaborate boundary layer analysis by Childress and Soward [68] produces the same predictions with speci"c values for the scaling prefactors, and Fannjiang and Papanicoloau [97] recover the general scaling relations (64) through variational methods.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
275
Fig. 4. Log}log plot of enhanced di!usion coe$cient KM versus Pe for Childress}Soward #ows (63) (from [210]). Upper VV curve: e"0.9, middle curve: e"0.5, lower curve: e"0. Also shown is a line with slope 2 for reference.
Numerical computation [210] of the homogenized e!ective di!usivity matrix con"rm the maximally enhanced di!usion along the x direction when 0(e(1. The e!ective di!usivity for Childress}Soward #ows with e"0, 0.2, and 0.5 are plotted in Fig. 4; the contrast of the maximally enhanced di!usion (KM &O(Pe)) for eO0 with the KM &O(Pe) scaling for the cellular #ow VV VV e"0 is evident. Crisanti et al. [77] investigated a family of #ows with the same essential features of the Childress}Soward #ows (63), and predicted the same kind of high PeH clet number behavior of the e!ective di!usivity through intuitive scaling arguments. These predictions were supported by Monte Carlo numerical simulations [77]. We will use the Childress}Soward #ows in Section 2.2.4 to illustrate some of the dramatic e!ects which the addition of a constant mean sweep to a periodic velocity "eld can cause. 2.2.3.2. Checkerboard yows. Fannjiang and Papanicoloau [97] considered a `checkerboard #owa variation of the steady cellular #ow (62) in which the #ow is active only in every other cell, so that transport is dominated by passage through the corners of diagonally adjacent cells. Through the use of variational principles, they rigorously show that the e!ective di!usivity can have a high PeH clet number asymptotic form lim #K M #&O(Pe?), with any 4a41, by further modifying . the checkerboard #ow so that the corners are suitably widened. 2.2.3.3. Temporally yuctuating cellular yows. A few interesting studies of the e!ective di!usivity have been undertaken concerning cellular #ows with periodic temporal #uctuations. Knobloch and Merry"eld [165] numerically investigated the e!ective di!usion of a passive scalar "eld in a standing wave obtained by multiplying the cellular #ow (62) by a periodic time factor cos ut. The
276
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
e!ective di!usivity was found to decrease as a function of the temporal frequency u. This phenomenon was also manifested in the analytical formula (55) for the e!ective di!usivity in a shear #ow with periodic temporal variations, and is due to the diminishing persistence of tracer motion as the temporal oscillations become more rapid. Knobloch and Merry"eld [165] also examined the e!ective di!usivity in a travelling wave with stream function t (x, y, t)"sin(2p(x!;t))sin(2py) , 25 where ; is the phase speed. Note that the velocity "eld averaged over a temporal period still vanishes everywhere. Through some numerical experiments, it was found in [165] that tracer transport is faster in a travelling wave than in a standing wave. This was attributed to the action of the Stokes drift of the travelling wave, and to the trapping and dragging of particles within the cores of the moving cells. Note, however, that there is no long-term net drift of the tracer along any direction [165]. Another interesting question is how tracer transport is modi"ed when a periodic time-dependent perturbation is added to the steady cellular #ow (62). Solomon and Gollub [301] measured how the e!ective di!usivity of the dyes and latex spheres in their laboratory convection cells changes when the Rayleigh number is increased to a point at which the steady cellular #ow becomes unstable via a Hopf bifurcation to a temporally periodic oscillation. They found the e!ective di!usivity to be enhanced by several orders of magnitude over that for a steady cellular #ow, and attributed this increase to the chaotic transport of tracers across the cellular boundaries due to the temporal #uctuations. Such a conclusion was supported by various numerical computations of the homogenized di!usivity in a simple model by Biferale et al. [40]. They found that the (dimensionalized) e!ective di!usivity is independent of molecular di!usion at large PeH clet number, consistent with the notion that chaotic advection is the dominant transport mechanism. 2.2.4. Ewects of constant mean sweep on transport in steady periodic yow We next consider the dramatic and subtle e!ects which a constant mean #ow can have on the e!ective di!usivity of a periodic #ow, with particular attention to the special class of Childress}Soward #ows (63) discussed in Paragraph 2.2.3.1. Our discussion draws primarily from the results and ideas in the original study by McLaughlin and the "rst author [210]. We "rst formulate some general mathematically rigorous criteria to decide when the addition of a mean #ow to a two-dimensional, steady, periodic #ow will give rise to maximal or minimal di!usivity. Roughly speaking for the moment, a mean #ow V with rationally related components will `genericallya give rise to maximally enhanced di!usion along all directions other than those perpendicular to V, whereas a mean #ow with components forming an irrational ratio will create minimally enhanced di!usion along all directions eL if there are no stagnation points in the #ow [210]. Through numerical evaluation of the e!ective di!usivity, we next demonstrate that this sensitive dependence of the large PeH clet number asymptotics of the e!ective di!usivity on the ratio of the components of the mean #ow and other features manifests itself clearly at accessibly large but "nite PeH clet numbers. The numerical experiments moreover reveal some striking crossover phenomena for the behavior of the e!ective di!usivity as a function of Pe which re#ect the competing in#uences of various #ow qualities suggested by the mathematical asymptotic theory [210]. All these phenomena indicate subtle, complex, and mathematically rigorous behavior for eddy di!usivity modelling in such #ow "elds.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
277
2.2.4.1. Conditions for maximal or minimal enhanced diwusion in presence of constant mean sweep. In Paragraph 2.1.4.1, we stated general conditions for maximally enhanced and minimally enhanced di!usion which can be deduced from the Stieltjes integral representation of the e!ective di!usivity [39]. In [210], some rigorous corollaries of these conditions were derived which provide some further general insights into how the large PeH clet number behavior of the e!ective di!usivity of a two dimensional, steady, periodic #ow
v (x, y) *(x)"*(x, y)" V v (x, y) W
is a!ected by the presence of a constant mean #ow (sweep)
V"
< V . < W
These results were later rederived in slightly di!erent form in [97], using variational principles mentioned in Paragraph 2.1.4.2. We consider separately the cases where the ratio of the components of the mean sweep < /< V W (or its inverse) is rational and irrational. E+ect of a mean sweep with rationally related coe.cients. If the constant mean #ow has rationally related components, then the e!ective di!usivity is maximally enhanced in all directions eL with V ' eL O0, provided that there exists a real number j and a positive integer p so that: E j< and j< are each integers, and V W e\p H4WV\4VWtN(x, y) dx dyO0 , E
(65)
3
where t(x, y) is the stream function corresponding to the velocity "eld, i.e. v "Rt/Ry and V v "!Rt/Rx [210]. W The e!ective di!usion is always minimally enhanced in the direction eL perpendicular to the mean #ow V when it has rationally related components [97]. The technical condition (65) is clearly satis"ed by `mosta #ows, with the signi"cant exception of shear #ows aligned perpendicularly to V. In the generic case, therefore, a mean sweep with rationally related coe$cients will induce maximally enhanced di!usion in all directions other than the one perpendicular to itself, along which di!usion will instead be minimally enhanced. Shear #ows directed perpendicularly to the mean sweep fail to adhere to this paradigm quite simply because they have no velocity component parallel to the mean sweep, and therefore cannot induce any additional di!usion in this direction. We will return below to discuss crossover e!ects in the e!ective di!usivity produced by #ows which are small perturbations of shear #ows.
278
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
It is shown in [210] that the condition (65) is satis"ed for the family of Childress}Soward #ows (63) described in Paragraph 2.2.3.1, provided that 04e(1. Recall that for e"1, the Childress}Soward #ow degenerates to a shear #ow; the e!ect of the addition of a constant mean sweep for this case was discussed in Paragraph 2.2.1.2. We now discuss the e!ects of the addition of a mean sweep with rationally related components to Childress}Soward #ows in the parameter range 04e(1, for which the theorem stated above can be applied. The special value e"0 corresponds to a cellular #ow which gave rise to an e!ective di!usivity scaling at large PeH clet number as Pe in all directions, when no mean #ow is present. The addition of a mean sweep with rationally related coe$cients however dramatically changes the large PeH clet number asymptotics: the e!ective di!usivity is now maximal, scaling as Pe, in all directions eL other than the one perpendicular to V, along which it is bounded in Pe. The intuitive reason for this change is that the mean #ow has opened up channels which facilitate transport parallel to itself, but which block transport in the perpendicular direction [304]. The rational relation between the components of the mean sweep V is, however, critical to this conclusion, as we shall explain below when we consider mean sweeps V with irrationally related components. For 0(e(1, the Childress}Soward #ow without a mean sweep has a mixture of open channels and cats-eye trapping regions, and asymptotic analysis indicates maximally enhanced di!usion along all directions except orthogonally to the channel direction, along which the di!usion is minimally enhanced. The addition of a mean sweep maintains the generally maximally di!usive character of the #ow, but shifts the direction of minimal di!usivity to be orthogonal to the mean sweep V, rather than orthogonal to the direction of the original channels (in the absence of the mean sweep). The change in the streamline structure for various Childress}Soward #ows under the addition of a mean sweep is graphically illustrated in [210]. E+ect of a mean sweep with irrationally related coe.cients. By contrast to the generally maximally enhanced di!usion promoted by the presence of a mean sweep with rationally related components, we have the following result for the case of a mean sweep with irrationally related components [210]: If the ratio < /< (or its inverse) is irrational, and the total #ow has no stagnation points (that is, V W "V#*(x, y)" vanishes nowhere), then no direction eL is maximally di!usive. Moreover, if the irrational ratio < /< (or its inverse) can be normally approximated by rationals, then the V W e!ective di!usion is only minimally enhanced in all directions. The proof of this statement is given in [210], and relies on Kolmogorov's theorem for dynamical systems on the torus which permits a convenient global change of coordinates ([297], Ch. 11). The number theoretic property of an irrational number being `normally approximateda by rationals is discussed in ([297], p. 95). It su$ces for our present purposes merely to mention that the set of irrational numbers having this property has full Lebesgue measure, i.e. almost every irrational number has the normal approximation property. It is readily checked that the above statement can be applied to Childress}Soward #ows, provided that the mean sweep with irrationally related components is strong enough to preclude stagnation points. We therefore have for these #ows (and in fact, quite generically) a very sensitive dependence of the e!ective di!usivity on the ratio of the components of the mean sweep. A rational
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
279
ratio implies maximally enhanced di!usion in almost all directions, whereas an irrational ratio usually implies minimally enhanced di!usion in all directions (provided there are no stagnation points). In particular, while the e!ective di!usivity for the cellular #ow described in Section 2.2.3 scales at high PeH clet number as Pe, the addition of a mean #ow with rationally related coe$cients will further enhance di!usion with Pe scaling in almost all directions, whereas the addition of a su$ciently strong mean #ow with irrationally related coe$cients will usually interfere with the enhanced transport mechanism of the cellular #ow, and limit the e!ective di!usivity to a "nite constant no matter how large Pe becomes. Similar results were obtained by Koch et al. [168] and Mauri [224] in asymptotic computations for the special case of a #ow drawn by a large-scale pressure gradient through a periodic array of small spheres, and by Soward and Childress [304] for the case of a weak mean sweep past a steady cellular #ow (62). Golden, Goldstein, and Lebowitz [125] found an analogous phenomenon in the di!usion of a particle through a oscillatory potential with two characteristic wavelengths; the e!ective di!usivity takes di!erent values depending on whether the ratio of the wavelengths is rational or irrational. An intuitive reason to understand why a mean sweep with rational ratios generally produces much more e!ective transport than mean sweeps with irrational ratios is that the former can set up resonant open channels of "nite width extending forever periodically in a given direction. The addition of a mean #ow with irrationally related components, by contrast, gives rise to an aperiodic total stream function with "ne structures which will not support these clean open channels [304]. This explains, in a heuristic way, why one should not generally expect maximally enhanced di!usion when the mean sweep has irrationally related components, but falls short of explaining why the di!usion should be so inhibited as to be only minimally enhanced. Some mechanism other than streamline blocking must be playing an active role in this regard. It may possibly be related to a dense sampling of the period cell by every streamline which leads to a rapid averaging over the mean zero velocity "eld [168]. It would be interesting to concretely identify and clarify the relevant mechanism producing minimally enhanced di!usion in this situation. 2.2.4.2. Numerical evaluations of ewective diwusivity at large but xnite PeH clet number. The mathematical theorems presented above concerning the addition of a mean sweep to a periodic #ow describe the asymptotic scaling of the e!ective di!usivity in the limit of large PeH clet number. They neither provide a numerical value for the prefactor in the scaling law, nor do they designate how large the PeH clet number must actually be for these asymptotic scalings to be observed. Such questions generally require speci"c computation through either numerical solution of the cell problem (33) or Monte Carlo simulations of the tracer motion (see Section 2.3.2). We describe here some numerical computations of the enhanced di!usivity for certain Childress}Soward #ows with mean sweep which are suggested by the asymptotic mathematical theory to display potentially interesting `crossovera behavior at "nite PeH clet number. These computations, reported in [210], solve the cell problem (33) using a "nite Fourier mode truncation scheme similar to the one used in [226]. Perturbed shear -ow with transverse cross sweep: We recall that the rigorous mathematical theory proceeding from the Stieltjes integral representation stated that the addition of a mean #ow with rationally related components to a Childress}Soward #ow (63) with 04e(1 produces maximally enhanced di!usion in all directions with eL ' VO0. On the other hand for e"1, the
280
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Childress}Soward #ow reduces to a shear #ow parallel to the vector (1, 1)R, and we know from Paragraph 2.2.1.2 that the addition of a mean sweep in any other direction will produce minimally enhanced di!usion in all directions. Therefore, in particular, the addition of a mean #ow parallel to (!1, 1)R will produce maximally enhanced di!usion for Childress}Soward #ows with 04e(1, but only minimally enhanced di!usion for the limiting shear #ow case e"1. This motivates a study of how the enhanced di!usivity varies at "nite PeH clet number when e is slightly below 1. We therefore choose a Childress}Soward #ow with e"0.9, which, in the absence of any mean #ow, has the structure of a perturbed shear #ow, with long and narrow cat's eyes interspersed between wide channels (see Fig. 5). For comparison and contrast, we present in Fig. 6 the numerical computations for the enhanced di!usivity along the x direction, KM , which results when VV the mean #ow V"(2, 2)R or V"(!2, 2)R is added. In both situations, the asymptotic theory predicts maximally enhanced di!usion (lim KM &O(Pe)), and this is con"rmed by the . VV numerical computations, but a substantial di!erence between the two cases is manifest at "nite PeH clet number.
Fig. 5. Contour plot of stream function tC (x, y) of Childress}Soward #ow (63) with e"0.9 (from [210]). !1
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
281
Fig. 6. Log}log plot of enhanced di!usion coe$cient KM versus Pe for Childress}Soward #ow (63) with e"0.9 (from VV [210]). Upper curve: mean #ow V"(2, 2)R, lower curve: V"(!2, 2)R. Also shown is a line with slope 2 for reference.
In the case V"(2, 2)R, the mean #ow is aligned parallel to the overall shearing direction, and has almost no e!ect on the enhanced di!usivity of the #ow. (Compare with the e!ective di!usivity computed for this #ow in the absence of any mean #ow in Fig. 4.) The second case, in which the mean sweep is directed orthogonally to the shearing direction, exhibits instead two distinct scaling crossovers in the plot of ln KM vs. ln Pe. For small Pe, the curve has slope 2, a general consequence VV of the Stieltjes integral representation (39) for arbitrary periodic #ows [12]. As Pe increases through the range 10 to 10, the enhanced di!usivity temporarily plateaus to a constant level, then "nally turns back to a quadratic rate of growth. The intermediate regime of constancy of KM is VV clearly attributable to the presence of the mean sweep transverse to the shearing direction. In fact, one may infer that for Pe:10, the tracer di!usion behaves in the same way as if the velocity "eld were a superposition of a shear #ow (e"1) with a transverse cross sweep. The enhanced di!usivity temporarily saturates at a "nite level in accordance with the minimal enhancement of di!usion in such a #ow. The reason why the perturbation to the shear #ow (i.e. the di!erence of e from 1) may not yet be noticable is that the "niteness of the PeH clet number implies a su$cient amount of molecular di!usion which could blur out the sensitivity of the tracer to the rather small-amplitude deviations to the shear #ow. For Pe910, the departure of *(x) from a parallel shear #ow begins to be felt, and the enhanced di!usivity correspondingly turns over to a maximally di!usive character. The association of the intermediate plateau regime 10:Pe:10 with a regime in which the shear #ow plus transverse sweep dominates the e!ects of the eO1 perturbations is also borne out by a computation of the eigenvectors of the enhanced di!usivity matrix [210]. Based on this numerical evidence and the rigorous asymptotic theory, we can conjecture that as e61, the intermediate plateau regime in the plot of ln KM vs. ln Pe extends ever further to the right, pushing the transition VV to the ultimate KM &O(Pe) growth stage to ever higher PeH clet number. VV
282
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The present study of the e!ective di!usivity in a perturbed shear #ow with transverse cross sweep exempli"es the symbiotic interaction between mathematical theory, physical intuition, and numerical computations. The mathematical theory furnished some general asymptotic statements, which left open some details concerning the behavior of the e!ective di!usivity at "nite PeH clet number, but brought out some of the essential physical features of the #ow which determine how e!ectively di!usion is enhanced. These considerations suggested some illustrative problems and questions to address numerically, and we thereby uncovered some interesting "nite PeH clet number crossover behavior which is beyond the reach of the asymptotic theory. On the other hand, the asymptotic theory plays a crucial role in checking the inferences we make from numerical simulations. If we had only computed the enhanced di!usivity up to Pe"100, we would have missed the second transition in the curve in Fig. 6 corresponding to V"(!2, 2)R. The empirical evidence alone might misleadingly suggest that the enhanced di!usivity had reached a permanent "nite limit, but the rigorous asymptotic theory would inform us that this could not possibly be the case, and that we would "nd another transition if we pushed our computations to higher PeH clet number. Contrast between mean sweeps with rational and irrational ratio of components: One particularly intriguing conclusion from the rigorous asymptotic theory is that the e!ective di!usivity can display diametrically opposite behavior, i.e. minimally or maximally enhanced di!usion in most directions, depending on whether a mean #ow with rationally or irrationally related components is superposed. This is, taken at face value, a statement applicable only at enormously large PeH clet number, since it is clear the e!ective di!usivity at any "nite but large PeH clet number cannot be much a!ected by an in"nitesimal shift of the mean #ow between rational and irrational ratios. Nonetheless, we understood above that this asymptotic statement had some apparent physical content, in that mean sweeps with rationally related components can be expected to open more e$cient channels of rapid transport than mean #ows with irrationally related components. Clearly, the practical issue suggested by both the asymptotic theory and this physical intuition is whether there can be a strong di!erence at "nite PeH clet number between the e!ective di!usivities when the ratio of the components of a mean #ow pointing in a general direction is well approximated or is not well approximated by a low-order rational number (one with small integers in the numerator and denominator) [304]. This question was strikingly answered in the a$rmative by [210] using a Childress}Soward #ow (63) with e"0.9, and comparing the e!ects of two di!erent mean sweeps, both generally transverse to the overall shearing direction. The "rst mean sweep is de"ned V"(7.1,!7.1)R, which clearly has a low order rational ratio (!1) of components. The tracer transport in this case is guaranteed by the rigorous asymptotic theory to be maximally di!usive at large PeH clet numbers, and the numerical simulation results presented as the upper curve in Fig. 7 con"rm this behavior. The enhanced di!usivity in this case exhibits a crossover at intermediate PeH clet numbers 1:Pe:10 in just the same way as the same Childress}Soward #ow with rationally related mean sweep V"(!2, 2)R which we discussed following Fig. 6. The reason for this crossover is that the perturbations to the shear #ow play a subdominant role to the gross shear plus cross sweep structure until PeH clet number is su$ciently large. The enhanced di!usivity was next computed for a mean sweep V"(7.14142, 7.1)R, which has the same general direction as the "rst mean sweep, but is clearly not well approximated by any low-order rational. We therefore expect the enhanced di!usivity to behave for accessibly large
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
283
Fig. 7. Log}log plot of enhanced di!usion coe$cient KM versus Pe for Childress}Soward #ow (63) with e"0.9 (from VV [210]). Upper curve: mean #ow V"(7.1,!7.1)R, lower curve: V"(7.14142,!7.1)R. Also shown is a line with slope 2 for reference.
values of Pe as if the mean #ow had irrationally related components. Since the #ow is readily shown to have no stagnation points, the mathematical theory predicts that the di!usivity should be only minimally enhanced for V"(7.14142, 7.1)R, in stark contrast to the maximally enhanced di!usion for V"(7.1, 7.1)R. This distinction is clearly manifested by the numerical computations in Fig. 7 at "nite but large values of the PeH clet number. The enhanced di!usivity behave identically for the two mean #ows up to Pe&10, at which point the #ow with irrationally related mean sweep exhibits a second crossover to a minimally di!usive regime in which the enhanced di!usivity remains constant at least up to Pe&10. This crossover may clearly be interpreted as a fairly sudden onset of sensitivity of the tracer transport to the "ne structure of the #ow created by the departure of the mean sweep V"(7.14142, 7.1)R from a low-order rational. For smaller values of PeH clet number, the molecular di!usion is su$ciently strong to coarse-grain the discrepancy between the two mean sweeps, but at larger values of PeH clet number, the di!erences in the streamline structure are acutely felt by the tracer [304]. Similar crossover behavior is also exhibited in approximate analytical formulas for the enhanced di!usivity in #ow drawn past a cubic array of small spheres [168]. Role of stagnation points: Recall that the theorem guaranteeing that mean #ows with irrationally related components produce minimally enhanced di!usion required the absence of stagnation points of the total #ow. This condition was needed in the proof of [210] to employ a theorem of Kolmogorov to e!ect a helpful change of coordinates. To examine whether this condition might be truly necessary or is just an artifact of the method of proof, those authors conducted some numerical studies of the e!ective di!usivity of a tracer in a #ow with an irrationally related mean
284
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
sweep and stagnation points. They found that the presence of stagnation points did a!ect the large PeH clet number behavior of the tracer transport so that it no longer had a minimally di!usive character [210]. Some discussion on the role of stagnation points in tracer transport also may be found in [97]. General implications: The examples presented above alert us to be cautious in inferring asymptotic scaling behavior of the e!ective di!usivity from computations at "nite PeH clet number because multiple crossovers can and do occur. They also explicitly demonstrate how the e!ective di!usivity can truly be exquisitively sensitive to "ne details of the #ow, particularly at high PeH clet number. For instance, the two mean sweeps just considered di!er by less than one percent, yet the enhanced di!usivity which results di!ers by orders of magnitude for moderately large values of Pe (see Fig. 7). We will see in Paragraph 2.3.2.1 that these di!erences are moreover manifested on practical, "nite time scales [231]. These examples raise important concerns for the modelling of the e!ective di!usivity of a #ow by nonrigorous, approximate, or ad hoc arguments, and provide simple, natural, and instructive test problems which can help determine whether these approximate theories are rich enough to capture the substantial variations in e!ective di!usivity produced by subtle changes in the #ow. 2.2.5. Other spatio-temporal periodic -ows We have focused our above discussion of homogenized di!usivity to #ows of shear and cellular type, but quite general periodic #ows can be studied through a concerted use of numerical methods with the mathematical tools described in Paragraph 2.1.4.1. Such an approach is exempli"ed in the work of Biferale et al. [40], in which they compute the e!ective di!usivity of tracers in various interesting kinds of #ows through three di!erent numerical approaches. One is the numerical solution of the cell problem (14) through a conjugate gradient algorithm (rather than through the "nite mode Fourier truncation method adopted in [210,226]. The second is through the construction of suitable PadeH approximants from the numerical computation of a "nite number of terms of a low PeH clet number expansion of the e!ective di!usivity. As described in [9,12] and Paragraph 2.1.4.1, these PadeH approximants can be used to bound the e!ective di!usivity rigorously and tightly over "nite ranges of PeH clet number, provided the coe$cients of the low PeH clet number expansion are computed with su$cient precision [40]. Finally, the e!ective di!usivity is computed through direct Monte Carlo simulations of the motion of a large number of tracers, which we will discuss at further length in Section 2.3.2. For all #ows considered, the e!ective di!usivities computed by the various methods agreed well over several decades of PeH clet number. The computation by PadeH approximants has some peculiar advantages and disadvantages. On the one hand, only a "nite number of quantities must be computed to obtain rigorous bounds for all PeH clet number, whereas the other approaches can compute the e!ective di!usivity for only one PeH clet number at a time. Unfortunately, it is di$cult to obtain good numerical precision with PadeH approximants at high PeH clet number [40], and this practically restricts the method to moderate and low PeH clet number (Pe:O(10)). The numerical solution of the homogenization cell problem (14) was found to be the most e$cient means of computing the e!ective di!usivity at large PeH clet number. One important consequence of the good agreement between this computation and those of the Monte Carlo simulations is that the asymptotic predictions of homogenization theory are realized on practical, "nite time scales. We return to this point in Section 2.3.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
285
Besides the steady cellular #ow and its time-dependent perturbation mentioned in Section 2.2.3, Biferale et al. [40] considered a quite di!erent, steady, three-dimensional `ABC #owa:
v (y, z) V *(x)"* (x, y, z)" v (x, z) , W ! v (x, y) X v (y, z)"sin z#cos y , V v (x, z)"sin x#cos z , W v (x, y)"sin y#cos x . X This ABC #ow is an exact steady solution of Euler's equations. The streamlines form regular open tubes surrounded by chaotic regions. The transport is expected to be dominated by the open tubes, producing maximally enhanced di!usion, and this is veri"ed numerically [40]. 2.3. Tracer transport in periodic yows at xnite times The homogenization theory presented in Sections 2.1 and 2.2 for the e!ective di!usion of a passive scalar "eld by a periodic velocity "eld is an asymptotic theory guaranteed to be valid only at su$ciently large space and long time scales. In practical applications, it is important to know the time scale on which this asymptotic e!ective di!usive behavior is attained and the nature of the corrections to the di!usive behavior over "nite intervals of time. We now address these questions by computing the statistical behavior of a single tracer in several classes of periodic #ows at "nite times. First, we return to the periodic shear #ows with constant or zero cross sweep, which we introduced in Section 2.2.1. Due to the special geometry of these #ows, the equations of motion for tracers can be exactly integrated, and exact formulas for the moments of the tracer displacement can be derived for arbitrary time. From these, we can directly read o! the rate of relaxation to the homogenized, long-time di!usive behavior as well as the character of the "nite-time corrections. We will "nd that the homogenized description is accurate after a "xed time of order unity (nondimensionalized with respect to molecular di!usion scales as in Section 2.1.1), irrespective of PeH clet number [230]. For general periodic #ows, the tracer equations are too di$cult to integrate exactly. The passive scalar evolution over pre-homogenized time scales for some special #ows other than shear #ows have been addressed through various approximate analytical techniques. As examples, we refer the reader in this regard to the "nite time analysis by Young et al. [346] (and also [47]) for a high PeH clet number steady cellular #ow, and to Camassa and Wiggins' [52] treatment of tracer advection in a temporally oscillating cell #ow by dynamical systems techniques which neglect molecular di!usion. Methods such as these rely upon a su$ciently simple geometry of the streamlines, as well as other asymptotic or ad hoc assumptions. An alternative and e!ective way to study the motion of a tracer with quantitative accuracy over "nite time intervals in a general periodic #ow with complex geometry is by careful Monte Carlo numerical simulations. A Monte Carlo simulation is simply an integration of the particle trajectory
286
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
equations for a large number of particles undergoing independent random molecular di!usion in addition to advection by the #ow. Averages computed from this "nite sample are then used to estimate statistics of the full ensemble, such as the mean-square tracer displacmeent. We report on several Monte Carlo simulations [40,231] which show very good quantitative agreement with the predictions of homogenization theory after an initial transient stage extending at most over a (nondimensionalized) time interval of order unity. In particular, the subtle crossover behaviors predicted by homogenization theory for the class of Childress}Soward #ows with a mean sweep in Section 2.2.4 are explicitly manifested over "nite time intervals. This underscores the care which is required in formulating e!ective di!usivity models for practical applications, even in the relatively simple case of a periodic velocity "eld varying on spatial scales well separated from the macroscale. On the positive side, the good agreement between homogenization theory and the Monte Carlo simulations indicate that the e!ective di!usivity of tracers in a periodic #ow on practical time scales can be computed through the numerical solution of a single cell problem (such as Eq. (33)), rather than through the generally more expensive simulation of the motion of a large number of tracers [40]. 2.3.1. Periodic sinusoidal shear yow We mentioned in Section 2.2.1 that many nontrivial aspects of passive scalar transport can be illuminated through explicit formulas within the class of shear #ows. Here, we present exact formulas for the evolution at all times of the "rst and second spatial moments of a passive scalar "eld immersed in a steady, periodic shear #ow with constant (possibly zero) cross sweep,
*(x)"*(x, y)"
wN
v(x)
.
These formulas for the behavior of the passive scalar "eld momentsat "nite times will be compared to the predictions of homogenization theory worked out in Section 2.2.1. We shall adopt a tracer-centered perspective to complement the "eld-centered perspective emphasized so far in Section 2. These are related in that the probability distribution function (PDF) for a single tracer in an incompressible velocity "eld obeys the advection}di!usion equation with initial data delta-concentrated at the initial tracer location; see the discussion in Section 1. The location (X(t),>(t)) at time t of a single tracer (2) originating from (x ,y ) in the presence of a periodic shear #ow with constant cross sweep obeys the following nondimensionalized stochastic equations of motion: dX(t)"Pe wN dt#(2 d= (t), X(t"0)"x , (66) V d>(t)"Pe v(X(t)) dt#(2 d= (t), >(t"0)"y . W The random increments d= (t) and d= (t) are di!erentials of independent Brownian motions V W [112,257] arising from molecular di!usion. Brownian motion has the following formal properties (for =(t)"= (t) or =(t)"= (t)): V W E =(t) is a continuous random function. E =(t)!=(s) is a Gaussian random variable with mean zero and variance "t!s". E Increments of =(t) over disjoint time intervals are independent of one another. E =(t"0)"0.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
287
Due to the spatially decoupled dynamics induced by a shear #ow, the stochastic trajectory equations (66) can be integrated successively by quadrature: (67a) X(t)"x #Pe wN t#(2= (t) , V R >(t)"y #Pe v(X(s)) ds#(2= (t) . (67b) W Note that the tracer position is random due to the Brownian motions arising from molecular di!usion. Two statistics of fundamental interest are the mean displacement of the tracer,
k (t"x , y ),1X(t)!x 2 , 6 5 k (t"x , y ),1>(t)!y 2 , 7 5 and the variance of its location,
(68)
p (t"x ,y ),1(X(t)!k (t))2 , 6 6 5 (69) p(t"x ,y ),1(>(t)!k (t))2 . 7 7 5 The brackets 1 ) 2 denote an averaging over the Brownian motion statistics. Because the PDF of 5 the tracer displacement is identical to the solution of the advection}di!usion equation with initial data ¹ (x, y)"d(x!x )d(y!y ) , the spatial moments of the passive scalar "eld evolving from such initial data are directly related to the tracer statistics (68) and (69) as follows:
(x!x )¹(x, y) dx dy"k (t"x , y ) , 6 \ \ (y!y )¹(x, y) dx dy"k (t"x , y ) , 7 \ \ (70) (x!x )¹(x, y) dx dy"1(X(t)!x )2"p (t"x , y )#(k (t"x , y )) , 6 6 \ \ (y!y )¹(x, y) dx dy"1(>(t)!y )2"p(t"x , y )#(k (t"x , y )) . 7 7 \ \ In particular, k (t"x , y ) is the mean displacement of the center of mass and p (t"x , y ) is the 6 6 mean-square radius of a cloud of passive scalar particles initially released at (x , y ). The tracer motion across the shear #ow is very simple to describe: X(t) is a Gaussian random variable with mean k (t"x , y )"x #Pe wN t 6
288
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
and variance p (t"x , y )"2t , 6 as may be checked from the fundamental properties of the Brownian motion = (t). V The statistics of tracer motion along the shear direction is naturally much richer. The main features for a periodic shear #ow can be illustrated in the simple context of a single-mode sinusoidal shear #ow: v(x)"sin 2px . The shear-parallel tracer displacement in a more general periodic shear velocity "eld can be computed in a similar manner by decomposing the shear #ow into a sum of Fourier modes, see [230]. 2.3.1.1. Mean tracer displacement along sinusoidal shear. For the single-mode case under current consideration, the mean displacement of the tracer along the shear is given by
R
1sin(2pX(s))2 ds#1= (t)2 5 W 5 R (71) "Pe 1sin(2p(x #wN s#(2= (s)))2 ds . V 5 The expectation in the integrand may be computed by expressing it as a complex exponential: k (t"x , y )"Pe 7
(72) 1sin(2p(x #wN s#(2= (s)))2 "I1exp(2pi(x #wN s#(2= (s)))2 , V 5 V 5 where I denotes the imaginary part of the subsequent expression. The right-hand side now involves the expectation of the exponential of a Gaussian random variable 2piZ, which can be explicitly computed according to the following general formula [257]: 1ep 82"ep 687e\p68\6877 for Gaussian Z .
(73)
Evaluating Eq. (72) in this way, using the fact that = (s) is a Gaussian random variable with mean V zero and variance s, substituting the result into Eq. (71), integrating the resulting complex exponential, and taking the imaginary part of the resulting expression, we achieve the following exact formula for the mean displacement of the tracer along the shear: wN [cos(2px )!e\pRcos(2p(x #Pe wN t))] k (t"x , y )"Pe 7 2p(4p#PewN ) #Pe
[sin(2px )!e\pRsin(2p(x #Pe wN t))] . 4p#PewN
(74)
The mean displacement is essentially characterized by exponentially decaying sinusoidal #uctuations. Clearly, the oscillations are induced by the sweeping of the tracer across the sinusoidal shear #ow. Indeed, in the presence of a cross sweep wN O0 and in the absence of molecular di!usion, tracers would simply follow the deterministic streamlines 1 (cos(2px )!cos(2px)) y"y # 2pwN
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
289
and forever oscillate in a regular manner. Molecular di!usion breaks up the phase coherent motion of the tracer by pushing it randomly across di!erent streamlines, and thereby causes the periodic component of the tracer's motion to decay exponentially. If we view k (t"x , y ) as the mean center 7 of mass (along the y direction) of a cloud initially released from (x , y ), then we can say that molecular di!usion causes the cloud to spread out and eventually sample many period cells. The cloud's center of mass motion along the shear will therefore decay to zero by the law of large numbers, since the mean velocity along the shear is zero. Similar considerations apply when wN "0, except that the coherent motion in the absence of molecular di!usion would be ballistic motion along straight streamlines aligned parallel to the y direction. We note that in the long-time limit, the mean tracer displacement along the shear settles down to a constant: 2p Pe sin(2px )#Pe wN cos(2px ) . lim k (t"x , y )" 7 2p(4p#PewN ) R This is consistent with the above reasoning that the mean tracer velocity along the shear should eventually vanish due to the averaging e!ects of molecular di!usion. The "nite net displacement is determined by the accumulated transient momentum from early times where the tracer motion is still largely coherent. Indeed, the long-time net displacement is manifestly sensitive to the initial location x of the tracer. 2.3.1.2. Variance of tracer displacement along shear. An exact formula can also be derived for p(t"x , y ),1(>(t)!1>(t)2 )2 "1(>(t)!y )2 !(1>(t)2 !y ) , 7 5 5 5 5 the variance of the tracer displacement along the shearing direction. The second term in the rightmost expression is just the square of k (t"x , y ), which was evaluated above (see Eq. (74)). The 7 "rst term may be evaluated similarly, starting from Eq. (67b) and using the standard properties of the independent Brownian motions = (t) and = (t): V W
1(>(t)!y )2 "2t#Pe 5
R R
1sin(2pX(s))sin(2pX(s))2 ds ds 5
R R 1 1cos(2p(X(s)!X(s)))!cos(2p(X(s)#X(s)))2 ds ds "2t# Pe 2 R R 1 "2t# Pe 1cos(2p(wN (s!s)#= (s)!= (s)) V V 2 !cos(2p(2x #wN (s#s)#= (s)#= (s)))2 ds ds. V V 5 The integrand in the last expression is now in the form of a di!erence of the real parts of complex exponentials of Gaussian random variables, and may be evaluated using Eq. (73). The resulting integrand is again a complex exponential, which may be integrated over time in a straightforward albeit tedious manner. We "nally "nd that the variance of the tracer displacement along the shear
290
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
may be expressed as follows: p(t)"2(1#KM )t#PeB (Pe wN )(1!e\pRcos(2pPe wN t)) 7 WW #Pe B (Pe wN ) e\pR sin(2pPe wN t) #Pe B (Pe wN ) e\pR[cos(4px )!cos(2p(2x #Pe wN t))] #Pe B (Pe wN ) e\p R[sin(4px )!sin(2p(2x #Pe wN t))] #Pe B (Pe wN ) e\p R[cos(4px )!cos(4p(x #Pe wN t))] #Pe B (Pe wN ) e\p R[sin(4px )!sin(4p(x #Pe wN t))] #Pe B (Pe wN )[sin(2px )!e\p R sin(2p(x #Pe wN t))] #Pe B (Pe wN )[cos(2px )!e\p R cos(2p(x #Pe wN t))] , where Pe KM " WW 2(4pk#PewN )
(75)
(76)
is the enhanced di!usivity predicted by homogenization theory (cf. (52)) and each B satis"es H B (z)4C /(1#z) for some numerical constant C independent of PeH clet number. Precise formulas H H H for these constants may be found in [231]. We note that the variance of the tracer displacement consists of a sum of a linear, di!usive growth, a constant, and some decaying, oscillating terms. Their presence may be explained in a similar way to analogous terms in the mean tracer displacement. Another method of derivation of the mean-square displacement of a tracer in a shear #ow with cross sweep and some numerical plots of its behavior may be found in [131]. 2.3.1.3. Relaxation to asymptotic homogenization regime. Homogenization theory predicts that at su$ciently long times, the PDF for the tracer displacement along the shear direction y will obey an e!ective di!usion equation with e!ective di!usivity 1#KM given by Eq. (76). In conjunction with WW Eq. (70), this implies that the mean of the tracer displacement along the shear, k (t"x , y ) should 7 settle down to a constant (since there is no advective term in the homogenized di!usion equation), and that its variance p(t"x , y ) should grow at an asymptotically linear rate 7 lim p(t"x , y )&2(1#KM )t . 7 WW R The exact "nite-time calculations (74) and (75) are in agreement with these homogenization asymptotic predictions. The tracer displacement departs at "nite times from the e!ective di!usion description through some constant terms and transient, exponentially decaying, oscillatory terms. These "nite-time corrections to the homogenized behavior do depend sensitively on initial data, in contrast to the e!ective di!usivity KM . WW An important question for applications is how much time must pass before the tracer displacement is accurately described by the homogenized di!usion equation. It is readily seen from
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
291
Eqs. (74) and (75) that for a steady, periodic shear #ow with constant or zero cross sweep, the mean and variance of the tracer displacement relax to their homogenized limits on an order unity time scale which is independent of Pe. Recalling our reference units for nondimensionalization (Section 2.1.1), we conclude that for a periodic shear #ow with constant or zero cross sweep, the homogenization theory becomes valid on a time scale comparable to that over which molecular di!usion, acting alone, would cause an initially concentrated cloud of tracers to disperse over several period cells. That this should be the governing time scale for the validity of the homogenized equations may be understood from the fact that homogenization theory appeals to an averaging over the periodic #uctuations of the velocity "eld. Without molecular di!usion, tracers would forever move along neatly ordered, periodic streamlines. One must therefore wait for molecular di!usion to bu!et the tracer across all the streamlines in a period cell before the tracer has e!ectively sampled the velocity "eld over several period cells. We shall discuss in Paragraph 2.3.2.1 some types of periodic velocity "elds for which homogenization is achieved on faster time scales. 2.3.2. Monte Carlo simulations over xnite times We now discuss the statistical behavior of tracers over "nite-time intervals in more general #ows. The general stochastic equations of motion for a tracer are, in nondimensionalized units (see Eqs. (2a) and (2b)): dX(t)"Pe(V#*(X(t), t)) dt#(2 dW(t) ,
(77a)
X(t"0)"x , (77b) where W(t) is a vector-valued random process with each component an independent Brownian motion. We showed in Section 2.3.1 how to integrate these equations in an exact closed form for the case in which *(x, t) is a shear #ow, but this is not generally possible. Instead, Eq. (77a) can be quantitatively studied over "nite-time intervals through Monte Carlo numerical simulations. By this we simply mean a numerical discretization of these equations of motion, along with an arti"cial random number generator to simulate the discretized in#uence of the random Brownian motion term (2 dW(t) in (77a). In this way, one can numerically integrate the equations of motion (77a) and (77b) to produce a simulation of a single random realization of a tracer trajectory. To compute statistical quantities associated with the tracer motion, one simply performs a large number N of simulations of the tracer trajectory +XH(t),, , using independent simulations of the Brownian H motion for each case, and then averages over the sample. For example, one can numerically simulate the evolution of the mean-square displacement pX (t),1"X(t)!x "2 over a "nite-time interval by computing the average 1 , pX (t), "XH(t)!x " N H over the squared-displacements of the N independent runs. The two numerical concerns in Monte Carlo simulations are the accurate discretization of the equations of motion (77a), (77b) and the choice of a su$ciently large sample size N to obtain accurate statistics re#ecting the in#uence of the Brownian motion. Standard generalizations of the Euler and Runge}Kutta schemes for the stochastic equations (77a) may be found in [163]; one
292
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
must take care that a su$ciently small step size is chosen [231]. The random number generator must also be of su$cient quality to avoid spurious artifacts in the simulation of the Brownian motion [230]. Other substantial numerical challenges must be faced when conducting Monte Carlo simulations of the advection}di!usion of a tracer by a random velocity "eld with long-range correlations; we address this problem at length in Section 6. 2.3.2.1. Realization of homogenized behavior at xnite time. Monte Carlo simulations have been utilized by McLaughlin [231] and Biferale et al. [40] to examine the extent to which the e!ective di!usion behavior predicted by homogenization theory describes the evolution of the mean square tracer displacement over "nite time intervals in some interesting classes of #ows. In both of these works, it was found that, after some transient period, the homogenized di!usivity does accurately describe (half) the rate of growth of the mean-square tracer displacement computed from the Monte Carlo simulations. In particular, McLaughlin [231] showed in this way that the strong sensitivity of the e!ective di!usivity to the rationality or irrationality of the ratio of the components of a mean sweep across a two-dimensional steady periodic #ow (see Section 2.2.4) is a relevant e!ect at "nite times. He considered a Childress}Soward #ow (63) with e"0.5, with two alternative mean sweeps, V"(!15, 15) and V"(!15.5, 15). The former clearly has a lower order rational ratio of coe$cients than the latter, so greater enhanced di!usion is expected for the former in almost all directions. In Fig. 8, the enhanced di!usivity (along the x direction) computed by numerical solution of the homogenization cell problem (33) and by the Monte Carlo simulations are shown to agree to excellent accuracy. The actual de"nition used in [231] for the enhanced di!usivity as simulated by the Monte Carlo method over "nite time is:
1 N\ , (XH(t)!x ) H !1 dt , KM +!" VV 2t 5 where N"1000 independent realizations of the tracer paths were simulated. The initial time interval 04t45 was excluded from the average to reduce contamination by transient e!ects. It is evident from Fig. 8 that the slight di!erence between the mean sweeps V"(!15,15) and V"(!15.5,15) creates an order of magnitude di!erence between the transport rate for "nite PeH clet number Pe&10 after a "nite interval of time t&10 (in units nondimensionalized with respect to molecular di!usion). Examination of the simulated mean-square displacement as a function of time further revealed that the transient period of adjustment to the homogenized behavior was very rapid, on the order t&10\ and decreasing further as PeH clet number is increased [231]. This may be compared with the order unity time of adjustment found for the shear #ow with cross sweep discussed in Section 2.3.1. The di!erence is apparently due to the better mixing properties of the Childress}Soward #ow with mean sweep; the velocity "eld cooperates with the molecular di!usion to accelerate the rate at which a tracer fully samples a period cell and thereby attains its ultimately homogenized behavior [231]. Monte Carlo simulations have also been utilized by Crisanti et al. [77] for #ows closely akin to Childress}Soward #ows without mean sweep (see Paragraph 2.2.3.1). The simulated di!usivities are found to behave in a manner consistent with maximally enhanced di!usion along the direction of the open channels and minimally enhanced di!usion in the transverse direction where transport
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
293
Fig. 8. Log}log plot of enhanced di!usion coe$cient KM versus Pe (from [231]). Solid curves: numerical solutions of cell VV problem from homogenization theory, discrete markers: Monte Carlo simulations. Upper curve and crosses: mean sweep (!15, 15)R, lower curve and circles: mean sweep (!15.5, 15)R.
is blocked by streamlines. Finally, we note that Rosenbluth et al. [287] employed Monte Carlo simulations to check and extend their analytical high PeH clet number predictions for the e!ective di!usivity of a passive scalar "eld in a cellular #ow (62), which they obtained through matched asymptotic expansions rather than through homogenization theory. 2.4. Random yow xelds with short-range correlations The periodic #ows we have been considering so far have a precisely ordered structure. Many #ows in nature and in the laboratory, however, are at su$ciently high Reynolds number that
294
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
turbulent #uctuations are strongly excited, and these impart a disordered and chaotic character to the #ow pattern. A convenient way to approximate the complex spatio-temporal structure of such turbulent #ows is by modelling the velocity "eld as a random function. The typical realizations (particular random choices) of velocity "elds in such random models display a disordered structure which is quite di$cult to achieve through deterministic models. The stochastic nature of random velocity "eld models is also appropriate because of the unpredictability of the precise microstructure which a turbulent #ow will develop in systems where only large-scale information can be observed or speci"ed, as is always the case in practice. We will exclusively consider random incompressible #ows ( ' *(x, t)"0), but will not otherwise insist that the random velocity "elds are actual statistical solutions of the Navier}Stokes or Euler hydrodynamic equations. For reference, we provide an appendix in Section 2.4.5 discussing some fundamental de"nitions, notations, and facts about random functions which we will need throughout this report. We begin our exploration of the advection of a passive scalar "eld by a random velocity "eld by considering two large-scale, long-time rescalings which rigorously lead, in certain asymptotic limits, to e!ective di!usion equations for the mean passive scalar "eld (averaged over the statistics of the velocity "eld). One of these, which we discuss in Section 2.4.1, corresponds to a limit in which the spatial scale of the velocity "eld is scaled up in proportion to the spatial scale of the passive scalar "eld. The e!ective di!usivity in this limit is given by the relatively simple Kubo formula [188]. Next, in Section 2.4.2, we consider the same sort of large-scale, long-time limit of a passive scalar "eld advected by a steady random velocity "eld as we did in the context of periodic velocity "elds in Section 2.1.2. A similar homogenized description results, provided that the velocity "eld has su$ciently short-ranged correlations so that a strong separation can exist between the observed macroscale and the spatial scale of the velocity "eld #uctuations [12]. Throughout Section 3, we will explore examples of random shear #ows which have strong long-range correlations which violate the conditions for the applicability of the homogenization theory, and the tracer motion is explicitly shown in such examples to proceed superdi!usively at long times. Stieltjes measure formulas and variational principles for the e!ective di!usivity, analogous to those presented for periodic #ows in Section 2.1.4, will be developed for steady, homogenous random "elds in Section 2.4.3. In Section 2.4.4, we discuss the application of the homogenization theory to some example random #ows. 2.4.1. Kubo theory Before presenting the homogenization theory for random velocity "elds, we consider another type of large-scale, long-time asymptotic rescaling which also leads to an e!ective di!usion equation, but with a much simpler formula for the e!ective di!usivity. We will express this rescaling in terms of dimensional functions and variables. We consider a given homogenous, stationary, mean zero, incompressible random velocity "eld *(x, t). We rescale the passive scalar "eld to large scales and long times ¹BI (x, t),dI \B¹(dI x, dI t) , with dI P0, and simultaneously rescale the length scale of the random velocity "eld to remain on the same order as that of the passive scalar "eld: *BI (x, t)"*(dI x, t) .
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
295
This rescaling of the length scale of the velocity "eld distinguishes the present asymptotic rescaling from that leading to homogenization theory; see Section 2.4.2 below. In particular, dI is not the ratio between the spatial scales of the velocity and passive scalar "elds; it may rather be thought of as the ratio between some "xed reference scale and the common scale of the velocity and passive scalar "eld. The rescaled advection}di!usion equation reads R¹BI (x, t)/Rt#dI \*(x, dI \t) ' ¹BI (x, t)"iD¹BI (x, t) , ¹BI (x, t"0)"¹BI (x) .
(78)
We now seek a simpli"ed description for the mean statistics 1¹BI (x, t)2 in the limit that dP0; angle brackets denote a statisical average over all randomness. The general obstacle to obtaining an e!ective equation for the mean passive scalar density is the di$culty in evaluating the average of the nonlinear term 1* ' ¹2 in terms of 1¹2 or other simple statistical objects (see Section 1). As dI P0, however, one can hope to approximate 1dI \ *(x, dI \t) ' ¹BI (x, t)2 accurately by accounting for the velocity "eld in some averaged way, since there is a strong separation between the time scales characterizing the rate of change of the velocity "eld and the passive scalar "eld. To have any hope of achieving this program, the original velocity "eld must have some su$ciently strong decorrelation in time. In particular, *(x, t) cannot be steady or have very long-term memory. In mathematical terminology, the velocity "eld *(x, t) must have certain mixing properties in time [143]. Provided that the unscaled velocity "eld *(x, t) does obey certain mixing and other technical regularity conditions, it can be shown [44,159,262] that for bounded and su$ciently smooth initial data ¹ (x), the passive scalar "eld ¹BI (x, t) converges uniformly over "nite intervals of (rescaled) time to a nontrivial limit lim ¹BI (x, t)"¹M (x, t) BI which obeys an e!ective di!usion equation R¹M (x, t)/Rt" ' (KH ¹M (x, t)) , ¹M (x, t"0)"¹ (x) , with e!ective di!usion tensor
(79)
KH"I#K ) given by the Kubo formula [188]
K " )
RI (0, t) dt ,
RI (x, t)"1*(x#x, t#t)*(x, t)2 .
(80)
The matrix K is guaranteed, by general properties of correlation functions ([341], Section 4), ) to be non-negative de"nite, and therefore represents an enhanced di!usivity. Note that the Kubo
296
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
di!usivity K depends only the amplitude of the velocity "eld and its integrated temporal ) structure; there is no dependence on the spatial structure of the random "eld *(x, t). Kubo formally argued this result for the case of a purely time-dependent random velocity "eld *"*(t) and no molecular di!usion in his pioneering paper [188]. In this case the spatial rescaling of the velocity "eld is trivial, and one concludes from the above that the long-time asymptotics of the mean passive scalar "eld in a spatially uniform velocity "eld with short-ranged temporal correlations is, under appropriate conditions, governed by some positive e!ective di!usivity, even in the absence of molecular di!usion. Stratonovich [313] formulated the more general result stated above for the case of random velocity "elds with spatio-temporal #uctuations, and Khas'minskii [159] was the "rst to provide a rigorous derivation of these asymptotics, though under somewhat restrictive conditions on the random velocity "eld model. Later work widened the applicability of Khas'minskii's theorem to a broader class of random velocity "elds (see [44,262], and the references in [134]). We note these theorems are generally stated without accounting for molecular di!usion, but their proofs can be easily extended to include it. Note that the Kubo theory requires no fundamental restriction (other than regularity) on the spatial structure of the velocity "eld, and thus can be applied under certain circumstances to velocity "elds with long-range spatial correlations. It has indeed been shown [15] that a slightly generalized form of Kubo theory describes the large-scale, long-time behavior of the mean passive scalar "eld in a certain natural class of random velocity "eld models with qualitative features similar to those of fully developed turbulence. We discuss this point brie#y in Paragraph 3.4.3.3. As with homogenization theory, we must remember that Kubo theory is a long-time asymptotic theory, and the description of the evolution of the passive scalar "eld by an e!ective di!usion is only valid at su$ciently long times. Over "nite time intervals, the passive scalar "eld may behave in a radically di!erent way. For example, very simple models can be formulated for which the passive scalar "eld has an e!ectively negative di!usion over "nite time intervals (see Section 3.4.4 and [18,131,141]). Also, Kubo theory crucially requires that the velocity "eld not have long-range memory. Superdi!usion can result from random, purely time-dependent velocity "elds with long-range temporal correlations [18], as we will demonstrate in Section 3.1.2. We "nally remark that the asymptotically rescaled advection}di!usion equation (78) can also be interpreted as describing a limit in which the correlation time of the velocity "eld is very fast compared to the advection time scale, without any large-scale, long-time rescaling. We discuss this perspective in Paragraph 4.1.3.1. 2.4.2. Homogenization theory for random yows We now revisit the large-scale, long-time rescaling of the passive scalar "eld introduced in the context of periodic velocity "elds in Section 2.1.2 and apply it to the case of advection by a steady, random, incompressible, homogenous velocity "eld. In this rescaling, the velocity "eld is held"xed while the length scale of the initial passive scalar "eld is made large relative to the length scales characterizing the random velocity "eld. A simpli"ed, homogenized description of the passive scalar is sought in the asymptotic limit, with the idea that over large scales, the random velocity "eld ought to have some averaged bulk e!ect. There is, however, an important distinction between periodic and random velocity "elds that comes into play here. Periodic velocity "elds have a well-de"ned, single length scale which can be de"nitely separated by a factor d from the large length scale characterizing the passive scalar "eld. Random velocity "elds, on the other hand, can
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
297
in general have a continuum of actively excited scales which can moreoever extend to arbitrarily large scales. Homogenization requires the notion that the velocity "eld #uctuates on much smaller scales than the passive scalar "eld, and this may never happen in a meaningful sense under a formal large-scale, long-time rescaling (10) of the passive scalar "eld if the velocity "eld has strong long-range correlations. A simple and general criterion for the applicability of a homogenized di!usive behavior for the passive scalar "eld on large scales and long times was formulated in [12] in terms of the "niteness of the following Pe& clet number de"ned for random velocity "elds:
Tr RK (k) 1 , dk"i\ , E(k)k\ dk 2p 1B 4p"k" where RK is the spectral density of the velocity "eld: Pe,i\
RK (k)"
1B
(81)
e\p k ' xR(x) dx ,
R(x)"1*(x#x)*(x)2 , and
1 E(k)" 2
Tr RK (k-K ) d-K 1B\ is the energy spectrum (integrated over spherical shells of constant wavenumber). It is readily checked that if the random velocity "eld has #uctuations sharply concentrated near a single length scale ¸ , then the PeH clet number (81) is proportional to ¸ 1"*"2/i, which is comparable to the T T de"nition (7) of PeH clet number for a periodic velocity "eld. A "nite value of the PeH clet number is equivalent to a su$ciently weak distribution of energy at low wavenumbers (large scales), which implies that the random velocity "eld's spatial correlations are su$ciently short ranged. In particular, a characteristic length scale ¸ "iPe/1"*"2 (82) T can be associated to any random velocity "eld with "nite PeH clet number; ¸ formally describes the T largest length scale of the velocity "eld with substantial energy. An in"nite value of the PeH clet number may be viewed as a manifestation of strong long-range correlations; examples of such #ows will be studied in Section 3. It was shown in [12] that a rigorous homogenized large-scale, long-time description of the passive scalar "eld advected by a random velocity "eld is possible whenever the PeH clet number is "nite. We shall present this homogenized theory in terms of variables and functions nondimensionalized in a similar way as in Section 2.1.1, using now the length scale (82) and time scale ¸/i. T The nondimensionalized advection}di!usion equation reads R¹(x, t)/Rt#Pe *(x) ' ¹(x, t)"D¹(x, t) , ¹(x, t"0)"dB¹ (dx) ,
298
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where d denotes a ratio between the length scale ¸ of the random velocity "eld and the length scale T ¸ of the initial passive scalar "eld. 2 De"ning the rescaled passive scalar "eld as before (10), ¹B(x, t),d\B¹(dx, dt) , we obtain the following advection}di!usion equation, rescaled to large spatial scales and long times R¹B(x, t)/Rt#d\Pe *
x ' ¹B(x, t)"D¹B(x, t) , d
(83)
¹B(x, t"0)"¹ (x) . For incompressible random, homogenous, ergodic velocity "elds with "nite values of the PeH clet number (81), the following homogenization theorem can be established. 2.4.2.1. Homogenized ewective diwusion equation for random velocity xelds with short-range correlations. In the long time, large-scale limit, the rescaled passive scalar "eld converges to a "nite limit lim ¹B(x, t)"¹M (x, t) , B which satis"es an e!ective di!usion equation
(84)
R¹M (x, t) " ' (KH ¹M (x, t)) , Rt
(85a)
¹M (x, t"0)"¹ (x) ,
(85b)
with constant, positive-de"nite, symmetric di!usivity matrix KH. This e!ective di!usivity matrix can be expressed as KH"I#K M , where I is the identity matrix (representing the nondimensionalized molecular di!usion) and K M is a nonnegative-de"nite enhanced di+usivity matrix which represents the additional di!usivity due to the random #ow. The enhanced di!usivity matrix K M can be computed as follows. Let v(x) be the (unique) random "eld with the following properties: E v(0)"0, E v(x) is a homogenous, random tensor "eld with 1# v(x)#2(R, E v solves the following random elliptic equation on 1B (in unscaled coordinates): Pe *(x) ' v(x)!Dv(x)"!Pe *(x) .
(86)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
299
Then the components of the enhanced di!usivity matrix may be expressed as K M "1 s ' s 2 , (87) GH G H where the angle brackets denote an ensemble average over the statistics of the velocity "eld. A homogenization theorem of this form was established by Avellaneda and the "rst author [12], using the framework of homogenization of equations with random coe$cients developed by Papanicolaou and Varadhan [264], and by OelschlaK ger [256] using a somewhat di!erent probabilistic approach. Some additional technical conditions required in these proofs were later removed by Fannjiang and Papanicolaou [98]. The "nite PeH clet number condition for the homogenization theorem is essential. Explicit shear #ow examples [10] rigorously demonstrate that when the PeH clet number (81) is in"nite, the mean passive scalar "eld will generally not be described by an e!ective di!usion equation at large scales and long times; see Section 3. The "nite PeH clet number condition for the applicability of the homogenization theorem for random velocity "elds cannot therefore be weakened unless some explicit reference is made to the #ow geometry. The homogenization theorem for random "elds is very similar to that which was stated for periodic velocity "elds in Section 2.1.2. Indeed, modulo some technicalities, `periodicitya has simply been converted to `statistical homogeneitya, and averages over the period cell have been replaced by ensemble averages over the velocity "eld. Indeed, the homogenization theorem for steady, periodic velocity "elds can be essentially embedded into the random homogenization theorem by de"ning a homogenous random "eld as periodic velocity "eld with the origin of the period lattice uniformly distributed over a "xed period cell. A homogenization theorem for spatio-temporal random velocity "elds may be found in [245,244]. 2.4.2.2. Comparison between Kubo theory and homogenization theory for random velocity xelds. Before moving on to develop the homogenization theory for random velocity "elds, we pause to compare it to the Kubo theory presented in Section 2.4.1. The principal di!erences in the asymptotic setup are that E the Kubo theory rescales the spatial scale of the velocity "eld in tandem with the spatial scale of the passive scalar "eld, while the homogenization scaling leaves the velocity "eld "xed, and leads to a strong formal separation of the spatial scales of the velocity and passive scalar "eld; E the homogenization theory is formulated for steady velocity "elds, whereas the Kubo theory relies crucially on su$ciently rapid decorrelation of the velocity "eld in time. The Kubo and homogenization theories may therefore be viewed as complementary coarsegrained averaging theories. The e!ective averaging in homogenization theory of steady random velocity "elds with short-range correlations occurs because a large spatial volume contains many regions between which the velocity #uctuations are essentially independent. Therefore, any coarse volume will sample and average over the full statistics of the velocity "eld, assuming it is ergodic with short-ranged correlations. Under the Kubo rescaling, on the other hand, the spatial scale of the velocity "eld is scaled up along with the coarse scale of the passive scalar "eld, so there is no averaging over space. Instead, the e!ective averaging occurs because a coarse-grained time interval can be broken up into many subintervals over which the velocity "eld #uctuations are essentially
300
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
independent, provided that the random velocity "eld has su$ciently strong mixing (forgetting) properties as a function of time. We can therefore associate the homogenization theory of steady random velocity "elds to spatial averaging, and the Kubo theory to temporal averaging. This explains why the homogenization theory requires short-range correlations in space, but no temporal decorrelation, whereas the Kubo theory requires short-range correlations in time (through the mixing assumption), but not necessarily any spatial decorrelation. One interesting question is why the Kubo e!ective di!usivity formula (80) is so much simpler than the homogenized e!ective di!usivity formula (87). The answer is that spatial averaging by a tracer is much more complex than temporal averaging. Indeed, the temporal coordinate of a tracer marches in a trivial linear manner as time proceeds, whereas the variation of its spatial coordinate depends intricately on the properties of the velocity "eld. Under Kubo rescaling, the velocity "eld varies slowly in space but rapidly in time, so the #uctuations in a tracer's velocity come primarily from the intrinsic (Eulerian) temporal decorrelation of the velocity "eld, rather than due to the tracer's motion across the velocity "eld. Indeed, the tracer e!ectively feels a velocity "eld which is (locally) constant in space but #uctuating in time. Therefore, the averaged motion of the tracer may, in the asymptotically rescaled Kubo limit, be simply expressed in terms of time averages of the statistics of the velocity "eld (80). A tracer moving in a steady #ow, on the other hand, samples the velocity "eld in a nonuniform way; it will, for example, spend a disproportionate amount of time near stagnation points. Hence, the homogenized e!ect of the steady velocity "eld felt by the tracer cannot be expressed in terms of a straightforward volume average of the velocity "eld. The e!ective bulk averaging of the tracer transport in homogenization theory is instead intricately dependent on the velocity microstructure and the e!ects of molecular di!usion, as expressed through the cell problem (86). 2.4.3. Alternative representations for ewective diwusivity in random yows A Stieltjes integral representation and some variational principles can be formulated for the e!ective di!usivity of a passive scalar "eld in a random velocity "eld with short-range correlations, in analogy to the periodic velocity "eld case discussed in Paragraph 2.1.4.1. The formulas have a similar appearance in both the periodic and random settings, though their derivation is more technically involved in the random case. 2.4.3.1. Stieltjes integral representation. The e!ective di!usivity KH of the passive scalar "eld in a steady random velocity "eld with Pe(R was shown by Avellaneda and the "rst author [9,12] to be expressible as a Stieltjes integral,
eL ' K M ' eL "Pe#* ' eL # \
do* 1#Pek
(88)
\ with respect to a certain measure do* which is nonnegative and normalized to have total integral equal to one. The Stieltjes integral formula (88) is formally identical to that for the periodic case (41); only the de"nition of its components are modi"ed. For example, the de"nition of the Sobolev norm # ) # in the prefactor involves now an average over the random ensemble of velocity "elds \ rather than over a period cell (40). Most importantly, the operator AV de"ned by AVu" D\((V#*(x)) ' u)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
301
which arises in the formal derivation of the Stieltjes integral representation is now a random operator acting on homogenous random "elds with "nite variance de"ned over 1B. The measure do* which appears in the Stieltjes integral representation is consequently no longer a discrete measure concentrated on a countable number of points, but is instead generally a continuous distribution (with possibly some discrete components). Therefore, the Stieltjes integral representation will not generally be reducible to a discrete sum (39) as in the periodic case. Moreover, the support of do* may be unbounded, in which case, Eq. (88) implies that a small PeH clet number expansion of the e!ective di!usivity will be divergent for all PeH clet number. The work of Kraichnan [181] suggests that this does indeed occur for velocity "elds with Gaussian statistics. The Stieltjes integral representation therefore represents in these cases a rigorous resummation of a formal perturbative power series with zero radius of convergence [9,11]. We stress, however, that this rigorous resummation is valid only when the notion of an e!ective, homogenized di!usivity is itself meaningful, i.e. when there is a strong separation of scales between the velocity and passive scalar "eld. Passive scalar transport in fully developed turbulence, in particular, cannot be handled in this manner (see Section 3.4.3). As in the periodic case, the existence of the Stieltjes integral representation implies that certain PadeH approximants can be built from the coe$cients of a formal (but divergent) small PeH clet number series which rigorously bound the e!ective di!usivity at all PeH clet numbers. Criteria for maximal and minimal di!usivity are more di$cult to formulate for the case of random velocity "elds due to the loss of discreteness of the measure do* appearing in the Stieltjes integral representation formula (88). Generalizations of the Stieltjes integral representation to time-dependent random velocity "elds with Pe(R were derived by Avellaneda and Vergassola [20]; the modi"cation is parallel to that of the periodic case discussed in Paragraph 2.1.4.1 (see in particular Eq. (42)). 2.4.3.2. Variational principles. The e!ective di!usivity in a homogenous random "eld can be represented through a variational principle (47) just as in the periodic case. The only di!erences are that: E the trial functions u are now homogenous random "elds de"ned on 1B, with u!eL a generalized gradient of a homogenous random "eld with "nite variance, and E the average of the functional "u"#Peu ' K ' u over the period cell is replaced by an average over the statistical ensemble of the random velocity "eld. The original variational formulation for homogenous random velocity "elds was derived by Avellaneda and the "rst author in [12]. Other types of variational principles for random velocity "elds, particularly with a cellular structure, were later put forth by Fannjiang and Papanicolaou [99]. These variational principles can be used to obtain rigorous bounds and estimates for the e!ective di!usivity in random velocity "elds [12,99] in a similar way to periodic velocity "elds, but they are a bit more di$cult to implement in practice since the fundamental spatial domain is in"nite space rather than a compact period cell [99]. 2.4.4. Examples of ewective diwusivity for random yows Most of the studies of tracer transport in random velocity "elds with short-range correlations appear to focus on #ows with vortex or cellular structures, and we mention a few of these here.
302
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Avellaneda and the "rst author [12] derived an exact formula for the e!ective di!usivity in a #ow consisting of a random tiling of two-dimensional space by a family of circular vortices of various sizes but common shape and direction of rotation. The special structure of this random #ow permitted the reductionof the cell problem (86) to an exactly solvable, radially symmetric PDE on a single disk. It was found in this way that the the e!ective di!usivity scales as Pe at high PeH clet number, just as in the case discussed in Section 2.2.3 where the vortices are arranged as periodic cells. Avellaneda et al. [19] studied two other kinds of random vortex models, in which vortices of "xed shape and size were thrown down onto the plane according to a random Poisson process, either with a common or randomly chosen orientation. The e!ective di!usivity in these #ows increased as a function of the vortex density, but was only enhanced by a factor of 2.5 at the highest density simulated, with Pe"100. An approximate Lagrangian analysis (see Section 3.1.3) of these random vortex #ows is also o!ered in [19]. Isichenko et al. [144] formulated an interesting conjecture, based on some scaling laws from percolation theory, that the e!ective di!usivity in a generic (or `common-positiona) random, homogenous, #ow with short-range correlations should scale at high PeH clet number as Pe. Using variational principles along with some scaling hypotheses from percolation theory, Fannjiang and Papanicoloau [99] verify this result for certain #ows obtained by a random perturbation of the canonical steady cellular #ow (62). These authors [99] also study tracer transport in randomized checkerboard #ows (in which the #ow in each cell is randomly turned on or o!). 2.4.5. Appendix: Random velocity ,elds An extensive and accessible treatment for the rigorous formulation of random functions may be found in [341,342]. We restrict ourselves here to formally setting forth some elementary notions which we will use throughout our discussion of random velocity "elds. For concreteness, we will introduce the de"nitions of various aspects of random functions in the context of random velocity "elds, but they clearly apply to other random functions, such as the passive scalar "eld. A random function may be described quite simply as a random variable taking values in some given function space. In practice, the probability measure on the function space is described implicitly through a su$ciently precise speci"cation of the statistical properties of the random function. A generic way of describing a random function, such as the random velocity "eld *(x, t), is to present all its ,nite-dimensional distributions and to declare * to be separable. A "nite-dimensional distribution of *(x, t) is just the (joint) probability law for the values of * at a given "nite collection of points. Knowing all the "nite-dimensional distributions is equivalent to a rule for computing Prob+*(xH, tH)3A , j"1,2, N, H for every "nite collection of space}time points +(xH, tH),, and Borel subsets A of the range space H H of F. If one wishes to prescribe these, the rules must obey some simple consistency conditions ([41], Section 36). The separability condition is technical ([41], Section 38), but in practice means that we want the realizations of * to be as nice as possible, given the "nite-dimensional distributions. Random functions de"ned over a multidimensional spatial domain (and possibly time) are often referred to as random ,elds, and we will use this terminology hereafter.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
303
2.4.5.1. Homogenous and stationary random xelds. Many of the random "elds arising in our models have statistical space}time symmetries, and we now de"ne the most basic of these. A more detailed discussion may be found in ([341], Ch. 4). A random "eld *(x, t) is called: E stationary if all "nite-dimensional distributions are invariant under time translation: Prob+*(xH, tH)3A , j"1,2, N,"Prob+*(xH, tH#t)3A , j"1,2, N, H H for any real t. E homogenous if all "nite-dimensional distributions are invariant under rigid translations in space: Prob+*(xH, tH)3A , j"1,2, N,"Prob+*(xH#x, tH)3A , j"1,2, N, H H for any x31B. Colloquially, a homogenous random "eld looks statistically the same at every point of space, and a stationary random "eld looks statistically the same at every moment of time. Another important statistical symmetry which we will discuss in Section 4 is isotropy, which is statistical equivariance under arbitrary rotations and re#ections. We refer to ([341], Section 22) for a detailed discussion of the meaning and implications of statistical isotropy for a random "eld. 2.4.5.2. Mean and two-point correlation function. Two fundamental statistical functions associated to a random "eld are the mean l(x, t)"1*(x, t)2 and the two-point correlation tensor RI (x, x, t, t)"1*(x, t)*(x, t)2, where angle brackets denote statistical averages. So long as there is no danger of ambiguity with reference to higher-order correlation functions, we will often refer to the `two-point correlation tensora as simply the `correlation tensora (or `correlation functiona for scalar random "elds). The arguments of the correlation tensor (function) will be called observation points (or sites, locations). When *(x, t) is a homogenous, stationary, random "eld, then all statistical descriptors, including the mean statistics 1*(x, t)2 and the two-point correlation function 1*(x, t)*(x, t)2 must be invariant under space and time translation. The mean must therefore be constant, and the two-point correlation function must depend only on the di!erences between the space and time coordinates of the observation points: 1*(x, t)*(x, t)2"RI (x!x, t!t) . 2.4.5.3. Spectral density. The structure of a homogenous, stationary, random "eld is in some ways more directly expressed in terms of the Fourier transform of the correlation tensor
RKI (k, u)"
1B
e\p k ' x>SRRI (x, t) dt dx \
(89)
than in the correlation tensor itself. We shall call RKI (k, u) the spectral density tensor of the random "eld. For a scalar random "eld, the spectral density tensor is just a scalar function.
304
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
An important theorem due to Khinchine states that the class of correlation tensors of homogenous, stationary random "elds coincides with the class of tensor functions for which the Fourier transform is everywhere a non-negative de"nite, Hermitian matrix ([341], Section 9). This means that for any homogenous, stationary random "eld, the spectral density tensor RKI (k, u) is guaranteed to be a non-negative-de"nite, Hermitian matrix for each k and u. It is often convenient to condense the spectral density tensor of the velocity "eld *(x, t) by integrating over shells of constant wavenumber "k"; this produces the spatio-temporal energy spectrum:
1 EI (k, u)" 2
B\
Tr RKI (kuK , u) duK ,
1 where SB\ is the (d!1)-dimensional sphere of unit radius. This function speci"es the density of energy E"1"*"2 in wavenumber-frequency space: the amount of energy contained in the band k$Dk, u$Du is I>DDIS>DDSE(k, u) dk du. The usual &&energy spectrum'' E(k) is the integral of the I\ I S\ S spatio-temporal energy spectrum over frequency space,
E(k)"
EI (k, u) du ,
\ and describes the statistical spatial structure of the velocity "eld at any given moment of time. The full spectral density tensor RKI (k, u) provides a more detailed resolution of the random "eld *(x, t) into #uctuations of various wavenumbers k and frequencies u, including correlations between various components of the velocity "eld ([341], Sections 9, 22.1 and 22.2). 2.4.5.4. Gaussian random xelds. Gaussian random ,elds are the extension of Gaussian random variables to the random function setting. By de"nition, all "nite-dimensional distributions of a Gaussian random "eld are Gaussian. Gaussian random "elds are therefore completely described by their mean and correlation tensor [341,342]. Homogenous, stationary, Gaussian random "elds *(x, t), may also be de"ned through their spectra, using the Khinchine theorem ([341], Section 9). One simplify chooses an arbitrary constant mean and an arbitrary spectral density tensor RIK (k, u) which is everywhere a non-negative de"nite, Hermitian matrix such that the entries of RKI (k, u) and RIK (!k,!u) are complex conjugates of one another. Then, by the Khinchine theorem, there exists a well-de"ned Gaussian random "eld with the speci"ed constant mean and correlation tensor
RI (x, t)"
1B
ep k ' x>SRRIK (k, u) du dk . \
3. Anomalous di4usion and renormalization for simple shear models In Section 2, we have discussed some general mathematical theories for the computation of the e!ective di!usion of a passive scalar "eld at large scales and long times. The underlying theory relies on the velocity "eld being periodic or having short-range correlations so that a strong scale separation between velocity and passive scalar scales can exist. While such theories can work well for moderately turbulent #ows with su$ciently simple spatial structures, they may not furnish
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
305
a good description for tracer di!usion in afully developed turbulent #ow at high Reynolds number. As we shall discuss in more detail in Section 3.4.3, fully developed turbulent #ows are characterized by strong spatio-temporal correlations extending over a wide range of scales, all the way up to the scale of the external forcing. In many applications particularly in atmospheric science [78,196], this forcing scale is on the same order of magnitude as the macroscales on which scientists wish to explicitly describe the #ow. Consequently, there is no clean separation between the active scales of the velocity "eld and the scales of observational interest. Homogenization theory may not therefore be adequate to describe the macroscale passive scalar dynamics in these situations. The development of simpli"ed e!ective equations for the large-scale passive scalar statistics is, however, of particular practical importance in the numerical simulation of transport processes in highly turbulent environments. It is often not possible, even on the largest contemporary supercomputers, to explicitly resolve all the active scales of turbulence [154]. A scheme is therefore needed for assessing the e!ects of the continuum of energetic but unresolved turbulent small scales of motion on the resolved scales, without having to compute them in full detail. The simplest modelling strategies account for the unresolved small scales by replacing the molecular di!usivity with a larger `eddy di!usivitya. The value of this eddy di!usivity may be estimated by some ad hoc procedure such as mixing length arguments or through approximate, analytical theories based on perturbation expansions in a small parameter [182], ideas from renormalization group (RNG) theory [227,243,300,344], or simplifying assumptions concerning higher-order correlations between the velocity and scalar "eld [31,196]. It has been pointed out by several authors [166,182,286], however, that such an eddy di!usivity model may not be su$cient to capture the e!ects of the unresolved scales of the velocity "eld when they are not well separated from the cuto! scale. More complex e!ective large-scale equations for the passive scalar "eld have been proposed based on the renormalization group [286] or perturbation expansions [166], often resummed according to various renormalized perturbation theories from "eld theory [177,227,285]. These equations are typically nonlocal in space and time, re#ecting large-scale interactions mediated by the unresolved small scales [286] and/or memory e!ects coming from the signi"cant spatiotemporal correlations of the unresolved scales [166,177,285]. The investigation of the relative merits and shortcomings of these various approximate closure theories for turbulent di!usion is still very much an active area of research [227]. In response to the above issues, Avellaneda and the "rst author [10,14] have developed a mathematically rigorous theory for turbulent transport in a class of Simple Shear Models in which the velocity "eld has a strati"ed geometry:
*(x, t)"*(x, y, t)"
w(t)
v(x, t)
.
The shearing component v(x, t) is taken as a homogenous and stationary, mean zero random "eld, and the spatially uniform transverse sweeping component w(t) is taken as a stationary random process with possibly nonzero mean. The case of no cross sweep w(t)"0 was analyzed in [10], and the case of a constant mean cross sweep w(t),wN was studied in [14,141]. The inclusion of randomness in w(t) in the present work is new. The virtue of the Simple Shear Model is that the tracer trajectory equations may be exactly integrated for all times, as we saw in Section 2.3.1 for the case of deterministic, periodic velocity
306
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
"elds. In the present case of random velocity "elds, these formulas still provide a representation of the tracer position in terms of explicit random variables which are open to mathematical analysis from a variety of angles [10,14,18,141,207,233]. The random tracer trajectories maintain a very rich and subtle statistical behavior [10,14,141], as will be demonstrated throughout Section 3. The Simple Shear Model therefore provides a means of studying various nontrivial aspects of turbulent di!usion in a precise manner. Versions of shear #ow models also have been utilized to study horizontal mixing in the ocean [347]. While the mathematical analysis of the Simple Shear Model relies heavily on the special geometric structure, it can include several statistical features of a realistic turbulent #ow, such as spatial correlations extending over a wide range of scales along with a reasonable temporal structure. Moreover, many #ow "elds in geophysical applications do have an underlying shear structure. We will discuss another turbulent di!usion model in Section 4 which instead drastically simpli"es the temporal correlation structure of the random velocity "eld, but permits a more general geometry [152,179]. Our purpose in investigating these simpli"ed models is that through unambiguous computations and analysis, we can obtain insight concerning subtle physical features of turbulent transport which might be missed by crude reasoning on a `realistica turbulent velocity "eld model. Throughout Section 3 we will examine a rich variety of Simple Shear #ows with an emphasis on describing #ows which generate anomalous di+usion of tracers, and on identifying the physical mechanisms responsible. By anomalous di!usion, we mean statistical tracer motion which departs from the standard situation in which its coarse-scale, long-time behavior resembles an ordinary Brownian motion. We will also at times mention some "nite-time anomalies in which the tracer acts as if it had a temporarily negative di!usion coe$cient (Sections 3.4.4 and 3.5.1). Exactly solvable models also provide excellent test problems [13,17,300] for assessing the strengths and weaknesses of the approximate closure theories mentioned above which seek to furnish simpli"ed descriptions of turbulent di!usion on the macroscale. We will brie#y discuss this use of simpli"ed models in Section 7. Overview of Section 3: We will approach the study of the Simple Shear Model through submodels of increasing complexity. First we consider in Section 3.1 the Random Sweeping Model in which v(x, t)"0, and the tracer is advected only by the time-dependent random "eld w(t). Even in this extremely simpli"ed model, the tracer motion can behave anomalously, depending on the longrange (low-frequency) statistical characterization of w(t). We classify the various forms of anomalous di!usion, and discuss their origin on a heuristic level. We then show from a Lagrangian viewpoint how this intuition gleaned from the Random Sweeping Model can be applied to understand qualitatively the circumstances under which anomalous tracer di!usion may arise in general #ows. This theme will be invoked repeatedly in later subsections as we interpret the mathematically derived turbulent di!usion formulas in the Simple Shear model. Next, in Section 3.2, we develop the Random Steady Shear (RSS) Model, in which the shear "eld is taken as a random steady #ow v(x, t)"v(x) with Gaussian statistics and correlation function
1v(x)v(x#x)2,R(x),
E("k") ep IV dk , \
with E(k)&A "k"\C #
(90)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
307
at low wavenumbers k. Such a velocity "eld could model #ow through a strati"ed porous medium. The exponent e(2 measures the strength of the long-range correlations of velocity "eld. As e increases towards e"2, the long-range spatial correlations become more pronounced. We concentrate on how the transport of tracers along the shearing direction y depends on the long-range spatial structure of v(x), the presence of molecular di!usion, and various types of cross sweeps w(t). Following [10,141], we derive exact formulas for the mean-square tracer displacement along the shear and analyze their physical content. The e!ects of temporal #uctuations in the shear #ow v(x, t) are studied within the random spatio-temporal shear (RSTS) Model in Section 3.3. We take v(x, t) to be statistically stationary in time, with the idea of qualitatively mimicing turbulent #ows which have reached a quasiequilibrium state in response to some statistically stationary external driving and internal viscous dissipation. Motivated by the inertial-range theory of turbulence (see, for example, [320]), we associate a wavenumber-dependent correlation time q(k) to shear #ow #uctuations with spatial wavenumber k, where lim q(k)&A "k"\X. The exponent z50 describes how slowly the large I O scales (low wavenumbers) of the shear #ow vary in time; zPR corresponds to a steady limit in which the large scales are frozen, whereas z"0 describes the opposite limit in which arbitrarily large scales decorrelate at a uniformly rapid rate. We explore how the temporal structure of the large scales, as described by z, in#uences the shear-parallel transport of tracers. We next consider the full probability distribution function (PDF) for the shear-parallel motion of a single tracer, which is equivalent to the description of the evolution of the mean 1¹(x, y, t)2 of the passive scalar "eld density. For velocity "elds with su$ciently localized spatial correlations, the homogenization theory discussed in Section 2 shows that the PDF for the position of a single tracer approaches a Gaussian distribution at long times. Equivalently, the mean statistics are governed by an ordinary di!usion equation with an enhanced di!usion coe$cient. In Section 3.4, we demonstrate through explicit examples how the PDF for a tracer can forever deviate strongly from a Gaussian distribution in Simple Shear Models with su$ciently strong long-range correlations [10]. A related feature is that the large-scale, long-time evolution of the mean statistics is not described by a standard di!usion PDE, but rather by a nonlocal di!usion equation. This anomalous large-scale long-time behavior actually manifests itself in a broad range of models with a nearly strati"ed structure [16]. In Section 3.4.3 we develop a modi"cation of the Simple Shear Model which permits the study of velocity "elds with a statistically self-similar inertial range of scales ¸ ;r;¸ , an important ) feature manifest in real fully developed turbulence [10,14]. According to Kolmogorov's theory [169], the energy spectrum within the inertial range of wavenumbers ¸\;k;¸\ scales as ) k\. This is largely con"rmed by experimental data [307], though some question remains whether there are `intermittency correctionsa [34,309] which alter the exponent slightly from 5/3. The k\ scaling of the energy spectrum formally corresponds to a value of e"8/3 in the Simple Shear Model (see Eq. (90)), but the cuto! of this scaling on the low wavenumber (infrared) end at ¸\ is essential for the total energy to be "nite and the #ow to be well-de"ned in the standard sense. Kolmogorov theory also predicts that the decorrelation times for the turbulent #uctuations in the inertial range scales as q(k)&k\, corresponding to a z"2/3 value in the RSTS Model. A competing point of view [319] suggests instead the value z"1. We modify the RSTS Model to permit the study of velocity "elds with scalings corresponding to such values of e and z by introducing an explicit infrared cuto! [10,14]. We then formulate the problem of computing
308
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
e!ective large-scale equations for the mean passive scalar density in the presence of such a turbulent shear #ow, and concretely indicate some of the inherent di$culties. Large-scale e!ective equations for the mean passive scalar density in the Simple Shear Model have in fact been rigorously derived by Avellaneda and the "rst author [10,14], and they display a rich variety of qualitative structure depending on the value of the scaling exponents e and z. We brie#y mention some of the "ndings of these rigorous renormalization computations. Up to this point in the paper, we have for the most part concentrated on describing the statistical motion of a single tracer and the intrinsically related evolution of the mean passive scalar density 1¹(x, t)2. These dynamics are strongly determined by the large-scale features of the turbulent system, simply because these are the most energetic in most realistic circumstances. In Sections 3.5 and 4, we shift our focus to the small-scale #uctuations of the passive scalar "eld. These are intimately related to the dynamics of the separation between pairs of tracers. While the large-scale properties of the passive scalar "eld are of more obvious observational and applied physical interest, the small-scale features are crucial in estimating the size of clouds of pollutants [78,132] and the progress of mixing processes and combustion in turbulent environments [43,339]. Moreover, the small-scale statistics of turbulent systems are of particular theoretical interest because they are thought to exhibit a variety of universal features which are independent of the particular large-scale con"guration (see [309]). We explicitly examine in Section 3.5 various aspects of the small-scale passive scalar statistics in the Simple Shear Model in the absence of molecular di!usion (i"0). We lay the foundations of our study with the development of an exact equation for the pair-distance function introduced by Richardson [284], which describes the PDF for the separation of a pair of tracers. From the pair-distance function, we can deduce some important properties about the evolution of interfaces between regions in which the passive scalar "eld is present or absent. The way in which these interfaces wrinkle plays a crucial role in turbulent combustion and mixing [43,310,305,339]. We characterize the roughness of interfaces through their fractal dimension [215], which may be explicitly computed within the Simple Shear Model. We relate the exact results to experimental measurements and other theoretical work, and state some open problems concerning the inclusion of the e!ects of molecular di!usion. Our study of the small-scale features of the passive scalar "eld in a turbulent #ow continues in Section 4 with the analysis of a di!erent simpli"ed turbulent di!usion model.
3.1. Connection between anomalous diwusion and Lagrangian correlations Before analyzing the statistical tracer motion in the Simple Shear Model, we set up a general framework for the discussion of anomalous tracer di!usion. We de"ne the terminology we shall use in Section 3.1.1. Next, we provide an extremely simple example of anomalous di!usion in the Random sweeping Model, which consists of a spatially uniform random #ow #uctuating in time (Section 3.1.2). Finally, we emphasize in Section 3.1.3 that tracer motion is determined by the statistics of its ¸agrangian velocity, and show how it is connected to but di!erent from the velocity "eld de"ned in the Eulerian (laboratory) frame of reference [317]. The Lagrangian perspective provides a basis for the intuitive understanding of the exact mathematical formulas which we will develop for the simple shear models in Sections 3.2 and 3.3.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
309
3.1.1. Qualitative classes of anomalous diwusion To "x vocabulary, we say that the tracer motion (usually along a certain direction) is: E Ballistic if the mean-square displacement is growing quadratically in time, corresponding to coherent unidirectional motion. E Di+usive if the mean-square displacement is growing linearly in time, corresponding to `normala behavior of a tracer which executes a mean zero, random back and forth motion, with su$ciently rapid decorrelation of its velocity in time. (This is of course the category in which Brownian motion, due to molecular di!usion alone, falls.) E ¹rapped if the mean-square displacement remains bounded for all time. E Superdi+usive if the mean-square displacement is growing faster than linearly in time. E Subdi+usive if the mean-square displacement is growing at a sublinear rate. Note that according to these de"nitions, `ballistica is a subcategory of superdi!usive, and `trappeda is a subcategory of subdi!usive. We will endeavor, however, to be explicit when ballistic or trapped behavior actually occurs. Other forms of anomalous di!usion will be described through rigorous examples in Section 3.4. One is the decrease of the mean-square tracer displacement over a certain time interval, so that the e!ective di!usivity is (temporarily) negative (Section 3.4.4). Therefore, even if an e!ective equation for the mean statistics can be derived in such #ows, it will be ill-posed over some time interval. This phenomenon creates di$culties for conventional numerical Monte Carlo schemes (see Section 6) for turbulent di!usion. Another interesting way in which anomalous di!usion manifests itself is through the higher order statistics. We discuss in Sections 3.4.1 and 3.4.2 an explicit, nontrivial class of examples in which the tracer displacement is neither Gaussian nor self-averaging at large times, in strong contrast to tracers in #ows for which the homogenization theory of Section 2 is valid. 3.1.2. Anomalous diwusion in the random sweeping model To provide a concrete illustration of the variety of anomalous di!usion behavior possible for a tracer, we de"ne the extremely simple yet fundamental Random Sweeping Model. The velocity "eld in this model is de"ned to be a spatially uniform #ow which #uctuates randomly in time according to a stationary, Gaussian random process. For the sake of simplicity of notation and coherence with the turbulent di!usion models to be discussed in Sections 3.2 and 3.3, we shall restrict attention to the case in which the velocity "eld always points along a single direction, and the spatial domain is two-dimensional. Then the random velocity "eld in the Random Sweeping Model is de"ned:
*(x, t)"*(x, y, t)"
w (t) D , 0
where w (t) is a stationary, Gaussian random process with mean zero and correlation function: D R (t),1w (t)w (t#t)2 . (91) U D D Such a model was considered by Kubo [188] as an example of how randomly #uctuating velocity "elds can act as e!ective di!usion processes when viewed on large scales and long times. A more
310
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
general Random Sweeping Model for arbitrary dimensions without the constraint that the velocity "eld point along a single direction was analyzed in great detail in [18]. We will present some pertinent results here. We will "nd it convenient to express the correlation function in terms of the power spectrum E (u) of the velocity "eld: U R (t)" ep SRE ("u") du"2 cos(2put)E ("u") du . (92) U U U \ The power spectrum E (u) describes precisely the spectral density of the energy 1"w (t)"2 resolved U D with respect to frequency. That is, S>DDSE (u) du is the amount of energy residing within the U S S\ frequency band u$Du. The power spectrum is a manifestly non-negative function, which in our applications we assume quite reasonably to be smooth (see Paragraph 2.4.5.3). Consider the location (X(t),>(t)) of a tracer particle advected by the Random Sweeping Model #ow, "rst without molecular di!usion i"0. Starting from position (x , y ), the (random) tracer position at later times is given by
R X(t)"x # w (s) ds , D >(t)"y . The displacement X(t)!x along the sweeping direction is a mean zero Gaussian random variable, since it is a linear functional of the mean zero Gaussian random process w (t). ConseD quently, it is completely described by its variance, which is nothing but the mean-square tracer displacement along the x direction:
p (t)"1(X(t)!x )2" 6
R R
R R 1w (s)w (s)2 ds ds" R (s!s) ds ds D D U
R (93) "2 (t!s)R (s) ds , U where the last equality used a change of variables and the fact that, by its very de"nition (91), R (t) U is an even function. Eq. (93) manifestly relates the turbulent di!usion of the tracer to the statistical correlations of the sweeping velocity "eld w (t). Strong and persistent correlations are clearly D associated with rapid di!usion. It is quite convenient mathematically to express the mean-square displacement in terms of the power spectrum of the sweeping velocity "eld. Substituting Eq. (92) into Eq. (93) and performing the integration over s, we obtain
p (t)"4 6
E (u) U
1!cos 2put du . 4pu
(94)
The qualitative long-time behavior p (t) depends sensitively on the nature of the low frequency 6 component of w (t), as we now describe. Suppose that the power spectrum is smooth for u'0, D absolutely integrable, and has power law behavior near u"0 E (u)"A u\@t (u) . U #U U
(95)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
311
Here, A is a positive constant and t ( ) ) is a smooth function on the positive real axis with #U U t (0)"1 and "t (0)"(R. To ensure integrability, the exponent b must be a real number less U U than 1. We will see that the long-time behavior of the tracer can then be categorized as diffusive, superdi!usive, or subdi!usive according to the value of this exponent. Note that all power spectra with "nite low-frequency limits are included within the class b"0. 3.1.2.1. Diwusive sweep. Consider "rst the case b"0, corresponding to the generic case of a "nite, nonzero distribution of energy at the lowest wavenumbers. By changing integration variables uPut in Eq. (94) and passing to the tPR limit via Lebesgue's dominated convergence theorem ([288], Section 4.4), we "nd that lim p (t)&2KHt , 6 V R with the positive constant
1 KH" A " R (s) ds . (96) V 2 #U U This formula shows that, on long time scales, the random velocity #uctuations serve to induce an e!ective di!usion of the tracer with the ordinary linear growth of the mean-square displacement. The e!ective large-scale di!usivity is given by the constant KH, which is just the integral of the V correlation function of the velocity "eld. Because the tracer displacement is always Gaussian distributed, all its higher moments scale in the same way as those of ordinary di!usion processes. The PDF for the tracer position, or equivalently, the mean statistics, therefore exactly satisfy a di!usion equation with di!usivity constant KH on large space}time scales [18]. V The b"0 class of Random Sweeping Model velocity "elds comprises the simplest (and original) example of Kubo's theory [188] for how transport by a randomly #uctuating, mean zero velocity "eld induces an e!ective di!usion on large scales (see Section 2.4.1). We emphasize that the standard di!usion of Kubo type arises in the Random Sweeping Model when the velocity "eld has the `generica property that its integrated correlation function is nonzero and "nite
R (s) ds(R , (97) U which is exactly equivalent to b"0 in Eq. (95). For other values of b, the condition (97) fails, and the tracer will di!use anomalously on long time scales, as we now show. 0(
3.1.2.2. Superdiwusive sweep. For the exponent values 0(b(1, the power spectrum diverges at low frequencies, corresponding to an in"nite value of R (s) ds. This means that the velocity "eld U exhibits a very long-term memory. Changing variables uPut in Eq. (94) and using the dominated convergence theorem to evaluate the long-time limit results in the following asymptotic formula: lim p (t)&(2/(1#b))K t>@ , V 6 R
(98)
312
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where the positive constant in the prefactor may be expressed: 1 C((1!b)/2) K " A p@\ , V 2 #U C((2#b)/2)
(99)
and C is the standard Gamma function [195]. The tracer motion is superdi+usive because p (t) 6 grows faster than linearly at long times. Because the tracer motion is Gaussian in the Random Sweeping Model, it can be shown [18] that the mean passive scalar density satis"es a timedependent di!usion equation: R1¹(x, y, t)2/Rt"D D(t) R1¹(x, y, t)2/Rx , U 1¹(x, y, t"0)2"1¹ (x, y)2 , where the e!ective di!usion coe$cient
(100)
1 dp (t) 6 D D(t), U 2 dt diverges in time as D D(t)&K t@ . V U
(101)
3.1.2.3. Subdiwusive sweep. For exponent values b(0, the power spectrum E (u) vanishes at the U origin. We consider in this paragraph the case in which !1(b(0. A direct asymptotic calculation as in the previous cases produces lim p (t)&(2/(1#b))K t>@ , 6 V R where K is given by Eq. (99). But now the scaling exponent 1#b is between 0 and 1, correspondV ing to subdi+usive motion of the tracer. The mean passive scalar density again satis"es a timedependent di!usion PDE (100) with e!ective di!usion coe$cient D D(t) obeying the law (101), U which now decays in time. The value b"!1 leads to a logarithmic growth of p (t). 6 3.1.2.4. Trapping sweep. Finally, for b(!1, we can take the tPR limit in (94) directly, without rescaling the integration variable. The oscillatory term vanishes in the tPR limit due to the integrability of E (u)u\ and the Riemann}Lebesgue lemma [172]. There results: U lim p (t)"K3 , 6 V R where
1 K3" E (u)u\ du . (102) V p U The mean-square tracer displacement never exceeds a "nite limit, and the tracer is statistically trapped. There is no e!ective long-range transport.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
313
Table 1 Long-time asymptotics of mean-square tracer displacement in Random Sweeping Model, with i"0. Scaling coe$cients are given by Eqs. (96), (99) and (102) Parameter regime
Asymptotic mean square displacement lim p (t) R 6
Qualitative behavior
b(!1 !1(b(0 b"0 0(b(1
K3t V K t>@ >@ V 2KH t V K t>@ >@ V
Trapping Sub-di!usive Di!usive Super-di!usive
3.1.2.5. Summary. We collect the anomalous di!usion results stated above for the Random Sweeping Model in Table 1. 3.1.2.6. Ewects of molecular diwusion. We now brie#y consider how the above tracer behavior is modi"ed under the addition of molecular di!usion i'0. The equations of motion now become stochastic: dX(t)"v(t) dt#(2i d= (t) , V d>(t)"(2i d= (t) , W where (= (t),= (t)) is a two-dimensional Brownian motion. The interaction between the molecular V W di!usion and a spatially uniform velocity "eld is completely linear, and the integrated trajectory equations are
R X(t)"x # v(s) ds#(2i= (t) , V >(t)"y #(2i= (t) . W The mean-square tracer displacement in each direction is modi"ed from Eq. (93) only by the addition of 2it due to the molecular e!ects
R p (t)"2it#2 (t!s)R (s) ds , 6 U p(t)"2it . 7 Molecular di!usion will therefore produce an ordinary di!usive character for all b40. The random sweep will have relatively negligible e!ects at long times when b(0, corresponding to the regimes of subdi!usive or trapping behavior due to random sweeping alone. On the other hand, molecular di!usion plays a negligible role in the superdi!usive regime 0(b(1 at long times. The simple additivity of the contributions from turbulent di!usivity and molecular di!usivity is a consequence of the absence of spatial structure in the Random Sweeping Model. We will see that
314
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
molecular di!usivity can in#uence the tracer motion in much more subtle ways when the velocity "eld has spatial variations, even in some very simple models (see Section 3.2). 3.1.3. Eulerian vs. ¸agrangian statistics The Random Sweeping Model is quite simplistic, yet already gives us some qualitative understanding of what circumstances can lead to anomalous di!usion in general statistically homogenous turbulent di!usion models. Speci"cally, suppose we are given an arbitrary twodimensional, incompressible, mean zero, spatially homogenous, statistically stationary random velocity "eld (which need not be Gaussian)
v (x, y, t) *(x, y, t)" V . v (x, y, t) W The equations of motion for the tracer are then dX(t)"v (X(t),>(t), t) dt#(2i d= (t) , V V d>(t)"v (X(t),>(t), t) dt#(2i d= (t) . W W Now, we de"ne the ¸agrangian velocity of the tracer
v*(t) v (X(t),>(t), t) **(t)" V " V ; (103) v*(t) v (X(t),>(t), t) W W this is nothing but the random velocity "eld evaluated at the current location of the tracer. In terms of this Lagrangian velocity, the equations of motion take an exceptionally simple form dX(t)"v*(t)dt#(2i d= (t) , V V
(104)
d>(t)"v*(t)dt#(2i d= (t) . W W In fact, these equations are identical to those of the Random Sweeping Model, without the restriction of the velocity pointing in a single direction, and with the ¸agrangian velocity appearing as the advective term instead of the externally prescribed sweep velocity w (t). D The Lagrangian velocity can be shown to be a mean zero, statistically stationary random process; see ([247], Section 9.5) for a formal argument and [270,350] for rigorous derivations. This is not a simple consequence of *(x, y, t) being a mean zero, stationary random process, but relies crucially on incompressibility and statistical homogeneity. By repeating the computations in Eq. (93), which did not require w (t) to be Gaussian, we obtain the following general formula for the D mean-square displacement of a tracer along the coordinate axes:
R p (t),1(X(t)!x )2"2 (t!s)R*(s) ds , V 6 R p(t),1(>(t)!y )2"2 (t!s)R*(s) ds . W 7
(105)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
315
The functions appearing in the integrand are the ¸agrangian correlation functions: R*(t),1v*(t)v*(t#t)2 , (106) V V V R*(t),1v*(t)v*(t#t)2 . W W W The general formula (105) was originally derived by Taylor [317]. By direct analogy with the analysis presented in the Random Sweeping Model, we can in principle categorize whether the tracer motion is, at long times, di!usive, superdi!usive, subdi!usive, or trapped. The main criterion is the low frequency behavior of the power spectra associated to the Lagrangian correlation functions (106). A trivial secondary criterion is the presence of molecular di!usion, which precludes subdi!usive or trapping behavior. The practical obstacle to a quantitative treatment along these lines, however, is the computation of the Lagrangian correlation function. What is explicitly speci"ed are the so-called Eulerian statistics of the velocity "eld, which are just the probability distributions of the velocity "eld evaluated at ,xed points (x, y, t) in the laboratory frame. The Lagrangian velocity, on the other hand, requires the evaluation of the velocity "eld at the random tracer locations (X(t),>(t), t). It is, in general, extremely di$cult to describe quantitative statistical features of the Lagrangian velocity "eld, such as its correlation function [227,350]. In some sense, it requires us to have already solved the problem of describing the statistics of the tracer position! There are special exceptions of course; the Lagrangian velocity v*(t) in the Random Sweeping Model is exactly the speci"ed sweeping V velocity w (t). But in general, we cannot gainfully use Taylor's formula (105) to compute the D mean-square displacement quantitatively. 3.1.3.1. Lagrangian intuition. The Lagrangian perspective does provide intuition for the qualitative behavior of a tracer. Based on the results of the Random Sweeping Model in Section 3.1.2 and Taylor's formula (105), we can deduce the following connections between anomalous tracer di!usion (say in the x direction) and properties of the Lagrangian velocity: E Ordinary di!usion occurs when the correlations of the Lagrangian velocity "eld are of "nite range, in that R*(s) ds is nonzero and "nite. V E Superdi!usion is associated with long-range correlations of the Lagrangian velocity "eld, so that R*(s) ds is a divergent integral. V E Subdi!usion and trapping are associated to oscillations in the Lagrangian velocity, which cause the Lagrangian correlation function to have a substantial negative region (so that R*(s) ds"0). V Though we cannot in general compute R*(s) ds, we can often infer, based on the spatio-temporal V #ow structure, whether the Lagrangian velocity ought to have long-range correlations or oscillations, so that superdi!usion or subdi!usion may be expected. For example, long-range correlations of the Lagrangian velocity of a tracer can generally be associated to #ows with long-range spatio-temporal correlations. The connections between long-range spatio-temporal correlations and superdi!usion have been explored in the context of Levy walks by Zumofen et al. [352] and through renormalization group analysis [45] and various forms of probabilistic analysis [47] by Bouchaud and others. We aim here to concentrate on continuum, incompressible, random #uid #ow models which can be analyzed unambiguously and which reveal a number of subtle features
316
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
concerning anomalous di!usion. In interpreting the exact mathematical results which we present in Sections 3.2 and 3.3, we will refer repeatedly to the qualitative paradigm outlined above. 3.1.3.2. Lagrangian description of standard tracer diwusion. It is helpful in this connection to describe a standard scenario in which the random velocity "eld gives rise to ordinary di!usive behavior at long times. Suppose the Lagrangian correlation function (of the x component of the tracer motion) can be expressed as R*(t)"<o(t/q ) , V * where o( ) ) is some smooth, rapidly decaying numerical function with o(0)"1 and o(t) dt"1. This corresponds to a model in which the Lagrangian velocity has mean-square velocity <, a single, "nite Lagrangian correlation time scale q , and no strong oscillatory behavior. We readily * "nd formulas for the mean-square tracer displacement in two asymptotic limits, using Taylor's formula (105)
<t for 04t;q , * (107) p (t)& 6 2<q t for t
*(x, t)"*(x, y, t)"
w(t) v(x)
.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
317
The steady shear #ow v(x) is taken as a mean zero, Gaussian, homogenous, random "eld. Its correlation function R(x) and energy spectrum E(k) are related by a Fourier transformation
E("k")ep IV dk"2 E(k)cos(2pkx) dk . (108) \ Note that the energy spectrum here is resolved with respect to spatial wavenumber k rather than with respect to frequency u; otherwise, it has the same physical meaning as the power spectrum de"ned in the Random Sweeping Model (Section 3.1.2). We assume the energy spectrum to be smooth for k'0, absolutely integrable, and to behave like a power law at low wavenumbers 1v(x)v(x#x)2,R(x),
E(k)"A k\Ct(k) , (109) # where t(k) is a smooth function on k'0 with t(0)"1. We will sometimes call e the infrared scaling exponent of the energy spectrum since it describes the low wavenumber properties of E(k). Integrability at k"0 requires e(2. Note that energy spectra with "nite kP0 limits correspond to e"1. As e increases, more energy is concentrated at low wavenumbers, corresponding to an increase in the strength of long-range correlations. By considering #ows with energy spectra which diverge at low wavenumber (1(e(2), we attempt to gain an understanding of some of the e!ects of long-range spatial correlations on tracer transport. This is of particular relevance to realistic turbulent di!usion, though we hasten to add that the long-range correlations in a fully developed turbulent #ow manifest themselves much more strongly; see Section 3.4.3. The cross sweep w(t)"wN #w (t) D is taken as a superposition of deterministic constant wN and a mean zero, Gaussian, stationary random "eld w (t) with correlation function D R (t),1w (t)w (t#t)2 . U D D Note that the #uctuating component of the cross sweep is of exactly the same form as in the Random Sweeping Model of Section 3.1.2. The shear "eld v(x) and sweep "eld w(t) are statistically independent of each other. The stochastic equations of motion for the location of a tracer particle X(t),>(t) in an RSS Model #ow are dX(t)"w(t) dt#(2i d= (t) , V X(t"0)"x , d>(t)"v(X(t)) dt#(2i d= (t) , W >(t"0)"y ,
318
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where += (t),= (t), is a pair of independent Brownian motions. This system of equations can be V W successively integrated by quadrature:
R X(t)"x # w(s) ds#(2i= (t) , (110a) V R (110b) >(t)"y # v(X(s)) ds#(2i= (t) . W Up to the addition of a constant mean sweep, the statistics of the shear-transverse position, X(t), is just that of a tracer in the Random Sweeping Model which was discussed in Section 3.1.2 and in [18]. We have that X(t) is a Gaussian random process with mean displacement 1X(t)2"x #wN t and cross-shear displacement variance
R p (t),1(X(t)!1X(t)2)2"2it#2 (t!s)R (s) ds . (111) 6 U We concentrate therefore on the shear-parallel motion >(t) of the tracer. While it can be explicitly represented in the fairly simple form (110b), its statistics are highly nontrivial owing to the nonlinear interaction between the shear velocity "eld v(x) and the shear-transverse tracer position X(t). For example, >(t) is not in general a Gaussian random process, even though all random "elds (v(x), X(t), and = (t)) appearing on the right-hand side of Eq. (110b) are Gaussian. We will return V to this issue in Section 3.4. Within Section 3.2, we will focus on the (absolute) mean-square shear-parallel displacement p(t)"1(>(t)!y )2 . 7 The mean displacement 1>(t)!y 2 vanishes identically because of the mean zero assumption on v(x). As we will demonstrate in Paragraph 3.2.6.1, the mean-square displacement along the shear is given by the following formula: Mean-square shear-parallel tracer displacement for RSS model:
p(t)"2it#2 7
R
E(k)R(k, t) dk ,
(112a)
(112b) R(k, t)"2 (t!s)cos(2pkwN s)e\pIN6 Q ds . This formula remains valid when v(x) is non-Gaussian. Expression (112) explicitly resolves the mean-square displacement along the shear into contributions from #uctuations of various wavenumbers. The function R(k, t), which we will call the shear-displacement kernel, contains the e!ects of cross-shear transport due to the deterministic mean #ow wN , the #uctuating velocity "eld w (t), and molecular di!usion i. The random cross-shear D motion is accounted for by p (t); see Eq. (111). The shear-displacement kernel R(k, t) may be 6 interpreted as the response of p(t) to the presence of a component A cos(2pkx)#B sin(2pkx) of the 7
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
319
random shear velocity "eld along with any cross-shear transport processes, where A and B are a pair of standard, independent Gaussian random variables (zero mean, unit variance). The actual random shear #ow may be thought of as a superposition of independent random #uctuations of various wavenumbers k, with amplitude given by (2E(k). Since these #uctuations are independent and do not act across the gradient of the velocity "eld, the shear-parallel tracer motion >(t) may be expressed as a superposition of independent contributions coming from each wavenumber k of the shear velocity "eld v(x). This will be a useful heuristic perspective in interpreting the exact mathematical results. Some general observations are already apparent. For w(t)"0 and i"0, we have no cross-shear motion (wN "p (t)"0) so the shear-displacement kernel is simply 6 R(k, t),t , and the mean-square displacement along the shear grows quadratically in time
E(k) dk"1v2t , signalling ballistic motion in the shear-parallel direction. This is readily understood: with no cross-shear motion, tracers stay on the original striated streamlines of the shear #ow and proceed at the constant (but random) velocity v(x ) of the streamline on which they were originally situated. Much richer behavior arises when tracers do move across the shear #ow streamlines due to molecular di!usion i and/or the sweeping velocity w(t), thereby breaking up the monotonic motion along the streamlines of the shear. This is re#ected in the suppression of the shear-displacement kernel R(k, t) (112b) through an oscillatory term due to the mean sweep and an exponential damping factor due to random cross-shear motion. We note moreover that the suppression e!ects become weaker at low wavenumber k. Hence, if there is su$cient energy in the low-wavenumber modes, we may expect that the shear-parallel transport is dominated by them, particularly at long times. In what follows, we will develop these ideas in detail, through exact formulas and accompanying explanations of the relevant physical mechanisms. We will "nd that the shear-parallel tracer motion can have a wide variety of long-time behaviors, depending on the nature of the sheartransverse tracer motion X(t) and on the strength of the long-range correlations of the random velocity "eld v(x). We shall "rst consider in turn the individual e!ects of each cross-shear motion on tracer transport along the shear: molecular di!usion in Section 3.2.1, a constant deterministic cross sweep w(t)"wN in Section 3.2.2, and a purely random cross sweep w(t)"w (t) in Section 3.2.3. They will D be compared in Section 3.2.4, and collective e!ects will be discussed in Section 3.2.5. After presenting these asymptotic results and some heuristic discussion, we indicate the means of their derivation in Section 3.2.6. p(t)"2t 7
3.2.1. Ewects of molecular diwusion We "rst consider tracer motion along a random steady shear when molecular di!usion is active (i'0) and there is no cross sweep velocity w(t)"0. This problem has been considered in the context of the #ow of groundwater through a strati"ed porous medium [119,223]; in these
320
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
applications i represents local dispersion coe$cients of the medium rather than microscopic di!usion. The general results presented here were derived in [10]. The mean-square cross-shear displacement is just p (t)"2it , 6 and the general formula (112) can be simpli"ed to the following expression:
p(t)"2it#2 7
E(k)R (k, t) dk , G
R (k, t)"tF (t/q (k)) . G G We use here a specially de"ned universal function: F (u),2(e\S#u!1)/u , and a natural wavenumber-dependent time scale q (k),(4pik)\ , (113) G which we will call the i-persistence time scale. The important properties of F (u) for our purposes are that it approaches a constant value 1 at small u, and that it decays for large positive u like 2u\. Matheron and de Marsily [223] obtain an alternative formula for p(t) in terms of the Laplace 7 transform of the correlation function R(x), and present some numerical plots indicating the growth of p(t) over "nite time intervals. We focus now on the long time asymptotics of p(t). 7 7 Recalling that the shear-displacement kernel R (k, t) represents the in#uence of #uctuations of G wavenumber k on the shear-parallel tracer transport, we see that the contribution of wavenumber k is ballistic for t;q (k) and di!usive for t
K (k),lim R (k, t)/2t"1/4pik . (114) G G R This can be understood physically through the Lagrangian perspective developed in Section 3.1.3, where q (k) plays the role of a Lagrangian correlation time for the shear-parallel transport G contribution from random shear #uctuations of wavenumber k. The persistent motion of the tracer toward a given direction along a steady shear #ow is broken up over time by the molecular di!usion, which allows the tracer to hop across streamlines. Since the shear #ow has zero mean, the tracer will be stochastically bu!eted from streamlines carrying it one way to streamlines carrying it the other. The shear-parallel Lagrangian velocity of the tracer will consequently decorrelate on a time scale on the order of the time it takes for molecular di!usion to move the tracer across one wavelength &k\ of the #uctuation. This is readily computed to have the parameteric scaling &(k\)/i, which is identical to q (k) up to numerical constant. This justi"es the interpretation of G
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
321
q (k) as the Lagrangian correlation time due to the molecular di!usion across the streamlines. And, G consistently with the general heuristic picture outlined in Paragraph 3.1.3.2, the tracer displacement due to the #uctuation of wavenumber k is ballistic for shorter times and di!usive for longer times. Moreover, the asymptotic di!usivity K (k) is proportional to (in fact equal) to the LagranG gian correlation time q (k). Therefore, the tracer motion induced by a single wavenumber k of the G steady shear #ow completely follows the paradigm of standard di!usion. Summing up the contributions from all wavenumbers in the shear #ow is quite subtle, however, because the Lagrangian correlation time q (k) diverges as kP0. This is clearly physical: it takes G much longer for molecular di!usion to break up the coherent shear-parallel tracer transport of large wavelength #uctuations. Because of the slow decorrelation of the low wavenumber modes, it is not necessarily true that the tracer transport is asymptotically di!usive at long times, even though the contribution from each individual wavenumber k'0 does eventually become di!usive. The long-time limiting behavior of the tracer motion along the shear depends crucially on the low wavenumber properties of the random shear spectrum, or equivalently, the strength of the long-range correlations in the shear #ow. A careful asymptotic calculation, indicated in Section 3.2.6 and originally by [10], reveal the results reported in Table 2. The preconstants of the scaling laws are
KH"i#2 G
E(k)K (k) dk , G
(115)
e K "!C ! A (4pi)\\C . G 2 #
As we see, for e(0, the steady shear #ow has su$ciently weak long-range correlations so that at long times, the shear-parallel tracer motion behaves di!usively with "nite e!ective di!usion constant KH obtained by simply integrating the long time e!ective single-mode di!usiviG ties (114) K (k),lim R (k, t)/2t"q (k)"1/4pik G G G R against twice the energy spectral density. Results of this type with a similar formula for enhanced di!usivity were pioneered by Taylor [318]. The particular result noted here was originally derived by Gelhar et al. [119], and can be seen to be a natural extension of the e!ective di!usivity of
Table 2 Long-time asymptotics of mean-square tracer displacement along the shear in Random Steady Shear Model, with i'0, wN "w (t)"0. Scaling coe$cients are given by Eq. (115) D Parameter regime
Asymptotic mean square displacement lim p(t) R 7
Qualitative behavior
e(0 0(e(2
2KHt G K t>C >C G
Di!usive Superdi!usive
322
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
a periodic shear #ow (51). An important class of shear velocity "elds which fall in the class e(0 are those with wavenumbers excited only above some threshold k '0, i.e., E(k)"0 for 04k(k .
Velocity "elds given by a "nite superposition of random sinusoidal oscillations are a particular example. For 0(e(2, the long-range correlations in the shear #ow are su$ciently strong to cause this simple picture to break down. The tracer's motion is dominated for arbitrarily large times t by #uctuations of very large scales with q (k)9t, which act coherently and persistently on the tracer. G These produce a long-term memory in the Lagrangian velocity of a tracer particle and give rise to a superdi!usive transport for exactly the same reason as we discussed in Section 3.1.3. Put another way, for 0(e(2, the cross-shear transport induced by Brownian motion is not fast enough to completely break up the coherent advection by the strong, large-scale shear #uctuations. This is re#ected in the fact that the e!ective di!usion constant KH for e(0 would diverge if used for e'0. G Note moreover that this division between di!usive and superdi!usive behavior is in exact accord with the rigorous homogenization criterion described in Section 2.4.2; the PeH clet number (81) of the random shear #ow with energy spectrum (109) is "nite precisely when e(0. It is interesting to note that superdi!usion arises for quite typical shear velocity "elds, namely those with "nite kP0 limits of the energy density E(k) (which correspond to e"1). These are precisely those random velocity "elds for which the integral of the correlation function is "nite and positive
0(
R(x) dx(R .
(116)
That superdi!usion could occur in steady shear #ows of this type was suggested by Gelhar et al. [119] and precisely demonstrated by Matheron and de Marsily [223]. A special example of a steady random shear #ow with e"1 which has been considered by Mazo and Van den Broeck [225] and others [46,279,280,353,354] consists of in"nite array of parallel channels of common width, with a constant random velocity along each channel chosen independently to be $v with equal probability. (The criterion (116) is manifestly satis"ed by this and any other random shear #ow with a "nite-range, purely nonnegative correlation structure.) An explicit formula for p(t) 7 valid for all "nite times is derived in [225], from which an asymptotic t scaling may be read o!. The authors [46,279] furnished the following physical-space heuristic explanation for the p(t)&t superdi!usive scaling law: The molecular di!usion across the channels causes a tracer 7 to sample &t channels over a time interval t, so the tracer spends a typical time &t in each channel. Consequently, the total mean-square displacement over a time interval t is the number of channels explored &t multiplied by the mean-square displacement produced by each channel &(t)"t. Phrased more generally, the reason for the anomalous di!usion is that the number of di!erent channels sampled grows sublinearly in time, so there is a long-term memory e!ect due to preferential resampling of channels previously visited. Viewing the scaling exponent l in p(t)&tJ as an order parameter, we can say that there is 7 a formal phase transition [126] in the long-time tracer behavior with respect to the infrared scaling exponent e of the energy spectrum. The phase transition occurs at e"0; below this value l"1, and above this value, l bifurcates continuously to the curve l"(e#2)/2. We indicate this graphically in Fig. 9. At the phase transition value e"0 itself, there are logarithmic corrections.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
323
Fig. 9. Scaling exponent l in p(t)&tJ as a function of e in Random Steady Shear Model with i'0, wN "w (t)"0. 7 D
Finally, we note that the long-time tracer behavior is singular in limit of zero molecular di!usion; the scaling coe$cients (115) diverge as iP0. The reason for the drastic di!erence between i"0 and i small is that tracers are permanently stuck on their original streamline in the former case, but can over time move arbitrarily far across the shear in the latter case, no matter how small i is. 3.2.2. Ewects of constant cross sweep We now consider the e!ects of a deterministic sweeping of tracers across streamlines, as manifested by a constant (in space and time) mean drift w(t)"wN O0 across the random, steady shear v(x)
*(x, y, t)"
wN
v(x)
.
(117)
The behavior of a tracer in such a #ow was studied in detail by Horntrop and the "rst author [141]. We consider for now the case without molecular di!usion. Then p (t)"0, and the formula for the mean-square tracer displacement along the shear may be 6 written
E(k)R N (k, t) dk , U R N (k, t)"tF (t/q N (k)) . U U We have de"ned another universal function p(t)"2 7
F (u),2(1!cos u)/u , and a wavenumber-dependent time scale (118) q N (k),(2pkwN )\ , U which we call the wN -persistence time scale. It plays a role similar to q (k) in Section 3.2.1, with some G important distinctions due to the lack of randomness of the sweep wN . As we shall see, q N (k) has some U but not all of the standard properties of a (wavenumber-dependent) Lagrangian correlation time.
324
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The time scale q N (k) does have the same property as q (k) in measuring persistence of the tracer U G motion, in that q N (k) describes the time scale on which the cross sweep wN disrupts the ballistic U motion due to a random oscillatory mode of wavenumber k in the shear velocity "eld. Indeed, over times t;(kwN )\, the tracer motion is predominantly unidirectional. Over longer times t9(kwN )\, the tracer is dragged by the mean sweep across several wavelengths of the oscillating shear #ow, so the tracer motion assumes an oscillatory character along the shear rather than a ballistic one. The key point of departure of the present model from the standard di!usion picture enunciated in Paragraph 3.1.3.2 is that each #uctuation of the shear #ow with wavenumber k contributes an oscillatory, rather than a di+usive component to the shear-parallel tracer motion at times long compared to the natural wN -persistence time scale q N (k). This is re#ected in the fact that the U shear-displacement kernel R N (k, t) is, for each k, a bounded oscillatory function of time. We U therefore call q N (k) a Lagrangian persistence time scale, which is a more generally applicable notion U than a Lagrangian correlation time scale. In Section 3.2.4, we will compare and contrast the above described behavior of the kernel associated to a constant mean sweep, R N (k, t), with those arising U from other cross-transport mechanisms. We note now that the mean-square displacement in a statistically homogenous, random steady shear #ow with constant cross sweep is given by the same formula as in the Random Sweeping Model (94), except that u is replaced by kwN in the kernel, and the energy spectrum resolved with respect to wavenumber appears instead of the energy spectrum resolved with respect to frequency. This is readily understood from the formula (110b) for the shear-parallel tracer motion, which for i"0 and w(t)"wN reads
R >(t)"y # v(x #wN s) ds . The Lagrangian velocity of the tracer is therefore exactly v(x #wN t), which is manifestly a mean zero, statistically stationary random process, just as the sweeping "eld w (t) in the Random D Sweeping Model. We can therefore immediately deduce the long-time asymptotics of the meansquare displacement across the shear from the results of the Random Sweeping Model (see Table 1 and [141]); these are displayed in Table 3. The scaling coe$cients are (119a) KHN "A wN \ , U # K N "A pC\wN C\C((2!e)/2)/C((1#e)/2) , (119b) U # 1 K3N " E(k)k\ dk . (119c) U pwN The long-time tracer behavior is a smooth function of e as it varies over 0(e(2, even though it falls into di!erent qualitative categories. There is however, a phase transition at e"0 between trapping behavior and subdi!usive behavior (Fig. 10). This is manifested both in the sharp change in the scaling exponent, and in the fact that the long time tracer motion depends on the whole energy spectrum for e(0 through K3 (see Eq. (119c)), whereas only the infrared parameters V A and e appear in the preconstants for e'0. # The simplicity of the Random Steady Shear model with constant cross sweep and the wide variety of resulting tracer behavior make this model an excellent candidate for testing numerical
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
325
Table 3 Long-time asymptotics of mean-square tracer displacement along the shear in Random Steady Shear Model, with i"0, w(t)"wN O0 (from [141]). Scaling coe$cients are given by (119) Parameter regime
Asymptotic mean square displacement lim p(t) R 7
Qualitative behavior
e(0 0(e(1 e"1 1(e(2
K3N t U K N tC C UH 2K N t U K N tC C U
Trapping Sub-di!usive Di!usive Super-di!usive
Fig. 10. Scaling exponent l in p(t)&tJ as a function of e in Random Steady Shear Model with wN O0, i"w (t)"0. 7 D
methods for simulating turbulent di!usion. It has been applied toward this end in [83], and we present an extensive discussion in Section 6.2. 3.2.2.1. Streamline analysis. Consideration of the streamlines of the steady velocity "eld gives a pictorial way to understand how the cross sweep wN impedes di!usion along the shear #ow (see Section 2.2). Without a cross #ow, the streamlines of the shear are of course straight lines parallel to the x axis. The e!ect of the velocity "eld alone is to transport any tracer along its own streamline forever along a single direction. When a nonzero cross sweep is added wN O0, the streamlines display a more interesting behavior. We de"ne the stream function W(x, y) of the resulting incompressible steady #ow in the standard way: RW(x, y) ! Ry wN " . *(x, y)" v(x) RW(x, y) Rx
An obvious solution is
W(x, y)"!wN y#WI (x) ,
(120)
326
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where dWI (x)/dx"v(x) .
(121)
As v(x) is a homogenous, mean zero, Gaussian random process, it may be expressed as a stochastic Fourier integral ([341], Section 9)
I (k) , (122) ep IV(E("k") d= \ where the integration measure is complex white noise, which is a Gaussian random quantity with the formal properties v(x)"
d= I (!k)"d= I (k) , 1d= I (k)2"0 , 1d= I (k)d= I (k)2"d(k#k) dk dk (An overbar denotes complex conjugation.) The stochastic Fourier integral expresses v(x) as a continuum limit of a sum of independent random Fourier modes, with amplitudes weighted by the square root of their energy density. It is readily checked that v(x) de"ned by Eq. (122) is a mean zero, Gaussian random process with correlation function in agreement with Eq. (108). Now, it is tempting to de"ne WI (x) as
(E("k") (A # ep IV ep IVsgn(k)"k"\>C(t("k") d= d= I (k)" I (k) , (123) 2pik 2pi \ \ so that Eq. (121) is formally satis"ed. For the stochastic Fourier integral (123) to be well-de"ned, however, the integrand must be square-integrable [341]. The division by k introduces a possible singularity at k"0, which is square-integrable only for e(0. For this range of infrared scaling exponents, we have successfully de"ned a good stream function W(x, y) by Eqs. (120) and (123). In particular, when e(0, WI (x) is a real, homogenous, Gaussian random "eld with "nite variance: WI (x)"
A "(A sgn(k)(2pi)\"k"\>C(t("k")" dk" # "k"\\Ct("k") dk(R . # 4p \ \ Now, the streamlines are given by level sets of the stream function 1WI (x)2"
W(x, y)"!wN y#WI (x)"C ,
(124)
or equivalently y"WI (x)/wN #(C/wN ) .
(125)
Since WI (x) is a homogenous random "eld with "nite variance, the shear-parallel displacement of the streamline, y, is randomly distributed, but smoothly and with "nite variance. There are no large excursions o! to in"nity (almost surely), and the streamlines are blocked in a statistical sense along
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
327
the shearing direction. Because we are not including molecular di!usion at the moment, tracers follow the streamlines, and will consequently remain trapped along the shearing direction for e(0. When 0(e(2, these arguments no longer hold (due to the singularity of the integral (123) at k"0); it is not possible to de"ne a statistically homogenous stream function with bounded variance. The streamlines instead wander o! to in"nity, yielding subdi!usive, di!usive, or superdi!usive behavior on large scales and long times. 3.2.3. Ewects of temporally yuctuating cross sweep The third and "nal mechanism of cross-shear transport which we will consider is that of a randomly #uctuating, spatially uniform, mean zero velocity "eld w (t): D w (t) *(x, y, t)" D . v(x)
The shear-transverse component of the velocity "eld w (t) is taken to be a mean zero, Gaussian, D stationary, random process with correlation function:
R (t),1w (t)w (t#t)2"2 U D D
cos(2put)E ("u") du . U
We assume the power spectrum of the random sweeping is smooth for u'0, absolutely integrable, and has the same form as that assumed in the Random Sweeping Model discussed in Section 3.1.2 E (u)"A "u"\@t ("u") . (126) U #U U Here, A '0, b(1, and t ( ) ) is a smooth function on the non-negative real axis with t (0)"1 #U U U and "t (0)"(R. U The mean-square displacement of a tracer across the shear due to the random sweeping is given by Eq. (93)
R p (t),1(X(t)!x )2"2 (t!s)R (s) ds . 6 U We saw in Section 3.1.2 that the exponent b determines the long-time behavior of the tracer motion in the x direction:
K t>@ V lim p (t)& >@ 6 K R V where
for !1(b(1 , for b(!1 ,
(127)
K "A n@\C((1!b)/2)/C((2#b)/2) , V #U 1 E (u)u\ du . K3" U V p To avoid needless complications, we will not treat the phase transition value b"!1, in which logarithms arise.
328
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The mean-square displacement of a tracer along the shear is obtained by setting i"wN "0 in Eqs. (112a) and (112b)
E(k)R D(k, t) dk , U (128) R R D(k, t)"2 (t!s)e\p I N 6Q ds . U We see that the random cross sweeping decorrelates the tracer motion along the shear through the exponentially decaying factor. It acts in a manner quite similar to molecular di!usion in bu!eting tracers randomly onto di!erent streamlines of the pure shear #ow; indeed, the case of pure molecular di!usion is recovered by simply setting p (t)"2it. The random cross sweeping, 6 however, can have a variety of long-time scaling behavior, depending on its low-frequency scaling exponent b. The shear-parallel transport depends sensitively on the e!ectiveness of the random cross sweep, as illustrated in Table 4, which displays the long time asymptotics of p(t) for various 7 parameter values. The scaling coe$cients are p(t)"2 7
E(k)K D(k) dk , U 1#b \C C((2!e)/2)A # , K D" U ((e(1#b)/2)!b) 4pK V K䢇D"2 E(k)e\pI)3V dk , U where KHD"2 U
(129)
R (k, t) K D(k),lim UD " e\pIN6Q ds (130) U 2t R is the asymptotic shear-parallel di!usivity contribution from a normalized mode of wavenumber k. These results are derived in Paragraph 3.2.6.3.
Table 4 Long-time asymptotics of mean-square tracer displacement along the shear in Random Steady Shear Model, with i"0, w(t)"w (t)O0. Scaling coe$cients are given by (129) D Parameter regime
!1(b(1
b(!1
2b e( 1#b 2b (e(2 1#b e(2
Asymptotic mean square displacement lim p(t) R 7
Qualitative behavior
2KH t UD
Di!usive
4 K t>C\@\C 2#e!b(2!e) UD
Superdi!usive
K䢇 t UD
Ballistic
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
329
Consider "rst the case in which the mean-square displacement across the shear grows unboundedly with time (!1(b(1). The value b"0 corresponds to a di!usive cross sweep, and the behavior of p(t) is much the same at long times as it is for the case of molecular di!usion; see 7 Section 3.2.1. In particular, there is a phase transition at e"0, below which the tracer displacement along the shear is di!usive, and above which it grows superdi!usively as p(t)&K Dt>C. As b is U 7 varied within the range !1(b(1, the situation is qualitatively the same. The key changes are that E The phase transition value between superdi!usive and di!usive shear-parallel transport is shifted to e"2b/(1#b). E The superdi!usive scaling exponent is modi"ed to 1#e/2!b(2!e)/2. These are indicated graphically in Fig. 11. We note that increasing b increases the strength of the low-frequency, long-range temporal correlations in the sweep, and thus increases the rate at which the tracer is swept across the streamlines. We see that an increase of b is associated with an increase in the range of values of e for which the shear-parallel transport is di!usive rather than superdi!usive, and is also associated with a depression of the scaling exponent in the superdi!usive regime. This explicitly demonstrates that an increase in the strength of cross-shear transport decreases the rate of shear-parallel transport. The reason is simply that the Lagrangian correlation time for the tracer motion along the shear will be reduced if it is swept across the shear at a faster rate; see Section 3.1.3. One can associate the Lagrangian correlation time in the present model with a natural w -persistence time q D(k) for each D U wavenumber k in the shear by the implicit relation p (q D(k))"(2pk)\ . 6 U This is suggested mathematically by the exponential damping term in Eq. (128), and physically (up to numerical factor) by the typical time taken to cross one wavelength &k\ of the shear #ow. For low wavenumbers k, which have the longest Lagrangian decorrelation times and therefore often dominate the long-time asymptotics of tracer motion, we can use the
Fig. 11. Scaling exponent l in p(t)&tJ as a function of e in Random Steady Shear Model with w (t)O0, i"wN "0. The 7 D value b"1/2, corresponding to a random cross sweep which induces superdi!usive cross-shear transport, is used in this drawing. Varying b slides the transition value horizontally along l"1.
330
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
asymptotic formula (127) to obtain (131) lim q D(k)&C D(K k)\>@ , U V U I where C D is an unimportant numerical constant. The rate of divergence of the Lagrangian U correlation time q D(k) is shallower when b increases and the cross-sweep transport is more e!ective, U as we claimed. It can be checked that the asymptotic single-mode di!usivity K D(k) is on the order U of q D(k), as would be expected by the standard Lagrangian analysis (Paragraph 3.1.3.2). U The case b(!1 has a distinctive character. The low-frequency component of the random sweep is so weak in this regime that the tracer motion across the shear is trapped. Thus, at long time scales, the tracer remains statistically localized about its original streamline, and therefore exhibits permanent memory e!ects. In fact, the Lagrangian correlation time associated to each wavenumber is e!ectively in"nite. Consequently, the shear-parallel transport is ballistic, in accordance with the cartoon Lagrangian description in Section 3.1.3. It is interesting to note that the scaling coe$cient K䢇D for the ballistic regime depends on the U energy spectrum at all wavenumbers, a feature in common with the di!usive regime and distinct from the superdi!usive regime. The reason is that each mode of the shear #ow has an in"nite Lagrangian correlation time associated to it due to the trapped cross shear transport. Hence, the contribution of even large wavenumbers remains ballistic at arbitrarily long times, and never becomes subdominant to the total ballistic scaling. 3.2.4. Comparison of ewects of various cross-shear transport processes We now make some general observations regarding the competing e!ects of cross-shear transport due to molecular di!usion i (Section 3.2.1), a constant cross sweep wN (Section 3.2.2), or a randomly #uctuating cross sweep w (t) (Section 3.2.3). D First of all, we have noticed that shear velocity "elds with long-range correlations (e relatively large) can exhibit superdi!usive shear-parallel transport for all cross-shear transport mechanisms considered. The reason is that the mode-by-mode contribution to the shear-parallel tracer motion has an e!ective Lagrangian persistence time scale diverging at low wavenumber. When there is enough energy in the low wavenumber modes, the total motion of the tracer also has an in"nite Lagrangian persistence time, so the standard di!usive situation of Paragraph 3.1.3.2 does not apply. Another trend which can be observed is the fact that shear-parallel transport is generally diminished as cross-shear transport becomes more e$cient. This was demonstrated explicitly in Section 3.2.3 by consideration of how the long-time asymptotics of p(t) depend on the exponent 7 b characterizing the low-frequency behavior of the randomly #uctuating cross sweep w (t). The D sense in which this qualitative trend is supported by all cases considered is that the superdi!usive regime of exponent values e shrinks as the cross-shear transport becomes more rapid. Indeed, the superdi!usive regime is minimal (1(e(2) for the case of a constant (ballistic) cross sweep, and expands smoothly for a randomly #uctuating cross sweep as b is decreased so that the cross-shear transport is superdi!usive, then di!usive, then subdi!usive, then trapped. Moreover, the case of a constant cross sweep can be thought of as a limit with b61 of the randomly #uctuating cross sweep insofar as superdi!usive tracer motion is concerned. Speci"cally, as b61, the boundary
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
331
value between di!usive and superdi!usive behavior, 2b/(1#b), and the superdi!usive scaling exponent 1#e/2!b(2!e)/2 for the randomly #uctuating cross sweep tend to the values 1 and e, respectively, which are those corresponding to a constant cross sweep (cf. Figs. 10 and 11). There is, however, a strong distinction between the case of a constant cross sweep and the case of a random cross-shear transport (whether by molecular di!usion or by a randomly #uctuating cross sweep) insofar as subdi!usive tracer behavior is concerned. There is no subdi!usive nor trapping regime for random cross-shear transport, whereas these do exist with a constant cross sweep when the energy spectrum of the shear #ow vanishes at the origin. The reason for the distinction may be understood by a consideration of the contribution R(k, t) of a single mode of wavenumber k of the random shear #ow to the shear-parallel transport. A constant cross sweep will coherently move the tracer across the sinusoidal modulation, and the net transport over every spatial wavelength travelled will be exactly zero. Consequently, each wavenumber by itself makes a bounded contribution to the mean-square shear-parallel displacement. If the tracer is instead randomly bu!eted across a sinusoidal shear component, its motion along the shear will be a mean zero but nontrivial random variable as it crosses a wavelength since its back-and-forth motion along the shear does not generally cancel out. (The tracer can linger near the positive maximum of the sinusoid longer than near the negative minimum, for example.) Consequently, after a time in which the tracer crosses many spatial wavelengths (which means t is much larger than the associated wavenumberdependent Lagrangian persistence time), the shear-parallel tracer motion contributed by the sinuosoid will behave like a sum of a large number of mean zero, identically distributed, independent random pushes, and hence be di!usive. As the mode-by-mode contributions of the shear-parallel transport are therefore di!usive at large times and ballistic at small times (see Section 3.1.3), the total tracer motion along the shear must be at least di!usive when the crosstransport mechanism is noisy. Only coherent transport processes across the shear (such as by a constant cross sweep) which preserve phase information can produce the oscillations in the Lagrangian velocity needed for subdi!usive or trapped behavior. 3.2.5. Superposition of cross-shear transport mechanisms We now consider what happens to the long-time asymptotics of the shear-parallel transport when multiple cross-shear transport processes are active. 3.2.5.1. Superposition of random cross-shear transport processes. The superposition of a randomly #uctuating cross sweep w (t) with molecular di!usion i is readily understood: the scaling exponent D of the long-time tracer motion along the shear is determined by the more rapid of the two cross-shear transport process. That is, when the #uctuating cross sweep w (t) is superdi!usive D (0(b(1), the value of e demarcating the phase transition to superdi!usive shear-parallel transport behavior, as well as the scaling exponenents in the superdi!usive regime of e, are identical to those reportedin Table 4 for the case of a superdi!usive #uctuating cross sweep acting alone across the shear. Conversely, when the #uctuating cross sweep is subdi!usive or trapped (b(0), then the long-time scaling properties of p(t) are determined by molecular di!usion alone (Table 2). 7 When the cross sweep is di!usive (b"0), the long-time asymptotics of p(t) have the same scaling 7 properties as either i or w (t) acting alone; they are identical. The reason for the simple rules for D superposition described above is that the Lagrangian correlation (or persistence) time of a tracer being decorrelated by two independent random mechanisms will naturally be equal to the lesser of
332
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
the two. Therefore, the more rapid cross-shear transport process will determine the character of the long-time asymptotics of the shear-parallel transport. Note that the above superposition laws indicate that the scaling exponent for p(t) is always the smaller of the exponents associated to each 7 individual, active random cross-shear transport mechanism operating in isolation. When the tracer motion is superdi!usive, the scaling coe.cient in the asymptotics for p(t) will 7 depend only the dominant cross-shear mechanism. But when the tracer motion is di!usive, the e!ective di!usion coe$cient will generally depend on all cross-shear transport processes involved. It must always be kept in mind that long-time asymptotic results are only valid when t is larger than all relevant time scales, and that there may in practice be important "nite time corrections. For example, even though the e!ects of molecular di!usion will eventually dominate those of a subdi!usive cross sweep w (t), there may well be a long intermediate interval of time over which D the cross-shear transport is dominated by the e!ects of the #uctuating cross sweep, because i is typically small relative to the macroscopic units of the system. Therefore, the shear-parallel transport may, in such a case, exhibit the scaling associated to a subdi!usive cross sweep for a while (Table 4), then ultimately cross over to the asymptotic scaling associated to the e!ects of molecular di!usion (Table 2). 3.2.5.2. Superposition of constant cross sweep with random cross-shear transport process. The situation is more subtle when a constant cross sweep wN is superposed with a random cross-shear transport process (i or w (t)). On the one hand, it is true that the fastest cross-shear transport D mechanism (in the present case, wN ) determines whether the tracer motion is superdi!usive, and if so, how its mean-square displacement scales in time. Therefore, the superdi!usive regime is 1(e(2 and has the same scaling law as for a constant cross sweep with no other random mechanisms (Table 3). But for e(1, we must take into account the fact that the random cross-shear transport will break up the coherence present when the constant cross sweep is acting alone. Therefore, when the constant cross sweep wN is superposed with molecular di!usion i and/or random cross-shear #uctuations w (t) with !1(b(1, the subdi!usive and trapping regimes (e(1) become difD fusive. We now explicitly illustrate some of these subtle e!ects of superposition by reporting some basic formulas, worked out in detail in [141], for the mean-square shear-parallel transport for the case in which a constant cross sweep wN O0 and molecular di!usion iO0 are both present. We take w (t)"0 for this discussion. D The mean-square displacement of the tracer along the shear is given by the following formula [141], specialized from Eq. (112)
p(t)"2it#2 7
E(k)R N (k, t) dk , GU
(132a)
t t R N (k, t)"tF , . GU q (k) q N (k) G U Here, F (u, u) is the following universal function u (1!e\S cos u)(u!u)#2uue\S sin u F (u, u)"2 ! , u#u (u#u)
(132b)
(133)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
333
Table 5 Long-time asymptotics of mean-square tracer displacement along the shear in Random Steady Shear Model, with i'0, wN O0, w (t)"0. Scaling coe$cients are given by Eq. (134) D Parameter regime
Asymptotic mean square displacement lim p(t) R 7
Qualitative behavior
e(1 e"1 1(e(2
2KH N t GU 2KH t GU K N tC C GU
Di!usive Di!usive Superdi!usive
and q (k)"(4pik)\ , G q N (k)"(2pwN k)\ U are the Lagrangian persistence times associated to the molecular di!usion and to the constant cross sweep, respectively. A more compact formula for p(t) in terms of the Laplace transform of the 7 velocity correlation function R(x) was derived by Matheron and de Marsily [223]; the spectral representation has been developed here because of its #exibility in handling randomly #uctuating components w (t) in the cross sweep. Some numerical plots of p(t) over "nite intervals of time for D 7 special choices of correlation functions R(x) can be found in [223]. The long-time asymptotics of p(t) are worked out rigorously in [141], and the results displayed 7 in Table 5. The scaling coe$cients are
KH N "i#2 GU
E(k)K N (k) dk , GU
#2 K* N "i#K N GUC GU
(134)
E(k)K N (k) dk , GU
A nC\C((2!e)/2) K N" # , GU 2C((1#e)/2)
where 1 q\(k) i G K N (k),lim R N (k, t)" " (135) GU 2t GU q\(k)#q\ 4pik#wN N (k) G U R is the asymptotic di!usivity contributed by a single normalized Fourier mode of the shear #ow. These results have some interesting relations to the asymptotics for a random steady shear #ow with either molecular di!usion i or a constant cross sweep wN acting individually (see Sections 3.2.1 and 3.2.2, respectively). First, we note that the superdi!usive regime and scalings are identical to those in which wN O0 but i"0. This is in accordance with the general principle described above that the fastest cross-shear transport mechanism determines whether the tracer behaves superdi!usively, and if so, how. Recall that if the random shear #ow has the generic property
334
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
0(R(x) dx(R (which corresponds to e"1), then the mean-square tracer displacement along the shear grows superdi!usively in time in the presence of molecular di!usion but with no cross sweep: p(t)&t. The addition of a constant cross sweep wN restores a di!usive character to the 7 shear-parallel transport, as was noted by Matheron and de Marsily [223] in a hydrological context. One way to understand this for the case of the independent channel models of [46,279,353] is that a mean sweep across the channels destroys the oversampling of old channels. Over a time interval t, a tracer samples only O((t) di!erent channels if it only moves di!usively across channels, but will sample a full O(t) di!erent channels when a constant cross-drift is added [47]. A second outcome of the combination of molecular di!usion with the constant cross sweep is the elimination of the subdi!usive and trapping regimes which were present when wN O0 but i"0. This is not simply because molecular di!usion contributes an independent shear-parallel di!usive transport process. Indeed, even if molecular di!usion were only to act across the shear, the only change to the asymptotics stated above is that the additive i term in the scaling constants KH N and GU K* N would disappear. The fundamental reason the subdi!usive and trapping behavior disappears GU is that the addition of molecular di!usion randomizes the cross-shear transport, and disrupts the coherent cross-shear motion which produces the persistent oscillations of the Lagrangian velocity. Interestingly, the boundary between the di!usive and superdi!usive regimes is located at e"1, as for the case of pure constant cross sweep, but manifests a sharp phase transition as for the case of pure molecular di!usion (Fig. 12). (Recall that the di!usive}superdi!usive transition is smooth for the case of a pure constant cross sweep.) We "nally remark upon the need for mathematical care which the current example demonstrates. Note that the asymptotic mode-by-mode di!usivity K N (k) (135) is a bounded function. GU Consequently, if one were to blindly compute lim p(t)/2t by moving the tPR limit under the R 7 integral over k in Eq. (132a), one would arrive at a "nite answer for all e(2, which would coincide with a naive extrapolation of formula (52) for the e!ective di!usivity in a periodic shear #ow to the case of a random shear #ow. It would be incorrect to conclude, however, that this result described the asymptotic shear-parallel di!usivity for 14e(2. The commutation of the tPR limit with the integral over k is patently invalid for this range of parameters, and leads to a misleading and incorrect result. The reason is a subtle nonuniform convergence of the integrand to its asymptotic limit at small wavenumbers; a formally transient contribution to the integral is in fact signi"cant or
Fig. 12. Scaling exponent l in p(t)&tJ as a function of e in Random Steady Shear Model with i'0, wN O0, w (t)"0. 7 D
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
335
dominant for 14e(2. Mathematically correct computations of the asymptotic di!usivity can be ensured by proper application of the dominated convergence theorem to interchange limits and integration and by explicit accounting of the troublesome nonuniformities at small wavenumber. We illustrate this procedure in Paragraph 3.2.6.3 below; see [141] for the rigorous computation of the asymptotic results just discussed. 3.2.6. Derivation We now indicate how the exact asymptotic formulas for the mean-square tracer displacement in the Random Steady Shear #ow are mathematically obtained. First, we derive the fundamental formula (112) for the mean-square tracer displacement at all "nite times. Next, we indicate in general terms how the long-time asymptotics may be deduced both heuristically and rigorously from this formula. We provide details for the case of a randomly #uctuating cross sweep w (t); the D other cases have been thorougly addressed in previous publications [10,141]. 3.2.6.1. Derivation of general formula. The mean-square tracer displacement of a tracer along the shear may be written in the following primitive form using the integrated equations for the tracer trajectories (110) p(t),1(>(t)!y )2 , (136a) 7 R (136b) >(t)!y " v(X(s)) ds#(2i= (t) , W R (136c) X(t)"x # w(s) ds#(2i= (t) . V Using the mutual independence of v(x) and = (t) and the fact they each have mean zero, we may W write
p(t)"2i1(= (t))2# 7 W
R R
1v(X(s))v(X(s))2 ds ds
R R
1R(X(s)!X(s))2 ds ds . (137) The remaining averaging is over the statistics of = (t) and w(t), which determine X(t). Invoking the V spectral representation (108) for the correlation function R(t) of the shear velocity "eld, we write "2it#
p(t)"2it# 7
R R
E("k")1ep I6Q\6QY2 dk ds ds . (138) \ Now, X(s)!X(s) is the random tracer displacement between times s and s. By statistical homogeneity and stationarity of the advection processes, this must have the same statistics as the tracer displacement over a general time interval s!s dX(s!s) .
336
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Furthermore, as can be checked from Eq. (110a), this is an integral of a Gaussian random process and therefore a Gaussian random variable. Its characteristic function is therefore readily evaluated: 1ep I6Q\6QY2"1ep IB6Q\QY2"ep I6B6Q\QY7\pIN6Q\QY"ep IUN Q\QY\pIN6 Q\QY , where
R p (t),1(X(t)!1X(t)2)2"2it#2 (t!s)R (s) ds . U 6 Substituting this equation into Eq. (138), we complete the derivation
R R
p(t)"2it# 7
\ R R
"2it#2
"2it#4
E("k") ep IUN Q\QY\pIN6Q\QY dk ds ds E(k) cos(2pkwN (s!s)) e\pIN6Q\QY dk ds ds
R
(t!s)E(k) cos(2pkwN s) e\pIN6Q ds dk .
This may be rewritten in the form stated in Eqs. (112a) and (112b)
p(t)"2it#2 7
E(k)R(k, t) dk ,
(139a)
R R(k, t)"2 (t!s) cos(2pkwN s) e\pIN6Q ds .
(139b)
3.2.6.2. General asymptotic considerations. The long-time limit of Eqs. (139a) and (139b) may be computed through consideration of the behavior of the shear-displacement kernel R(k, t). We "rst discuss the asymptotics on a heuristic level, then indicate how to make these arguments rigorous. Subsequently, we will provide a detailed computation of the asymptotic shear-parallel transport rate when a randomly #uctuating cross sweep w (t) is active. D For a given superposition of cross-shear transport processes, there is a naturally de"ned Lagrangian persistence time q (k) associated to transport by the shear mode with wavenumber k. * Expressions for q (k) for various cross-shear transport mechanisms were given in Eqs. (113), (118) * and (131). When multiple cross-shear transport processes are present, q (k) is taken as the smallest * of the corresponding persistence time scales. For the moment, we will discuss only random cross-shear transport processes (i and w (t)); the presence of a constant cross sweep wN O0 requires D some special consideration, and we return to it later. When t;q (k), the contribution of the shear * mode of wavenumber k is ballistic, and R(k, t)+t for 04t;q (k) . *
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
337
The shear-parallel transport due to shear #uctuations of wavenumber k becomes di!usive for t
There is also an intermediate case in which the contribution from the ballistic modes 04k:k (t) contributes a linear function of t, so that the shear-parallel transport is di!usive, but with an extra term in the formula for the enhanced di!usivity (140b). An explicit example of this special case, which we discuss no further here, arises at the phase transition value e"1 for a random shear #ow with constant cross sweep wN O0 and molecular di!usion i'0; see Table 5. Rigorous approach. To decide mathematically whether the total shear-parallel transport is di!usive or superdi!usive, one "rst derives a bound of the following form on the shear-displacement kernel "R(k, t)"4Cq (k)t , (141) * where C is some numerical constant. This can be done through integration by parts in Eq. (139b) or by direct consideration of the damped exponential. Then one can use this equality along with the
338
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
dominated convergence theorem ([288], Section 4.4) to prove that if
E(k)q (k) dk(R , *
(142)
then p(t) grows linearly in time with e!ective di!usion coe$cient given by Eqs. (140a) and (140b). 7 When Eq. (142) is violated, this indicates that the ballistic behavior of the modes k:k (t) is producing a nonuniformity near k"0. To compute the contribution from this region, one rescales the integration wavenumber to k"k/k (t). We illustrate this procedure explicitly below. A mathematically equivalent approach [141] to computing the asymptotics of p(t) is to posit 7 general space-time rescalings >I (tI )"a>(tI /o(a)) and to choose o(a) in such a way that >I (tI ) approaches a "nite nontrivial limit as aP0. We emphasize that our continual appeal to the dominated convergence theorem to justify our computations in what follows is not simply an exercise in mathematical pedantry. As we pointed out through a counterexample at the end of Paragraph 3.2.5.2, it is patently wrong to say that ordinary di!usion occurs with di!usion coe$cient KH whenever the expression (140b) for this quantity is "nite. Discrepancies between this formal quantity and the true asymptotics of p(t) can 7 and do arise from a formally transient contribution to the integral which is however quantitatively signi"cant or dominant at long times. By establishing bounds for which the dominated convergence theorem applies, we ensure that the transient terms do not contribute signi"cantly. Transient contributions which are important can, in the present context, be isolated and evaluated by an appropriate rescaling of the integration variables. This will be explicitly illustrated in our detailed presentation of the long-time asymptotics of p(t) for the case of a randomly #uctuating cross sweep 7 w (t) below. D Subtle features of constant cross sweep. We now mention how the above discussion is modi"ed in the presence of a constant cross sweep wN O0. If the only cross-shear transport mechanism is the constant cross-sweep (wN O0, i"0, w (t)"0), D then the single-mode contribution is trapped, rather than di!usive, in the long time limit. That is, for each k'0, R(k, t) is a bounded function of t. Therefore, the long-time limit of p(t) is trapping 7 when the energy spectrum is su$ciently weak at low wavenumber (e(0) so that dominated convergence applies. The shear-displacement kernel for the case of a purely constant cross sweep R(k, t)"R N (k, t) can be written in terms of elementary functions, thereby permitting a completely U explicit and direct analysis (see Section 3.2.2 and [141]). When the constant cross sweep is superposed upon another random transport mechanism, the Lagrangian persistence time as we have de"ned it is q (k)"q N (k)"(2pwN k)\, but there is an * U additional important time scale which we call the randomization time q (k). This is the time scale over which the phase of the oscillations of the Lagrangian velocity is forgotten due to the presence of the random component of the cross-sweep. It may be equated to the shortest Lagrangian persistence time scale of the random processes included, which is just the time taken for the randomness to give the cross-shear tracer location an uncertainty equal to the wavelength k\ of the shearing mode under consideration. It is readily checked that for all random cross-shear transport mechanisms considered, q (k)"q N (k);q (k) at small k. The contribution R(k, t) to the * U shear-parallel transport is ballistic for t;q (k), trapped subsequently for an intermediate interval *
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
339
q (k);t;q (k) due to oscillations in the Lagrangian velocity, then di!usive t
p(t)"2 7
E(k)R D(k, t) dk , U
R R D(k, t)"2 (t!s)e\pIN6Q ds , U
(143a)
(143b)
where the cross-shear displacement variance is expressed in terms of the correlation function R (t) U of w (t) as D
R p (t),1(X(t)!x )2"2 (t!s)R (s) ds . 6 U Recall that the energy spectrum E(k) of the shear #ow and the power spectrum E (u) of the random U cross-sweep are assumed to be smooth on the positive real axis, absolutely integrable, and of the following forms: E(k)"A "k"\Ct("k") , # E (u)"A "u"\@t ("u") , U #U U where b(1, e(2, A '0, A '0, and t ( ) ) and t( ) ) are each smooth function on the positive # #U U real axis with t (0)"t(0)"1. U We know from Section 3.1.2 that
K t>@ lim p (t)& >@ V 6 K3 R V
for !1(b(1 , for b(!1 ,
(144)
340
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where 1 C((1!b)/2) K " A n@\ , V 2 #U C((2#b)/2)
1 K3" E (u)u\ du . V p U =eak sweeping regime. We dispense "rst with the case b(!1. Here the cross-shear #uctuations are trapped, and the Lagrangian persistence time q (k)"q D(k) is e!ectively in"nite for each * U wavenumber. Consequently, every mode contributes forever ballistically
lim t\R D(k, t)"lim 2 (1!u) e\pIN6RS du"2 (1!u) e\pI)3V du"e\pI)3V . U R R Moreover, t\R(k, t) is uniformly bounded by unity, and we may therefore take the tPR limit under the integral over wavenumber by the dominated convergence theorem and conclude
lim p(t)&2t E(k)lim t\R D(k, t) dk&2t E(k)e\pI)3V dk , 7 U R R as indicated in Table 4. Di+usive regime. We now turn to the more interesting cases in which !1(b(1 and the tracer travels ever farther across the shear streamlines. The single-mode normalized contribution R D(k, t) U is then asymptotically di!usive lim R D(k, t)&2K D(k)t , U U R K D(k)" e\pIN6Q ds . U We are therefore led to consider situations in which the total shear-parallel transport p(t) is 7 di!usive. By noting the asymptotics of p (t) in Eq. (144), we can bound R D(k, t) as follows 6 U (145) 04R D(k, t)4C q D(k)t , U U where q D(k) is some positive, decreasing function with low-wavenumber asymptotics U (146) q D(k)&C D(K k)\>@ , U U V where C and C D are some positive numerical constants depending on b but not on k nor t. The U dominating function q D(k) obeys the same properties as the Lagrangian w -persistence time, and D U we identify it symbolically as such. The dominated convergence theorem allows us to conclude that provided
E(k)q D(k) dk(R , U
(147)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
341
the long-time shear-parallel transport is di!usive lim p(t)&2KHDt , 7 U R KHD"2 E(k)K D(k) dk . U U The condition (147) is satis"ed whenever e(2b/(1#b), and we thereby obtain the results for the di!usive regime stated in Table 4. Superdi+usive regime. We next consider the regime 2b/(1#b)(e(2 where the above argument fails due to the nonuniform convergence of E(k)R D(k, t) to 2K D(k)E(k)t near k"0. (Here we U U do not treat the transition value e"2/(1#b) which involves logarithmic corrections.) According to the general considerations discussed above, we should expect that p(t) grows superlinearly and 7 is dominated at large time by low wavenumbers 04k:k (t), where 4pK \ V t\>@ k (t)" 1#b
is the inverse function to Eq. (146) (up to an unimportant numerical factor). We will therefore zoom in on this region by rescaling the integration variable in Eqs. (143a) and (143b) to q"k/k (t) (148) First, for technical reasons, we separate the formula (143a) for p(t) as follows 7 p(t)"pN (t)#pJ (t) , 7 7 7 I pN (t)"2 E(k)R D(k, t) dk , 7 U pJ (t)"2 E(k)R D(k, t) dk , 7 U I where k is an arbitrary positive cuto! wavenumber. The contribution pJ (t) is clearly (at most) 7 di!usive since the range of active k is bounded below by k '0. We can therefore neglect pJ (t) if 7 pN (t) is superdi!usive, and we now show this is indeed the case. 7 Rescaling pN (t) according to Eq. (148), we obtain 7 IIR E(qk (t))R D(qk (t), t) dq . (149) pN (t)"2k (t) U 7 Now, it is readily shown that
lim t\R D(qk (t), t)"lim 2 (1!u) e\pOIRN6RS du"2 (1!u)e\OS>@ du . U R R The "niteness of this limit re#ects the fact that we have zoomed in on the low wavenumbers k:qk (t) for which the mode-by-mode contribution R D(k, t) to the shear-parallel tracer motion is U
342
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
ballistic. The long-time limit of the other factor in the integrand is described by lim (k (t))C\E(k (t)q)"A q\C , # R since lim k (t)"0. R Now, we would like to evaluate the tPR limit of pN (t) in Eq. (149) by taking the tPR limits 7 of the integrands and the upper cuto!. To do this, we "rst establish the uniform bounds C 04t\R D(qk (t), t)H(k /k (t)!q)4 , U 1#q>@
(150a)
04kC\(t)E(k (t)q)H(k /k (t)!q)4C q\C , (150b) where C and C are constants independent of k and t (but possibly depending on the "xed cuto! k ), and H( ) ) is the Heaviside function 1 if q'0 ,
H(q)" 0
if q(0 ,
(151)
if q"0 . Inequality (150a) follows from Eq. (145) and the easy bound 04R D(k, t)4t; inequality (150b) is U a consequence of the low wavenumber asymptotics of E(k). Using Eqs. (150a) and (150b) and the fact that q\C(1#q>@)\ is an absolutely integrable function on q3[0,R) for 2b/(1#b)(e(2, we can apply the dominated convergence theorem to conclude that
lim pN (t)&2k\C(t)t lim (kC\(t)E(qk (t))(t\R D(qk (t), t)) dq 7 U R R "2k\C(t)t A q\C 2 (1!u)e\OS>@ du dq # 4 " K t>C\@\C , 2#e!b(2!e) UD
(152)
where
4nK C\ V K D"(2#e!b(2!e))A (1!u) q\Ce\OS>@ dq du . U # 1#b By a change of variables p"qu>@, the integral may be evaluated exactly in terms of Gamma functions [195], leading to the expression for p(t) in the superdi!usive regime displayed in Table 4. 7 3.3. Tracer transport in shear yow with random spatio-temporal -uctuations and transverse sweep We next explore the e!ects of introducing temporal #uctuations into the random shear #ow v
*(x, y, t)"
w(t)
v(x, t)
.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
343
This model will be referred to as the Random Spatio-¹emporal Shear Model (RSTS) Model. The cross-shear sweeping #ow w(t) is de"ned with the same properties as in the Random Steady Shear Model of Section 3.2. The random spatio-temporal shear #ow v(x, t) is assumed to be a Gaussian, homogenous, stationary, mean zero random "eld. We now need to describe v by a correlation function depending on space and time RI (x, t),1v(x, t)v(x#x, t#t)2 . As in the steady case, we de"ne the velocity correlation function through its Fourier transform,
ep IV>SREI (k, u) du dk . (153) \ \ The spatio-temporal energy spectrum EI (k, u) appearing here resolves the energy of the #uid simultaneously into spatial wavenumber and temporal frequency; RI (x, t)"
I>DI S>DS EI (k, u) du dk D D I\ I S\ S is the amount of energy associated to #uctuations with spatial wavenumbers k$Dk and temporal frequencies u$Du. The spatio-temporal energy spectrum EI (k, u) is necessarily a nonnegative function with EI (k, u)"EI (!k, !u). (See Paragraph 2.4.5.3.) For the sake of simplifying some formulas, we shall restrict attention to random #ows which are statistically invariant under time reversal, so that RI (x, t)"RI (x,!t) and consequently EI (k, u)"EI (k,!u). This requirement excludes random travelling wave motion. With this extra condition, we can express the velocity correlation function entirely in terms of the energy at nonnegative wavenumbers and frequencies
cos(2p(kx#ut))EI (k, u) du dk . By de"nition, the energy spectrum E(k) (resolved with respect to spatial wavenumber) is just the integral over frequencies of the spatio-temporal energy spectrum RI (x, t)"4
EI (k, u) du"2 EI (k, u) du . \ We shall make the same assumptions on the energy spectrum E(k) as we did for a steady random shear #ow in Section 3.2. Namely, E(k) is assumed to be smooth on k'0 and absolutely integrable, with E("k")"
E(k)"A k\Ct(k) , (154) # where e(2, A '0 and t(k) is a smooth function on the positive real axis with t(0)"1. # We are left to describe the behavior of EI (k, u) with respect to u. To this end, we shall assume that the #uctuations of any given wavenumber k decay on a single wavenumber-dependent time scale q(k). Mathematically, this implies that EI (k, u)"E("k") (uq("k"))q("k") , I where the + ( ) ), 0(k(R, is a family of nondimensional even functions with (u) du"1. I \ I
344
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The important point is that u only appears in the combination uq(k). We shall call q(k) the Eulerian correlation time scale since it describes the rate of decay of (a single wavenumber component of) the velocity "eld as observed at a "xed point, as in the Eulerian perspective. (See Section 3.1.3) It is to be contrasted with the ¸agrangian persistence time q (k), introduced in Section 3.2, for the rate of * decay of the Lagrangian velocity (due to #uctuations of wavenumber k) observed by a moving tracer in the #ow. The Lagrangian and Eulerian correlation times coincide when the tracer moves purely along the streamlines of the shear #ow, but the Lagrangian persistence time will be shorter in the presence of a cross sweep w(t) or molecular di!usion which transports the tracer across streamlines. We will assume that E q(k) is a decreasing, smooth function of k, with (155) lim q(k)&A k\X , O I with A '0 and 04z(R. O E (-) are even, nonnegative, smooth functions of both - and k, and obey a mild, uniform bound I C ( , (156) 04 (-)4 I 1#"-"A with 0(C (R and c'1. Moreover, we assume that (0)'0, meaning that the spatio( temporal random shear has some nontrivial zero frequency component at low wavenumbers. It is quite satisfactory to consider the special case with q(k),A k\X and ( ) ), ( ) ). Our O I purpose in stating the assumptions in greater generality is solely to emphasize that it is only the low-wavenumber properties of EI (k, u) which determine whether the tracer motion along the shear has an anomalous di!usion law, and if so, what that law is. The reason we demand that q(k) decrease with k is that #uctuations at higher wavenumbers (smaller spatial scales) naturally have faster dynamics than those of lower wavenumbers. Note that the Eulerian correlation time scale diverges more severely in the low wavenumber limit as z increases. Since the low wavenumbers often play a central role in determining the long-time behavior of a tracer, we can expect that the zPR limit should exhibit features in common with a random steady shear #ow. This will be borne out in what follows. In the other extreme z"0, the low wavenumbers decorrelate at a uniformly "nite time. This may be viewed as a rapid decorrelation limit in a weak sense; we will discuss turbulent di!usion models with rapid decorrelations in a strong sense in Section 4. A decreasing, power law dependence for the Eulerian decorrelation time scale q(k) is natural for describing the self-similar inertial range of scales in fully developed turbulence at a high Reynolds number. As we shall discuss in Section 3.4.3, the standard Kolmogorov theory for the inertial range statistics of a turbulent velocity "eld corresponds formally to e"8/3 and z"2/3 in the Random Spatio-Temporal Shear (RSTS) Model de"ned above. The value e"8/3 is actually outside the admissible domain e(2 of infrared scaling exponents allowed in the present model because it would result in an in"nite amount of energy residing at small wavenumbers. We will discuss in Sections 3.4.3 and 3.5 how the RSTS Model can be extended to incorporate an inertial range of scales.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
345
At the present, we will restrict attention to the mean-square displacement p(t),1(>(t)!y )2 7 of a tracer along the shear in the RSTS Model with e(2 and z50. An exact formula for this quantity is obtained in Paragraph 3.3.6.1: Mean-Square Shear-Parallel ¹racer Displacement for RS¹S Model
p(t)"2it#4 7
EI (k, u)RI (k, u, t) du dk ,
(157a)
R (157b) RI (k, u, t)"2 (t!s) cos(2pkwN s) cos(2pus)e\pIN6Q ds . This formula remains valid when v(x, t) is non-Gaussian. Note that it properly reduces to the formula (112) when the spatio-temporal energy spectrum has the form EI (k, u)"E(k)d(u) associated to a steady random shear #ow. The shear-displacement kernel RI (k, u, t) represents the response of p(t) to the presence of 7 a component A cos(2pkx) cos(2put)#B cos(2pkx) sin(2put) #C sin(2pkx) cos(2put)#D sin(2pkx) sin(2put) in the random shear #ow, where A, B, C, and D are independent, standard Gaussian random variables. The formula for RI (k, u, t) di!ers from that of its counterpart R(k, u, t) (112b) for the steady shear #ow only in the presence of an oscillatory term cos(2pus) in the integrand, naturally manifesting the temporal #uctuations of the shear #ow. An alternate formula for p(t), which is at "rst perhaps easier to understand, involves the spectral 7 temporal correlation function Ex (k, t), which provides an intermediate representation between the full physical space spatio-temporal correlation function RI (x, t) and the spatio-temporal spectrum EI (k, u)
ep IVEx (k, t) dk"2 cos(2pkx)Ex (k, t) dk , \ (158) Ex (k, t)" ep SREI (k, u) du"2 cos(2put)EI (k, u) du . \ As such, Ex (k, t) describes the temporal correlations of turbulent shear modes of wavenumber k in terms of the physical time variable; Ex (k, t"0) is simply equal to E(k). The mean-square shearparallel displacement can be expressed in terms of Ex (k, t) by formula similar to Eqs. (112a) and (112b) in the RSS Model RI (x, t)"
R
(159) (t!s) cos(2pkwN s)e\pIN6QEx (k, s) ds dk . For the special case of a steady shear #ow, Ex (k, t),E(k). We will work here with the representation (157), since it cleanly separates the in#uence of the particular shear #ow structure EI (k, u) from a kernel RI (k, u, t) which depends only on the active cross-shear transport mechanisms. In Section 3.5, we will make use of a semi-spectral representation similar to Eq. (159). p(t)"2it#4 7
346
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
We will consider the long-time behavior of p(t) in the RSTS Model in a manner parallel to our 7 study of the Random Steady Shear Model in Section 3.2. First, we consider in Section 3.3.1 the motion of a tracer in the randomly #uctuating shear #ow with no molecular di!usion and no cross sweep. Next, we consider separately the e!ects when molecular di!usion i (Section 3.3.2), a constant cross sweep wN (Section 3.3.3), or a randomly #uctuating cross sweep w (t) (Section 3.3.4) are D added to the #uctuating shear #ow. The qualitative scaling behavior for p(t) in each case may be 7 classi"ed into three categories determined by the exponents e and z characterizing the lowwavenumber (infrared) scaling behavior of the energy spectrum E(k) and the Eulerian correlation time q(k). These will be graphically described by phase diagrams, indicating the regions of the (e, z) diagram associated to each type of qualitative scaling behavior. In Section 3.3.5, we discuss how the shear-parallel transport of a tracer behaves under superpositions of the various cross-shear transport mechanisms. It turns out that the superposition rules are completely straightforward here; the subtleties discussed in Section 3.2.5 for the Random Steady Shear Model are absent here. The method of derivation for all the results in the RSTS Model is indicated in Section 3.3.6. 3.3.1. Tracer behavior in absence of cross-shear transport We begin by considering the rather simple case of a tracer in a random shear #ow with spatio-temporal #uctuations with no molecular di!usion (i"0) or cross sweep (w(t)"0). A tracer then forever stays on its original streamline, and its Lagrangian velocity is the same as the Eulerian velocity observed at any given point on that streamline. Consequently, the tracer behaves exactly as in the Random Sweeping Model; the spatial structure of the shear #ow is irrelevant. The long-time asymptotics of the shear-parallel tracer displacement could be worked out as in Section 3.1.2 by considering the low-frequency behavior of the energy spectrum of the velocity resolved with respect to temporal frequency, which can be expressed as EI (k, u) dk. We will, however, proceed in a fashion which better maintains continuity with our analysis of the Random Steady Shear Model in Section 3.2 and our later discussion of the RSTS Model where the spatial structure of the shear plays a crucial role. For i"w(t)"0, the shear-displacement kernel RI (k, u, t) in Eq. (157b) takes the simple, wavenumber-independent form, RI (k, u, t)"tF (2put) , (160) F (-),2(1!cos -)/- . It is readily seen that the tracer displacement due to a single random spatio-temporal Fourier mode of the shear with wavenumber k and frequency u is ballistic for short times t;u\ and trapped for long times t
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
347
direct if we break up the shear-parallel transport into contributions from a shear mode of a single wavenumber k, with temporal #uctuations coming from all frequencies, rather than disintegrating the motion into modes of single wavenumber and single frequency. We accordingly rewrite Eqs. (157a) and (157b) as an integral of the energy spectrum E(k) against a kernel R (k, t) which has already accounted for the temporal structure of the #uctuating shear ( #ow
E(k)R (k, t) dk , (161a) ( R (k, t),2 RI (k, u, t) (uq(k))q(k) du . (161b) ( I With Eq. (160), it is clear that R (k, t) may be expressed as ( R (k, t)"tF (t/q(k)) , ( ( (162) F (u),2 F (2pu-) (-) d- . I ( This representation re#ects the fact that the temporal #uctuations of a single wavenumber mode k of the velocity "eld has a characteristic time scale q(k). Since the Lagrangian velocity coincides with the Eulerian velocity in the present instance, the Lagrangian persistence time q (k) of the tracer * is readily identi"ed with q(k). According to the standard intuition of Paragraph 3.1.3.2, we therefore expect that the shear-parallel tracer displacement due to a single wavenumber k of the shear is ballistic for t;q(k) and di!usive for t
KI H"
EI (k, 0) dk"
E(k)q(k) (0) dk , (163a) I C((2!e)/2z) KI "z\pC>X\X A A\CX -C\X (-) d- . (163b) # O C((e#3z!2)/2z) In contrast to the Random Sweeping Model, there is no subdi!usive or trapping behavior here because EI (k, u"0) is by assumption positive for small k, so the velocity of a streamline always has a nonvanishing zero frequency limit. We observe that the di!usion coe$cient KI * in the di!usive regime only depends on the zero frequency modes of the shear #ow. We stress that this does not imply that the tracer behavior is therefore the same as in the Random Steady Shear Model. In a steady shear #ow, the spatio-temporal energy spectrum is singularly concentrated at u"0,
348
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Table 6 Long-time asymptotics of mean-square tracer displacement along the shear in Random Spatio-Temporal Shear Model, with i"wN "w (t)"0. Scaling coe$cients are given by Eq. (163) D Parameter regime e(2!z 2!z(e(2
z50 z50
Asymptotic mean square displacement lim p(t) R 7
Qualitative behavior
2KI *t 2z KI tX>C\X 2z#e!2
Di!usive (D) Superdi!usive (SD-u)
namely EI (k, u)"E(k)d(u), and the shear-parallel tracer motion is ballistic in the absence of cross-shear transport. In the Random Spatio-Temporal Shear Model, the di!usive shear-parallel tracer transport (for e#z(2) comes from the continuous distribution of energy at low frequencies. To understand the criterion e#z(2 for di!usive behavior, we formally sum up the shearparallel transport contributions from each wavenumber k. A straightforward argument generalized from the discussion of the RSS Model in Paragraph 3.2.6.2 suggests that the statistical tracer motion should, at long times, be di!usive and characterized by a di!usion coe$cient on the order of the integral of the di!usivities contributed by each mode, provided that
E(k)q (k) dk *
(164)
is "nite. For the present case of no cross-shear transport mechanisms, q (k)"q(k), the Eulerian * correlation time for each spatial Fourier mode of the random shear. The "niteness of expression (164) is determined by the behavior of the integrand near k"0; integrability at high wavenumbers is ensured by the physical assumption that q(k) decreases with k. At small k, the Eulerian correlation time q(k)&A k\X. Recalling that E(k)&A k\C for small k, we see that the criterion that Eq. (164) O # be "nite is just e#z(2, which is exactly the di!usive regime described in Table 6. Note moreover that since (0) is an order unity dimensionless constant (with a possible but unimportant mild I variation with k), the formula for the shear-parallel di!usivity (163a) states that the asymptotic di!usivity contributed by a normalized mode of wavenumber k (with prescribed temporal #uctuations) is proportional to the Lagrangian correlation time q (k). This is in agreement with the * general relation which we have seen to hold in standard di!usive situations (see Paragraph 3.1.3.2 and Section 3.2). For e#z'2, the shear-parallel tracer motion is superdi!usive because the product E(k)q (k) diverges too strongly at low wavenumber. * The two regimes of long-time behavior of p(t) are indicated on a phase diagram in Fig. 13. Such 7 phase diagrams are two-dimensional versions of the pictures of phase transitions with respect to the single parameter e which we presented for the RSS Model in Section 3.2. Each regime of qualitatively di!erent long-time behavior for p(t) may be viewed as a `phasea with a distinct 7 algebraic law for the long-time scaling exponent of p(t). The boundaries between these phases 7 correspond to phase transitions, and are associated with more complicated formulas for the long-time behavior of p(t). We shall not present the special formulas characterizing the phase 7
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
349
Fig. 13. Phase diagram for long-time asymptotics of p(t) in Random Spatio-Temporal Shear Model with i"wN " 7 w (t)"0. D
boundaries, as these would be a distraction from our main endeavor on developing physical insight from the simpli"ed model. We see from the phase diagram that when z"0, corresponding to a "nite correlation time of the low wavenumber modes, the shear-parallel tracer motion is di!usive for any energy spectrum E(k). To facilitate later discussion, We label the di!usive regime with the symbol D and the other regime with the symbol SD-u, indicating a superdi!usive tracer behavior determined by temporal #uctuations of the shear #ow. 3.3.2. Ewects of molecular diwusion We now consider the e!ects of positive molecular di!usion i'0 on the shear-parallel transport of a tracer. We continue for the moment to assume no cross sweep (w(t)"0). The sheardisplacement kernel (157b) may then be expressed in the closed form
p(t)"2it#4 7
EI (k, u)RI (k, u, t) du dk , G
(165a)
RI (k, u, t)"tF (t/q (k), 2p ut) , (165b) G G where q (k)"(4pik\)\ is the Lagrangian persistence time associated to molecular di!usion, G and the universal, dimensionless function F is de"ned in Eq. (133). Upon integration over the frequency variables, we discover that the shear-parallel tracer motion due to each spatial wavenumber k is naturally determined by the time scales q (k) and q(k), corresponding to the G decorrelation mechanisms due to molecular di!usion across the shear and temporal #uctuations of the shear #ow, respectively. The long-time behavior of the mean-square shear-parallel displacement, obtained from a rigorous asymptotic computation of Eqs. (165a) and (165b), is classi"ed according to the parameter
350
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Table 7 Long-time asymptotics of mean-square tracer displacement along the shear in Random Spatio-Temporal Shear Model, with i'0, wN "w (t)"0. Scaling coe$cients are given by Eq. (166) D Parameter regime
04z42 z52 04z(2
e(2!z e(0 2!z(e(2
z'2
0(e(2
Asymptotic mean square displacement lim p(t) R 7
Qualitative behavior
2KI Ht G
Di!usive (D)
2z KI tX>C\X 2z#e!2 4 K t>C 2#e G
Superdi!usive (SD-u) Superdi!usive (SD-i)
values e and z in Table 7. The coe$cients appearing in the scaling laws are as follows
KI *"i#4 G
EI (k, u)KI (k, u) du dk , G
(166a)
C((2!e)/2z) KI "z\nC>X\X A A\CX - C\X (-) d- , C((e#3z!2)/2z) # O e K "!C ! A (4pi)\\C , G 2 #
(166b) (166c)
where q\(k) ik RI (k, u, t) G " " (167) KI (k, u)"lim G G q\(k)#(2pu) 4pik#u 2t G R is the asymptotic di!usivity due to a single spatio-temporal Fourier mode of the shear with wavenumber k and frequency u. The three regimes of long-time behavior of p(t) are also indicated 7 on a phase diagram in Fig. 14. We shall now explain on a heuristic level why the phase diagram takes the form it does. There are two natural time scales associated to the tracer motion due to each spatial wavenumber of the shear #ow: an Eulerian temporal decorrelation time q(k) due to temporal #uctuations of the shear #ow and a decorrelation time q (k) due to molecular di!usion bu!eting the tracer across G the shear #ow. One therefore naturally associates a Lagrangian persistence time of the tracer motion q (k) with the smaller of these two time scales. The same standard argument as was given in * Section 3.3.1 indicates that di!usive or superdi!usive behavior should result according to whether q (k)E(k) is integrable or not-integrable at k"0. (See the discussion around Eq. (164).) * The Lagrangian persistence time q (k) is set by the smaller of q(k) and q (k). As q(k)&k\X for * G small k and q (k)&k\, we see that for z(2, Lagrangian persistence time for the low wavenumber G modes is determined by the Eulerian correlation time scale q(k), and molecular di!usion has a negligible e!ect. Consequently, the criterion for di!usive behavior for z(2 should be expected to
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
351
Fig. 14. Phase diagram for long-time asymptotics of p(t) in Random Spatio-Temporal Shear Model with i'0 and 7 wN "w (t)"0. D
be the same as that for a random shear #ow with spatio-temporal #uctuations and no molecular di!usion: e#z(2 (see Section 3.3.1). This is con"rmed in the phase diagram (Fig. 14). Moreover, when this di!usivity condition is violated for z(2, the resulting asymptotic superdi!usive behavior is exactly the same as for the case of a shear #ow with spatio-temporal #uctuations and no molecular di!usion. The reason is that superdi!usion is determined solely by the low wavenumber modes of the shear #ow, and we have noted that molecular di!usion is asymptotically irrelevant for such modes relative to the intrinsic temporal decorrelation of the shear #ow. We have therefore identi"ed the region 04z(2, e#z'2 as the same SD-u regime introduced in Section 3.3.1. A complementary situation arises for the portion of the phase diagram corresponding to z'2, for which molecular di!usion plays the dominant role at low wavenumber. The criterion for di!usivity, e(0, is identical to that for transport in a steady random shear #ow with molecular di!usion. We also note from Table 7 that the long-time tracer behavior for z'2, e'0 is identical to the superdi!usive regime e'0 for a random steady shear #ow with molecular di!usion (cf. Section 3.2.1). The temporal #uctuations of the shear #ow are not manifested in any way in this regime, because they are asymptotically irrelevant for the low wavenumbers driving the superdi!usive tracer motion. We call the phase region z'2, e'0 the SD-i regime, indicating superdi!usive tracer behavior with molecular di!usion as the dominant Lagrangian decorrelation mechanism. All told, we may understand the phase diagram (Fig. 14) for shear-parallel tracer transport in an RSTS #ow with i'0 as a gluing together of a portion of the phase diagram (Fig. 13) for an RSTS #ow with i"0 and a portion of the phase diagram for an RSS #ow with i'0. The former applies for 04z(2, and the latter for z'2; the dividing line z"2 is determined by the balance at low wavenumbers between the decorrelation time scale q(k) due to temporal #uctuations of the shear #ow and the decorrelation time scale q (k) set by di!usion across streamlines due to molecular G di!usion.
352
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
There is a similar subtlety in the formula for the di!usivity constant in terms of the single spatio-temporal mode di!usivities (167) as in the RSS Model with molecular di!usion and a constant cross sweep. Namely, the single-mode di!usivity does not obey the standard relation of being proportional to the Lagrangian persistence time of a single spatio-temporal shear mode, which here would be identi"ed as min(q (k), u\). The reason for this aberrant behavior is that the G retardation of the tracer's net motion due to temporal #uctuations of a shear mode with a single frequency is periodic and coherent. Integration over the frequency variables, however, produces a formula for the di!usivity in terms of the energy spectrum which does conform to standard Lagrangian intuition. Speci"cally,
KI *"i#2 G K (k)"2 (G
E(k)K (k) dk , (G
KI (k, u) (uq(k)) q(k) du"q(k) G I
2ik
(uq(k)) du , 4pik#u I
and the di!usivity K (k) due to a single wavenumber k (with the full spectrum of temporal (G #uctuations) is approximately proportional to q (k)"min(q (k), q(k)). A related fact is that the * G iP0 limit of the asymptotic di!usivity KI (k, u) of a single spatio-temporal Fourier mode behaves G singularly in the iP0 limit, re#ecting the change from di!usive to ballistic (u"0) or trapped (u'0) behavior. Integration over frequencies regularizes the limit however: the iP0 limit of K (k) converges to q(k) (0). Therefore, as iP0, the di!usion constant KI * smoothly approaches I G (G the limiting value KI H (163a) characterizing the case of no molecular di!usion. 3.3.3. Ewects of constant cross sweep We next consider the e!ects of a constant cross sweep w(t)"wN O0 on the shear-parallel transport of a tracer in an RSTS #ow, with no molecular di!usion. The shear-displacement kernel can again be expressed in closed form RI (k, u, t)"t(F (2p(u#wN k)t)#F (2p(u!wN k)t)) , F (u),2(1!cos u)/u . It is interesting to note that, for a "xed wavenumber k, a strong contribution to the shear-parallel tracer motion comes from frequencies u+wN k. This is in contrast to the general situation without a mean cross sweep wN , for which the shear-displacement kernel typically decays with u away from u"0. The presence of the mean sweep wN creates a type of `resonancea with spatio-temporal shear modes for which u!wN k"0. These resonant modes have a component which appears steady from the point of view of a tracer swept across the shear at speed wN , and consequently contribute ballistically to the shear-parallel tracer motion for all times. The o!-resonance modes u!wN kO0, on the other hand, each contribute an oscillatory, trapped motion in the long-time limit. It is thus natural to expect that the long-time behavior of p(t) should be dominated by the modes along the 7 resonance line u"wN k. Note that since energy is distributed continuously in wavenumberfrequency space, the ballistic contribution of single spatio-temporal Fourier modes along the resonance line does not imply ballistic transport of the total random shear #ow.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
353
The long-time asymptotic behavior of p(t) is described in Table 8 and the phase diagram in 7 Fig. 15. The scaling coe$cients are given by
KI *N " U
EI (k, wN k) dk"
E(k)q(k) (wN kq(k)) dk , I
(168a)
C((2!e)/2z) A A\CX - C\X (-) d- , KI "z\nC>X\X C((e#3z!2)/2z) # O
(168b)
1 K N " A nC\wN C((2!e)/2)/C((1#e)/2) . U 2 #
(168c)
Table 8 Long-time asymptotics of mean-square tracer displacement along the shear in Random Spatio-Temporal Shear Model, with wN O0, i"w (t)"0. Scaling coe$cients are given by Eq. (168) D Parameter regime
04z41 z51 04z(1
e(2!z e(1 2!z(e(2
z'1
1(e(2
Asymptotic mean square displacement lim p(t) R 7
Qualitative behavior
2KI HN t U
Di!usive (D)
2z KI tX>C\X 2z#e!2 2 K N tC e U
Superdi!usive (SD-u) Superdi!usive (SD-wN )
Fig. 15. Phase diagram for long-time asymptotics of p(t) in Random Spatio-Temporal Shear Model with wN O0 and 7 i"w (t)"0. D
354
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
We note that, in the di!usive regime, the di!usion coe$cient is indeed determined by the energy spectrum along the resonance line u"wN k, as suggested by our previous discussion. The phase boundaries may again be explained by simple consideration of the behavior of the Lagrangian persistence time q (k). In the present case, it is the minimum of the Eulerian correlation * time q(k) of the shear #ow and the sweeping persistence time q N "(2pwN k)\. For 04z(1, the U low-wavenumber modes of the shear-parallel transport are dominated by the temporal decorrelation of the shear #ow, and the e!ects of the cross sweep wN are asymptotically negligible. Consequently, the phase diagram and behavior of the superdi!usive regime for 04z(1 is identical to the case of an RSTS #ow with no cross sweep (Section 3.3.1). On the other hand, for z'1, the low-wavenumber contribution to the tracer motion is limited primarily by the sweeping across streamlines. Upon comparison with the results of Section 3.2.2, we see that for z'1, the boundary e"1 marking the onset of the superdi!usive regime, as well as the tracer behavior within the superdi!usive regime, are the same as those for the case of a constant cross sweep in a steady random shear #ow. We label the regime e'1, z'1 the SD-wN regime, within which p(t) grows 7 superdi!usively according to a law depending only on wN and the low-wavenumber behavior of E(k), but not on the temporal #uctuations of the shear. One important distinction between the long-time asymptotic shear-parallel transport with a constant cross sweep for z'1 in the RSTS model from that of the RSS model is that the tracer motion is never subdi!usive or trapping. The reason is that the spatiotemporal #uctuations of the shear #ow break up the phase coherence of the shear-parallel transport from individual spatial Fourier modes k, so the long-time contribution from each wavenumber k becomes di!usive rather than trapped. We saw a similar phenomenon when molecular di!usion was superposed on a constant cross sweep in the RSS Model; see Paragraph 3.2.5.2. Crucial to the obliteration of the subdi!usive and trapping regimes is the assumption that the spatio-temporal energy spectrum EI (k, u) is nonzero for k'0 and small. Random shear #ows violating these conditions can give rise to subdi!usive shear-parallel tracer transport. Other types of anomalous behavior can also arise if energy is singularly concentrated at certain wavenumbers and frequencies, but we shall not explore these issues in any further detail here. One unusual feature of the present situation is that the contribution from wavenumber k to the di!usion constant KI HN in the D regime is not proportional to q (k)&q N (k) for wavenumbers k for * U U which sweeping e!ects dominate (q N (k);q(k)). The reason can be traced to the fact that the U interaction of a pure, constant cross sweep with a shear mode of a single wavenumber k induces a shear-parallel trapping motion at long times. Therefore, the asymptotic di!usivity must rely somehow on the phase-randomizing e!ects of the random temporal #uctuations of the shear #ow, even though the sweeping acts faster to break up the persistent motion of the tracer. The single-mode di!usivity in the RSS Model with i'0, wN O0, and w (t)"0 su!ers from a similar D anomaly, as we discussed brie#y in Paragraph 3.2.5.2.
3.3.4. Ewects of temporally yuctuating cross sweep Just as for the Random Steady Shear Model, a mean zero, randomly #uctuating cross sweep w(t)"w (t) in#uences the shear-parallel transport in the RSTS Model in a manner similar to D molecular di!usion, but with a wide range of behavior depending on the exponent b describing the
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
355
Table 9 Long-time asymptotics of mean-square tracer displacement along the shear in Random Spatio-Temporal Shear Model, with w (t)O0, i"wN "0. Scaling coe$cients are given by Eq. (169) D Parameter regime
For !1(b(1: 2 04z4 1#b 2 z5 1#b 2 04z( 1#b 2 z' 1#b
2b (e(2 1#b
For b(!1: z50 z50
e(2!z 2!z(e(2
e(2!z
Asymptotic mean square displacement lim p(t) R 7
Qualitative behavior
2KI H t UD
D
2b e( 1#b 2z KI tX>C\X 2z#e!2 4 K t>C\@\C 2#e!b(2!e) UD
2!z(e(2
2KI H t UD 2z KI tX>C\X 2z#e!2
SD-u SD-w D D SD-u
low-frequency scaling of its power spectrum
R (t),1w (t)w (t#t)2"2 U D D
cos(2put)E ("u") du , U
E (u)"A u\@t (u) . U #U U The long-time behavior of the mean-square shear-parallel tracer displacement p(t) in an 7 RSTS #ow with randomly #uctuating cross sweep w (t) is detailed in Table 9 and in the phase D diagram in Fig. 16. The preconstants appearing in the asymptotic scaling laws have the following expressions:
KI *D"4 U
EI (k, u)KI D(k, u) dk du , U
(169)
C((2!e)/2z) KI "z\nC>X\X A A\CX -C\X (-) d- , C((e#3z!2)/2z) # O C((2!e)/2)A 1#b \C # , K D" U (e(1#b)/2!b) 4pK V KI *D" EI (k, 0)e\pI)V3 dk" E(k)q(k) (0)e\pI)3V dk . U I
356
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 16. Phase diagram for long-time asymptotics of p(t) in Random Spatio-Temporal Shear Model with w (t)O0 and 7 D i"wN "0. The value b"1/2, corresponding to a random cross sweep which induces superdi!usive cross-shear transport, is used in this drawing. Varying b simply slides the `triple pointa of intersection of the phase boundaries along the e#z"2 line.
In the D regime for !1(b(1, the single wavenumber-frequency mode contribution to the di!usivity constant is
RI (k, u, t) " cos(2pus)e\pIN6Q ds . KI D(k, u),lim UD U 2t R A rigorous derivation of these results is presented in detail in Paragraph 3.3.6.2. We see the same qualitative structure of the phase diagram here in Fig. 16, for !1(b(1, as for the case of molecular di!usion or a constant cross sweep acting in concert with a randomly #uctuating spatio-temporal shear #ow (cf. Figs. 14 and 15). The phase diagram may be analyzed in exactly the same manner through consideration of the Lagrangian persistence time q (k) of the low * wavenumber modes. Here it is the minimum of the Eulerian correlation time q(k)&k\X and the w -persistence time q D(k)&k\>@; see Eq. (131). For 04z(2/(1#b), the intrinsic decorrelaD U tion of the random shear determines the shear-parallel tracer motion due to low-wavenumber shear modulations, and the phase diagram and superdi!usive scaling laws are indi!erent to the presence of w (t)O0. On the other hand, for z'2/(1#b), the random sweeping w (t) sets the D D decorrelation rate of the shear-parallel tracer motion, and the superdi!usive regime along with its boundary take the same form as in the Random Steady Shear Model (see Section 3.2.3). We label the superdi!usive regime within this portion of the phase diagram the SD-w regime. Varying D b from 0 to 1 increases the cross-shear transport from di!usive to superdi!usive to almost ballistic, and the phase diagrams and superdi!usive scaling exponents for p(t) correspondingly interpolate 7 between those associated to molecular di!usion i acting across the shear (Section 3.3.2) to those
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
357
associated with a constant sweep wN across the shear (Section 3.3.3). The di!usivity constant KI *D in U the D regime may be heuristically understood in terms of the contributions from modes of various wavenumbers and frequencies in a similar fashion to the case in which molecular di!usion is the cross-shear transport mechanism (Section 3.3.2). For b(!1, the cross-shear motion is trapped, and consequently never competes with the temporal #uctuations of the shear in determining the tracer dynamics due to low wavenumber components of the shear. The phase diagram and superdi!usive scaling laws are consequently identical for all z to those for the case in which there is no cross-shear transport (see Section 3.3.1). The trapping cross sweep does have a mild in#uence on the di!usion constant within the di!usive regime. 3.3.5. Superposition of cross-shear transport mechanisms For the RSTS Model, the behavior of shear-parallel transport under a combination of crossshear transport processes is simple to describe. The phase diagram, along with the scaling laws for the indicated superdi!usive regimes, are exactly those corresponding to the mechanism which moves the tracer across the shear most rapidly (at long times). Therefore, any time that a mean cross sweep wN O0 is active, the phase diagram appears as in Fig. 15, and the asymptotic behavior of p(t) in superdi!usive regimes is insensitive to any other cross-shear transport mechanisms 7 which may be present. Similarly, if wN "0 but a superdi!usive (0(b(1) random cross sweep w (t) D is active, then the phase diagram and superdi!usive scaling laws are just as described in Section 3.3.4, whether or not molecular di!usion is present or not. Molecular di!usion, in like manner, dominates randomly #uctuating cross sweeps w (t) with subdi!usive or trapping behavior. D It should be noted that in any case, the di!usion coe$cient in the di!usive regime D will depend on all cross-shear transport mechanisms present. Another way to summarize the above results is that the criterion for superdi!usive shear-parallel tracer motion, and the asymptotic behavior of p(t) in the superdi!usive regime, depend only on 7 the low-wavenumber behavior of the energy spectrum E(k)&A "k"\C and the low-wavenumber # behavior of the Lagrangian persistence time q (k). The Lagrangian persistence time is in turn * determined by the shortest of the individual persistence times (q(k), q (k), q N (k), and q D(k)) which U G U correspond to active processes. Di!usive behavior results when E(k)q (k) is integrable at low * wavenumber. The region of superdi!usive behavior may from this argument be deduced to be the intersection of the superdi!usive regions associated to each cross-shear transport process acting separately. With the assumptions of the RSTS Model, there is never any subdi!usive or trapped shearparallel tracer motion, and the associated subtlelty in the RSS Model concerning superposition of cross-shear transport mechanisms is not an issue (see Paragraph 3.2.5.2). We emphasize that this is a consequence of the assumption that energy is distributed continuously in wavenumber-frequency space, with some nontrivial contribution at low frequencies and wavenumber. This is illustrated by the behavior of the shear-displacement kernel RI (k, u, t) when wN O0 in the limit in which the random components of the cross shear motion vanish, i.e. iP0 and w (t)P0. For any "xed kO0 D and u, the long-time asymptotics of RI (k, u, t) behaves singularly in this limit, just in the RSS Model. The integration against the spatio-temporal spectrum EI (k, u) satisfying the assumptions of the RSTS Model, however, completely regularizes the limit in which the random cross-shear components vanish. Irregular limiting behavior of the type described in the Random Steady Shear Model
358
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
can result for random shear #ows with spatio-temporal #uctuations which do not satisfy the basic assumptions of the RSTS Model. 3.3.6. Derivations We indicate here how to establish the results concerning the statistics of shear-parallel tracer motion in the RSTS Model which were stated throughout Section 3.3. First, we derive the basic formula (157) for p(t) at "nite times. Next, we illustrate in detail how the long-time asymptotics of 7 p(t) can be rigorously computed for the case in which only a randomly #uctuating cross sweep is 7 present (as in Section 3.3.4). We "nally sketch without details how to compute the long-time asymptotics of p(t) with general cross-shear transport mechanisms. 7 3.3.6.1. Derivation of general formula. The derivation of formula (157) proceeds in exactly the same way as in the Random Steady Shear Model (see Paragraph 3.2.6.1), except that the time dependence of the shear velocity "eld must be accounted for. We start with the following modi"cation of formula (137)
p(t)"2i1(= (t))2# 7 W
R R
1v(X(s), s)v(X(s), s)2 ds ds
R R
1RI (X(s)!X(s), s!s)2 ds ds ,
"2it#
which is obtained by simply replacing v(X(s)) in the trajectory equation (136b) for >(t) with v(X(s), s). Substituting the spectral representation (153) for RI (x, t) into this last expression, we have
R R
p(t)"2it# 7
EI (k, u)1ep I6Q\6QY2ep SQ\QY du dk ds ds . \ \
The remaining average may now be computed in the same way as in Paragraph 3.2.6.1:
R R
EI (k, u)ep IUN Q\QY\pIN6Q\QYep SQ\QY du dk ds ds . \ \
p(t)"2it# 7
Using "nally the four-way symmetry EI (k, u)"EI (k, !u)"EI (!k,u)"EI (!k,!u) which is a consequence of the assumed time reversal symmetry of the shear #ow, we can condense the wavenumber-frequency integration to the "rst quadrant
p(t)"2it#4 7
R R
EI (k, u) cos(2pkwN (s!s)) cos(2pu(s!s))
6
;e\p I N Q\QY du dk ds ds
"2it#8
R
EI (k, u)(t!s) cos(2pkwN s) cos(2pus)e\pIN6 Q du dk ds .
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
359
Reversing the order of integration and decomposing the formula into an explicit integration against the shear-displacement kernel RI (k, u, t), we arrive at the desired formula (157)
p(t)"2it#4 7
EI (k, u)RI (k, u, t) du dk ,
R RI (k, u, t)"2 (t!s) cos(2pkwN s) cos(2pus)e\pIN6Q ds .
(170a) (170b)
3.3.6.2. Derivation of asymptotics for case of randomly yuctuating cross sweep. We present here a rigorous computation of the long-time asymptotics for p(t) for the case considered in Sec7 tion 3.3.4 in which w (t)O0, i"0, and wN "0. The general "nite-time formula (170) specializes to D (171a) EI (k, u)RI D(k, u, t) du dk , p(t)"4 U 7 R (171b) RI D(k, u, t)"2 (t!s)cos(2pus)e\pIN6Q ds , U where
R p (t),1(X(t)!x )2"2 (t!s)R (s) ds , 6 U R (t),1w (t)w (t#t)2"2 cos(2put)E ("u") du , U D D U E (u)"A u\@t (u) . U #U U This is the case for which we also presented a detailed derivation for the Random Steady Shear Model in Paragraph 3.2.6.2. The general procedure is similar in spirit, though the extra integration over the frequency variable creates the need for some extra work. We shall "rst deal with the range of exponent values !1(b(1 for which the random cross sweep produces unbounded motion across the shear. The e!ects of a trapping cross sweep (b(!1) is handled separately at the end. Di+usive regime. For !1(b(1, the long-time asymptotics of the shear-displacement kernel are di!usive
lim RI D(k, u, t)&2KI D(k, u)t , U U R KI D(k, u)" cos(2pus)e\pIN6 Q ds . (172) U We now show that for the range of parameters de"ning the D regime in Fig. 16, the shear-parallel tracer motion is described by a "nite di!usion coe$cient given by the integral of the single-mode
360
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
di!usivity against the spatio-temporal energy spectrum lim p(t)&2KI *Dt , 7 U R (173) KI *D"4 EI (k, u)KI D(k, u) du dk . U U To do this, we need only show that t\RI D(k, u, t) is bounded uniformly in time by a function which U is integrable against EI (k, u), and then apply the dominated convergence theorem. Noting that "RI D(k, u, t)"4RI D(k, 0, t)"R D(k, t), we can utilize the bound (145) obtained in our previous U U U analysis in the RSS Model
"RI D(k, u, t)"4C q D(k)t . U U Here q D(k) is some positive, decreasing function with the low-wavenumber asymptotics (146) U q D(k)&C D(K k)\>@ , (174) U V U and may be thought of as the Lagrangian w -persistence time. The constant K is de"ned in V D Eq. (129); +C , and C D are positive numerical constants which do not depend on k, u, or t. H U Next we establish a second bound (175) "RI D(k, u, t)"4C u\t , U with positive numerical constant C depending only on b. We "rst integrate RI D(k, u, t) by parts U R dp (s) s RI D(k, u, t)"2u\t e\pIN6Qsin(2pus) (2pt)\#pk 6 1! ds . (176) U ds t As stated in Eq. (144), lim p (s)&2K /(1#b)s>@. It may be veri"ed through an integration by V Q 6 parts of the formula
dp (s) R sin2put 6 " R (s) ds" A "u"\@t ("u") du U #U U ds 2pu that the derivative has the naturally expected bound "dp (s)/ds"4C K (1#s@) 6 V for !1(b(1 and some positive numerical constant C depending only on b. Using these facts about p (s) and its derivative, the desired bound (175) follows from Eq. (176). 6 We therefore have that (177) "RI D(k, u, t)"4C min(q D(k), u\)t . U U This sensibly generalizes the natural bounds (141) which one obtains in the RSS Model, since min(q D(k), u\) acts as a Lagrangian persistence time for the shear-parallel tracer motion assoU ciated to a #uctuation of a single wavenumber and frequency.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
361
We make even stronger contact with the analysis of the RSS Model when we perform the integration of Eqs. (171a) and (171b) over frequency. Recalling that the spatio-temporal spectrum in the RSTS Model has the form EI (k, u)"E(k) (uq(k))q(k) , I we can write
(178a) E(k)R D(k, t) dk , (U (178b) R D(k, t)"2 RI D(k, u, t) (uq(k))q(k) du . I U (U The kernel R D(k, t) may be interpreted as the mean-square shear-parallel displacement of a tracer (U due to a single normalized random Fourier mode of wavenumber k (with temporal #uctuations at all frequencies); it directly generalizes the RSS shear-displacement kernel R D(k, t). Using the U assumed bound (156) on ( ) ) along with Eq. (177), we infer the following bound: I (179) "R D(k, t)"4C min(q D(k), q(k))t . U (U Realizing that min(q D(k), q(k)) represents the Lagrangian persistence time q (k) of spatial wavenumU * ber k, we see that this directly generalizes the physically motivated bounds (141) we obtained for the RSS shear-displacement kernel. With Eq. (179) and the dominated convergence theorem, we have demonstrated a rigorous version of the criterion p(t)"2 7
q (k)E(k) dk(R (180) * for ordinary di!usive growth of p(t) at long times. We have only proven it here for the special case 7 i"wN "0, but as we shall discuss below, it holds rigorously for iO0 and wN O0 as well, provided that q (k) is interpreted in the appropriate fashion. We have already indicated in Section 3.3.4 how * Eq. (180) de"nes the boundaries of the di!usive regime D in the phase diagram in Fig. 16. Superdi+usive regimes. For values of e and z outside the closure of the D regime, the integral in Eq. (180) diverges at low wavenumber. We therefore expect superdi!usion which is driven by the low wavenumber modes of the shear #ow, and zoom in on this region with a strategy similar to that developed for the RSS Model in Section 3.2.6. We introduce a wavenumber cuto! k and frequency cuto! u , and separate the formula (171a) into a low wavenumber-frequency piece pN (t) and the 7 remainder pJ (t): 7 p(t)"pN (t)#pJ (t) , 7 7 7 I S EI (k, u)RI D(k, u, t) du dk , pN (t)"4 U 7
pJ (t)"4 7
IYI SYS
EI (k, u)RI D(k, u, t) du dk . U
362
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The contribution pJ (t) is clearly at most di!usive by Eq. (177), since it has no contribution from 7 modes with both small u and small k. We next rescale pN (t) by 7 q"k/k (t) where k (t) scales with the inverse function to the Lagrangian persistence time q (k) at small k. * (For the motivation, see our discussion in Paragraph 3.2.6.2 in the context of the RSS Model.) Noting that lim q (k)"lim min(q D(k), q(k)) * U I I
q (k)&C D(K k)\>@ U V & UD q(k)&A k\X O
for z'2/(1#b) , for 04z(2/(1#b) ,
we choose
k (t)"
4pK \ V t\>@ 1#b
for z'2/(1#b) ,
AXt\X O
for 04z(2/(1#b) .
(Numerical prefactors in k (t) have been chosen for convenience.) Because the frequency variable u appears in Eqs. (178a) and (178b) in conjunction with the single time scale q(k) with q(k)&A k\X O at small wavenumbers, it is natural (and actually quite necessary) to rescale the frequency variable along with the wavenumber variable according to
-"u/u (t) ,
(181a)
u (t),A\(k (t))X . O
(181b)
Note that u (t)"t\ for z(2/(1#b), where the decorrelation of the shear #ow #uctuations itself is dominant at low wavenumbers. Performing the rescalings indicated above, we obtain
pN (t)"4k (t)u (t) 7
IIR SSR EI (qk (t), -u (t))RI D(qk (t), -u (t), t) d- dq . U
(182)
The rescaled shear-displacement kernel in the integrand converges, for each "xed q and - to a "nite multiple of t in the long-time limit, re#ecting the fact that the modes in the band k:k (t), u:u (t) are still contributing ballistically: 2 (1!u)e\OS>@ du lim t\RI D(qk (t), -u (t), t)" U (1!cos 2p-)/(2p-) R
for z'2/(1#b) , for 04z(2/(1#b) .
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
363
The long-time limit of the rescaled spatio-temporal energy spectrum in the integrand may be expressed through the low-wavenumber asymptotics of EI : lim EI (qk (t), -u (t))"lim E(qk (t)) (-u (t) q(qk (t)))q(qk (t)) OI R R R &A (qk (t))\C (-u (t)A (qk (t))\X) A (qk (t))\X # O O "(k (t))\C\XA A q\C\X (-q\X) . # O To deduce the limit of the integral (182) as the integral of the limit of the integrand, we establish uniform integrable bounds on the integrand. From the inequality (177) and "RI D(k, u, t)"42t, we U can deduce that "t\RI D(qk (t), -u (t), t)H(k /k (t)!q)H(u /u (t)!-)" U
C 1#q>@ 4 C 1#-
(183)
for z'2/(1#b) , (184) for 04z(2/(1#b) ,
where H is the Heaviside function (151). The low wavenumber asymptotics E(k)&A k\C and # q(k)&A k\X along with the uniform bound (156) on ( ) ) imply O I (k (t))\\C\XEI (qk (t), -u (t))H(k /k (t)!q)H(u /u (t)!-) 4C q\C\X/(1#"-q\X"A) , (185) where c'1. The bounds provided by the product of the right-hand sides of Eqs. (183) and (185) are indeed absolutely integrable over 04q4R, 04-4R, provided that (e, z) fall within the interior of either of the superdi!usive regimes indicated in Fig. 16. The dominated convergence theorem thus guarantees that we may safely evaluate the long-time limit of Eq. (182) by replacing the integrand by its long-time limit. Within the SD-w regime, de"ned by the inequalities D z'2/(1#b), 2b/(1#b)(e(2 , we obtain lim p(t)&lim pN (t)&4k (t)u (t)tk\C\X(t) 7 7 R R ; A A q\C\X (-q\X) 2 (1!u)e\OS>@ du d- dq # O "4k\C(t)t A q\C 2 (1!u)e\OS>@ du
(-) d- dq # "2k\C(t)t A q\C 2 (1!u)e\OS>@ du dq . #
364
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
We used (-) d-" (-) d-" in the last equality. We thereby arrive at the same \ long-time asymptotic expression for pN (t) as in the RSS Model; see Eq. (152). This rigorously proves 7 that the long-time asymptotics of p(t) in the SD-w regime is una!ected by the presence of 7 D temporal #uctuations in the shear #ow. For the SD-u regime in Fig. 16, de"ned by 0(z42/(1#b),
2!z(e(2 ,
we compute instead
lim p(t)&lim pN (t)&4k (t)u (t)tk\C\X(t) 7 7 R R 1!cos 2p; A A q\C\X (-q\X) d- dq # O 2p- 2z KI tX>C\X , " 2z#e!2
where
(2z#e!2) p\A A\CX KI " q\C\X(1!cos 2p-)-\ (-q\X) d- dq # O z (2z#e!2) " p\A A\CXz\ pC\X(1!cos 2p-)-\C\XX (p) d- dp # O z C((2!e)/2z) "z\pC>X\X A A\CX pC\X (p) dp . # O C((e#3z!2)/2z) This is exactly what is stated in Table 9 and Eq. (169). =eak sweeping regime. We "nally treat the case in which the randomly #uctuating cross sweep w (t) produces only trapped motion across the shear (b(!1). The following calculation will also D be valid for the special case of no cross sweep w (t)"0 (and no other cross-shear transport D mechanisms), thereby yielding the results stated in Section 3.3.1 as a by-product. To understand how to proceed, we notice that for no cross sweep (p (t)"0), the shear6 displacement kernel in Eq. (171b) has the explicit form RI D(k, u, t)"RI (k, u, t)"tF (2put) , U 2(1!cos -) . F (-)" -
(186)
In particular, the shear-parallel tracer displacement generated by a single wavenumber-frequency mode with uO0 is oscillatory and trapped. This suggests that most of the tracer action at long times will come from low frequency modes, and the rescaling -"ut is suggested to zoom in on them. We follow this strategy also if a trapping cross sweep w (t) is present, since we can expect that D this should only produce a weak perturbation of the case of no cross-shear transport.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
365
We thereby arrive at the following expression for p(t): 7 p(t)"4t\ EI (k, -t\)RI D(k, - t\, t) d- dk . 7 U The long-time limit of the rescaled shear-displacement kernel has a clean form when the cross sweep w (t) is trapped: D lim t\RI D(k, -t\, t)"2 lim (1!u)cos(2p- u)e\pIN6RS du U R R "2 (1!u)cos(2p- u)e\pI)3V du"e\pI)3VF (2p-) . We used the fact (144) that p (t) has the "nite long-time limit K3 for b(!1. The rescaled 6 V spatio-temporal energy spectrum clearly converges to its zero-frequency limit:
lim EI (k, -t\)"EI (k, 0) . R With "RI D(k, -t\, t)"4t and EI (k, -t\)4C E(k)q(k)& C A A k\C\X (due to the uniform U I # O bound (156) on ), we can apply the dominated convergence theorem when e#z(2 to deduce I EI (k, 0)e\pI)3VF (2p-) d- dk"2t e\pI)3VEI (k, 0) dk , lim p(t)"4t 7 R where we have used the integral formula F (2p-) d-". This covers the di!usive regime D. For e#z'2, there is a nonintegrable divergence of the limiting integrand at k"0, so we rescale the wavenumber along with the frequency variable
q"k/k (t) . Since the Lagrangian persistence time q (k) is clearly just the Eulerian correlation time q(k) of the * shear #ow, and q(k)&A k\X for low wavenumbers, an appropriate choice of the rescaling is O k (t)"AXt\X . O Note that the indicated wavenumber and frequency rescaling is the same as that in the SD-u regime described above in our discussion of the case of random sweeps with !1(b(1. Under this rescaling, the random cross sweeping there became asymptotically irrelevant in the long time limit; the same clearly must be true when the cross sweep is trapping. Therefore, the long-time asymptotics of p(t) obeys the SD-u scaling law for e#z'2 when the random cross sweep motion 7 is trapping. 3.3.6.3. Sketch of general derivation of asymptotics. We have just computed in detail the asymptotic results presented in Section 3.3.4 for the shear-parallel transport in the presence of a randomly #uctuating cross sweep w (t) and no molecular di!usion i"0 or mean cross sweep wN "0. We D covered, as a special case of a trapping cross sweep, the derivation of the results of Section 3.3.1
366
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where no cross-shear transport process is active. The approach developed above extends easily to handle iO0; indeed molecular di!usion may be treated in the above analysis in exactly the same way as a di!usive cross sweep (b"0). Therefore, only the case of a nonzero mean cross sweep remains to be discussed. We shall simply sketch how to modify the above approach to compute the asymptotics for p(t) when wN O0. 7 First of all, if no other cross-shear transport process is active (i"w (t)"0), as in our discussion D in Section 3.3.3, then the shear-displacement kernel takes the explicit form RI N (k, u, t)"t[F (2p(u!wN k)t)#F (2p(u#wN k)t)] , U F (-)"2(1!cos(-))/- . This is essentially just a sum of Doppler shifts of the kernel RI (k, u, t) in Eq. (186) for the case of no cross-shear transport. The single-mode wavenumber-frequency contributions away from the resonance line u"wN k are trapped, so the main contribution to the total shear-parallel transport at long times comes only from modes near this resonance line. The asymptotics of p(t) are therefore 7 computed by rescaling the frequency variable in Eqs. (170a) and (170b) as u"wN k#- t\, and then proceeding similarly as in our discussion of a trapping cross sweep above. This will work out the D and SD-u regime in Fig. 15. Within the SD-wN regime, only the low wavenumbers are relevant at long times, and the e!ects of the cross sweep dominate those of the temporal #uctuations in the shear #ow. The wavenumber-frequency rescaling must therefore be chosen in accordance with the fact that the e!ective Lagrangian persistence time is q (k)"q N (k)"(2pwN k)\, namely * U k (t)"(2pwN t)\, u (t)"A\(k (t))X . O When a mean cross sweep is superposed with molecular di!usion i or a randomly #uctuating cross sweep w (t), the only substantial change is that the frequency variable should not be rescaled in the D computation within the D regime. The reason is that the random component of the cross sheartransport will render RI (k, u, t) di!usive at all nonzero wavenumbers and frequencies, so the whole wavenumber-frequency domain contributes to the e!ective di!usion constant. For the bene"t of the reader interested in some of the mathematical details of the computation, we remark that the rigorous asymptotic computation of p(t) with wN O0 is more arduous than in 7 the wN "0 case presented in detail above. There are transient contributions in the integral representation (170) for p(t) which cannot be controlled by a uniform, time-independent bounding 7 function. Their contribution must instead be separately estimated and shown to be subdominant. This can be accomplished with su$cient care, but we do not present these details here because it would require too much space. 3.4. Large-scale e+ective equations for mean statistics and departures from standard eddy di+usivity theory We now shift our focus from the description of the mean-square displacement of a single tracer to the mean concentration of tracers (or other passive scalar quantity). These points of view are related in that the mean passive scalar density 1¹(x, t)2 evolving from a concentrated source at x is exactly equal to the full probability distribution function (PDF) for the location X(t) of a single tracer starting from x [164]. The mean passive scalar density 1¹(x, t)2 is therefore determined not
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
367
only by the mean-square displacement of a single tracer, but by the higher order statistics of its motion as well. We recall that it is not possible in general to obtain a closed PDE for 1¹(x, t)2 by naively averaging the advection}di!usion equation R¹(x, t)/Rt#*(x, t) ' ¹(x, t)"iD¹(x, t) , because of the statistically nonlinear coupling between *(x, t) and ¹(x, t). We did show through homogenization theory in Section 2, however, that for certain periodic velocity "elds and random velocity "elds with short-range correlations, the mean passive scalar density does obey an e!ective di!usion equation at large scales and long times. Namely, if the initial data is rescaled to vary on large scales: ¹B(x),¹ (dx) (with d'0 small), then the mean of the resulting passive scalar "eld ¹B(x, t) converges under a di+usive rescaling, ¹M (x, t)"lim 1¹B(x/d, t/d)2 . (187) B This large-scale, long-time limit is in these cases the unique solution of a constant-coe$cient di!usion equation R¹M (x, t)/Rt" ' (K* ¹M (x, t)),
¹M (x, t"0)"¹ (x) , (188) where K* is a constant, symmetric, positive-de"nite matrix depending on the velocity "eld *(x, t) and the molecular di!usivity i, and can be obtained from the solution of an associated `cell problema (86). It can be shown, using the arguments in the appendices to [10,14], that such a homogenized description holds throughout the di!usive regimes in both the Random Steady Shear Model (Section 3.2) and the Random Spatio-Temporal Shear Model (Section 3.3), at least when i'0 and the cross sweep velocity w(t) is a (zero or nonzero) deterministic constant. It is an open problem whether the Simple Shear Models do or do not adhere to a homogenized description at large scales and long times within the di!usive regimes when i"0 or a randomly #uctuating cross sweep w (t) D is present. We explore now some ways in which the e!ective di!usivity picture is altered when the velocity "eld has long-range spatial correlations which violate the conditions needed to apply standard homogenization theory. The Simple Shear Models developed in Sections 3.2 and 3.3 will be used for explicit illustrations. First of all, the di!usively linked rescaling of space and time in (187) clearly will not do for superdi!usive regimes. Instead, the large-scale, long-time behavior is captured by an appropriate choice of rescaling function o(d) which vanishes as dP0, and for which
x t ¹M (x, t)"lim ¹B , d o(d) B has a nontrivial limit. Di!usive rescaling corresponds to o(d)"d.
368
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
As we shall see in Sections 3.4.1 and 3.4.2, the rescaled (or `renormalizeda) limit ¹M (x, t) of the mean passive scalar "eld cannot in general be expressed as the solution of a (local) PDE [10,16]. Therefore, we sometimes utilize the more general framework of Green's functions. Because of the linearity of the advection}di!usion equation and the prevailing assumption that the initial passive scalar data is independent of the statistics of the velocity "eld, we can always express the mean of passive scalar statistics as an integral of the mean initial data against some kernel, or Green's function, PR:
1¹B(x, t)2"
PR(x, x)1¹B(x)2 dx .
(189)
1B
When the velocity "eld is statistically homogenous, as we shall assume here, the Green's function only depends on the di!erence between the spatial locations of the `sourcea and the `targeta, and we can express PR(x, x) more simply as PR(x!x). The reason we write the Green's function in this way is that PR(x) is exactly the probability distribution function (PDF) of the displacement X(t)!x of a single tracer. The renormalized mean statistics are described by a renormalized Green's function
¹M (x, t)"
PM R(x!x)1¹ (x)2 dx ,
1B
and this renormalized Green's function characterizes the long-time asymptotic PDF for the displacement of a single tracer:
x . (190) PM R(x)"lim d\PRMB d B The renormalized Green's function associated with the homogenized PDE (188) is the standard Gaussian heat kernel: e\x ' K*\ ' xR , PM R(x)" (4pt)B(Det K*) re#ecting the fact that the tracer displacement relaxes to a Gaussian distribution at long times. In Section 3.4.1, we present the long-time, large-scale limit of a passive scalar "eld evolving under the Random Steady Shear (RSS) model with positive molecular di!usion i'0. For the parameter range 0(e(2, the tracer motion is superdi!usive and an anomalous scaling o(d)"dJ, lO1 is needed to obtain a "nite limit. The associated large-scale, long-time Green's function may be represented as the average of an explicit functional of Brownian motion, but is not the solution to any (local) PDE [10]. Moreover, as we shall discuss in Section 3.4.2, this anomalous Green's function behavior persists when these steady shear #ows are perturbed by a general class of short-ranged two-dimensional random #ows [16]. In Section 3.4.3 we modify the Random Spatio-Temporal Shear (RSTS) model to include an important feature of real-world turbulence at high Reynolds number: a self-similar inertial range of scales. We discuss some issues pertaining to the computation of e!ective large-scale passive scalar behavior in high Reynolds number turbulent #ows in the context of these shear #ow models, and report on the results of some rigorous work along these lines [10,14]. Finally, in Section 3.4.4 we
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
369
provide some explicit examples in which the Green's function for the mean passive scalar density solves a time-dependent di!usion equation in which the di!usion coe$cient is negative over "nite time intervals. 3.4.1. Large-scale, long-time Green's functions in steady random shear -ow We return to the Random Steady Shear Flow Model from Section 3.2, with no cross sweeping #ow:
*(x, y, t)"
0
v(x)
,
where the random shear #ow v(x) has energy spectrum (109), in which the low wavenumber asymptotics of E(k) are given by lim E(k)&A k\C with e(2. Molecular di!usion i'0 is I # assumed to be active. We computed in Section 3.2.1 that the shear-parallel tracer displacement p(t)"1(>(t)!y )2 7 is di!usive for e(0 and superdi!usive for 0(e(2; see Table 2. We now consider aspects of the full probability distribution function (PDF) PR(x, y) of the tracer displacement at long time, as re#ected in the renormalized Green's function (190). 3.4.1.1. Diwusive regime. For the di!usive regime e(0, the condition
E(k) dk(R k is satis"ed, and the homogenization theory for incompressible velocity "elds with short-range correlations applies (see Section 2.4.2). Consequently, the renormalized Green's function (with di!usive rescaling o(d)"d) is a Gaussian: exp(!(x/4it)!(y/4K*t)) G , (191) PM R(x, y)" 4p(iK*t G where the di!usivity in the x direction is the bare molecular value i, whereas the di!usivity in the y direction is the turbulence-enhanced value K* stated in Eq. (115). The mean passive scalar G density therefore rigorously satis"es an ordinary, constant-coe$cient di!usion equation when rescaled to large scales and long times as in Eq. (187): R¹M (x, y, t) R¹M (x, y, t) R¹M (x, y, t) "i #K* , G Rx Ry Rt ¹M (x, y, t"0)"¹ (x, y) . Moreover the behavior of a single tracer is self-averaging for e(0, in that the large-scale, long-time PDF of a tracer, conditioned on a single realization of the velocity "eld, is (almost) always identical to the asymptotic Gaussian distribution (191) characterizing the ensemble-averaged behavior [16]. In other words, the tracer's motion in (almost) any given environment e!ectively samples the statistics of the entire ensemble after some "nite time.
370
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
3.4.1.2. Superdiwusive regime. The behavior of a tracer in a random shear #ow with 0(e(2 has many anomalies which contrast sharply with this simple homogenized picture. Recall that the `typicala class of random shear #ows with "nite energy at low wavenumbers correspond to e"1, and belong to this anomalous class. First of all, we have shown in Section 3.2.1 that the shear-parallel tracer motion is superdi!usive for 0(e(2: 4 K t>C , lim p(t)& 7 2#e G R where the constant K is described in Eq. (115). The appropriate rescaling of time to capture the G large-scale behavior of the tracer displacement in Eq. (190) is therefore o(d)"d>C. Under this space}time rescaling, the di!usive tracer motion along the x direction is negligible, and it factors out of the renormalized Green's function as a delta function: PM R(x, y)"d(x)PM R(y) . 7 One could of course retain the cross-shear dynamics by choosing an anisotropic scaling of space, but our interest here is only on the PDF for the tracer displacement along the shear, which at long times is described by PM R(y). 7 An explicit formula for this renormalized Green's function was derived rigorously in [10]:
y exp ! 4(2/(2#e))K at>C G PM R(y)" 7 (4p(2/(2#e))K at>C G
.
(192)
? The outer brackets denote an expectation over the random variable
2C\ a, F (=(s)!=(s)) ds ds , C C(!(e#2)/2) where =(t) is a standard Brownian motion. The function F (y) appearing in the integral is given by C sgn(y) C for 0(e(1 or 1(e(2 , F (y)" e OW"q"\C dq" C "y"\C C C d(y) for e"1 , \ C where 2 (1!e)p ! sin C(2!e) for 0(e(1 or 1(e(2 , p 2 C" 1 C for e"1 . (2p
Without the ensemble average over a, the renormalized Green's function PM R(y) would be a Gaus7 sian distribution. But the averaging over a implies that PM R(y) is in fact the probability distribution 7 for a mixture of (mean zero) Gaussian random variables, and is therefore necessarily a broaderthan-Gaussian distribution (see Paragraph 5.2.2.1). That is, the tracer is more likely to make large
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
371
excursions along the shear, relative to its root-mean-square displacement, than a Gaussian random process with the same mean-square displacement law would. Through analytical considerations and numerical simulations, Zumofen et al [353,354] identify the source of non-Gaussianity more speci"cally as coming from the variability in the range of the Brownian motion across the shear. The renormalized Green's function given by Eq. (192) cannot be represented as the solution to a PDE of di!usion type or even any local PDE of the form RPM R(y)/Rt"Q(R/Ry)PM R(y) , 7 7 where Q( ) ) is a polynomial. Therefore, we say that the renormalized mean statistics ¹M (x, y, t) obey a nonlocal equation within the regime 0(e(2. One appealing formulation of this nonlocal equation for ¹M (x, y, t) uses the notion of a random di!usivity. The conditional Green's function
y exp ! 4(2/(2#e))K at>C G PM R(y"a), 7 (4p(2/(2#e))K at>C G is the fundamental solution of the constant coe$cient di!usion equation
(193a)
RPM R(y"a)/Rt"K atC RPM R(y"a)/Ry , (193b) 7 G 7 PM R(y"a)"d(y) (193c) 7 with di!usion coe$cient K a. Therefore, the renormalized Green's function may be viewed as an G average over solutions to constant-coe$cient di!usion equations with a random factor a appearing in the di!usivity. The behavior of the PDF of a as a function of e is discussed in [10]. While Eq. (192) is an explicit representation of the renormalized Green's function, it would be interesting to identify some properties of PM R(y), such as the form of the tail region yC, G in a more transparent manner. Some numerical simulations and formal large deviation arguments [46,280,353], as well as some partial exact results from a quantum mechanical analogy [194] suggest that, for e"1, the tails of the renormalized Green's function have a stretched exponential form PM R(y)&t\>C exp(!("y"/t>C)B) , lim 7 W) GR>C with d" and possibly some power-law prefactor in the large parameter. It would be interesting to rigorously derive such a result using the method of large deviations [331], and to describe the shape of the tails of PM R(y) as a function of e. We note that explicit quantitative expressions for several moments of the renormalized Green's function for e"1 may be extracted from the exact "nite-time formulas in [353,354] for the moments of the shear-parallel displacement in a limiting version of the independent channel model with in"nitely thin channels. Departure from self-averaging. This representation of the renormalized Green's function suggests, but does not prove, that at long times, the squared tracer displacement along the shear in a single realization may grow as t>C, but with a realization-dependent prefactor. In other words, the motion of a single tracer may not be self-averaging, in that the behavior of a typical single
372
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
realization does not resemble the ensemble-averaged behavior. This issue has been explored in more depth by [46,280] for a random shear #ow which takes the form of an array of independent channels, as discussed in Section 3.2.1. These models have energy spectra which are "nite at the origin and so belong to the class e"1. Bouchaud et al. [46,280] demonstrated that the motion of a single tracer is not self-averaging for the independent strati"ed channel model by computing the quantity lim 11>(t)!y 2 2 &((2!1)(4A /3(pi)t . # 5 T R This is to be compared with the formula for the mean-square displacement for e"1:
(194)
lim p(t),lim 1(>(t)!y )2"(4A /3(pi)t . (195) 7 # R R To interpret Eq. (194), note "rst that M (t"v, y )"1>(t)!y 2 is the mean displacement of 7 5 a tracer, averaged over Brownian motion but in a "xed realization of the random shear environment and "xed starting location y . This random variable M (t"v, y ) describes exactly the 7 displacement of the center of mass of a cloud of tracers initially concentrated at the speci"ed point y , then subsequently moving in a common random shear environment but with independent Brownian motions. Clearly, M (t"v, y ) has mean zero when averaged over all random shear 7 con"gurations v(x). The important point of the above results is that the displacement of the center of mass of the cloud, in a given realization of the random shear environment, will be of the same order of magnitude as the typical displacement of a tracer. On the other hand, the mean tracer concentration density, averaged over the ensemble of random shears, will have its center of mass "xed at y and spread symmetrically about it. Therefore, the evolution of an initially concentrated cloud observed in a given environment does not resemble the ensemble-averaged behavior, and we therefore say that the evolution of a concentrated cloud (and also of a single tracer) is not self-averaging. This conclusion is also reached from a di!erent direction for any value of e with 0(e(2 in [10], in which it is demonstrated that the renormalized Nth order moments of the rescaled passive scalar "eld d\¹(x/d, y/d, t/o(d)) do not coincide with ¹M ,(x, y, t). The loss of self-averaging may be attributed to the long-range correlations of the Lagrangian tracer velocity which arise due to su$ciently strong long-range spatio-temporal correlations in the random shear #ow. This violation of self-averaging for an initially concentrated cloud lends some support to the idea that the tracer may indeed behave at long times in a given realization as if it had a time-dependent di!usivity K atC with a random factor a, as suggested by the representation (193). It would be G interesting to determine whether this random di!usivity picture is literally valid. 3.4.2. Persistence of non-Gaussian Green's function for nearly strati,ed steady -ows We next report on some work which indicates the robustness of the anomalous renormalized Green's function (192) under perturbations of the Random Steady Shear Model. As we now discuss, this renormalized Green's function actually characterizes the large-scale, long-time behavior of the statistics of a tracer in a rather broad class of random steady #ows with an approximately strati"ed structure.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
373
First of all, the assumption of Gaussian statistics of the random shear #ow can be considerably relaxed. It has been demonstrated rigorously in [14], that the long-time PDF of a tracer position converges to Eq. (192) for several classes of non-Gaussian shear models with energy spectra as in Eq. (109). Included in these classes of non-Gaussian models is the independent channel model [46,279,353] discussed in Section 3.2.1 and shear #ow versions of models with translational disorder as studied in [12,19,167]. In e!ect, there is a central limit theorem in operation as the tracer explores its random environment. Secondly, Avellaneda and the "rst author established [16] that renormalized Green's functions of the form (192) arise generically for a class of steady random velocity "elds which are `nearly strati"eda, in the sense that they have the structure
0
*(x, y)"
v(x)
#u(x, y)
of a Gaussian homogenous random steady shear #ow v(x) with energy spectrum (109) perturbed by a two-dimensional random homogenous velocity "eld
u (x, y) V u (x, y) W of the type with short-range correlations discussed in Section 2. More precisely, the random perturbation "eld u(x, y) is assumed to be periodic in y (with period 1) and statistically homogenous along the x direction, and the Fourier transform of its correlation tensor, u(x, y)"
RK (k), K
e\p IV>KWR(x, y) dy dx , \ R(x, y),1v(x, y)v(x#x, y#y)2
is assumed to satisfy the condition
#RK (k)# K dk(R . k#m \ K This last condition guarantees that u(x, y) obeys the conditions for homogenization theory, so that the statistical motion of a tracer due to the perturbation "eld u(x, y) alone would be Gaussian and di!usive at long times. Note that u(x, y) is allowed to be a deterministic, periodic #ow as a special case. The superposition of the random shear with the two-dimensional perturbation u(x, y) corresponds to #ow in a strati"ed heterogenous porous medium, with the perturbations u(x, y) varying only over short wavelengths. Intuitively, we expect the solution of the equations of motion for the tracer dX(t)"u (X(t),>(t)) dt#(2i d= (t) , V V d>(t)"v(X(t)) dt#u (X(t),>(t)) dt#(2i d= (t) W W
374
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
to behave as follows: on a coarse scale, the cross-shear component of the path, X(t), will approach (statistically) a Brownian motion. In this case, the cross-shear transport would be of di!usive type, and from our discussion in Sections 3.2.1 and 3.2.3, the random shear should produce superdiffusive scaling p(t)&t>C for 0(e(2. The contribution from u to the shear-parallel transport 7 W can be expected to be di!usive, and thus negligible relative to the contribution from the random shear. Therefore, the tracer in this nearly strati"ed system should behave in much the same way as in the Random Steady Shear Model with molecular di!usion, and the renormalized Green's function, using the same rescaling function o(d)"d>C, should be of the same form (192). One change, however, is that the presence of the perturbation "eld u (x, y) will enhance the cross-shear V motion so that the e!ective shear-transverse di!usion constant at long time will be some value K* greater than the bare molecular value i. The factors of i appearing in the formula for the V renormalized Green's function stem from the shear-transverse di!usion, so they should selfconsistently be replaced by the enhanced shear-transverse di!usivity K*. V This picture is rigorously justi"ed in [16]. Moreover, the e!ective shear-transverse di!usivity K* can be obtained by solving a cell problem in the sense of homogenization theory, as in V Section 2. Namely, let s (x, y) be a suitable solution of V Rs (x, y) Rs (x, y) ![v(x)#u (x, y)] V "u (x, y) . iDs (x, y)!u (x, y) V W V V V Ry Rx Then K* is given by V K*"i(1#1" s "2) . V V The proof hinges on showing that an e!ective separation of scales between the di!usive cross-shear motion X(t) and the superdi!usive shear-parallel motion >(t) exists. Namely, the superposition models should have the property that the X(t) motion achieves its asymptotic statistical behavior on a time scale which is short with respect to the time scale on which superdi!usion occurs. Moreover, a crucial ingredient in this self-consistency argument is that sample-to-sample #uctuations in the velocity statistics should not a!ect the X(t) motion; that is, the cross-shear motion should be completely self-averaging. Some of the above technical assumptions can be relaxed without a change in the conclusion; see [16] for full details. It is also shown in that paper how the superdi!usive space}time scaling of the renormalized Green's function may be calculated for nearly strati"ed #ows through a rigorous diagrammatic resummation of a perturbation expansion of the advection}di!usion equation with respect to the advective term. There is also interesting theoretical and computational work by Avellaneda et al. [8] with non-Gaussian eddy di!usivity equations for special steady random velocity "elds de"ned by the `Manhattan grida. 3.4.3. Renormalized Green's functions for turbulent shear -ows with very long-range spatio-temporal correlations We have so far been considering tracer transport in a variety of random shear velocity "elds, and found that the character of the long-range spatial and temporal correlations in the velocity "eld play a crucial role in determining the long-time behavior of an immersed tracer. Fully developed turbulent velocity "elds at high Reynolds number have very strong long-range spatio-temporal correlations, and we have seen from explicit shear #ow examples that tracers can exhibit a variety
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
375
of subtle and anomalous behavior in such #ows. Long-range correlations are manifested in fully developed turbulence in a distinctive way: a wide range of scales, known as the inertial range, over which the velocity "eld exhibits statistical self-similarity. We "rst describe this feature, then seek to mimic it within our general class of Gaussian random shear #ows. 3.4.3.1. Inertial range of fully developed turbulent yow. The inertial range was "rst predicted theoretically by Kolmogorov [169] in 1941 as a characteristic of a fully developed turbulent system which is maintained in a statistically stationary state (or `quasi-equilibriuma) through external forcing at some characteristic length scale ¸ and dissipation by viscosity l. Kolmogorov utilized an intuitive picture of turbulence, introduced by Richardson [284], in which energy cascades from the relatively large scales at which it is injected down to ever smaller scales until viscous dissipation of energy ultimately sets in. The cascade process arises from nonlinear interactions between various turbulent velocity #uctuations, or `eddiesa, which tear apart large-scale structures into smaller ones. Kolmogorov introduced the key hypothesis that, away from boundaries, the small-scale structure of the #ow should be largely independent of the large scale details, due to the mixing nature of the cascade. More precisely, he hypothesized that the only information communicated from the large scales is the rate of energy injection, which must be equal to the rate of energy dissipation eN in a statistically stationary state. The small-scale statistics r;¸ of turbulence are therefore said to be universal in the Kolmogorov theory, since they do not depend on any of the large scale details other than the overall energy #ux. From this universality hypothesis he deduced the existence of a distinguished length scale ¸ "(l/eN ), which we will call the Kolmogorov ) dissipation length scale. He hypothesized that viscosity l plays an important role only on scales smaller than or comparable to ¸ ; on larger scales r<¸ , the velocity statistics should be ) ) independent of l. The Kolmogorov dissipation length scale ¸ scales with Reynolds number Re ) as ¸ Re\, so at su$ciently large Reynolds number there exists an inertial range of scales ¸ ;r;¸ within which the velocity "eld statistics should depend only on the scale of interest ) r and the energy dissipation rate eN . From these hypotheses and dimensional analysis follows the Kolmogorov's `two-thirds lawa: 1"*(x#r, t)!*(x, t)"2"CI eN "r" for ¸ ;"r";¸ , ) )
(196)
where CI is a universal numerical constant. The fact that the mean-square velocity di!erence over ) a distance r vanishes at a slower rate than r implies that the velocity #uctuations are not smooth when viewed with a resolution within the inertial range of scales. The velocity "eld is in fact a fractal "eld [215,305] on scales within the inertial range. We return to this fractal aspect of turbulence in Sections 3.5 and 6. It is often convenient to formulate the Kolmogorov theory in terms of the spectral density of energy E(k) in wavenumber space. This energy spectrum is de"ned for general multidimensional #ows with the same physical meaning as in our models with scalar random "elds: I>DDIE(k) dk is I\ I the energy of the velocity "eld which would be measured if all #uctuations were "ltered out except those with wavevector magnitudes within the band k$Dk (see Paragraph 2.4.5.3). By analogous arguments, or through Fourier scaling, one arrives at Kolmogorov's `"ve-thirds lawa: E(k)"C eN k\ for ¸\;k;¸\ . ) )
(197)
376
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The universal numerical constant C appearing in this law is called the Kolmogorov constant. When ) the Reynolds number is large, the regime of power law scaling extends over a range of wavenumbers proportional to Re. This means that the velocity #uctuations are active and self-similar over a wide range of scales. The Kolmogorov inertial range scaling predictions (197) and (196) are quite well established experimentally, with a Kolmogorov constant C +1.5 (see [307] for a review). ) Another consequence of Kolmogorov's theory is the association of an eddy turnover time scale q (k)"eN \k\ (198) to each inertial-range wavenumber k. This is just the natural advective time scale of an eddy of size k\, with velocity on the order of eN k\. It may also be thought of as an estimate for the time it takes for an inertial range eddy to be torn up into smaller eddies. According to the Kolmogorov hypotheses, the eddy turnover time is the only relevant time scale for an inertial range eddy, and so the spatio-temporal energy spectrum must have the following universal inertial-range form: EI (k, u)"E(k) (uq (k))q (k)"C eN k\ (ueN \k\) for ¸\;k;¸\ , (199) ) ) where ( ) ) is a universal, nonnegative function with "1. This complete spatio-temporal Kolmogrov inertial-range theory is much more di$cult \ to assess empirically. A competing theory for the temporal dynamics for the inertial range will be mentioned below. Mathematical modelling of spatio-temporal spectrum of Kolmogorov type. For the moment, we proceed to construct a mathematical model for an energy spectrum which contains an inertial range conforming to Kolmogorov's theory. The inertial range is described by Eq. (199), and we must only specify the form of EI (k, u) outside the self-similar scaling range. At very large wavenumbers k<¸\, the velocity #uctuations are rapidly dissipated by viscosity, and the energy spectrum ) decays rapidly. The energy spectrum must vanish at su$ciently small wavenumbers k;¸\ because of the "nite size of the system. We are therefore motivated to de"ne the following model for the spatio-temporal energy spectrum corresponding to a turbulent #ow with a Kolmogorov inertial range: EI (k, u)"E(k)q(k) (uq(k)) ,
(200a)
where E(k)"C eN k\t (k¸ )t (k¸ ) , (200b) ) ) q(k)"q (k)"eN \k\ . (200c) The new parameters and auxiliary functions appearing here have the following meaning and properties: E t is the infrared (low wavenumber) cuto!, a smooth, nonnegative function on the positive real axis vanishing in a neighborhood of the origin, E t is the ultraviolet (high wavenumber) cuto!, a smooth, nonnegative function on the positive real axis, decaying faster than any power, with t (k)"1#O(k) near the origin, E (-) describes the temporal component, and is assumed to be a non-negative, bounded, smooth, even function, decaying at least as fast as "-"A for some c'1. Moreover, we demand that (0)'0 and (u) du"1. \
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
377
The model assumes that the velocity "eld statistics continue to be isotropic on the large scales, which certainly is not true; but we are not attempting to account for system-dependent large-scale features of the velocity "eld here. Also, q(k) should cross over to a di!erent dependence on wavenumber outside the inertial range, but there is little point in introducing this extra complexity here. To maintain the parlance of physical discussions of turbulence, we will sometimes refer to ¸ as the `integral length scalea of the velocity "eld, though in our model it di!ers from the technical de"nition [320] of this term by a constant of order unity in the high Reynolds number limit. We can build a simple shear model with a Kolmogorov type inertial range by simply de"ning a Gaussian random shear #ow v(x, t) using the spatio-temporal energy spectrum (200) through the formula (153). It can be checked explicitly that such the velocity di!erences will exhibit the Kolmogorov inertial range scaling: 1(v(x#x, t)!v(x, t))2"CI eN "x" for ¸ ;x;¸ , ) ) for some universal constant CI proportional to C . Of course, such a shear model di!ers ) ) geometrically from fully developed turbulence in that the latter is usually modelled as statistically isotropic. The shear model has the advantage, however, of being mathematically tractable in the context of turbulent di!usion, and we will see that it sheds light on a number of physically relevant issues regarding the e!ects of a wide inertial range on turbulent di!usion. The reader will note the formal similarity between the Kolmogorov model spectrum (200) and the Random Spatio-Temporal Shear (RSTS) Model in Section 3.3 with parameter values e"8/3 and z"2/3. The key di!erence is that the power law scaling of the Kolmogorov energy spectrum has e'2, and is cut o! at the low wavenumber ¸\. This cuto! is essential for the energy
1 E" 1(v(x, t))2" E(k) dk 2 to be "nite; the ¸ P0 limit is singular. We will next widen the spatio-temporal energy spectrum model for fully developed turbulence from the strict Kolmogorov picture (200) to include intermittency corrections and alternate theories for the temporal correlations. This will motivate the consideration of modi"ed RSTS models with more general values of e'2 and z'0. Intermittency corrections. Although the Kolmogorov form for the spatial energy spectrum E(k) mentioned above agrees quite well with experimental data, it is of theoretical interest to consider a more general self-similar inertial-range power law behavior of the form E(k)"C eN k\(k¸ )?(k¸ )? ) )
for ¸\;k;¸\ . )
(201)
Here, the asymptotically small parameters (k¸ )\ and (k¸ ) are permitted to enter the inertial ) range asymptotic form via power laws with exponents a and a , which if nonzero, are referred to as anomalous exponents. In the terminology of Barenblatt [24,25], Kolmogorov's hypothesis (200) is completely self-similar with respect to the inertial range limit while the more general hypothesis (201) is incompletely self-similar. The anomalous exponents will always be assumed to satisfy !(a #a ( ,
(202)
378
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
which ensures that energy is concentrated at low wavenumber, dissipation is concentrated at high wavenumber, and a physical-space self-similar inertial range scaling regime exists within ¸ ;r;¸ . ) In real-world turbulence, the issue of anomalous scaling is much more important in discussions of higher-order statistics of the velocity "eld than in the second-order statistics (as re#ected in E(k)) [309]. There remain, however, suggestions [34,72] that the inertial range form of the energy spectrum may have `intermittency correctionsa of the form (201), with a "0 and a small (on the order of 0.03). A di!erent incomplete self-similarity hypothesis for the inertial-range has recently been formulated [26], but we will not consider it here. Our purpose is only to introduce the #avor of anomalous corrections to inertial range scaling into our shear #ow models, and examine their e!ects on turbulent di!usion. In the presence of intermittency corrections, the natural eddy turnover time q (k), de"ned as the ratio of the length scale k\ and velocity scale of the eddy, is altered from the Kolmogorov de"nition (198). The mean-square velocity of an inertial-range eddy of length scale k\ scales as E(k)k, so with Eqs. (200a), (200b) and (200c), we de"ne the eddy turnover time in the presence of intermittency corrections as k\ "C\eN \k\(k¸ )\?(k¸ )\? . (203) q (k)" ) ) (C eN k\(k¸ )?(k¸ )? ) ) Alternative temporal decorrelation theories. Whether the eddy turnover time q (k) truly sets the decorrelation rate of inertial range eddies has come into question, starting with some papers of Kraichnan [176] and Tennekes [319]. It is pointed out that beyond the inertial breakdown of eddies, the velocity "eld observed at a given point will see a decorrelating in#uence from the sweeping of eddies by strong, large-scale velocity components. The in#uence of an eddy of wavenumber k on a given point in space will, according to this notion, e!ectively cease once the eddy is carried away through advection by large scale eddies. An estimate for this sweeping time scale is q (k)"(v k)\ , (204) where v is the single-point root-mean-square magnitude of the turbulent velocity "eld. If we extrapolate the Kolmogorov inertial-range scaling of the velocity of eddies, we would expect v to typically be on the order of eN ¸, in which case the sweeping time q (k) would be shorter than the eddy turnover time q (k). This would appear to suggest that sweeping is the dominant decorrelation mechanism, so that the eddy correlation time q(k) should be set equal to q (k) instead of q (k) in the inertial range of wavenumbers. The same conclusion holds true in the presence of intermittency corrections, as will be clear when we nondimensionalize later. Whether the sweeping e!ect really has this decorrelating in#uence has not been conclusively decided; we refer the reader to [60,273,345] for theoretical, numerical, and experimental investigations along this line. We only wish to mention this issue here, but will not dwell on it. To cover both possibilities, we will simply assume that q(k), the correlation time of an eddy of wavenumber k, has some scaling law: q(k)"A k\X , O
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
379
where z'0 and A is some positive constant. The exponent z"2/3 corresponds to the strict O Kolmogorov theory, and z"1 corresponds to the decorrelation by large-scale sweeping. Because we have embedded both possibilities into our model, we can draw distinctions concerning how the passive scalar statistics will di!er under these two competing hypotheses. Our results are only guaranteed to be true for the model #ows considered, but one can hope to obtain some understanding for the physical mechanisms and qualitative e!ects at work, and consider which of these may apply in more general situations. We note that for a shear #ow, there is no direct physical mechanism for decorrelation by large-scale sweeping unless a cross-shear sweep, such as w (t), is D present. 3.4.3.2. Basic considerations concerning large-scale e+ective di+usivity. To illustrate some of the issues involved in computing the e!ective evolution of a passive scalar "eld in a turbulent #ow, we construct a cartoon model founded on shear #ows. We will refer to this model as the Random Spatio-¹emporal Shear -ow with Inertial Range, or RSTS-I Model for short. The velocity "eld is taken to be a parallel superposition of a two-dimensional steady, large-scale mean shear #ow ;(x) and a turbulent small-scale component v(x, t), with a temporally #uctuating cross sweep w (t): D w (t) D . (205) *(x, z, t)" ;(x)#v(x, t)
The cross-sweep w (t) will be modelled as a mean zero, Gaussian, stationary random process as in D the Random Sweeping Model of Section 3.1.2, with
R (t),1w (t)w (t#t)2"2 U D D
cos(2put)E ("u") du , U
E (u)"A u\@t (u) . U #U U The turbulent shear #ow v(x, t) is a Gaussian, homogenous, stationary, mean zero random "eld with correlation function
ep IV>SREI (k, u) dk du \ \ and spatio-temporal energy spectrum EI (k, u) taken to be of the high Reynolds number form, with intermittency corrections, discussed above: RI (x, t),1v(x, t)v(x#x, t#t)2"
EI (k, u)"E("k")q("k") (uq("k")) , E(k)"C eN k\(k¸ )?(k¸ )?t (k¸ )t(k¸ ) , ) ) ) q(k)"A k\X . O Finally, the mean shear #ow is taken to be either linear or a sinusoid:
;(x)"
cx , vN sin(2px/¸M ) .
(206a) (206b) (206c)
(207)
380
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
It will be helpful to think of the linear shear coe$cient c as the ratio of a certain velocity vN and length scale ¸M characterizing the turbulent system. The shear-parallel tracer motion due to the mean shear and the turbulent shear "eld are simply additive, since they do not act across each other's gradient. To better assess their relative magnitude, it is helpful to "rst nondimensionalize with respect to the (large) scale of the turbulent system. Nondimensionalization. We think of the mean shear #ow ;(x), the #uctuating cross-sweep w (t), D and the turbulent shear #ow v(x, t) as being produced by forces of comparable scale. It is therefore natural in our model to equate the mean-square turbulent velocity v "1v(x, t)2 with the velocity scale vN of the mean #ow, and to equate the outer length scale ¸ of the turbulent #ow with the length scale ¸M of the mean #ow. The turbulent #uctuations will then be active over a wide range of scales extending from the system scale ¸ "¸M down to the Kolmogorov dissipation scale ¸ . We ) are assuming a high Reynolds number #ow so that ¸ ;¸ , and a substantial inertial range exists. ) Moreover, we equate the magnitude of the cross-sweep velocity w (t) with vN and the time scale of its D #uctuations with the natural time macroscale ¸M /vN . The initial passive scalar density is assumed to vary on the macroscale ¸M , and to have total mass M. We then use ¸ "¸M , vN "v , and M as our reference units for nondimensionalization. This implies that the reference time scale is q "¸ /v . Temporarily, we will use a superscript to denote a nondimensionalized variable and a superscript to denote a nondimensionalized function. To nondimensionalize the turbulent shear #ow, we note "rst that
(208) E(k) dk"c C eN ¸(¸ /¸ )? , ) ) where symbols c will denote dimensionless positive numerical constants depending only weakly on H Reynolds number in the sense that they approach "nite positive limits as RePR. The constants c also depend on the scaling exponents a , a , and z, and on the dimensionless auxiliary functions H t , t , and . As a consequence of Eq. (208), the energy spectrum may be written in terms of ¸ and v as E(k)"c\v¸\&k\\&t (k¸ )t (k¸ ) , (209) ) where we have de"ned the Hurst exponent as v"2
H"!(a #a )/2 . The reason we call H the Hurst exponent is that 1(v(x#x)!v(x))2 scales self-similarly as "x"& over the inertial range of scales ¸ ;x;¸ , and the Hurst exponent is de"ned in just this way for ) self-similar (or more properly `self-a$nea) fractal random "elds (see [215,216]). The restriction on the intermittency exponents (202) is equivalent to 0(H(1. From Eq. (209), it follows that the energy spectrum nondimensionalized to large scales takes the appealingly simple form E3(k)"(v\¸\)E(k¸\)"c\(k)\\&t (k)t (kl ) , ) where l ,¸ /¸ . ) )
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
381
It is often desirable to relate this ratio of the Kolmogorov dissipation length scale ¸ and the ) integral length scale ¸ to the Reynolds number: Re"v ¸ /l characterizing the #ow, where l is the kinematic viscosity. In the Kolmogorov theory, Re is proportional to (¸ /¸ ); this law must however be modi"ed in the presence of intermittency ) corrections. To compute Re, we need an expression for l in terms of the model parameters. This is obtained by using the de"nition of the mean energy dissipation rate eN
Rv(x, t) "4pl kE(k) dk . Rx Substituting Eq. (206b) into this expression, we "nd eN ,l
l"c C\eN ¸(¸ /¸ )? . ) ) ) The Reynolds number is now computed to scale with l ,¸ /¸ as follows: ) ) (210) Re"c Cl\\?>? . ) ) We now only need to nondimensionalize the eddy decorrelation time q(k). To estimate how A relates to the system scale, we consider the two natural cases in which q(k) is the eddy turnover O time q (k) (203) or the sweeping time q (k) (204). Using Eq. (208), we can express these time scales as q (k)"c¸&v\k\>&, (211a) q (k)"v\k\ , (211b) and their large-scale nondimensionalizations are q3(k)"(v /¸ )q (k¸\)"ck\>& , q3(k)"(v /¸\)q (k¸\)"k\ . Since q(k) is supposed to generalize the behavior of these natural sweeping times, we can take its nondimensionalized form as q3(k)"k\X ; we do not bother with a possible order unity preconstant. We can "nally write the spatiotemporal energy spectrum, nondimensionalized to the system space}time scales ¸ and q "¸ /v , dropping the and superscripts, as EI (k, u)"E("k")q("k") (uq("k")) , (212a) E(k)"c\k\\&t (k)t(kl ) , ) q(k)"k\X , where l may be related to the Reynolds number through Eq. (210). )
(212b) (212c)
382
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The nondimensionalization of the mean shear #ow (207) is directly seen to be
;(x)"
x
(213)
sin(2px) .
Since the cross-sweep w (t) is assumed to vary also on the same scales, its nondimensionalized D correlation function R (t) has amplitude and scale of variation both of order unity. The nondimenU sionalized initial data ¹ (x, y) is similarly an order unity function. The molecular di!usion coe$cient, nondimensionalized to the macroscale, may be identi"ed as the inverse of the Pe& clet number, Pe"¸ v /i . This de"nition of PeH clet number is consistent with the one used in Section 2.4 (see Eq. (81)), up to a factor which approaches a constant of order unity in the high Reynolds number limit. Challenges of modelling e+ective large-scale passive scalar behavior. We can now write down the advection}di!usion equation, nondimensionalized with respect to the system length macroscale ¸ and time macroscale q : R¹(x, y, t) R¹(x, y, t) R¹(x, y, t) R¹(x, y, t) #w (t) #;(x) #v(x, t) "Pe\D¹(x, y, t) , (214) D Rx Ry Ry Rt ¹(x, y, t"0)"¹ (x, y) . The nondimensionalized mean #ow ;(x) is given by Eq. (213), the nondimensionalized spatiotemporal energy spectrum of the turbulent shear #ow v(x, t) is given by Eqs. (212a), (212b) and (212c), and the nondimensionalized correlation function R (t) of the random cross sweep w (t) is an U D order unity function. A key observation from this nondimensionalization is that the advective processes due to both the mean and turbulent components of the #ow are of order unity with respect to the system macroscales. The Reynolds number enters only in the energy spectrum (212b) through the nondimensionalized Kolmogorov dissipation scale l and weakly through the non) dimensional preconstant c . On the other hand, the PeH clet number Pe can often be large in fully developed #ows, making the molecular contribution formally subdominant (away from boundaries) to the turbulent advection in Eq. (214). We know, however, from our discussions in Sections 2 and 3 that the presence of even a small amount of molecular di!usion can have subtle e!ects on large-scale passive scalar transport in certain #ow con"gurations. A key goal of turbulence modelling is obtaining an e!ective equation for the mean concentration density 1¹(x, y, t)2 which is at least approximately valid on the system scale. One wishes to account for the important e!ects which the random #uctuations have on the system scale, without resolving the #uctuations explicitly in detail. We can make some partial progress toward this end in the presently stated model, because we can exactly compute the e!ective di!usivity of a single tracer, de"ned as half the time rate of change of its variance in position (along a given direction). The e!ective di!usivity in the cross-shear direction is determined by the sweeping (111):
R 1 d1p (t)2 6 "Pe\# R (s) ds . D (t), U V dt 2
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
383
The e!ective di!usivity along the shearing direction is obtained by simply summing the contributions of molecular di!usion, the mean shear #ow, and the random shear #ow, taking into account the cross-shear transport: 1 d1p (t)2 6 "Pe\#DM (x, t)#DI (t) . D (t), W W W 2 dt A linear mean shear #ow ;(x)"x contributes a shear-parallel di!usivity [31,258]:
R DM (t)"Pe\t#t (t!s)R (s) ds , U W whereas a sinusoidal mean shear #ow ;(x)"sin 2px contributes
(215)
1 1 R DM (x , t)" e\pN6 Q ds! (cos 4px )e\pN6 R W 2 2 R R Q R (s!s) ds ds ds ; exp !2p p (s)#4Pe\s#2 U 6 R 1 ! (sin 2px ) e\pN6 R e\pN6 Q ds , (216) 2 both of which may be computed by a generalization of the method presented in Section 2.3.1. Finally, the e!ective di!usion due to the turbulent shear #uctuations (157) is
EI (k, u)KI (k, u, t) du dk ,
DI (t)"i#4 W
R
(217)
6
KI (k, u, t)" e\p I N Qcos(2pus) ds . All these e!ective di!usivities are time-dependent and of order unity in units nondimensionalized to the macroscale. Observe, however, that the di!usivity along a linear mean shear (215) grows unboundedly in time whereas the di!usivity due to a sinusoidal mean shear (216) saturates to a "nite positive value for large nondimensional times t<1. The turbulent di!usivity (217) also saturates to a "nite value for t<1 by virtue of the energy spectrum having a low wavenumber cuto! at a length scale of order unity in nondimensionalized units; see Eqs. (212a), (212b) and (212c). It is also important to note that the mean position of a tracer will respond to the presence of a mean shear. For a linear mean shear #ow, < (t),d1>(t)2/dt"x , 7 whereas for a sinusoidal mean shear #ow, < (t),d1>(t)2/dt"sin 2px e\pN6 R . 7 The mean tracer velocity in this latter cases rapidly decays, and the mean tracer displacement saturates to a "nite value.
384
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Now, it may be tempting to formulate an advection}di!usion equation for the mean passive scalar "eld 1¹(x, y, t)2 of the form R1¹(x, y, t)2 R1¹(x, y, t)2 R1¹(x, y, t)2 R1¹(x, y, t)2 #< (t) "D (t) #D (x, t) , 7 V W Ry Rx Ry Rt 1¹(x, y, t"0)2"1¹ (x, y)2 . Or perhaps one may explicitly retain the mean #ow ;(t) in the advection, and simply include the turbulent contributions to D (t) and D (x, t) in the di!usion coe$cients in the PDE. Indeed, we saw V W in Section 2 that e!ective equations of this type, with time-independent di!usion coe$cients, do rigorously describe the e!ective behavior of the mean passive scalar "eld on large scales at long times for random #ows with short-range correlations. Also, we saw in Section 3.1.2 that the evolution of the mean statistics were precisely described, even in anomalous di!usion regimes, by a time-dependent di!usion equation in the Random Sweeping Model. In the present situation, however, the reasoning leading to such simpli"ed e!ective evolution equations for the mean statistics 1¹(x, y, t)2 break down. The key di$culty is that the excited scales of turbulent motion extend over a continuous range from the small scale l &Re\>'! (where ) `ICa denotes intermittency corrections) up to the macroscale of the system. It is on this macroscale that we wish to pose the e!ective evolution equation, and there is no scale separation between it and the active scales of the turbulence. This is why homogenization theory cannot be directly applied. The failure of separation of scales is particularly acute for turbulence spectra with e'2 (which includes the Kolmogorov value e"8/3), because these have very strong long range correlations and require the presence of the infrared cuto! t ( ) ) to keep the energy "nite. Another delicate issue is that the time scale at which the turbulent di!usivity saturates to its asymptotic value is comparable to the natural time-scale associated to the macroscale dynamics. This suggests that any e!ective turbulent di!usivity should generally be expected to be timedependent, as in the Random Sweeping Model described in Section 3.1.2. Because of the spatial variability of the shear interacting with the #uctuating cross-sweep, however, the tracer displacements cannot be expected to obey an approximately Gaussian distribution after order unity times. A proper e!ective evolution equation for 1¹(x,y,t)2 therefore need not be a standard di!usion equation with time-dependent di!usivities. In e!orts to address the challenge of modelling the large-scale e!ects of turbulence active over a wide range of scales, a wide variety of renormalized perturbation theories [177,285] involving partial summation of divergent perturbation series which mimic ideas from "eld theory and renormalization group (R-N-G) theories [300,286,344] inspired from critical phenomena have been proposed. We shall now brie#y discuss some rigorous mathematical work concerning e!ective evolution equations for the mean statistics in a simplifed shear model [10,14] which can be used as a basis for testing the various renormalized perturbation theories [13,17,300]. 3.4.3.3. Renormalized Green's functions for random shear -ows on large scales within inertial range. Avellaneda and the "rst author considered the problem of computing the large-scale e!ective turbulent di!usion for a family of random spatio-temporal shear #ows with energy spectra of the type (206), parametrized by scaling exponents e and z. They found it convenient to nondimensionalize with respect to the Kolmogorov dissipation length scale ¸ and dissipation )
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
385
time scale, rather than with respect to the system macroscales as above. In this formulation, the nondimensionalized value d\ of the integral length scale ¸ of the turbulent velocity "eld diverges with Reynolds number: d\&Re>'!. Since the macroscale of the passive scalar "eld naturally coincides with the macroscale of the velocity "eld in most applications, the (nondimensionalized) initial data is rescaled along with the integral length of the velocity "eld: ¹B(x, y),¹ (dx, dy) .
(218)
The central goal in turbulent di!usion modelling at high Reynolds number is to obtain an e!ective description of the evolution of 1¹B(x, y, t)2 on the system macroscale in the dP0 limit, where ¹B(x, y, t) is the passive scalar "eld evolving from initial data (218) in a turbulent velocity "eld with integral length scale d\. The "rst issue is "nding the proper space}time rescaling. Clearly, the nondimensionalized space variable must be rescaled by d, the macroscale of both the velocity "eld and the initial data. But unlike standard homogenization theory, the proper time rescaling is not necessarily linked to this spatial rescaling through the usual di!usive relation. Instead, one seeks a more general temporal rescaling function o(d) for which a "nite and nontrivial limit,
xy t ¹M (x, y, t),lim ¹B , , d d o(d) B
,
exists. The usual di!usive rescaling corresponds to o(d)"d. Note that the renormalization di!ers here from that discussed generally at the beginning of Section 3.4 in that the turbulent velocity "eld also depends on d, which is equated to its nondimensionalized integral length scale. Once the proper choice of o(d) is established, one strives to compute the equation satis"ed by ¹M (x, y, t). This may be viewed as a generalized `eddy di!usivitya equation for the high Reynolds number limit, since the e!ect of all the small scales d:k:1 of the velocity "eld on the macroscale have been incorporated into an e!ective equation involving only the large scales. To stress that the equation for ¹M (x, y, t) is not always of the form of a standard di!usion equation, we will refer to it by the general appellation of `eddy-renormalized equationa. In any event, the fundamental solution to this eddy-renormalized equation may be identi"ed as the renormalized Green's function. This program has been rigorously carried out for the spatio-temporal shear #ow models (206), with zero cross sweep [10] and a constant cross sweep [14]. The e!ects of molecular di!usion are also taken into account. A variety of possible e!ective equations arise, depending on the parameters !R(e(4 and z'0 of the turbulent spatio-temporal energy spectrum, and these can be described by a phase diagram similar to those displayed in Section 3.3. The rigorous renormalization theory for the case of no cross sweep (w(t)"0) involves "ve distinct phase regions in the e, z plane. To each phase region corresponds a temporal rescaling function o(d)"dD with a distinctive algebraic law for the exponent f in terms of e and z, and an eddy-renormalized equation for the mean passive scalar density. It is readily checked that the renormalized mean-square tracer displacement
1>M (t)2"lim d\ B
>B
t o(d)
,
386
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
scales as tD. Therefore, as in Section 3.3, the exponent f may be viewed as an order parameter describing `phase transitionsa across the boundaries of the phase regions. The e!ective equation within each phase region has a distinctive form, though the coe$cients within each region depend smoothly on the parameters e and z. We will simply sketch the main qualitative points, and later, in Section 3.5.4, illustrate the renormalization procedure itself in the simpler context of the pair distance function. One phase region in the phase diagram for the renormalized mean passive scalar density corresponds to a situation in which ordinary homogenization theory applies: o(d)"d, the eddyrenormalized di!usivity equation for ¹M (x, y, t) is an ordinary di!usion equation with a constant, turbulence-enhanced di!usion coe$cient (166a), and the renormalized Green's function is Gaussian. This is exactly the region D of Fig. 14 in which the long-range correlations of the velocity "eld are not strong enough to create anomalous behavior. Note in particular that this di!usive regime lies exclusively outside the parameter range 2(e(4 corresponding to velocity "elds with a physical-space inertial scaling range. All the other phase regions support superdi!usion of the passive scalar "eld (f(1). One of the phase regions for the renormalized mean passive scalar "eld coincides exactly with the SD-i phase region of Fig. 14. The renormalized Green's function for this region is exactly the one discussed for a steady random shear #ow with 0(e(2 in Paragraph 3.4.1.2. Indeed, the decorrelation of shear-parallel transport is set in the SD-i region by molecular di!usion, and the role of temporal #uctuations in the shear #ow plays a negligible role at large space and time scales. It is therefore not surprising that the eddy-renormalized equation coincides over this whole region with that for the steady case. The eddy-renormalized equation corresponding to this non-Gaussian renormalized Green's function is nonlocal. The presence of the infrared cuto! t plays no role in the renormaliz ation for the SD-i phase region, and does not appear in the eddy-renormalized equation. The situation is a bit more subtle for the other three superdi!usive phase regions. The eddy-renormalized equations for these regions take the form of di!usion equations with constant or time-dependent di!usion coe$cients. The presence of molecular di!usion is irrelevant on the large scales for all three of these regions. At the boundaries between the phase regions, the eddy-renormalized equation takes special forms which cross over between those on each side of the boundary. It is very interesting that the analogue of the Kolmogorov spectrum in the shear model occurs at the point e"8/3, z"2/3, which lies on the boundary between two of the superdi!usive regimes. In one of these regimes, the Eulerian temporal decorrelation of the velocity "eld is the dominant in#uence on the large-scale, long-time tracer transport, and the superdi!usively renormalized Green's function satis"es an e!ective di!usion equation with di!usion coe$cient given by a Kubo formula; see Section 2.4.1. The point (e, z)"(8/3, 1) corresponding to the alternative sweeping temporal decorrelation hypothesis falls in this Kubo region. In the other superdi!usive regime adjacent to the Kolmogorov point (e, z)"(8/3, 2/3), the temporal dynamics of the velocity "eld are by contrast completely irrelevant in determining the large-scale, long-time motion of the tracer. This is the regime corresponding to an energy spectrum with intermittency corrections reducing the value of e (this matches the conventional wisdom for the sign of intermittency corrections, if they exist [34]). More generally, we see that the renormalized mean passive scalar statistics in the RSTS-I Model are very sensitive to intermittency corrections of the Kolmogorov theory. In [15], it was rigorously shown that these features of the phase diagram for the renormalized Green's function near the Kolmogorov point carry over rigorously to random, isotropic turbulent #ows with spatio-temporal energy spectrum (206).
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
387
Note that there is no contradiction in the fact that the renormalized Green's function in the superdi!usive `Kuboa regime is described by an ordinary di!usion equation. The reason for the di!erence between the superdi!usive space}time scaling relations of the renormalization group and the di!usive space}time scaling of the renormalized equation can be traced to the linking of the infrared cuto! t (dk) de"ning the energy spectrum with the observation scale d\. This infrared cuto! breaks the symmetry of the renormalized equations with respect to the renormalization group of scalings [10]. The sensitivity of the large-scale behavior of the mean passive scalar density to the infrared cuto! is consistent with the well-known fact that the mean passive scalar statistics are strongly in#uenced by the large scales of a turbulent velocity "eld [196]. For an explicit numerical demonstration in a simple context, the reader may consult [141]. Only second order statistical quantities, such as the `pair distance functiona which involve relative di!usion of a pair of tracers, can be expected to exhibit universal behavior which is independent of the details of the large scales. We discuss this issue in greater length in Section 3.5. The presence of the infrared cuto! in Eqs. (206a), (206b) and (206c) also leads to a departure of the renormalized phase diagram from that presented in Fig. 14 for the mean-square tracer displacement in a random shear #ow with no infrared cuto! in the spatio-temporal energy spectrum. Namely, the SD-u region of Fig. 14 does not renormalize as a single unit with the temporal rescaling o(d)&dXC>X\. This choice of temporal rescaling leads to a proper high Reynolds number renormalization only for part of the SD-u region; the rest falls within another phase region (corresponding to the Kubo-like regime) with a distinct temporal rescaling law and sensitive dependence of the renormalized equation on the infrared cuto!. The e!ects of the addition of a constant cross sweep to the random spatio-temporal turbulent shear #ow (206) are considered in [14]. The renormalization phase diagram is altered in some ways; in particular the phase region corresponding to a nonlocal eddy-renormalized equation is eliminated. The renormalization is una!ected by the constant cross sweep in the vicinity of the Kolmogorov point (e"8/3, z"2/3). The eddy-renormalized equations are also shown in [14] to hold without change for a variety of random shear velocity models with energy spectra (206) but non-Gaussian statistics. The rigorous renormalization theory for the simple shear model [10,14] has been used by Avellaneda and the "rst author as an unambiguous means of assessing the performance of a variety of R-N-G methods and renormalized perturbation theories for predicting eddy di!usivity [13,17,300]. We summarize some of these "ndings in Section 7. 3.4.4. Comment on the eddy diwusivity modelling at xnite times In our above discussion concerning the mean passive scalar density, we have focused on the behavior at large scales and long times, which is usually of the main practical interest. There is, however, still an important practical concern regarding the transient period of adjustment to long-time asymptotic behavior, both for observations and for numerical simulations. In particular, if the numerical simulation of the tracer dynamics becomes unstable or inaccurate during any phase of the evolution, a substantial error may be incurred which permanently contaminates results at large times, even if the numerics are well behaved in the large-time asymptotic regime. We point out here some simple, explicit examples in which the tracer exhibits an e!ective negative di!usivity over some "nite interval of time, but eventually settles down to a positive di!usivity at long times. As it is quite di$cult for numerical Monte Carlo schemes (see Section 6) to simulate
388
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
periods of negative di!usion accurately, one must be on the alert for such a contingency. The appearance of an e!ective negative di!usivity is not a pathological feature of our mathematical models; Taylor [317] recognized such a phenomenon in chimney smoke experiments by Richardson [283] in 1920. As was pointed out in [18], negative di!usivity can already be observed in the extremely simple Random Sweeping Model described in Section 3.1.2. Recall that the velocity "eld is de"ned to be spatially uniform and directed always along a single direction:
*(x, y, t)"
w (t) D , 0
with w (t) #uctuating in time according to a Gaussian, mean zero, stationary stochastic process. D The correlation function of the sweeping "eld is de"ned: R (t),1w (t)w (t#t)2 . U D D The tracer displacement is at all times a Gaussian random variable, and it can therefore be shown [18] that the mean tracer concentration density 1¹(x, t)2 obeys an exact PDE of di!usive type: R1¹(x, y, t)2/Rt"iD1¹(x, y, t) 2#D D(t) U
R1¹(x, y, t)2 , Rx
1¹(x, y, t"0)2"1¹ (x, y)2 .
(219)
The e!ect of the random #uctuations is to augment the di!usivity along the sweeping direction x above its bare molecular value i by the time-dependent quantity:
R (220) D D(t), R (s) ds . U U At large times, the enhanced di!usivity D D(t) must be nonnegative, but there may well be "nite U intervals of time over which it is negative. Explicit examples may be found by de"ning the velocity "eld w (t) through statistically homogenous solutions to a damped and stochastically driven D harmonic oscillator problem [18]: d(dw (t)/dt)#2a dw (t)#uw (t) dt"A d=(t) , (221) D D D where u '0 describes the sti!ness of the oscillator, a'0 describes the damping, and d=(t) is white noise. For the underdamped range of parameters, u 'a, the correlation function has oscillations: R (t)"(A/4au)e\?R[cos(u t)#(a/u )sin(u "t")] , U where u "(u!a. The criterion for D D(t) (220) to be always nonnegative may be shown [141] U to be exactly equivalent to the inequality u4C a, where C is a numerical constant which may ' ' be computed to be approximately 27.197. Therefore, for su$ciently strong restoring forces on the
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
389
oscillations driving the random sweeping (u/a'C ) and su$ciently small i, there are "nite ' intervals of time over which i#D D(t)(0 , U and the PDE (219) for the mean passive scalar density is ill-posed. This fact is not of great concern within the Random Sweeping Model itself, because its simplicity permits the mean statistics 1¹(x, y, t)2 to be unambiguously represented in closed form [18], and there is no need to actually solve (219). But the potential for ill-posed behavior does raise practical concerns for numerical simulations in more complex #ows where no exact solution is available. When the e!ective di!usivity becomes negative, approximate PDEs for the mean statistics cannot be numerically integrated by ordinary means, and the numerical simulation of even the meansquare displacement of a tracer faces severe di$culties [140]. An example with nontrivial spatial variations which exhibits ill-posed behavior for the mean passive scalar density can be constructed for the Random Steady Shear (RSS) Model with constant, nonzero cross sweep:
*(x, y, t)"
wN
v(x)
.
The mean passive scalar density in such a #ow can be shown to obey a time-dependent di!usion equation of the same form as Eq. (219), except that the time-dependent di!usivity enhancement is in the shearing direction y, and is expressed as
R D(t)" R(wN s) ds , where R(x) is the correlation function of the Gaussian, homogenous, random shear #ow v(x). For a random shear #ow v(x) generated as the homogenous solution of a stochastically damped and driven harmonic oscillator (221) (but with spatial rather than temporal variations), it has been similarly shown that for su$ciently strong restoring forces u/a'C +27.197 and su$ciently ' small molecular di!usivity i, the PDE for the mean tracer density 1¹(x, y, t)2 is ill-posed over some "nite interval of time [141]. In particular, negative di!usion over "nite time intervals arises in the degenerate limiting case, studied by GuK ven and Molz [131], in which the random shear #ow v(x) is an undamped periodic oscillation with random phase and the molecular di!usion i is su$ciently small. 3.5. Pair-distance function and fractal dimension of scalar interfaces A common theme in the various shear #ow examples which have been presented is that the motion of a single tracer is determined to a great extent by the details of the large-scale features of the velocity "eld. This is particularly true for random velocity "elds with long-range correlations characteristic of fully developed turbulence (such as the 2(e(4 simple shear models), in which the energy is concentrated at small wavenumbers. In practical situations, this means that the motion of a tracer, and therefore the evolution of the ensemble-averaged mean passive scalar concentration, in a turbulent environment depends strongly on the geometry and the details of the
390
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
large-scale stirring. Therefore, there can be no single law describing the behavior of a tracer's position in an arbitrary turbulent system. The situation can be quite di!erent for the separation between a pair of tracers in a fully developed turbulent #ow with a wide inertial range. The rate of separation between a pair of tracers is not governed by the typical velocity of the #ow, but by the di+erence between the velocity at the two tracer locations. Therefore, a turbulent eddy with large size relative to the separation between the two tracers will sweep them both with roughly the same velocity, and contribute to their relative motion only through its velocity gradient (primarily the strain component). Recall that in the Kolmogorov picture, the root mean square velocity of a turbulent eddy of length scale ¸ in the inertial range scales as ¸. Therefore, an inertial range eddy of size ¸ large compared to the separation l of a tracer pair will contribute to the absolute motion of either of them (or their center of mass) in proportion to ¸, but will contribute to their relative motion in proportion to ¸(l/¸)&l¸\. Consequently, large-scale eddies are seen to dominate the absolute motion of a tracer, but to contribute little to their relative motion (separation). In other words, the process of separation of a tracer pair can be expected to be insensitive to large-scale details so long as the separation remains small compared to the system size. By noting that inertial-range eddies on scales small compared to the distance between the tracers contribute their full velocity to the relative tracer motion, we arrive at the following heuristic principle: When the tracer pair separation lies within the inertial range of scales of a fully developed turbulent #ow, its evolution should be primarily governed by inertial-range eddies of size comparable to the momentary separation. As we have discussed in Section 3.4.3, the statistics of a turbulent velocity "eld well within the inertial range of scales is thought to have a number of universal properties which are independent of many of the details of the physical system. Combining the previous two statements, we see that the separation l(t) between a pair of tracers may obey universal laws as it evolves through the inertial range of scales. The most famous proposed universal relation of this type is Richardson's law [252,253,284] 1l(t)2&t, which has been observed in a wide variety of laboratory experiments [248,258,261,315] and numerical simulations using synthetic turbulent velocity "elds (see Section 6 of this review and [86,109,291,351]). Like the Kolmogorov theory of turbulence, universal inertial-range theories for turbulent di!usion in multidimensional velocity "elds with realistic spatio-temporal structure are based on extra physical assumptions which do not follow directly from the advection}di!usion equation. As we have seen in Sections 3.2, 3.3 and 3.4, many mechanisms and subtleties of turbulent di!usion can be explored in a clear and unambiguous fashion in shear #ows. Here, we will mathematically investigate universal inertial-range aspects of turbulent di!usion in shear #ows, using the family of models with reasonably realistic spatio-temporal energy spectra introduced in Paragraph 3.4.3.2. For simplicity in exposition, we will not consider any active cross-shear transport processes, so the only decorrelation of the Lagrangian tracer motion is due to the temporal #uctuations of the random shear #ow itself. It is consequently a little more direct to work here with the spectral temporal correlation function Ex (k, t) de"ned earlier in Eq. (158) instead of the spatio-temporal energy spectrum EI (k, u). The function Ex (k, t) describes the temporal correlation structure of the shear modes of wavenumber k, and is related to the physical-space correlation function RI (x, t) and the spatio-temporal correlation function EI (k, u) through the Fourier transform relations (158).
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
391
The representation of the RSTS-I Model (206) in terms of Ex (k, t) reads
Ex ("k", t)ep IV dk"2 Ex (k, t)cos(2pkx) dk , \ Ex (k, t)"E("k") K (t/q("k")) , RI (x, t)"
(222a) (222b)
E(k)"A k\\&t (k¸ )t (k¸ ) , (222c) # ) q(k)"A k\X , (222d) O where K (u) is just the Fourier transform of (-) in Eqs. (206a), (206b) and (206c). We have collected the various constant prefactors into the single coe$cient A . The cuto! functions t and t along # with K are smooth functions on the positive real axis with the assumed technical properties listed under (200). The exponent H characterizing the inertial-range scaling exponent of the energy spectrum is formally related to the infrared scaling exponent e used in the RSTS Model in Section 3.3 by H"(e!2)/2. The appropriate parameter range for which a universal scaling range arises in real space is 0(H(1, or 2(e(4. The value with H"1/3 (and z"2/3) corresponds to Kolmogorov-type statistics. It is easily checked that the heuristic considerations suggesting universal inertial-range behavior for pair separation formally apply throughout the range 0(H(1. By `universal inertial-range behaviora in the context of this family of mathematical models for turbulence, we mean statistical behavior which depends only on parameters describing the inertial range (A , e, and z), but not on viscous-scale or large-scale properties (¸ , ¸ , t , t ). # ) We also by no means imply that these laws are universal with respect to #ow geometry; the anisotropic geometry of the shear #ow can of course lead to departures in the speci"c appearance of certain laws from their formulation for statistically isotropic #ows. For example, the mean-square pair separation will be found in Paragraph 3.5.2.1 to grow in the inertial range according to a power law with exponent di!erent from that appearing in Richardson's t law for isotropic turbulence, even for Kolmogorov values of the exponents H"1/3 and z"2/3. We will return to Richardson's t law later when we discuss an exactly solvable statistically isotropic mathematical model (Paragraph 4.2.2.4) and numerical methods for accurately simulating tracer pair dispersion over a wide inertial range in a synthetic statistically isotropic turbulent #ow (Section 6.5). Our present study of pair separation in the simple shear model begins in Section 3.5.1 with a derivation of an exact solution for the PDF (probability distribution function) of the separation between a pair of tracers in a shear #ow with arbitrary spatio-temporal statistics, in the absence of molecular di!usion (i"0). This PDF of the separation will be called the pair-distance function, a term introduced by Richardson in his pioneering work on relative di!usion [284]. We show that the pair-distance function satis"es a linear PDE of di!usive type, and indicate an explicit example in which the di!usion coe$cient is negative over some "nite time interval, leading to ill-posedness [141]. We then specialize in Section 3.5.2 to the Random Spatio-Temporal Shear Models with Inertial Range (RSTS-I) introduced in Section 3.4.3, and show that the pair-distance function has an explicit, universal form on scales well within the inertial range. We then subsequently utilize the formula for the pair-distance function to study the properties of the boundary of a region marked by a passive scalar quantity as it evolves in an RSTS-I #ow in the absence of molecular di!usion i"0. Of particular natural and engineering interest [199,305,339] is the turbulent wrinkling of the interface between the scalar-occupied and the scalar-free region.
392
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The notion of fractal dimension [215] provides a useful measure of the geometric convolution of a surface with scale invariance, such as the scalar interface in the inertial range of scales. By combining the explicit formula for the pair-distance function with a mathematical theorem of Orey [259] concerning the fractal dimension of the graph of a Gaussian random process, we compute in Section 3.5.3 the local fractal dimension of the scalar interface in the RSTS-I model within the inertial range of scales. There is also a distinct global fractal dimension describing the roughness of the passive scalar level set above some time-dependent length scale determined by the rate of temporal decorrelations of the shear #ow. We next study fully scale-invariant properties of the pair-distance function which emerge after an isotropic renormalization to large space and time scales within the inertial range. The fractal dimensions which emerge from the isotropically renormalized passive scalar "eld are in quite good agreement with experimentally observed fractal dimensions of interfaces in a variety of turbulent systems [305]. We conclude in Section 3.5.5 by posing the open problem of including the e!ects of molecular di!usion i'0 on the pair-distance function and the fractal dimension of scalar interfaces in the RSTS-I Model. Unambiguous results along these lines could provide useful insights concerning general mathematical and physical theories for fractal dimensions of interfaces in turbulent #ows [74,75,128,132,310]. 3.5.1. Pair-distance function in shear -ow The pair-distance function is de"ned for our purposes to be the PDF for the separation between a pair of tracers, given a certain initial separation. We will specialize the notation for convenience in our study of pair dispersion in a two-dimensional shear #ow with constant cross sweep:
*(x, y, t)"
wN
v(x, t)
.
(223)
Here v(x, t) is a homogenous, Gaussian, mean zero random "eld with correlation function 1v(x, t)v(x#x, t#t)2"RI (x, t) , and wN is a deterministic constant (which may be zero). Within Section 3.5.1, we do not specify a particular model for RI (x, t). Consider two tracers released at the points (x , 0) and (x #x , 0) in the random shear #ow (223), with no molecular di!usion (i"0). Let > (t) and > (t) denote their y positions at time t. Then V >V V we de"ne the pair-distance function QR(y"x ) as the probability density function satisfying
W> QR(y"x ) dy"Prob+y 4> (t)!> (t)4y , \ V >V V > \ W
(224)
for all y (y . In words, then, QR(y"x ) gives the probability density that two tracers with initial \ > shear-transverse separation x will have developed a relative shear-parallel displacement of y at time t. A few comments on this de"nition are in order. First, by statistical homogeneity, the statistics of pair separation depends only on the relative position of the tracers, and are therefore independent of x . Furthermore, the constancy of the cross-shear velocity preserves the shear-transverse separation x of the tracers, so there is no need to separately account for the shear-transverse
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
393
separation at later times in the pair-distance function. Since there is no variation of the #ow along the y direction, there is no loss in taking both tracers to start at y"0; the PDF for pair separation starting from arbitrary locations (x , y ) and (x #x , y #y ) is obtained straightforwardly from the PDF QR(y"x ) de"ned for y "y "0. Finally, we note that Richardson's original de"nition [284] of the pair-distance function is more general in a literal sense, but equivalent in essence to the simpler de"nition used here. We will now proceed to give an exact formula for the pair-distance function in a general shear #ow with constant cross sweep (223). The relative shear-parallel separation is given by the explicit formula (see Eqs. (110a) and (110b))
R (225) > (t)!> (t)" (v(x #x #wN s, s)!v(x #wN s, s)) ds . V V >V This expression is an integral over a Gaussian random "eld, and thus a Gaussian random variable. Since QR(y"x ) is the probability density for the random variable (225), it is completely determined by the mean and variance of the relative displacement. Since v has mean zero, it is easy to see that 1> (t)!> (t)2"0 . (226) V >V V The variance of the shear-parallel separation can be computed as an integral of the velocity correlation function:
pD (t"x ),1(> (t)!> (t))2" 7 V >V V
R R
1(v(x #x #wN s, s)!v(x #wN s, s)) ;1(v(x #x #wN s, s)!v(x #wN s, s))2 ds ds R R [2R(wN (s!s), s!s)!R(!x #wN (s!s), s!s) " !R(x #wN (s!s), s!s)] ds ds . Then the pair-distance function can then be expressed as the following Gaussian:
(227)
exp[!y/(2pD (t"x ))] 7 . QR(y"x )" (2ppD (t"x ) 7 It is instructive to note that QR(y"x ) satis"es a di!usion PDE: RQR(y"x ) RQR(y"x ) "DD(x , t) , Rt Ry
(228)
(229)
(230)
Q(y"x )"d(y) . The initial condition comes from the fact that at time t"0, the tracers have a deterministically zero displacement. The time-dependent di!usion coe$cient D(x, t) is obtained by di!erentiating Eq. (227): DD(x , t), RpD (t"x )/Rt . 7
(231)
394
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
A simple way to derive the di!usion equation (230) is to compute the characteristic function of Eq. (225), which is just the Fourier transform of the pair-distance function: QK R(k"x ),1exp(2pik(>(x#x, t)!>(x, t)))2"exp(!2pkpD (t"x )) , 7
(232)
and to note that it satis"es the Fourier transform of Eq. (230). 3.5.1.1. Possible ill-posedness of pair-distance PDE. While the pair-distance function has been shown to always obey a fairly simple, time-dependent di!usion equation, the PDE (230) is not necessarily well-posed. It is shown in [141] that for the case of a steady shear #ow (v(x, t)"v(x) and RI (x, t)"R(x)), this PDE is well-posed for all time t'0 and initial separations x if and only if one of the following two conditions holds: E there is no cross sweep wN "0, or E the velocity correlation function R(x) is a nonincreasing function of x on 0(x(R. If both of these conditions are violated, then the formal di!usion coe$cient DD(t"x ) is negative for some values of x and t, leading to an ill-posed problem for QR(y"x ). Explicit examples of a random shear velocity "elds giving rise to ill-posedness for the pairdistance function may be found within the class of statistically homogenous solutions to a damped and stochastically driven harmonic oscillator problem [141]: d(dv(x)/dx)#2a dv(x)#kv(x) dx"A d=(x) , where k '0 describes the sti!ness of the oscillator, a'0 describes the damping, A'0 is the driving amplitude, and d=(x) is white noise. We introduced this stochastic model before in Section 3.4.4. For the underdamped range of parameters, k 'a, the correlation function has oscillations: R(x)"(A/4ak)e\?V[cos(k x)#(a/k )sin(k "x")] , with k "(k!a. Thus by the general fact quoted above from [141], the formal di!usion coe$cient DD(x , t) is negative over some range of values of x and t whenever the oscillations exceed the damping. Note that DD(x , t) is just half the rate of the mean-square shear-parallel tracer separation; its negativity implies time intervals over which the tracers are actually decreasing their separation on average. Of course, for the present model we have an exact expression (229) for the pair-distance function so the ill-posedness of the PDE (230) does not thwart our analytical progress here. The potential for ill-posed behavior of the relative tracer displacement, as explicitly demonstrated in our simple model, however does raise practical concerns for numerical integration of tracer trajectories in more complex #ows where no exact solution is available [140]. 3.5.2. Inertial range behavior of the pair-distance function in the simple shear model We will now study the pair-distance function in detail when the random shear #ow is an RSTS #ow with an inertial range and no cross sweep (w(t)"0).
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
395
Substituting the expression (222) for the spatio-temporal correlation function of this velocity "eld into the general formula for the mean-square shear-parallel separation (227) with wN "0, we have
pD (t"x )" 7
R R
s!s dk ds ds 2(1! cos (2nkx ))E("k") K q("k")
\ "4t (1! cos (2nkx ))E(k)U(t/q(k)) dk "4t (1! cos (2nkx))A k\\&t (k¸ )t (k¸ )U(tA\kX) dk . # ) O We have de"ned
(233)
2 S (u!u) K (u) du (234) U(u)" u and used a change of variables to re-express the double time integral in the second equality above. We collect here some basic properties of U which will be useful in our later analysis of pD (t"x ): 7 U(0)"1 , (235a) lim U(u)& (0)u\ , S "U(u)"4CU/(1#"u") ,
(235b) (235c)
where CU is some "nite positive constant. These results follow from the Fourier relation between
K (u) and (-), and the assumed properties of (-) stated near Eq. (156). It is instructive at this point to compare and contrast the formulas we have developed for the mean-square relative displacement pD (t"x ) between a pair of tracers and the mean-square absolute 7 displacement p(t) of a single tracer along the shear. We represent the latter using formula (159) for 7 the special case in which there is no cross-shear transport and Ex (k, t) is given by Eq. (222b):
A k\\&t (k¸ )t (k¸ )U(tA\kX) dk . (236) # ) O The formulas for the absolute and relative tracer displacements di!er in only two regards: a multiplicative factor of 2, and the appearance of a factor (1! cos 2nkx ) in the integrand for pD (t"x ). The prefactor of 2 simply re#ects the fact that the mean-square relative distance between 7 two tracers undergoing some independent statistical motion will simply be the sum of their individual absolute mean-square displacements. There is, however, a de"nite coupling between the motion of two tracers in our model due to the spatial correlations of the shear #ow. The e!ects of this coupling are completely contained in the factor (1! cos 2nkx ). For k;1/x , this factor is small (O((kx ))), re#ecting the fact that the motion of two tracers due to a shear mode with wavenumber k is highly correlated when their cross-shear separation x is much less than the wavelength k\. Their relative velocity is on the order of their cross-shear p(t)"2t 7
396
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
separation x multiplied by the gradient of the shear mode, which is proportional to k times the root-mean-square velocity of the shear mode. The rate of relative mean-square separation due to a shear mode of wavenumber k;1/x is thus depressed by a factor of (kx ) from the absolute mean-square displacement of either tracer. The exact formula (233) for pD (t"x ) thereby explicitly 7 manifests the general principle that the low wavenumber modes contribute weakly to the process of tracer separation on small scales (see our discussion at the beginning of Section 3.5). For shear modes with high wavenumbers k<1/x , the spatial correlation factor 1! cos 2nkx is generally of order unity. The e!ect of the cosine becomes negligible when integrated over a su$ciently wide band of wavenumbers k, due to incoherent phase cancellation. Therefore, any wide band of random shear modes with wavenumbers k<1/x contribute essentially independent ly to the motion of each tracer in a pair with cross-shear separation x . We now proceed to study the inertial-range behavior of pD (t"x ) for the parameter range 7 0(H(1. By this we mean that the (constant) shear-transverse separation satis"es ¸ ;x ;¸ ; ) the shear-parallel separation has no in#uence on the tracer dynamics in our shear #ow model. We may straightforwardly take the asymptotic inertial-range limit in Eq. (233), and the integral scale and dissipation scale cuto!s disappear:
lim pD (t"x ), pD (t"x )"4t (1! cos (2nkx ))A k\\&U(tA\kX) dk . (237) 7' 7 # O V*)V* This is mathematically justi"ed by the dominated convergence theorem after a rescaling of the integration variable q"kx , owing to the boundedness of the function U and the integrability of the function (1! cos 2nq)q\\& over q3[0,R). Therefore, we have expressed the relative mean-square displacement of tracers on scales within the inertial range purely in terms of inertial range quantities; there is no dependence on either the integral or dissipation length scales. The same is true for the full pair-distance function QR(y"x ) for inertial-range separations of x , since it ' is determined entirely by pD (t"x ) (see (229)). The pair-distance function is therefore truly universal 7' in the inertial range of scales in the RSTS-I Model. In contrast, we remark that the mean-square absolute displacement of a single tracer p(t) 7 depends vitally on the factor t (k¸ ). Taking ¸ PR in Eq. (236) would give an in"nite value for p(t), simply because the total energy of the shear #ow would become in"nite. Energy spectra with 7 inertial ranges (0(H(1) are therefore said to manifest an infrared divergence. A direct consequence is that the motion of a single tracer in a turbulent #ow is strongly sensitive to the large-scale structure of the #ow. The reason that the relative tracer displacement has a "nite inertial-range limit, notwithstanding the infrared divergent energy spectrum, is that low wavenumber shear modes contribute quite weakly to the separation process due to their small gradients. Mathematically, this is manifested by the factor (1! cos (2nkx )) in Eq. (233), which contributes a factor k to tame the infrared divergence k\\& at small k. The ability to remove the Kolmogorov dissipation length cuto! t(k¸ ) from Eq. (233) is less ) subtle; it is simply a consequence of the `ultraviolet convergencea of energy spectra with inertial range. The heuristic component of our above discussion concerning absolute and relative tracer motion in a high Reynolds number shear #ow can be generalized through similar scaling arguments to isotropic turbulence. The exact mathematical formulas, however, are special to shear #ows. These
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
397
Table 10 Long-time asymptotics of mean-square relative tracer displacement along the shear within Inertial Range of Random Spatio-Temporal Shear Model. Scaling coe$cients are given by Eq. (238) Parameter regime
z50
2!z 0(H( 2 2!z (H(1 2
z50
Asymptotic mean square displacement lim p (t"x ) R D7'
Phase region
2KI HD "x "&>Xt z KI D"x "tX>&\X z#H!1
R-1 R-2
permit us to examine various aspects pertaining to tracer dynamics within the inertial range of scales in much more detail than is generally possible, as will be shown in what follows. 3.5.2.1. Evolution of relative separation between a xxed pair of tracers. The long-time asymptotics of the mean-square shear-parallel tracer pair separation pD (t) are described in Table 10, with 7 scaling coe$cients
!C(!H!z/2) KI HD "2 (1!cos (2nkx )) (0) A A k\\&\X dk"2n&>X> A A (0) , # O C(H#(z#1)/2) O # (238a)
8n(z#H!1) A A\&X u\&\XXU(u) du . KI D" (238b) # O z These results are very similar to those found for the mean-square shear-parallel displacement of a single tracer in an RSTS #ow with no cross shear transport (Section 3.3.1), and the method of derivation is very similar. One only needs to account for the extra factor (1! cos 2nkx ), and we remark only on the changes this produces. First of all, the integrand for the long time relative di!usivity (238a) in the linear growth regime (R-1 in Fig. 17) is peaked at k&x\, rather than at the lowest wavenumbers where energy is concentrated. This re#ects the idea discussed at the beginning of Section 3.5 that relative di!usion is driven primarily by velocity #uctuations (eddies) with wavelength comparable to the current tracer separation. This notion, however, is violated in the present model when the correlation time diverges su$ciently rapidly at low wavenumber, in which case the long-time relative di!usion is driven at a superlinear rate by these slow, low wavenumber shear modes (regime R-2 in Fig. 17). Care is therefore required in analyzing this regime at long times and large but "nite integral length scales ¸ ; the limits tPR and ¸ PR do not commute. In our present discussion, we are always assuming that ¸ is extremely large so that the limit ¸ PR is appropriately taken "rst. Note that when the temporal correlation times of the shear modes are chosen as the natural eddy turnover times so that z"1!H (see Eq. (211a)), the relative di!usion is always linear and driven primarily by modes of wavenumber k&x\. This includes the Kolmogorov values
398
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 17. Phase diagram for long-time asymptotics of pD (t) in Random Spatio-Temporal Shear Model with Inertial 7 Range, no cross shear transport. This phase diagram also applies to the local fractal dimension of level sets of ¹(x, y, t) (see Paragraph 3.5.3.2).
H"1/3, z"2/3. These results for the Random Spatio-Temporal Shear Model contrast strongly with Richardson's law predicting a t growth of pD (t) for relative di!usion through the inertial 7 range of scales in isotropic turbulence. The shear geometry destroys a fundamental premise in the similarity argument [252,253] behind Richardson's law: the initial pair separation x is never forgotten because the shear turbulence does not act along the x direction. Therefore, the most relevant turbulent eddies driving the separation of the tracer pair are those with wavelengths comparable to the initial separation x , rather than to the current separation scale. 3.5.3. Fractal dimension of scalar interfaces In this subsection, we will apply the exact formulas for the pair-distance function which we worked out in Sections 3.5.1 and 3.5.2 to provide a measure of smoothness of level sets of the passive scalar density ¹(x, y, t). This is of practical importance in determining mixing rates. Consider "rst two initially distinguishable #uids, such as a salty #uid and a freshwater #uid, a dyed #uid and an undyed #uid, or a warm #uid and a cold #uid which are brought together to mix. If e!ects such as buoyancy and chemical reactions between the #uids can be neglected, then the evolution of the #uid mixture may be expressed in terms of a passive scalar "eld ¹(x, y, t) which re#ects the local concentration density of one #uid or the other. For example, in the abovementioned examples, ¹(x, y, t) could be chosen as salinity, dye concentration, and temperature, respectively. After a suitable rescaling and translation of physical units, one may characterize the initial state of the passive scalar "eld by ¹ (x, y),1 in the region occupied by the "rst #uid, ¹ (x, y),0 in the region occupied by the second #uid, and ¹ (x, y), at the initial interface. If we neglect molecular di!usion for the moment, then the passive scalar concentration density is unchanged along Lagrangian trajectories, so the #uid is characterized thereafter by mobile regions where ¹(x, y, t),1 and ¹(x, y, t),0 separated by the level set ¹(x, y, t)" demarcating the evolving interface between the two #uids. The action of molecular di!usion will smooth the variation of ¹(x, y, t) across the #uid interface, but it is readily seen that the level set ¹(x, y, t)"
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
399
still provides a good representation for the location of the interface between the #uids where mixing is occurring, particularly when the PeH clet number is large so that the interface region is thin. One major feature of turbulent di!usion is that it roughens interfaces on small scales both within the inertial range and the viscous range [54,305]. This is in strong contrast to the smoothing nature of molecular di!usion. The source of the di!erence is the interface structure created by spatial correlations in turbulent velocity "elds. Only when viewed on scales large compared to the largest correlation length of the velocity "eld do these undulations blur out, and the mixing action of turbulent di!usion appear to be smoothing. The reason that molecular di!usion is smoothing on all scales is that its correlation length is strictly zero, so all scales are large enough for the mixing to appear smoothing! The turbulent wrinkling of the #uid interface permits the #uids to contact each other over a greater surface area than if it were a #at plane as in laminar #ow conditions. This increases the #ux of temperature, salinity, dye, etc., across the interface, and speeds the process of mixing. Another context in which level sets of a scalar "eld are of great practical interest is in the progress of di!usion #ames in turbulent combustion [305,339]. A di!usion #ame is characterized by the fact that the fuel and the oxidant are supplied separately, and the #ame appears at the boundary of the fuel and oxidant zones where the necessary molecular mixing takes place. When the reaction rate is large compared with the di!usion coe$cients of the chemical species involved, the #ame can be idealized as an in"nitesimally thin surface where all the reaction takes place. From the physical equations governing the combustion process one can deduce that at the #ame front the temperature ¹(x, y, t) attains its maximum value corresponding to the adiabatic stoichiometric temperature ¹ (for the algebraic details of the derivation consult [339]). Therefore the location of the A di!usion #ame can be characterized by the level surface ¹(t, x, y)"¹ , under the approximation A that the #ame front is in"nitesimally thin. If the combustion is occurring in a turbulent environment, the #ame surface will be wrinkled into a larger surface area, thereby increasing the speed at which the #ame burns through the fuel [339]. In combustion, the temperature "eld is not a passive scalar "eld, but one might to a "rst approximation suppose that the wrinkling of the interface may be governed by a similar process. Indeed, laboratory measurements indicate that di!usion #ames have a quantitatively similar small-scale geometry to that of passive scalar interfaces, provided that certain conditions are met [305]. We mention here that the "rst author with Souganidis have studied the enhanced #ame speed in premixed turbulent combustion with the e!ects of nonlinear reaction and di!usion, as well as small-scale turbulence [90,88,89,211}213]. In particular, [212,213] contain a rigorous analysis of renormalized #ame fronts in a steady shear #ow with an inertial range of scales. In the above examples, the extent of mixing and burning enhancements relies sensitively on the degree of wrinkling of the #ame front. We now discuss one way of quantifying this wrinkling. 3.5.3.1. Elements of fractal theory. In a high Reynolds number #ow, the small-scale turbulent #uctuations have a self-similar statistical structure over a wide inertial range of scales. It is reasonable to expect that the small-scale contortions of the scalar interface produced by these inertial range eddies will also exhibit a self-similar structure over a subregion of the inertial range of scales where molecular di!usion is formally negligible relative to advection. A useful framework for describing objects with a statistical scaling invariance is fractal geometry [215]. The roughness of such fractal objects can be quantitatively represented through the notion of fractal dimension. For
400
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
an extensive discussion of theory and applications of fractal geometry, see [215]. Here we state only the essentials for our purposes. The intuitive notion of fractal dimension is a measure of how well a set "lls up space. There are several ways to de"ne and compute fractal dimensions of an object A; these yield identical results under certain caveats [217,218]. A useful operational de"nition that can be computed e$ciently in practice [305] is that of local box-counting dimension d . One partitions the ambient space 1B in * which the object resides into boxes of size d, and counts how many boxes N intersect the object. B The local box-counting dimension of A is then de"ned as ,lim log N /log d\. (239) B B This local box-counting dimension clearly coincides with the ordinary Euclidean dimension of any smooth, recti"able object. But it also assigns a useful geometric dimension to very rough curves, such as the graph of Brownian motion =(t) versus time, which has d "1.5 (see [215]). The * fractional value of this dimension indicates that the small-scale wrinkles in the Brownian graph cause it to have space-"lling properties intermediate between that of a curve and of a solid region in a plane. Curves of arbitrary local box-counting fractal dimension from 1 to the dimension of the embedding space can be constructed [215]; greater values indicate wilder local #uctuations. When we speak of the fractal dimension of a physical entity like a front, we do not mean the technical fractal dimension of the front, de"ned by the behavior on the smallest scales. We are rather interested in the structure of the curve over some "nite range of scales with both an upper and lower cuto!, particularly ranges of self-similarity such as the inertial range. For these purposes, the size of the boxes d in the de"nition of the local box-counting dimension must be restricted to lie within the range of desired scales. When the front possesses approximate self-similarity over a substantial portion of this range, then a graph of log N versus log d\ will produce points lying B nearly along a straight line. A more robust and meaningful assessment of the fractal dimension of the front on this range of scales is obtained in empirical measurements [305] through the slope of the best-"t line, rather than through a literal implementation of Eq. (239). In our mathematical analysis, we have computed the second-order statistics after having taken the limit of zero Kolmogorov dissipation length and no molecular di!usion, so there is no lower limit to the inertial range behavior of the passive scalar interface. We can therefore equate the fractal dimension for the inertial range in our model with that computed from a dP0 mathematical limit. To this end, we will appeal to a theorem of Orey [259]. d
*
If (x, f (x)) is the graph of a statistically homogenous Gaussian random "eld, and c "x"?41( f (x#x)!f (x))24c "x"? (240a) for "x";1 and positive constants c , c , then for almost every realization, this graph has (Hausdor!) fractal dimension d "2!a . (240b) *& The de"nition of the Hausdor+ fractal dimension referred to in this theorem is somewhat technical, and we refer the reader to [249] for a complete discussion. It su$ces for our discussion to note that the Hausdor! fractal dimension coincides with the more intuitive local box-counting dimension in
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
401
standard situations [215]. There are caveats and counterexamples to this statement [218,332] which we do not wish to dwell on here, since we see no reason why they should arise in our application to passive scalar interfaces produced in a shear #ow. We will therefore simply refer to the Hausdor! dimension as the local fractal dimension d . The adjective `locala is important, * however, since the scalar level set in the RSTS-I Model exhibits a crossover to a di!erent self-similar structure at large scales which is characterized by a di!erent fractal dimension, as we will explain in detail later. The exponent a appearing in Eq. (240a) is just the Hurst exponent of the graph [215], and acts as a HoK lder exponent for almost every realization of the random graph [1]. Smooth random curves have Hurst exponent a"1, and fractal dimension d equal to their topological dimension of unity. * Rougher random curves have smaller Hurst exponents a and, according to Eq. (240b), larger values of fractal dimension d . * 3.5.3.2. Local fractal dimension of passive scalar level sets. We now show how the local fractal dimension of passive scalar level sets on the inertial range of scales may be exactly computed in the RSTS-I model through the exact formula for the pair-distance function in the inertial range and Orey's theorem (240). We assume that the level set of ¹(x, y, t) of interest is initially a #at line lying along y"0. Because there is no molecular di!usion, the level set at any later time is simply given by the image of the initial level set y"0 under the (random) mapping of Lagrangian tracers induced by the motion of the shear #ow. (This follows mathematically from the method of characteristics applied to the "rst order PDE which results from setting i"0 in the advectiondi!usion PDE.) Therefore, the scalar level set of interest at time t is described by the graph (x, > (t)) , V where
R > (t)" v(x, s) ds V is the y position at time t of the tracer starting from (x, 0) at time zero. Now, > (t) is a statistically homogenous Gaussian random "eld in x, since it is just a time V integral over the statistically homogenous Gaussian random "eld v(x, t). In Section 3.5.2, we computed that, within the inertial range of scales,
(1! cos (2nkx))A k\\&U(tA\kX) dk . (241) # O Since the passive scalar level set is the graph of a homogenous Gaussian random "eld, we can compute its local fractal dimension from the small x asymptotics of this quantity, using Orey's theorem. The results are presented in Table 11. The phase diagram coincides with that for the relative mean-square displacement pD (t"x ) (Fig. 17). 7' The small x asymptotics of the integral appearing in Eq. (241) are computed as follows. Within Region R-2, one can simply replace (1! cos 2nkx) by its small x limit 2nkx, and appeal to the dominated convergence theorem using the fact (235c) that (1#u) U(u) is a bounded function. 1(> (t)!> (t))2,pD (t"x)"4t 7' V>VY VY
402
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Table 11 Local fractal dimension of scalar level sets within inertial range of scales in Random Spatio-Temporal Shear Model Parameter regime z50 z50
2!z 0(H( 2 2!z (H(1 2
Local fractal dimension d *
Phase region
z d "2!H! * 2
R-1
d "1 *
R-2
Dominated convergence using this approach would fail in Region R-1, and one must instead rescale integration variables q"k"x", yielding 1(> (t)!> (t))2,pD (t"x) 7' V>VY VY
(1! cos (2nq))A q\\&U(tA\qX"x"\X) dq . # O If z"0, then this quantity is clearly proportional everywhere to "x"&, yielding a fractal dimension d "2!H"2!H!z/2. For z50, one must use the large-argument asymptotics (235b) of U. * To establish that the limiting value of the integral is obtained by replacing the U factor in the integrand by its asymptotic limit, we apply the dominated convergence theorem to the integral multiplied by "x"\X, using the bound (235c). Note from Table 11 that an increase of the exponent z at "xed value of H results in a smoother interface; we next explain the physical reason for this. We must "rst of all note that, for the purposes of local smoothness, it is clearly the dynamics of the small scales (large wavenumbers) which play the dominant role. The temporal decorrelation rate of the shear modes scales as A (k)"A k\X in O O the RSTS-I Model, so that the temporal correlations of high wavenumbers are weaker as z increases. That is, zPR implies very rapid decorrelation of high wavenumbers whereas zP0 corresponds to a slow decorrelation. This is the opposite of the statement made in Section 3.3 concerning the relation of z to the low wavenumber dynamics. One physical implication of Table 11, therefore, is that the passive scalar level sets become smoother as the temporal correlations of the high wavenumber velocity #uctuations decay more rapidly. To understand this, consider "rst the case of a steady shear #ow. This can be incorporated in our analysis by setting K (u),1, which renders the temporal decorrelation factor U in Eq. (241) to be simply equal to unity. It is then readily seen that "4t"x"&
(1! cos (2nk))A k\\& dk , # so that by Orey's theorem, the scalar level sets would have a local fractal dimension d "2!H. * This also coincides with the fractal dimension resulting from slow decorrelation of the high wavenumbers at a constant rate (z"0). We see that the fractal structure of the velocity "eld is directly impressed upon the scalar interface; the Hurst exponent of the interface is exactly the Hurst 1(> (t)!> (t))2,pD (t"x)"4t"x"& 7' V>VY VY
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
403
exponent H of the velocity "eld. Suppose now that z'0 so that the rate of decorrelation increases unboundedly at high wavenumbers. This means that after any "nite time t, the velocity #uctuations at su$ciently high wavenumber k<(A /t)X will have passed through many correlation times. The O structure of the scalar interface viewed in the corresponding range of scales "x";(A /t)\X is O therefore the result of many roughly independent pushes in time. The temporal #uctuations in the small-scale advection serve to at least partially wash out the in#uence of the fractal spatial structure of the velocity "eld on the interface, leading to a smaller interface fractal dimension than in the steady case. The rapid temporal #uctuations of the small-scale velocity components appear to produce a self-averaging, smoothing e!ect. This is somewhat reminiscent of molecular di!usion, which serves to smooth interfaces and has microscopically small spatio-temporal correlations. For z'2!2H, the high wavenumber shear velocity modes #uctuate rapidly enough in time to produce smooth passive scalar level sets. 3.5.3.3. Global fractal dimension of passive scalar level sets. Our above physical description of how temporal correlations in#uence the roughening of the scalar interface by the turbulent shear #ow applies only in the time-dependent range of scales "x";(A /t)\X. Here the wrinkling of the scalar O interface has an anisotropically self-similar (or self-a.ne [216]) structure which coincides with the long-time asymptotic reported in Table 10: 1(> (t)!> (t))2,pD (t"x) 7' V>VY VY 2KI D*"x"&>Xt & z KI D"x"tX>&\X z#H!1
2!z if z50, 0(H( 2 for "x";(A /t)\X . 2!z O if z50, (H(1 2
On the other hand, for "x"<(A /t)\X, it can be shown directly by rescaling the integration variable O q"kx in Eq. (241) that 1(> (t)!> (t))2,pD (t"x)&K* for "x"<(A /t)\X , (242a) D t"x"& 7' V>VY VY O !C(!H) A , (242b) KD"2n&> C(H#) # which is identical to what a steady random shear velocity "eld would produce ( K (u),1). That is, the scalar interface has a distinct anisotropically self-similar structure for "x";(A /t)\X and for O "x"<(A /t)\X, provided z'0. The crossover length scale O ¸ (t),(A /t)\X (243) O describes the spatial scale below which the interface has settled statistically into its long-time asymptotic structure and above which the interface has not yet substantially felt the e!ects of temporal decorrelation of the shear #ow. The scalar level set can therefore be described as having two fractal dimensions: a local fractal dimension d given in Table 11 for the structure on small scales "x";¸ (t), and a global fractal * dimension d "2!H %
(244)
404
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
(by formal analogy with Eqs. (240a) and (240b)) on su$ciently large scales "x"<¸ (t). The strict computation of Hausdor! fractal dimension by Orey's theorem (240) involving the xP0 limit will always focus on the small scales and produce the local fractal dimension. For the case where z"0 (wavenumber-independent decorrelation time scale q(k)"A ), there is no crossover between fractal O dimensions; the passive scalar level set is, for all times, exactly (anisotropically) self-similar with a single fractal dimension d "d "2!H: * %
1(> (t)!> (t))2,pD (t"x)"4t"x"& 7' V>VY VY
(1! cos (2nk))A k\\&U(t/A ) dk . # O
3.5.3.4. Comparison of model fractal dimensions with experimental results. We pause now to relate the fractal dimensions computed for the passive scalar level sets in our RSTS-I model to the local box-counting fractal dimensions measured within the inertial range of scales in laboratory turbulence. At the Kolmogorov point (H"1/3, z"2/3), the local fractal dimension is d "4/3 while the * global fractal dimension is d "5/3. These values agree extremely well with fractal dimensions % computed by Sreenivasan and coworkers using box-counting on two-dimensional section images obtained in a variety of dyed turbulent #ows using laser-induced #uorescence [272,305]. In particular, they "nd a common fractal dimension of 1.36$0.05 for the interface between the dyed turbulent and un-dyed quiescent region in jets, wakes, mixing layers, and boundary layers [271], and a fractal dimension of 1.67$0.04 for level sets of the dye concentration within the interior of the dyed turbulent region [75]. The interface fractal dimension from these experiments is very close to the value 1.35 observed by Lovejoy to govern the boundary of cloud and rain areas over three decades of scales [199]. We hasten to mention that the agreement between the experiments and the model predictions is not direct. In particular, the experiments do not report a crossover of fractal dimensions within the inertial-convective range for a single passive scalar level set; the di!erent values reported correspond to fractal dimensions in di!erent parts of the #ow. We only wish to point out that the numerical fractal dimensions computed in the model do appear in the real world. The model may be suggesting, though, that the reason for observing di!erent fractal dimensions for di!erent level sets may have something to do with the temporal dynamics of the turbulent #ow; we come back to this point in Paragraph 3.5.5.2. 3.5.4. Fractal dimension of scalar interfaces after isotropic renormalization We will now consider large-scale, long-time properties of the pair-distance function within the inertial range of scales. We proceed via a renormalization process similar to that used for the mean passive scalar density in Section 3.4.3, with the important distinction that the integral length scale has already been removed to in"nity, and need not be linked with the spatial rescaling. Recall that the presence of an integral length scale is necessary for analysis of the mean passive scalar density to prevent an infrared divergence of energy. The pair-distance function, however, "lters the e!ects of large-scale #uctuations so that the low-wavenumber cuto! can be removed from consideration forthwith. We therefore simply need to rescale the spatial arguments x Px /j, yPy/j, and the time variable tPt/o(j) in the pair-distance function in a suitably linked fashion so that a nontrivial
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
405
limiting `"xed pointa
yx (245) QM R(y"x )" lim j\QRMH ' j j H is achieved. Since the pair-distance function is a PDF for the spatial location of the tracer along the shear direction y, we have also rescaled its amplitude by j\ to preserve the law of total probability
QR(y"x ) dy"1 ' \ throughout the renormalization. We note that the inertial-range pair-distance function enjoys an exact scaling invariance
y x , QR(y"x )"(b(j))\QRMH ' ' b(j) j
(246a)
with o(j)"jX ,
(246b)
b(j)"jX>& .
This may be directly checked from the explicit formula for the pair-distance function (see Eq. (229)) and exp(!y/(2pD (t"x ))) 7' , QR(y"x )" ' (2npD (t"x ) 7' pD (t"x )"4t (1! cos (2nkx ))A k\\&U(tA\kX) dk . 7' # O The scaling invariance (246) is anisotropic except when z#H"1, whereas the renormalization (245) involves an isotropic rescaling. In the case where H#z"1, the rescaling (246b) may be used for the renormalization, and the renormalized pair-distance function QM R(y"x ) will coincide with QR(y"x ). For the other cases, the renormalized pair-distance function will di!er from the `barea ' pair-distance function QR(y"x ). ' One key feature of the renormalized pair-distance function is that it will enjoy an isotropic scaling invariance de"ned by the renormalization group leading to it
yx QM R(y"x )"j\QM RMH j j
for all j'0 .
A crucial fact behind this statement is that the renormalization does not a!ect any other fundamental length scales; the renormalized mean passive scalar density discussed in Paragraph 3.4.3.3 does not necessarily enjoy a similar scaling invariance because the integral length scale is rescaled along with space [10,14,141]. The present renormalization of the pair-distance function is quite similar in spirit to the renormalization group in critical phenomena [126,204]. In that context, the Hamiltonian for the
406
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Table 12 Properties of pair distance function and fractal dimension of level sets under isotropic renormalization within inertial range of scales in Random Spatio-Temporal Shear Model Parameter regime
z50 z50 z50
0(H(1!z 1!z(H(0 H"1!z
Temporal rescaling function o(j)
Renormalized mean-square relative displacement pN D (t"x ) 7
Fractal dimension dM
Phase region
j\&\X j\& j\&
2KI HD "x "&>Xt KD"x "&t pD (t"x ) 7'
2!H!z/2 2!H 2!H!z/2
R-I1 R-I2 Boundary
microscopic physics is coarse-grained in a certain scale-invariant fashion to produce a Hamiltonian purported to give a macroscopic picture of the physical system under consideration. In the coarse-graining procedure, certain physical quantities need to be rescaled by some power law, and one initially leaves the exponents unspeci"ed. One discovers, however, that only for special values of these critical exponents does the renormalization procedure tend toward a nontrivial macroscopic Hamiltonian. For these special values, the limiting Hamiltonian is itself scale-invariant and "xed under successive application of the renormalization group. Such a Hamiltonian is thus naturally called a ,xed point (of the renormalization group #ow). We can similarly view QM R(y"x ) as a "xed point of the renormalization de"ned by Eqs. (246a) and (246b). 3.5.4.1. Results of renormalization. In Table 12, we report the unique temporal rescaling power law o(j) necessary for the isotropic renormalization to converge to a nontrivial "xed point, as well as the properties of the resulting renormalized pair-distance function QM R(y"x). The Gaussian form of the pair-distance function is preserved under the renormalization, so QM R(t"x) is completely characterized in all cases by the renormalized mean-square relative displacement
t x . pN D (t"x ),lim jpD 7 7' o(j) j H One may directly write the pair-distance function as exp(!y/(2pN D (t"x ))) 7 , QM R(y"x )" (2npN D (t"x ) 7 or equivalently as the solution of a di!usion equation RQM R(y"x ) RQM R(y"x ) , "DM D(x , t) Ry Rt QM (y"x )"d(y) . with renormalized relative di!usion coe$cient 1 RpD (t"x ) 7 . DM D(x , t), Rt 2
(247)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
407
Fig. 18. Phase diagram for renormalized pair distance function in Random Spatio-Temporal Shear Model with Inertial Range, no cross shear transport.
The scaling laws for pN D (t"x ) have coe$cients given by Eqs. (238a) and (242b). In each case, the 7 renormalized pair-distance function self-consistently enjoys an isotropic scaling invariance corresponding to the rescalings in the renormalization which produced it:
yx . QM R(y"x )"j\QM RMH j j The renormalized behavior of the pair-distance function has some sharply di!erent features on either side of the phase boundary H#z"1, drawn in Fig. 18. The formulas for the renormalized relative mean-square displacement pN D (t"x ) and the fractal dimension dM jump discontinuously 7 across the phase boundary. We now explain why this happens, then derive the results stated in Table 12. It is readily checked that Region R-I1 has the same properties as the long-time asymptotics of Region R-1 in the phase diagram before renormalization presented in Fig. 17. The reason is that for this range of parameters, the appropriate temporal rescaling function o(j) is such that j¸ (t/o(j))PR as jP0, where ¸ (t) is the crossover length de"ned in Eq. (243). Consequently, the isotropic renormalization zooms in on the passive scalar level set structure below the crossover length, where it has settled down to its long-time asymptotic statistics and is governed by the local fractal dimension d reported in Table 11. Note that Region R-I1 of Fig. 18 lies entirely within * Region R-1 of Fig. 17. Region R-I2, on the other hand, corresponds to temporal rescalings in which j¸ (t/o(j))P0, so that the isotropic renormalization zooms in on the structure of the passive scalar set on scales above the crossover. Here, the e!ects of temporal #uctuations of the velocity "eld are not felt; the passive scalar level set is evolving as in a steady shear #ow. The fractal dimension characterizing the renormalized "xed point is given in Region R-I2 by the global value d "2!H; see the discussion % around Eq. (244).
408
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The discontinuity of the behavior of the renormalized "xed point across the phase boundary H#z"1 is thereby shown to result from a jump between zooming in on the level set structure above and below the crossover length ¸ (t). On the phase boundary H#z"1 itself, the unrenormalized pair-distance function within the inertial range QR(y"x ) obeys the exact isotropic scaling symmetry '
yx QR(y"x )"j\QRMH ' ' j j
for o(j)"j\&\X, and therefore persists unchanged under the isotropic renormalization: QM R(y"x )"QR(y"x ). The renormalized mean-square relative tracer displacement pN D (t"x ) and the 7 ' renormalized relative di!usion coe$cient DM D(t"x ) remain equal to their bare values pD (t"x ) (237) 7' and (RpD (t"x ))/(Rt), which do not have a simple scaling form in x and t. 7' Interestingly, the phase boundary H#z"1 for the renormalized pair-distance function is exactly the locus of parameters for RSTS-I models with temporal correlations dictated by the natural eddy turnover time scale (see 211a). In particular, the Kolmogorov point (H"1/3, z"2/3) falls on the phase boundary. One interesting implication of this in the model is that the fractal dimension of the level set on large scales within the inertial range is very sensitive to perturbations of the exponents H and z from the exact Kolmogorov exponents, since the fractal dimension dM jumps discontinuously across the phase boundary. In particular, while the Kolmogorov point itself is associated with a level set fractal dimension of dM "4/3, a small perturbation of H or z can give rise to a fractal dimension of dM "5/3. 3.5.4.2. Renormalization computation. We "rst of all seek the scaling law o(j) for which
yx QM R(y"x ),lim j\QRMH ' j j H is nontrivial, i.e., neither zero nor in"nity. We will operate under the assumption that o(j) must be a power law, o(j)"jD with f'0; we leave it to the reader to verify that there is in fact no other suitable scaling for the renormalization. In the course of our computation, it will be seen that the scaling exponent f is uniquely determined by H and z. From the explicit formula (229) for the pair-distance function, we see that our task is equivalent to "nding the scaling law o(j) so that
t x pN D (t"x ),lim jpD 7 7' o(j) j H is nontrivial. The renormalized pair distance function will then be simply exp (!y/(2pN D (t"x ))) 7 . QM R(y"x )" (2npN D (t"x ) 7
(248)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
409
Now, by a rescaling of the integration variable in Eq. (237), we have the following expression for the renormalized mean-square relative tracer displacement:
t , pN D (t"x )"lim 4A o\(j)j\&t"x "&JU jXo\(j) 7 # A "x "X O H
JU(u),
(1!cos (2nq))q\\&U(uqX) dq .
(249a)
(249b)
We now try various choices of rescaling o(j)"jD, and see for which values of H and z a nontrivial limit is self-consistently produced. We will "nd that only one choice of f will work for each (H, z). It is helpful to separately consider three possible cases: f"z, f'z, and 0(f(z. f"z: When f"z, the argument of JU in Eq. (249a) remains "xed as jP0, and therefore a nontrivial renormalized limit requires that the prefactors balance, i.e., o(j)"j\&. Since o(j)"jD"jX is assumed, self-consistency requires that z"1!H, which is exactly the phase boundary in Fig. 18. Here, the renormalization leaves pD (t"x ) "xed, which is not surprising 7' because the scaling invariance (246) which this quantity enjoys is isotropic for z"1!H. f'z: For temporal rescalings with f'z, the argument of JU in (249a) diverges as jP0. We therefore require the large u asymptotics of JU(u):
lim JU(u)& (0)u\ (1!cos (2nq))q\\&\X dq S !C(!H!z/2)
(0)u\ , " n&>X> C(H#(z#1)/2)
(250)
which follows formally from the large argument asymptotics (235b) of U, provided that H(1!z/2 so that the integral appearing in the middle expression is "nite. The dominated convergence theorem in conjunction with Eq. (235c) guarantees (250) rigorously in this case. Therefore, for f'z, !C(!H!(z/2)) , pN D (t"x )"lim 2A j\&\X\Dt"x "&>X (0)A n&>X> 7 # O C(H#(z#1)/2) H and the existence of a nontrivial limit requires that we choose f"2!2H!z. Self-consistency with the assumption f'z means that this scaling can only work when H(1!z, which is just Region R-I1 in Fig. 18. The other required condition in the computation, H(1!X , is automati cally satis"ed within Region R-I1, and therefore the renormalization developed here is indeed valid for that regime of parameters. f(z: When f(z, the argument of JU in Eq. (249a) converges to zero, and we need instead the small argument asymptotics:
!C(!H) , lim JU(u)" (1!cos (2nq))q\\&dq" n&> C(H#) S
410
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
which follows from Eq. (235a) and the boundedness of U. The renormalized limit of the meansquare relative tracer displacement then takes the form !C(!H) pN D (t"x )"lim 2A j\&\Dt"x "&n&> . 7 # C(H#) H A nontrivial limit requires that we choose f"1!H, and self-consistency with f(z means that this renormalization is valid precisely for Region R-I2. Self-similarity and fractal dimension of renormalized passive scalar ,eld. Having computed the appropriate rescaling o(j) and renormalized mean-square shear-parallel displacement pN D (t"x ) for 7 each value of H and z, we obtain the renormalized pair distance function from Eq. (248). Its scaling invariance under the isotropic rescaling de"ning the renormalization group can be directly checked. The renormalization preserves the Gaussianity of the graph describing the level set of the passive scalar "eld initially lying along y"0, so the fractal dimension of this level set for the renormalized passive scalar "eld may be obtained from Orey's theorem. The exponent a to be used in Eq. (240) is readily read o! from the formulas for pN D (t"x ), which have pure power law scaling in 7 x . This concludes the derivation of the results stated in Table 12. 3.5.5. Open problem: fractal dimension of scalar interfaces with small but nonvanishing molecular di+usivity The simple methodology used to compute the rich inertial-range behavior of the pair-distance function and the passive scalar level set fractal dimensions in Section 3.5 has relied fundamentally on the neglect of molecular di!usion. Its inclusion signi"cantly complicates matters. Instead of Eq. (225), we would have (for no cross sweep wN "0)
R > (t)!> (t)" (v(x #x #(2i= (s), s)!v(x #(2i= (s), s)) ds#(2i= (t) , V V V W V >V where (= (t),= (t)) is a Brownian motion. The appearance of = (s) in the argument of v(x, s) V W V destroys the Gaussianity, and thereby the resulting development of the pair-distance function hinging on this property. Nonetheless, there are more sophisticated tools available for handling the e!ects of molecular di!usion on a spatio-temporal random shear #ow in a mathematically rigorous fashion [10,14]. Furthermore, any moments of the pair distance function with di!usion can be treated explicitly by following the procedure utilized in Section 3.2. We pose as an open problem the mathematical examination of the properties of the pair-distance function and the structure of passive scalar level sets within the inertial-convective range of scales in the RSTS-I Model when molecular di!usion is present but weak compared to the large scale turbulent advection (high but "nite PeH clet number). By inertial-convective range of scales, we mean the portion of the inertial range over which the e!ects of molecular di!usion are formally subdominant to turbulent di!usion, in the sense that i;q (x)1(dv(x))2, where * 1(dv(x))2&A "x"& denotes the mean-square velocity #uctuation measured over a distance # x across the shear and q (x)&min (x/i, A xX) is an approximate Lagrangian correlation time * O associated to this scale. The inertial-convective range can be equivalently de"ned as the asymptotic regime ¸ , ¸ ;x;¸ )
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
411
where ¸ is a length scale above which molecular di!usion is formally subdominant to turbulent advection (see Paragraph 4.3.3.1). Since we are concerned with scales x<¸ , the e!ects of molecular di!usion might naively be assumed to be negligible. The limit of vanishing molecular di!usion, however, has a number of singular features which require it to be studied carefully [74]. For example, without molecular di!usion, a sharp interface between distinct #uids will forever remain sharp, while any molecular di!usion, however small, will smooth the variation of the passive scalar "eld across the interface. The thickness of the resulting front can be expected to be on the order of ¸ , which is very small compared to the scales of interest, but the geometric structure along the front at all length scales could be signi"cantly a!ected by the fact that the passive scalar "eld has been smoothed across the front. The fractal dimension of passive scalar level sets within the inertial-convective range of scales may therefore well assume di!erent values in the presence of molecular di!usion than those computed in Section 3.5.3 for i"0. Experimental measurements [271,272] of the fractal dimensions of interfaces generally involve #ows such as wakes and jets and layers, wherein the large-scale shear in these #ows is directed along the coarse-grained interface. The passive scalar level set emanating from y"0 which we considered in Section 3.5.3 on the other hand gets hit head-on by the shear #ow. To bring the model closer to the experimental setup, it would be preferable to consider the fractal dimension of a passive scalar level set evolving from x"0. For i"0, such a level set remains forever straight, but for i'0, the interaction of molecular di!usion and the shear will wrinkle it. An interesting question is how the fractal dimensions of the level sets evolving from x"0 and y"0 compare to each other when i'0. The inclusion of the e!ects of molecular di!usion into the RSTS-I Model would put it in a position to interact with various physical and mathematical theories which have been put forth concerning the fractal dimension of passive scalar level sets over the inertial-convective range of scales. We brie#y mention some of these now, along with some questions concerning them raised by the RSTS-I Model as developed here. 3.5.5.1. Theories and bounds for fractal dimensions of passive scalar level sets. Several theories have been put forward to explain either the observed fractal dimension of 1.36$0.05+4/3 for scalar level sets at turbulent interfaces [271] or the dimension 1.67$0.04+5/3 for scalar level sets immersed well within a turbulent #uid [75]. That the fractal dimension of the interface between dyed and non-dyed region of a turbulent #ow should be 4/3 (modulo intermittency corrections) has been argued in several ways. One, o!ered independently by Sreenivasan et al. [235,310] and by Gouldin [128]), proceeds from the experimental observation that the #ux of dye across the interface depends only on the large-scale parameters and not on molecular di!usivity, which demands that the interface must increasingly contort itself to provide the surface area to maintain this constant #ux as the PeH clet number is increased. Another approach, due to Hentschel and Procaccia [132], analyzes Richardson's pair-distance function in the limit of zero molecular di!usion by positing a di!usion PDE for this quantity with relative di!usivity depending as a power law on current tracer separation and elapsed time. If the exponents for the relative di!usivity are chosen in accordance with Kolmogorov}Obukhov completely self-similar theory, then the fractal dimension of any level set as computed from the pair-distance function and the assumption that Eqs. (240a) and (240b) holds also for non-Gaussian graphs produces d"4/3.
412
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
More recently, Constantin et al. [75] have predicted an interface fractal dimension of d"1#H (interface) ,
(251)
based on a certain rigorous mathematical bound they derived and some further assumptions concerning the sharpness of this upper bound and the nature of turbulent velocity #uctuations near the interface. The parameter H is, as usual, the Hurst exponent characterizing the inertial-range spatial structure of the velocity "eld. For Kolmogorov turbulence, H"1/3, and the theory (251) predicts an interface fractal dimension of 4/3. The interface fractal dimension formula (251) also follows from a direct generalization of the phenomenological arguments of [235,310]. There is some interesting experimental data in [75] showing fairly good agreement with the prediction (251) for various moderate Reynolds number experiments in which the inertial-range is not fully developed but characterized in some e!ective sense by a Reynolds-number-dependent Hurst exponent 04H(Re)(1/3. Mandelbrot [214,215] suggested a simple possible reason for why a passive scalar level set in a fully turbulent region should have fractal dimension 5/3: almost every level set of a twodimensional self-similar (or more properly, self-a$ne) Gaussian fractal random "eld with Hurst exponent a has fractal dimension 2!a. Passive scalar #uctuations in the inertial-convective range of scales within a statistically homogenous region of turbulence are thought to have Hurst exponent H "1/3, both from similarity theory [76,254] and from experimental measurements 2 [308]; see Paragraph 4.3.4.1 for further discussion. If the passive scalar "eld were Gaussian, it would then follow that their level sets would have fractal dimension 5/3. The passive scalar #uctuations in the inertial-convective range of scales are however, known to be highly nonGaussian [309]. Constantin et al. [75], on the other hand, deduce from a rigorous estimate an upper bound (252) d4(3#H) (within homogenous turbulence) for the fractal dimension of a passive scalar level set immersed in a homogenous turbulent #ow with velocity Hurst exponent H. If this upper bound is assumed to be sharp, then a fractal dimension of 5/3 is predicted for the Kolmogorov value H"1/3. We note that, strictly speaking, the fractal dimension upper bound (252) applies to a short time average of the level set, rather than an instantaneous level set. A rigorous su$cient condition for Eq. (252) to be sharp was obtained in [74]; unfortunately it is di$cult to decide in practice whether this technical condition holds. We emphasize that the rigorous mathematical theory of [74,75] includes the e!ects of small but nonvanishing molecular di!usivity. 3.5.5.2. Questions raised by the RSTS-I Model concerning theories for fractal dimension of scalar level sets. It would be interesting to compare some of these theories with exact results from the RSTS-I Model, once the e!ects of molecular di!usion were properly taken into account. As it stands, the exact fractal dimensions deduced from the simpli"ed model with i"0 con#ict rather sharply with the conclusions of some of these theories. For example, the model predicts that both local and global fractal dimensions should decrease with the Hurst exponent of the velocity "eld, H, whereas the above theories predict an increase for passive scalar level sets both at turbulent interfaces (251) and within a region of homogenous turbulence (252). Moreover, one readily checks from Table 11 and Eq. (244) that the mathematical upper bound (252) for passive scalar level sets is violated by
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
413
both local and global fractal dimension in the RSTS-I Model for su$ciently small values of H. We emphasize that, in and of itself, this is no contradiction because the above mathematical and physical theories all assume the presence of a small amount of molecular di!usion (except for [132]). It does, however, motivate the computation of e!ects of molecular di!usion in the RSTS-I Model to address questions such as: Do the inertial-convective range fractal dimensions change to fall in line with the theoretical predictions and inequalities? If not, can the physical mechanism creating the disagreement be identi"ed? Might there be subtle issues concerning the time averaging of level sets used in the derivation of the upper bound (252) in [75]? Is the technical condition from [74] which establishes the sharpness of Eq. (252) satis"ed in the RSTS-I Model? One possible objection to the relevance of the RSTS-I Model to these theories is the strong anisotropy of the model. Indeed, this prevents quantitative comparison with [132], which relies on the isotropy of the turbulence in its analysis. But the other theories are formulated in a quite general way which should apply to anisotropic situations. We have also noted above that experimental measurements [271,272] are performed in settings with strong large-scale anisotropic shear. We recall "nally our discussion in Section 3.5.3 in which we demonstrated the strong relevance of the temporal correlation structure of the RSTS-I #ow in determining the fractal dimension of scalar level sets. First of all, the local fractal dimension depends on the value of z, with more rapid decay of the correlation time at high wavenumber (z large) leading to smoother scalar level sets. Secondly, the scalar level sets exhibit two regimes of di!erent self-similarity which are determined by the relative scales which have strongly or barely felt the e!ects of temporal #uctuations. Of the above-mentioned theories, however, only [132] makes explicit reference to the temporal correlation structure of the velocity "eld. The mathematical inequalities of [74,75] may depend implicitly on the velocity temporal structure through their assumptions about the degree of (HoK lder) smoothness of the passive scalar "eld and on their temporal averaging of scalar level sets. It would be interesting to know whether the qualitative e!ects of temporal structure on the passive scalar sets in the RSTS-I Model persist when i'0. If so, the following questions could be pursued: Do the mathematical and physical theories apply for the RSTS-I Model as both Hurst exponent H and temporal scaling exponent z are varied? If not, can they be modi"ed to explicitly account for the nature of the temporal correlations in the velocity "eld? Could the di!erence between the fractal dimension of passive scalar interfaces and level sets in fully turbulent regions be due to a di!erence in the local temporal structure of the turbulence?
4. Passive scalar statistics for turbulent di4usion in rapidly decorrelating velocity 5eld models In Section 3, we studied tracer transport in a velocity "eld model with a simpli"ed shear #ow geometry. For such a #ow, the nonlinearity of the trajectory equations simpli"es to the point that they may be integrated by quadrature, leading to an explicit statistical expression for the location of a tracer particle at any moment of time. We could then proceed directly to analyze a rich variety of behavior for the statistical motion of a tracer particle in response to various properties of the environment. In this section, we will consider a complementary simpli"cation which again permits a tractable analysis of the tracer trajectories. Rather than restricting the geometrical structure of the #ow, we
414
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
will prescribe a convenient statistical structure. Namely, the velocity "eld will be taken to be a mean zero, homogenous, stationary Gaussian random "eld with no memory 1*(x, t)*(x#r, t#q)2,R (r)d(q) , where the tensor R (r) describes the spatial correlation structure, and the Dirac delta function describes the temporal correlation structure. For clarity, in this paper we shall call this the Rapid Decorrelation in ¹ime (RDT) model. It is also known in the literature as the `delta-correlateda model, the `white-noisea model, or the `Kraichnan modela after one of its original proposers [179]. (Kazantsev [152] independently suggested such a model for a magnetohydrodynamic turbulent #ow.) The key simpli"cation a!orded by the Rapid Decorrelation in Time model is that each of the tracer particles in such a #ow undergoes an e!ective Brownian motion. Unlike an ordinary Fickian di!usion process, the Brownian motions of a collection of particles moving simultaneously in the #ow are coupled to one another. Through the standard relation between the mean of a passive scalar "eld and the equations of motion for a single tracer particle (see, for example, [185,196], or [244]), it follows that the mean 1¹(x, t)2 obeys a closed di!usion PDE. That is, the moment closure problem [227] is averted in the RDT model. One can go further and write down a closed equation of di!usion type for the equal-time, second-order passive scalar correlation function 1¹(x, t)¹(x, t)2 , a fundamental statistic re#ecting the small-scale spatial structure of the passive scalar "eld. In fact, as "rst shown by the "rst author [206], the equal-time correlation functions of all orders
, ¹(x H, t) H obey di!usion equations with variable coe$cients. Hence, the statistical spatial structure of the passive scalar "eld in the RDT model is represented precisely through the solutions to deterministic PDEs. There is no need to handle random variables explicitly. Overview of Section 4 RD¹ model setup: The di!usion equations for the mean passive scalar density and second-order correlation function are presented in Section 4.1. The mean passive scalar density in the RDT model obeys a standard constant-coe$cient di!usion equation (255); the e!ect of the velocity "eld is simply to boost the di!usivity constant. Therefore, we shall focus for the most part on the second-order passive scalar correlation function. In the RDT model, it obeys a variable coe$cient di!usion PDE (256) which admits some explicit special solutions showing how the small-scale #uctuations of the passive scalar "eld respond to the turbulent velocity "eld. To illuminate the behavior of the passive scalar "eld #uctuations, we only consider statistically homogenous systems. We furthermore restrict attention to #uctuations with length scales much smaller than the integral length of the velocity "eld. The main physical aspects are most clearly seen in asymptotic regimes in which the passive scalar statistics exhibit universal features independent of particular model details, and we shall concentrate our attention on these regimes. Evolution through the inertial range: In Section 4.2, we examine the evolution of the passive scalar correlation function over time intervals during which the correlation length of the passive scalar #uctuations lies within the inertial range of scales. In this asymptotic regime, the passive scalar
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
415
correlation function obeys some universal laws which depend only on the inertial-range scaling law for the turbulent energy spectrum. We "rst consider, in Section 4.2.1, the rate of decay of the passive scalar variance 1(¹(x, t))2 with time in a turbulent shear #ow of the type discussed in Section 3, but with a delta-correlated temporal structure. The passive scalar variance is an analogue of the energy of the velocity "eld, and is just the value of the second-order correlation function at coinciding points. For a turbulent #ow with short-range spatial correlations, the passive scalar variance decays according to a power law with an exponent equal to that describing dissipation by molecular di!usion alone; only the decay rate is enhanced due to the turbulent activity. On the other hand, in a turbulent #ow with long-range spatial correlations, the passive scalar variance decays anomalously at long times according to a more rapidly vanishing power law. We obtain an exact expression for the evolution of the passive scalar variance by relating the variable-coe$cient di!usion equation satis"ed by the passive scalar correlation function to an associated quantum-mechanical SchroK dinger equation. The potential function appearing in this SchroK dinger equation re#ects the spatial correlations of the turbulent shear #ow. Next, in Section 4.2.2, we consider a statistically isotropic turbulent #ow with long range correlations, and develop a self-similar solution which describes the evolution of the passive scalar correlation function through the inertial range of scales. From the form of this solution, we will deduce the anomalous relative di!usion of a pair of tracers while they are separated by a distance lying within the inertial range of scales. The mean-square distance between the particles grows according to a superlinear power law, rigorously manifesting a version of Richardson's law [284] for the RDT model which re#ects the acceleration of di!usion as the particles separate. We illustrate this dispersion of a pair of tracer particles over several decades through a Monte Carlo numerical simulation. The explicit self-similar solution indicates another interesting contrast between the decay of passive scalar #uctuations under the in#uence of a turbulent #ow and under ordinary molecular di!usion. Whereas rough #uctuations are smoothed out by molecular di!usion, the self-similar spatial correlations in a turbulent #ow actually introduce a rough fractal structure which spreads out in space even as the passive scalar "eld is decaying in amplitude. Statistically stationary state of driven passive scalar: In Section 4.3, we turn to the statistics of a passive scalar "eld which is advected by an RDT turbulent #ow, dissipated by molecular di!usion, and directly driven by a `pumpinga "eld f (x, t) representing external agitation: R¹(x, t) #*(x, t) ' ¹(x, t)"iD¹(x, t)#f (x, t) , Rt ¹(x, t"0)"¹ (x) . In the RDT model, the pumping is represented as a mean zero, Gaussian, homogenous random "eld with a delta-correlated temporal structure, and a spatial structure characterized by a single large length scale ¸ : D 1 f (x, t) f (x#r, t#q)2"U(r)d(q) .
(253)
With pumping, the advection}di!usion equation has both driving and damping. The passive scalar "eld may then be expected to settle down after a suitable period of time to a statistically stationary state in which production of new #uctuations by the pumping is balanced by destruction via
416
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
molecular dissipation. Sometimes we will refer to such a statistically stationary state as a `quasiequilibriuma, with the `quasia pre"x distinguishing the strongly damped and driven equilibrium described here from ordinary thermal equilibrium. For pumping of the form (253), the mean and correlation functions of the passive scalar function still obey closed equations of di!usion type (with an inhomogeneity arising from the pumping) (Section 4.3.1). The statistically stationary state is characterized by the steady solutions to these equations. The mean of this state is zero by symmetry, but the correlation function of the passive scalar "eld in quasi-equilibrium: PH(r),1¹(x, t)¹(x#r, t)2 * will re#ect the spatial structure of the passive scalar "eld set up by the input of #uctuations at large scales and dissipation at small scales. The asterisks decorating the above expression indicate that the statistics of the passive scalar "eld are to be taken as those of the quasi-equilibrium state, which inherits statistical homogeneity from its environment. For the case in which the velocity and pumping "eld is statistically isotropic, Kraichnan [179] showed how the correlation function of the passive scalar "eld could be represented via quadrature in terms of the functions R (x) and U(x) characterizing the spatial correlations of the velocity and pumping "elds (Section 4.3.2). We shall be particularly interested in the small-scale spatial structure of the quasi-equilibrium passive scalar "eld, which one might expect to display universal features independent of the particulars of the large scale pumping or the large-scale structure of the velocity "eld. The second-order statistics of the small scales do indeed exhibit universal behavior in the RDT model, but we hasten to mention that there does appear to be some sensitivity to the large scale velocity "eld in real turbulent #ows [306,309]. Some striking features of the small-scale statistics are brought out in the passive scalar spectrum E (k), which resolves the strength of the passive scalar 2 #uctuations as a function of their wavenumber k and parallels the energy spectrum for the velocity "eld. Perhaps the most interesting feature of E (k) is that it can support a variety of universal 2 self-similar scaling regimes at high wavenumbers. These are analogous to the Kolmogorov k\ inertial-range scaling of the energy spectrum of a turbulent velocity "eld at high Reynolds number (see Paragraph 3.4.3.1). Theoretical predictions for three di!erent scaling regimes in the passive scalar spectrum E (k) have been proposed from a number of directions [28,29,76,120,254]. 2 Like the Kolmogorov prediction, each of these theories predicts a universal scaling law for E (k) 2 within some range of wavenumbers, provided the physical conditions are such that the delimiting wavenumbers are su$ciently widely separated. Though each theory has some experimental data which purportedly support it, none are unequivocally con"rmed and some are under active controversy. Moreover, in one range of scales (the inertial-di!usive regime), two competing theories predict di!erent scaling laws, and experiments have not decided the issue de"nitively. We will utilize the RDT model to investigate the nature of scaling regimes of the passive scalar spectrum, with a view to making contact with the real-world issues when we can. First, in Section 4.3.3, we report the rigorous existence of three universal scaling regimes in the passive scalar spectrum in the RDT model, corresponding to those predicted for the real world. Two of these scaling laws were computed by Kraichnan [179,183]. Then, in Section 4.3.4, we examine whether straightforward adaptations of the approximate real-world theories to the RDT model make the correct predictions. The theories based on Kolmogorov-type dimensional analysis produce the correct scaling predictions for two of the regimes. In the inertial-di!usive regime,
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
417
where naive dimensional analysis is insu$cient to produce a unique prediction, we "nd that the theory of Batchelor, Howells, and Townsend [29] carries over successfully to the RDT model, but the theory of Gibson [121] does not. Higher-order statistics: Finally, in Section 4.4, we survey some recent work concerning the higher order quasi-equilibrium statistics of the passive scalar "eld in the RDT model. We focus particular attention on the passive scalar structure functions S*(r),1(¹(x#r)!¹(x)),2 , , and the issue of whether they display anomalous scaling properties in the inertial-convective range. By anomalous scaling is meant that the structure functions have a power law form S*(r)JrD, , over a common interval of scales r much larger than the dissipation scales and much smaller than the macroscopic system scale, but that the exponents have a nontrivial relation, i.e. f ONf . , This phenomenon is often called an instance of `small-scale intermittencya or `inertial range intermittencya, as it re#ects a tendency for the passive scalar #uctuations on these scales to burst to large values from time to time with a much greater frequency than would arise from a Gaussian random "eld. We shall state some theories for the values of the exponents f in the RDT model , [64,116,183,343], often with con#iciting predictions. The predictions are compared to the results of some recent numerical simulations in some simpli"ed settings where accurate resolution is possible and unambiguous anomalous scaling can be demonstrated [33,334]. In Section 5, we will utilize a variant of the RDT model to study large-scale intermittency of the passive scalar "eld. 4.1. Dexnition of the rapid decorrelation in time (RD¹) model and governing equations First we restate in Section 4.1.1 the de"nition of the RDT model and indicate a sound mathematical interpretation of the Gaussian, delta-correlated velocity "eld. Then, in Section 4.1.2 we develop the di!usion PDEs for the mean and second-order correlation function of the passive scalar "eld, and remark on their structure. The equation for the second-order correlation function will be studied through applications in Sections 4.2 and 4.3. We discuss in Section 4.1.3 the sense in which the RDT model describes the limiting behavior of the passive scalar "eld in a velocity "eld with short but "nite correlation time relative to the time scales characterizing advection. While the RDT model equations do in fact describe a particular short correlation time limit of a broad class of velocity "elds, we illustrate through a cautionary example that other limiting nontrivial passive scalar dynamics can be realized. Finally, in Section 4.1.4, we de"ne the particular form of the velocity spatial correlation function we will use in applications. 4.1.1. Dexnition of model The velocity "eld *(x, t) is formally de"ned in the Rapid Decorrelation in Time (RDT) model as an incompressible, Gaussian, homogenous, stationary, random "eld with mean zero and
418
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
second-order correlation function 1*(x, t)*(x#r, t#q)2"R (r)d(q) . (254) The tensor-valued function R (r) describes the spatial structure of the velocity "eld; incompressibil ity implies ' R (r),0 (see [341], Section 22.4). One readily observes that the velocity "eld so de"ned is not an ordinary random "eld, since the variance of such a "eld is in"nite. This divergence of the mean-square velocity is necessary for a nontrivial advection in a model with zero correlation time. Random "elds with delta correlations arise naturally in a variety of simpli"ed physical models, and they can be given a clear mathematical meaning as generalized, or distribution-valued, random "elds (see [118] or [341], Ch. 24,25). The complication in the current context is the appearance of such a random "eld as a coe$cient in the advection}di!usion PDE R¹(x, t)/Rt#*(x, t) ' ¹(x, t)"iD¹(x, t) , ¹(x,t"0)"¹ (x) . We can avoid the di$culty of making sense of the solution of a PDE with random, distributionvalued coe$cients by reformulating the mathematical problem in terms of the -ow of the #uid. That is, we can specify the mapping of the location of #uid elements from one time to another, and not make explicit reference to the velocity "eld. The RDT velocity "eld induces a random Brownian -ow of the #uid, in which every #uid element undergoes a Brownian motion, but the motions of di!erent #uid elements are correlated with one another according to the spatial structure tensor R (r) [189]. A Brownian #ow makes rigorous mathematical sense of a #uid #ow with a Gaussian, delta-correlated velocity "eld just as mathematical Brownian motion makes sense out of the evolution of a particle with a Gaussian, delta-correlated velocity. Through the mathematical framework of a Brownian #ow, one can make rigorous sense of the advection of a passive scalar "eld in the RDT velocity "eld and derive all the results in this section without having to deal directly with the distribution-valued velocity "eld itself. We only wish to indicate that a rigorous formalism is possible, though we will not dwell on it here. The reader may consult [185] or [349] for the technical implementation of these ideas. 4.1.2. Closed equations for mean and correlation function of passive scalar xeld Though the RDT velocity "eld model is limited in simulating physical reality through its lack of inertia or memory e!ects, it has the virtue of having closed equations for the mean and correlation functions of the passive scalar "eld. The usual turbulence moment closure problem [227] does not arise in the RDT model. Thus, one has the opportunity for much more precise mathematical analysis of the passive scalar "eld statistics than is generally possible. In particular, one can study the advection of a passive scalar "eld in a velocity "eld with long-range spatial correlations and an inertial range of self-similar behavior. We can think of the Simple Shear Model and RDT Models as complementary simpli"ed models which permit a mathematical study of various aspects of turbulent di!usion in velocity "elds. The Simple Shear Model restricts the geometry of the #ow, but permits an arbitrary speci"cation of the spatio-temporal statistics. The RDT Model assumes a special (and unphysical) temporal structure, but permits a multi-dimensional #ow geometry with an arbitrary speci"cation of the spatial statistics.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
419
We will present the PDEs for the mean and second-order correlation function of the passive scalar "eld in turn. The fact that closed equations could be written down for these statistical functions was "rst pointed out by Kraichnan [179] and Kazantsev [152] through formal arguments. Since then, these equations have been derived in several contexts by various techniques, some mathematical and rigorous [185,206,208,246,244,349] and some formal [95,117,164]. 4.1.2.1. Mean passive scalar density. In the RDT model, the mean passive scalar "eld 1¹(x, t)2 obeys an ordinary di!usion equation with enhanced coe$cient: R1¹(x, t)2/Rt" ' ((iI#R (0)) 1¹(x, t)2) , 1¹ (x, t"0)2"1¹ (x)2 . (255) Here I denotes the identity matrix, and R (r) is the spatial structure tensor of the velocity "eld; see Eq. (254). R (0) is a nonnegative-de"nite tensor since 1*(x, t)*(x, t)2"R (0)d(t). For a statistically isotropic velocity "eld, R (0) will simply be a multiple of the identity, say R (0)"R I, with R '0, and one can simply say the scalar di!usion constant for the mean passive scalar density is enhanced from i to i#R . The turbulent enhancement R is simply a measure of the strength of the velocity "eld, which may be formally thought of as the product of the (very small) velocity correlation time and the (very large) single-point velocity variance. (Smooth the delta function in the expression (254) over a very small time interval q to see this). This is in agreement with what Taylor's formula for the turbulent di!usivity would produce in the zero correlation time limit (Section 3.1.3). The reason that the mean passive scalar density obey a simple di!usion equation is that the trajectory of a single tracer particle in the RDT model is given as a sum of two independent Brownian motions: one with di!usivity i from molecular di!usion, and one with di!usivity R from advection by the RDT velocity "eld. The superposition of these independent motions produces an e!ective Brownian motion with di!usivity i#R . The equation for the mean passive scalar density then follows from its relation to the statistics of a single tracer trajectory (see Section 3.4). None of these comments should lead the reader to conclude that the white noise velocity "eld leads to trivial behavior for the passive scalar. It is only the mean passive scalar density (or equivalently, the statistics of the motion of a single tracer) which has this simple e!ective behavior. When we turn to statistics of the #uctuations, the situation is much more interesting. 4.1.2.2. Second-order passive scalar correlation function. Until now, we have focused for the most part on the behavior of the mean concentration of the passive scalar, or equivalently, the motion of a single tracer particle. (An exception was in our discussion of the pair-distance function and fractal interfaces in Section 3.5.) We wish in this section to focus on the nature of the #uctuations of the passive scalar "eld. This is particularly relevant when considering the transport of dangerous quantities (pollutants or hazardous chemicals). In these cases one is more interested in the probability of the passive scalar density exceeding some certain safety threshold, rather than simply the (presumably) very low mean value. (See for example the environmental science books [78,265].) Knowledge of the statistics of the #uctuations are important in general geophysical and engineering applications, as they give information about how rapidly a passive scalar becomes `well-mixeda
420
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
in a turbulent #uid and about the rate of growth of a cloud of passive scalar released into a turbulent environment, as we saw in Section 3.5. From a fundamental physical viewpoint, the statistics of the #uctuations of the passive scalar "eld reveal much more about the small-scale structure of the velocity "eld than the mean does. This is particularly interesting in fully developed turbulence, which we shall be considering, in which the turbulent velocity "eld has its energy distributed over a wide range of scales. The mean passive scalar density is most sensitive to the large-scale #uctuations of the velocity "eld, in which most of the energy resides (see Section 3.5). The e!ects of the small-scale velocity #uctuations are manifested primarily in the small-scale passive scalar #uctuations. This is particularly interesting in physical turbulence theory, where the statistics of the velocity #uctuations on small scales are at least approximately universal. That is, while the large-scale structure of a turbulent velocity "eld depends strongly on the system con"guration (pipe #ow, jet #ow, shear #ow, boundary layer #ow), the small scales have several common features in these di!erent settings. (See [309] for a recent review.) One might expect the passive scalar #uctuations on small scales to be universal as well. It is thus of interest to discover to what extent this is true, and, furthermore, to describe these universal statistics of the passive scalar "eld. A fundamental statistic of the passive scalar "eld from which one can determine some basic properties about the spatial structure of the #uctuations is the (equal-time) second-order passive scalar (PS) correlation function: 1¹(x, t)¹(x, t)2 . For simplicity, we shall assume that the passive scalar "eld is a statistically homogenous random "eld with mean zero, so that in particular, the second-order PS correlation function depends only on time and the relative displacement of the observation points x!x: P (r, t),1¹(x, t)¹(x#r, t)2 . We will discuss the evolution of the passive scalar variance, P (0, t)"1(¹(x, t))2 , which gives the simplest measure of the amplitude of the passive scalar #uctuations, in an RDT model with shear #ow geometry in Section 4.2.1. Subsequently, we elaborate upon the spatial structure of the passive scalar "eld revealed by the full second-order PS correlation function in an isotropic RDT model. These analyses are possible due to the fact that in the RDT model, the PS correlation function obeys a closed PDE, which we now present. We shall assume that the passive scalar "eld is at all times a mean zero, statistically homogenous random "eld; the self-consistency of this assumption is easily checked. Then the second-order passive scalar correlation function P (r, t),1¹(x#r, t)¹(x, t)2 obeys the following variable coe$cient di!usion PDE: Evolution of second-order passive scalar correlation function RP (r, t) " ' ((2iI#D (r)) P (r, t)) , Rt P (r, t"0)"P(r) .
(256)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
421
The function P(r) is just the correlation function of the initial data: P(r),1¹ (x)¹ (x#r)2 . The turbulent contribution to the di!usivity of P (r, t) is given by the tensor (257) D (r)"R (0)!(R (r)#RR(r)) . We call this tensor D (r) the velocity structure tensor. It is just (half) the correlation tensor of the velocity di+erences at locations with relative spatial separation r: 1(*(x#r, t#q)!*(x, t))(*(x#r, t#q)!*(x, t))2"D (r)d(q) , (258) as may be checked by expansion of the binomial product and the de"nition (254) for the velocity correlation tensor R (r). It is readily seen that D (r) is a nonnegative de"nite tensor for each r. Connection with relative di+usivity. The appearance of the velocity structure tensor as the enhanced turbulent di!usivity in Eq. (256) may be understood through the well-known connection between the evolution of the second-order correlation function (of a statistically homogenous random "eld) and the relative di!usion of a pair of tracer particles. This connection parallels that between the mean passive scalar density and the absolute di!usion of a single tracer trajectory; further discussion may be found, for example, in [164]. Here it su$ces to observe that if the passive scalar correlation function obeys a di!usion equation, as it does in the RDT model, then the di!usion coe$cient of the PDE corresponds exactly to the relative di+usivity of a pair of tracer particles. That is, in the RDT model, if X(t) and X(t) denote the random tracer trajectories for particles starting at x and x at time 0, then under the current conditions of statistical homogeneity:
1d 1(X(t)!X(t))(X(t)!X(t))2 2 dt
"2iI#D (x!x) . R In particular, the rate of growth of the mean-square distance between a pair of tracers momentarily separated at time 0 by displacement vector r"x!x is
1d 1"X(t)!X(t)"2 2 dt
"2di#Tr D (r) . (259) R The "rst term arises from the independent Brownian motions which the tracers undergo due to molecular di!usion. The second contribution to the relative di!usivity is proportional to the mean-square velocity di+erence at two points displaced by r. Due to spatial correlations in the velocity "eld, this mean-square velocity di!erence will start at zero for zero displacement, and generally grow as the displacement is increased, and saturate at some constant value when the displacement greatly exceed the correlation length of the velocity "eld. Turbulent di!usion is thus more e!ective at separating tracers which are farther apart than those which are closer together. The RDT model explicitly re#ects Richardson's hypothesis that the relative di!usivity should depend only on the current separation of the particles [284]. Richardson deduced that this feature can give rise to a strongly superlinear rate of growth of the mean-square separation of a pair of tracers in a turbulent #ow. We shall discuss this issue at more length in Section 4.2.2. Batchelor
422
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
[27] suggests that the relative di!usivity should actually depend on the time since the tracer release in real-world turbulent di!usion. As the RDT model velocity "eld has no memory, this consideration does not arise here. The reader may refer to [132] for a nice unifying discussion of Richardson's and Batchelor's ideas. We have thus far indicated why the enhanced di!usion of the PS correlation function P (r, t) is given by D (r), provided that P (r, t) obeys a di!usion equation. The fact that P (r, t) obeys a (variable-coe$cient) di!usion equation in the "rst place relies mathematically on the fact that a pair of tracer particles in the RDT model undergo a coupled Brownian motion. The coupling is due to spatial correlations in the #ow, and is completely described by the tensor D (r). 4.1.2.3. Dependence of passive scalar mean and correlation function on velocity xeld spatial structure. We mentioned above the general fact that the mean passive scalar density is sensitive primarily to the large-scale features of the velocity "eld, and that the small-scale structure of the velocity "eld is much more strongly re#ected in the #uctuations of the passive scalar "eld. The RDT model brings this out quite clearly. The mean passive scalar density 1¹(x, t)2 obeys an ordinary di!usion equation (255), with the di!usivity enhanced by the nonnegative de"nite tensor R (0) associated to the absolute strength (i.e., the single-point variance) of the velocity "eld. In turbulence, most of the energy is contained in the largest scales of the velocity "eld, so R (0) primarily depends upon the macroscopic features of the velocity "eld. On the other hand, in a statistically homogenous setting, the second-order PS correlation function P (r, t)"1¹(x, t)¹(x#r, t)2 evolves according to a variable coe$cient di!u sion PDE. The di!usivity tensor is enhanced by the velocity structure tensor D (r), which describes velocity di!erences separated by arbitrary distances r. Thus, the second-order statistics of a statistically homogenous passive scalar "eld will be directly sensitive to the small-scale spatial structure of the velocity "eld. We will explore in Sections 4.2 and 4.3 how universal small-scale features of the velocity "eld are transmitted to universal small-scale features of the passive scalar "eld in the RDT model. 4.1.3. Velocity xeld model with two distinct short correlation time limits Before proceeding with the applications of the RDT model, we wish to remark upon the domain of its validity. Certainly, the equations apply for the Gaussian delta-correlated velocity "eld (or its mathematical interpretation as a Brownian #ow). One might further ask, though, in what sense the RDT model equations also furnish an asymptotic description of the statistics of a passive scalar "eld advected by a velocity "eld with a "nite but short correlation time relative to the advection time scales. While the RDT model does describe a particular limit with short correlation time of a large class of random velocity "elds, we will present an example of a velocity "eld for which two di!erent short correlation time limits can be taken. The mean and correlation function of the passive scalar "eld converge in one limit to solutions of the RDT model equations, and converge in the other limit to solutions of spatially nonlocal pseudo-di!erential equations (Eqs. (264) and (265)). Thus, the possibility of writing closed di!usion equations for the passive scalar statistics is not merely a consequence of the correlation time being very short relative to the advection time scales.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
423
4.1.3.1. Central limit rescaling. We state "rst a positive way in which the RDT model describes a certain short correlation time limit: If one is given a (generally non-Gaussian) random velocity "eld *(x, t) satisfying suitable mixing and regularity conditions, then under the particular rescaling *C(x, t),e\*(x, t/e) ,
(260)
the statistics of the passive scalar "eld advected by *C(x, t) converge as eP0 to the solutions of the RDT model equations [53,185]. Note that the particular rescaling in Eq. (260) is of a `central limita type, and is moreover mathematically equivalent to (but somehwat conceptually distinct from) the Kubo rescaling discussed in Section 2.4.1. Over an order unity time interval, the advection process formally becomes a sum of a large number N&O(e\) of roughly uncorrelated, mean zero, identically distributed pushes with duration dq&O(e) and displacement magnitude d"x"&"*C"dq&O(e)&O(N\). We may say therefore that a `central limit theorem in the environmenta holds; that is, as eP0, the non-Gaussian velocity "eld *C(x, t) acts more and more like a Gaussian, rapidly decorrelating velocity "eld as far as passive scalar advection is concerned. The result that central limit scaling of a broad class of velocity "elds gives rise to passive scalar dynamics described by di!usion equations is in accord with the well-known fact that central limit scaling of discrete random walks generally leads to continuum processes with di!usive behavior, such as Brownian motion ([102], Section 14.6). 4.1.3.2. An explicit alternative short correlation time limit. The limit theorem stated above may tempt one to conclude that the RDT model universally describes the advection of a passive scalar "eld by a velocity "eld with very short correlation time. That is, one might suppose that the speci"cation that the RDT velocity "eld is Gaussian is gratuitous, since the equations of the RDT model also describe the short correlation time limit of a large class of non-Gaussian models. In fact, we now show by example that there exist limits in which the velocity correlation time scale is much smaller than the advection time scale, but which do not give rise to the RDT model equations for the passive scalar statistics. For this purpose, we use a simple shear Poisson blob velocity "eld model, introduced by Avellaneda and the "rst author in [14]. The velocity "eld is taken as a random two-dimensional shear #ow:
*(x, y, t)"
0
v(x, t)
.
The random "eld v(x, t) is de"ned as v(x, t)" W(x!m , t!q ) L L L where (m , q ) denumerates a space}time Poisson process of unit intensity [151], and W is some L L smooth, compactly supported `bloba structure function with zero integral:
W(x, t) dx dt"0 . \ \
(261)
424
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
That is, the velocity "eld is a superposition of the blob functions distributed according to a space}time Poisson process. It has mean zero and a "nite correlation function:
RI (x, t),1v(x, t)v(x#x, t#t)2"
\ \
W(x, t)W(x#x, t#t) dx dt .
We now de"ne two families of Poisson blob velocity "eld models generated from the given Poisson blob velocity "eld under two di!erent rescalings, each describing a di!erent limit process in which the correlation time becomes small. The family of velocity "elds under central limit rescaling is de"ned: vC (x, t)" e\W(x!mC, (t!qC)/e) , (262) !* L L L where the intensity of the Poisson process (mC, qC) is taken as e\. It can be shown that this L L de"nition is statistically equivalent to the rescaling (260). The family of velocity "eld models generated under the ,xed intensity rescaling of the original Poisson blob model is de"ned: vC(x, t)" e\W(x!m , (t!q )/e) , (263) $' L L L with the intensity of the Poisson process (m , q ) "xed at unity and the amplitude of the blob L L functions rescaled from the prototype W more strongly than in the central limit rescaling. At e"1, both families coincide with the unscaled Poisson blob velocity "eld: v(x, t)" W(x!m , t!q ) , L L L where the Poisson process (m , q ) has unit intensity. As eP0, the second-order correlation function L L of each rescaled family approaches the same delta-correlated form: lim 1vC (x, t)vC (x#x, t#t)2"lim 1vC(x, t)vC(x#x, t#t)2"RI (x)d(t) , !* !* $' $' Q C C RI (x, t) dt . RI (x), Q \ Moreover, both rescalings give rise to a limit in which the correlation time is O(e) and short compared with the advection time scale J1/1(vC(x, t))2&O(e). Now we de"ne ¹C (x, y, t) and ¹C(x, y, t) to be the random passive scalar "elds solving the !* $' advection-di!usion equation with rescaled velocity "elds vC (x, t) and vC(x, t), respectively. In !* $' accordance with the above discussion, under central limit rescaling, the passive scalar statistics converge to those governed by the RDT model. Thus, as eP0, the mean 1¹C (x, y, t)2 and the !* correlation function 1¹C (x, y, t)¹C (x#x, y#y, t)2 converge to solutions of the di!usion !* !* equations associated to an RDT model velocity "eld vJ (x, t) with correlation function 1vJ (x, t)vJ (x#x, t#t)2"RI (x)d(t). Q
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
425
On the other hand, one can show [186] that the mean 1¹C(x, y, t)2 and correlation function $' 1¹C(x, y, t)¹C(x#x, y#y, t)2 of the family generated by "xed intensity rescaling converge as $' $' eP0 to "nite limits PM (x, y, t) and PM (x, y, t) which solve pseudo-di+erential equations: $' $' R RPM (x, y, t) $' "iDPM (x, y, t)#; PM (x, y, t) , $' $' Ry $' Rt
PM (x, y, t"0)"1¹ (x, y)2 , $'
(264)
and
RPM (x, y, t) R $' "2iDPM (x, y, t)#; x, PM (x, y, t) , $' $' Rt Ry $' PM (x, y, t"0)"1¹ (x, y)¹ (x#x, y#y)2 . $' The pseudo-di!erential operators are speci"ed by
; (k)" $'
(265)
(e\L IWM K!1) dm ,
\ ; (x, k)" (e\L IWM V>K\WM K!1) dm , $' \
with
WM (x),
W(x, t) dt . \ Since ; (k) and ; (x, k) are, in general, transcendental functions of k, the mean and correlation $' $' function of the limiting passive scalar "eld under the "xed intensity rescaling cannot be expressed as solutions of PDEs. Their governing equations are intrinsically nonlocal. The upshot is that under central limit rescaling, a central limit theorem `in the environmenta applies, and the trajectories of tracer particles begin to resemble a coupled Brownian motion as the correlation time becomes short. But in the short correlation limit of the Poisson blob model which produces a distinct limiting behavior, the tracer trajectories retain a Poissonian character and do not converge to Brownian motions. The validity of the RDT model equations speci"cally requires that the tracers di!use according to Brownian motion processes with no Poissonian components. Put another way, in the short correlation time limit under central limit rescaling (262), the Poisson blob velocity "eld acts on the tracer trajectories like a Gaussian random, delta-correlated velocity "eld, whereas under "xed intensity rescaling, it acts on the tracer trajectories like a non-Gaussian delta-correlated velocity "eld and the RDT model equations do not apply. These examples are due to the second author and further discussion may be found in [185,186]. 4.1.4. Energy spectra with inertial range for RDT model Thus far, we have discussed the RDT model in a general manner; for applications we must specify the spatial correlation tensor R (x) appearing in the velocity correlation function: 1*(x, t)*(x#r, t#q)2"R (r)d(q) .
426
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
As usual, we de"ne the velocity correlation function through its spectrum. For simplicity and continuity with our discussion in Section 3, we consider "rst the case where the RDT velocity "eld is a simple two-dimensional shear #ow
*(x, y, t)"
0
v(x, t)
,
so that the velocity correlation function is simply described by a scalar function: 1v(x, t)v(x#x, t#t)2"R (x)d(t) . Then we write the spatial correlation function in terms of its spectral density:
R (x)"
eL IVEI ("k") dk"2
(266)
cos(2nkx)EI (k) dk .
\ Please note that even though EI (k) depends only on wavenumber, it is really the spatio-temporal energy spectrum EI (k, u) discussed in Section 3.3. It is simply independent of the frequency variable u, as is characteristic of delta-correlated `white noisea processes. We would like to simulate a fully developed turbulent #ow as closely as possible with the RDT model velocity "eld, but of course we must respect the short correlation time inherent in the model. A natural way to produce such a model, given our discussion in Paragraph 4.1.3.1, is to start with a velocity "eld model with a reasonable spatio-temporal energy spectrum EI (k, u), and then pass to a short correlation time limit through the central limit rescaling (260). Thus, we posit a spatiotemporal energy spectrum corresponding to the Kolmogorov similarity hypotheses stated in Paragraph 3.4.3.1, in which the energy spectrum has the inertial range form E(k)+C eN k\ ) and the correlation time of an inertial range velocity #uctuation of wavenumber k scales as q (k)+eN \k\. The spatio-temporal energy spectrum is then constructed from these quantities as follows (199): EI (k, u) " E(k) (uq (k))q (k) " C eN k\t (k¸ )t (k¸ ) (ueN \k\)(eN \k\) ) ) " C eN k\ (ueN \k\)t (k¸ )t (k¸ ) , ) ) where ¸ is the integral length scale, ¸ is the Kolmogorov dissipation length scale, C is the ) ) Kolmogorov constant, t and t are infrared and ultraviolet cuto!s, respectively, and is the temporal structure function ( (u) du"1). Performing the central limit rescaling \ *C(x, t),e\*(x, t/e) induces the following rescaling of the spatio-temporal energy spectrum: EI C(k, u)"EI (k, eu) . Passing to the eP0 limit, we see that the RDT model velocity "eld induced by a central limit rescaling of a Kolmogorov-type turbulent model is speci"ed by the spatio-temporal energy spectrum: EI (k)"C (0)eN k\t (k¸ )t (k¸ ) . 0"2 ) )
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
427
We note that in the RDT model, after central limit rescaling, the Kolmogorov spectrum formally corresponds to the exponent 7/3. As we did in Section 3, we imbed this particular spectrum in a more general family of models with the same qualitative features: EI (k)"A k\\&t (k¸ )t (k¸ ) , (267) # ) and we have dropped the `RDTa su$x. The parameter H(1 describes the inertial-range scaling, with H"2/3 corresponding to the central limit rescaling of Kolmogorov velocity "eld, and A is # some prefactor of appropriate dimensions. We will be most interested in the case 0(H(1, which describe velocity "elds with long-range spatial-correlations typical of turbulence (2(e(4 in the parameterization of Section 3). If we imagine replacing the delta function in Eq. (266) by a smooth approximation, then the parameter H would be exactly the Hurst exponent ([215], Section 27) characterizing the fractal spatial structure of the velocity "eld in the inertial-range of scales (see Eq. (269) and subsequent discussion). The Hurst exponent of a true Kolmogorov velocity "eld is 1/3; the reason the value H"2/3 arises from a central limit of this velocity "eld is because #uctuations decorrelate in time at a wavenumber-dependent rate. One can specify similar turbulent spectra for velocity "elds with multi-dimensional geometry through the use of tensors, and the general remarks above persist without change. We shall in particular construct a multi-dimensional velocity "eld with isotropic statistics in Paragraph 4.2.2.1. 4.2. Evolution of the passive scalar correlation function through an inertial range of scales We now utilize the explicit PDE (256) for the second-order passive scalar correlation function in the RDT model to explore the statistical evolution of passive scalar #uctuations. It is clear from the di!usive form of the equation that, in general, the amplitude of the #uctuations in a freely evolving passive scalar "eld will be damped, and the typical length scale of the #uctuations will increase due to turbulent spreading. It is instructive to examine a situation in which a universal description is possible. We thus consider a fully developed turbulent #ow with well-developed inertial range, and focus on a time interval during which the predominant length scale of the #uctuations lies within the inertial range and is much larger than the length scale of the initial disturbance. In such an asymptotic regime, one might plausibly expect that most of the details about the initial structure of the PS correlation function are irrelevant, and that the dynamics are driven primarily by the inertial-range turbulent #uctuations which have universal, self-similar properties. We will see that this is indeed the case in the RDT model. The asymptotic regime just described is the same as that for which we computed fractal dimensions of scalar interfaces in Section 3.5. Another theme we will emphasize is the distinction between the action of turbulent di!usion and bare molecular di!usion on the passive scalar "eld, particularly when the turbulent velocity "eld has very strong long-range correlations (H'0). We "rst revisit turbulent di!usion in a shear #ow, now with rapid decorrelation in time, and derive formulas for the rate of dissipation of the passive scalar #uctuations as they evolve through the inertial range of scales (Section 4.2.1). When the velocity "eld has long-range spatial correlations typical of fully developed turbulence, the amplitude of the passive scalar #uctuations will decay anomalously according to a power law with the exponent di!ering from that corresponding to ordinary di!usion. The methodology involves the relation of solutions of the RDT model
428
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
di!usion PDE to an associated quantum mechanics SchroK dinger problem with potential related to the spatial structure of the shear #ow [206]. This strategy will play a central role in the derivation of results concerning large-scale passive scalar intermittency in a closely related model to be discussed in Section 5. Next, in Section 4.2.2, we investigate more extensively the spreading of passive scalar #uctuations through the inertial range of scales in an isotropic RDT model. A completely self-similar solution for the second-order PS correlation function may be obtained in this asymptotic regime, and from it we may deduce the rate of decay of the passive scalar variance, the rate at which the length scales of the #uctuations grow with time, and the fractalizing (roughening) properties of turbulent di!usion. 4.2.1. Anomalous decay of passive scalar yuctuations We begin our study of passive scalar #uctuations in the RDT model with a two-dimensional RDT shear #ow with statistics described in Section 4.1.4. The RDT model equation (256) for the second-order PS correlation function P (x, y, t)"1¹(x, y, t)¹(x#x, y#y, t)2 can then be ex pressed in terms of scalar functions: RP (x, y, t)/Rt"2iDP (x, y, t)#D (x) RP (x, y, t)/Ry , P (x, y, t"0)"P(x, y) , with the turbulent di!usion coe$cient given by
(268)
D (x)"R (0)!R (x) . Note that this PDE is very similar to the PDE for the pair-distance function in a turbulent shear #ow which we derived in Section 3.5.1. Indeed, it is a general fact that the second order PS correlation function and pair-distance function satisfy the same PDE (see for example [185] or [196]). The PDE derived in Section 3.5.1 permitted nontrivial temporal correlations, while here we have accounted for the presence of molecular di!usivity. Including both e!ects simultaneously is a more challenging and interesting task, as we indicated at the end of Section 3.5. Now, we describe the behavior of the passive scalar correlation function P (x, y, t) as it evolves through the inertial range of scales. For such a regime to exist, we must have that ¸ ;¸ and that ) the length scale ¸ of the passive scalar correlation function satis"es ¸ ;¸;¸ . As we indicated ) in Section 3.5.1, the ¸ PR limit may be directly taken in Eq. (268) for all H(1. Even though the spatial correlation function of the velocity "eld R (x) for 0(H(1 approaches in"nity in this limit, these divergences cancel in the expression for D (x), and (1!cos(2pkx))k\\&t (k¸ ) dk DM (x), lim D (x)"2A # ) * is "nite. Having taken the ¸ PR limit, we then need to restrict our attention to time scales su$ciently large so that the length scale of the PS correlation function is much larger than both ¸ and its ) initial length scale. We do this through a large-space, long-time rescaling chosen so that the asymptotic evolution PS correlation function is faithfully and "nitely expressed in terms of these
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
429
rescaled variables. We call this rescaling an `inertial range renormalizationa; in the language of the renormalization group we are looking for rescalings leading to a nontrivial `"xed pointa. A similar renormalization was carried out for a spatio-temporal shear #ow in Section 3.5.4, but we will not enforce isotropy of the renormalization group here. The qualitative character of the renormalization depends on whether H(0 or H'0. We shall consider each case in turn, "rst deriving the PDE describing the inertial-range evolution of the PS correlation function and then computing the rate of decay of the passive scalar variance 1(¹(x, y, t))2"P (0, 0, t). Further details may be found in the original work [206], and the statements presented here can be veri"ed rigorously through the Feynman-Kac representation developed in that paper. 4.2.1.1. Inertial range renormalization for H(0. In the parameter regime with H(0, the turbulent velocity "eld does not have very strong long-range correlations, and the turbulent di!usion coe$cient DM (x) approaches a "nite limit as the spatial scale x becomes arbitrarily large: lim DM (x)"RM (0)"2 A k\\&t (k¸ ) dk . # ) V Thus, the inertial-range evolution for the PS correlation function may be described by the standard di!usive renormalization:
xy t , , . PH(x, y, t)"j\P j j j The limiting function PM (x, y, t),lim PH(x, y, t) (which may be viewed as a "xed point of the H renormalization) obeys a constant-coe$cient di!usion PDE with enhanced di!usion coe$cient along the shearing direction: RPM (x, y, t) RPM (x, y, t) RPM (x, y, t) "2i #(2i#RM (0)) , Q Rx Ry Rt PM (x, y, t"0)"Md(x)d(y) , where M"1P(x, y) dx dy. The renormalized PS correlation function PM (x, y, t) thus assumes a standard (but anisotropic) Gaussian form. In particular, the variance 1(¹M (x, y, t))2 of the renormalized passive scalar "eld ¹M (x, y, t) decays according to a power law with the same exponent as it would under molecular di!usion alone: 1(¹M (x, y, t))2"PM (0, 0, t)"M(4p)\(2i#RM (0))\(2i)\t\ . Q For the case H(0 just discussed, the only role of the random velocity "eld in the decay of the passive scalar variance is to decrease the coe$cient through the term RM (0). 4.2.1.2. Inertial range renormalization for H'0. Turbulence typically has very long-range spatial correlations, however, and corresponds to the class H'0 which we now discuss. The standard di!usive renormalization fails in this case to produce a "nite limiting inertial-range behavior, because the turbulent di!usion coe$cient DM (x) diverges as xPR: lim DM (x)"D'"x"& . V
(269)
430
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The numerical coe$cient in Eq. (269) is given by
D'"2A #
C(!H) , (1!cos(2pk))k\\& dk"!A p&> # C(H#)
where C( ) ) is the Gamma function [195]. The power law growth of DM (x) in Eq. (269) manifests the property that, within the inertial range of scales, the turbulent separation of particles becomes stronger and stronger as their relative distance increases. To capture the behavior of the PS correlation function through the inertial range of scales, we must renormalize according to the following prescription [206]:
x y t PH(x, y, t),j\\&P , , . j j>& j
(270)
With Eq. (269), one then readily checks that the renormalized inertial-range limit PM (x, y, t), lim PH(x, y, t) obeys the variable-coe$cient di!usion PDE: H RPM (x, y, t) RPM (x, y, t) RPM (x, y, t) "2i #D'"x"& , Rx Ry Rt
(271)
PM (x, y, t"0)"Md(x)d(y) . The inertial-range limit which we have just derived is completely self-similar in the terminology of Barenblatt [25]. All length scales have disappeared in this asymptotic regime, and the only remnant of the initial data is the total spatial integral of the initial PS correlation function, M, which is an exactly conserved quantity. A related fact is the invariance of the PDE (271) under the rescalings de"ning the renormalization (270). Therefore, in the RDT model under discussion, the second order PS correlation function is universal and scale-invariant within the inertial range of scales, when the random velocity "eld has very long range correlations H'0. We will examine the self-similarity properties of the passive scalar correlation function in an isotropic version of the RDT model in Section 4.2.2. Here, we shall simply examine the rate of decay of the variance of the renormalized passive scalar "eld 1¹M (x, y, t)2, and show that it decays anomalously with a power law t\\&. The fact that the turbulent #ow is a shear #ow along the y direction puts the PDE (271) for PM (x, y, t) in a form well suited for a partial Fourier transform in the y variable [206]:
ep IWPM (x, y, t) dy . PMK (x, k, t)" \ The resulting PDE for PMK (x, k, t) takes the form of a quantum mechanical SchroK dinger equation (in imaginary time): RPKM (x, k, t) RPKM (x, k, t) "2i !4pD'k"x"&PMK (x, k, t) , Rt Rx PKM (x, k, t"0)"Md(x) .
(272)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
431
Since the e!ective potential function 4pD'k"x"& grows su$ciently rapidly as "x"PR, the operator on the right-hand side has pure point spectrum [323], and the solution to Eq. (272) may be represented as a superposition of orthonormal eigenfunctions corresponding to `bound statesa of the corresponding quantum mechanical system. This is in contrast with the situation for H(0, in which the e!ective potential would be a constant function, and the quantum mechanical operator would correspond to a free particle and have purely continuous spectrum. De"ne now +t (x), to be the normalized eigenfunctions of a nondimensionalized SchroK dinger H operator corresponding to the right-hand side of Eq. (272): dt (x) H #"x"&t (x)"k t (x) , ! H H H dx
(t (x)) dx"1 . H \ We only need to account for the even real eigenfunctions due to the parity invariance of the potential and the evenness of our initial data. The `energya eigenvalues are k (k (k (2, and k '0 is guaranteed by the positivity of the potential. Through the rescaling: tI (x)"(2pD'k)>&i\>&t ((2pD'k)>&i\>&x) , H H kJ "2i&&>(2pD'k)&>k , H H the eigenfunction expansion of the solution to Eq. (272) reads
tI (x)d(x) dx" e\IJ HRtI (x)tI (0) . PK (x, k, t)" e\IJ HR tI (x) H H H H \ H H Now, the passive scalar variance is determined by this partial Fourier transform as follows:
1(¹M (x, y, t))2"PK (0, 0, t)"
PK (0, k, t) dk . \ We compute this quantity by expressing PK (x, k, t) in terms of the parameter-free eigenfunctions and eigenvalues t (x) and k : H H (2pD'k)>&i\>&"t (0)" PK (0, k, t) dk" 1(¹M (x, y, t))2" H \ \H ;exp(!2i&&>(2pD'k)&>k t) dk"C i\&>(2pD')\t\&> , H & where the numerical constant:
C "2\&>(H#1)C((H#2)/2) k\&> "t (0)" . & H H H In particular, we see that the passive scalar variance decays through the inertial range according to an anomalous power law t\&> for H'0, whereas for H(0, the passive scalar variance decays
432
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
according to the ordinary di!usive power law t\. Note that the passive scalar variance decays more rapidly at long times due to the presence of long-range correlations (H'0), and that the decay rate is self-consistent with the self-similar inertial range scaling (270) of the amplitude and time argument of the PS correlation function. As we will see more clearly in Section 4.2.2, the faster decay is due to the rapid dispersal of the passive scalar #uctuations by the energetic longwavelength #uctuations of the turbulent #ow. 4.2.2. Self-similar spreading of passive scalar -uctuations For the rest of Section 4, we will explore further issues concerning the passive scalar correlation function within the context of an RDT model with statistical isotropy, i.e., no preferred direction. This is a natural geometry for describing `generica fully developed turbulence, particularly since the small-scale #uctuations may be expected in many circumstances to be insensitive to the large scale con"guration. The assumption of statistical isotropy furthermore simpli"es the mathematical analysis by reducing the dimensionality of the di!usion PDE (256) for the PS correlation function. This a!ords us the possibility of describing the second order passive scalar statistics in a rather explicit fashion. Here we give a complete description of the PS correlation function during the same `inertial-rangea phase of evolution just considered for the shear RDT model, during which the length scale of the passive scalar #uctuations lies within the inertial range of scales and is much larger than the length scale of the initial disturbance. We begin by de"ning the correlation function for the isotropic RDT velocity "eld through a spatio-temporal energy spectrum with inertial-range scaling, and conduct an inertial-range renormalization in the same manner as we did for the shear RDT model. The PDE describing the renormalized PS correlation function is then exactly solved, and we read o! from this solution some properties concerning the evolution of passive scalar #uctuations and the separation of a pair of particles on length scales within the inertial range. 4.2.2.1. Setup for isotropic RDT model. As the velocity "eld is now multi-dimensional (with d"2 or 3 denoting the spatial dimension), its correlations must be described by a tensor rather than a scalar function. The general relation between the correlation tensor and the scalar spatiotemporal energy spectrum (k) describing the strength of the velocity #uctuations at various wavenumbers k is given by (see [341], Sections 9 and 22): 1*(x, t)*(x#x, t#t)2,R (x)d(t) , 2EI ("k") dk . R (x)" ep k x(I!kK kK ) (d!1)A "k"B\ 1B B\
(273)
The tensor factor I!kK kK , where I is the identity matrix and kK "k/"k", is a projection which enforces incompressibility. The constant A is the area of the (d!1)-dimensional sphere. We B\ take the same inertial-range form of the spatio-temporal energy spectrum as that which we have been using for a shear geometry: EI (k)"A k\\&t (k¸ )t (k¸ ) . # )
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
433
Because of statistical isotropy and incompressibility, the turbulent di!usion tensor D (r)" R (0)!R (r) appearing in Eq. (256) may be expressed in terms of a scalar function of a single variable ([341], pp. 380}383): D (r)"D # (r)rL rL #D (r)(I!rL rL ) , , (rB\D (r)) , . D (r)" , (d!1)rB\
(274)
D (r) is half the mean-square longitudinal velocity di!erence, and D (r) is half the mean-square , , lateral velocity di!erence observed at two points separated by a distance r. If we assume that the passive scalar statistics are initially statistically isotropic, then by symmetry the passive scalar statistics remain statistically isotropic, and the PS correlation function may be expressed for all time as a function of a single space variable r""r" and a single time variable: 1¹(x, t)¹(x#r, t)2"P ("r", t) , 1¹ (x)¹ (x#r)2"P("r") . With Eq. (274), we can then write the PDE (256) for the second-order PS correlation function in the following form: Isotropic evolution of second-order passive scalar correlation function
1 R RP (r, t) RP (r, t) " rB\(2i#D (r)) , Q, rB\ Rr Rr Rt P (r, t"0)"P(r) .
(275)
4.2.2.2. Inertial-range renormalization of isotropic RD¹ model. Now we are in a position to analyze the evolution of P (r, t) through the inertial range of scales through a similar type of renormaliz ation procedure as in our earlier discussion of the shear RDT model. It is readily checked that for H(0, the renormalized PS correlation function obeys, as in the shear case, a constant coe$cient di!usion equation with the di!usion coe$cient enhanced by the presence of the random velocity "eld. The PS correlation function thus assumes a standard Gaussian shape and spreads and decays according to ordinary di!usive laws. We shall focus on the case of very long-range correlations H'0, which exhibits more interesting behavior and has qualitative similarities to real-world turbulence. By taking ¸ PR and renor malizing the PS correlation function according to the law: PH(r, t)"j\BP (r/j, t/j\&) , we "nd formally that the inertial range limit PM (r, t),lim PH(r, t) obeys the PDE H RPM (r, t) 1 R RP (r, t) " D' r&>B\ , Rt rB\ Rr * Rr
PM (r, t"0)" A
M d(r) . rB\ B\
(276)
434
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
We have used the fact that within the inertial range of scales, D (r) grows as a power law Q, lim D (r)"D' r& , Q, * *PP*) 2C(!H)C(d/2) D' " p&>A . * C((2#2H#d)/2) # The inertial-range limit for the isotropic RDT model is completely self-similar, just as for the shear RDT model in Section 4.2.1. The only memory of the initial data is the spatial integral of the initial PS correlation function
A rB\P(r) dr . B\ Rigorous veri"cation of this complete self-similarity and of the convergence of the passive scalar correlation function to the solution of Eq. (276) under the inertial-range renormalization is more subtle than in the shear case; see [208] for some positive mathematical results and a discussion. M"
4.2.2.3. Exact solution of renormalized PDE. We now proceed with the development of an exact solution for the renormalized PS correlation function PM (r, t), from which we will be able to deduce a number of properties concerning the statistics of passive scalar #uctuations as they evolve through the inertial range of scales. The assumption of statistical isotropy has reduced the complexity of the PDE (276) for PM (r, t) to the point that it can be solved by dimensional analysis [24]. There are three independent dimensions: length, time, and the passive scalar density. Five di!erent parameters and variables appear in the PDE de"ning PM (r, t), so by general principles of dimensional analysis [24,25], we can re-express the PDE in terms of 5!3"2 dimensionless variables. One natural way to choose these dimensionless variables is to nondimensionalize the dependent quantity PM and one of the independent variables (say r) with respect to the remaining variables and parameters (t, D' and * M). Thus, we de"ne the nondimensional quantities: r m" , (D' t)\& * PM . Q " & M(D' t)\B\& * As dimensional analysis guarantees, the PDE (276) is equivalent to an ODE for the nondimensionalized function Q "Q (m): & & 1 dQ (m) d dQ (m) d Q (m)! m & "m\B m&>B\ & . (277) ! 2!2H dm dm dm 2!2H &
This ODE is to be solved with the auxiliary condition
Q (m)A mB\ dm"1 , & B\
(278)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
435
which expresses the fact that the spatial integral of the correlation function PM (r, t) is conserved and equal to its value M at time t"0. This ODE problem may be exactly solved by quadrature. After multiplying Eq. (277) through by mB\, it becomes a perfect derivative:
dQ (m) mBQ (m) d & m&>B\ & # "0 . dm 2!2H dm Integrating once, we have m&>B\
dQ (m) mBQ (m) & "C, & # 2!2H dm
and it is readily checked that the integration constant C must vanish for the integral in Eq. (278) to be "nite. The remaining "rst-order ODE is thus separable and easily integrated to give
m\& . Q (m)"C exp ! & & (2!2H)
(279)
The normalization constant C must be chosen so that Eq. (278) holds; this gives & C(d/2) . C " & 2 pB (2!2H)B>&\\&C(d/(2!2H)) Re-expressing this result in the original dimensional variables through Eq. (277), we obtain the following. Exact solution for the renormalized passive scalar correlation function:
r M Q , PM (r, t)" (D' t)B\& & (D' t)\& * * with Q given by Eq. (279) for H'0. &
(280)
4.2.2.4. Inertial-range properties of passive scalar -uctuations and relative di+usion of particle pairs. With this exact solution, we can read o! several features concerning the evolution of the passive scalar #uctuations on length scales which lie within the inertial-range and are much larger than the correlation length scale of the initial passive scalar "eld. First, we see that, as with the shear RDT model in Section 4.2.1, the passive scalar variance decays anomalously for H'0: 1¹(x, y, t)2&PM (0, t)&t\B\& . Under ordinary di!usive decay, the scaling exponent would be !d/2, corresponding to slower decay of variance. We now proceed to use the explicit form of the shape of the PS correlation function to infer properties of relative tracer di!usion and the roughness of the passive scalar "eld. Relative tracer di+usion. Below we will see that the length scale of the passive scalar #uctuations grows with time in proportion to t\&. Our discussion of this property will be facilitated by the general relation between the second order PS correlation function and the probability distribution
436
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
for the relative separation of a pair of tracer particles ([196], Section 8.5). In the present case, PM ("r", t) is proportional to the probability distribution over 1B of the relative displacement r of a pair of tracers at large times t for which the tracer separation distance o (t) is statistically concentrated within the inertial range of scales and is much larger than the initial separation. The constant of proportionality is M, the integral of the initial PS correlation function. The fact that the length scale of the correlation function scales as t\& means precisely, then, that the typical relative separation between a pair of tracers grows in time proportionally to t\& as they evolve through the inertial range. In particular, the mean-square relative tracer displacement may be computed as
1o (t)2"(M)\
"x"P ("x", t) dx"(M)\ 1B
r(A rB\P (r, t)) dr"C (D' t)\& B\ 0 *
with the numerical constant given by C "(2!2H)\& 0
C((d#2)/(2!2H )) . C(d/(2!2H))
Under ordinary di!usion, the mean-square displacement grows linearly with time. The presence of the rapidly decorrelating velocity "eld with very long-range spatial correlations (H'0) causes particles to separate within the inertial range at a more rapid rate. In fact, the exponent of the power law of the relative separation becomes arbitrarily large as HP1. In 1926, Richardson [284] predicted a law of this type for real-world turbulent di!usion as a consequence of the increase of relative di!usivity of a pair of tracers with separation distance. Through a "tting of available data from observation of balloon motion, Richardson found the relative di!usivity to scale as the 4/3 power of the separation distance and postulated a PDE of a form similar to that which we have been discussing Eq. (276) with H"2/3. Solving the equation, Richardson deduced that the mean-square separation of a pair of tracers should grow as t. Richardson's prediction has found a good amount of numerical and experimental con"rmation, as we shall later discuss in Section 6. Here, we can say that Richardson's reasoning is exactly valid for a Gaussian velocity "eld with rapid decorrelation in time. Furthermore, with the central limit rescaling discussed above (267), the familiar Kolmogorov spectrum corresponds in the RDT model exactly to the value H"2/3!! As a concrete illustration of this theoretical result, we present in Fig. 19 the results of a numerical Monte Carlo simulation of the relative dispersion of a pair of tracers in a rapidly decorrelating, isotropic velocity "eld with Hurst exponent H"1/3. Through use of a multiwavelet method [84,85], which will be discussed later in Section 6, the numerically synthesized random velocity "eld supports a wide inertial range extending from scales 3;10\ to 10. For all times plotted, the tracer separation lies well within the inertial range of the simulated velocity "eld. In the upper plot, we see that the root-mean-square relative tracer displacement 1o (t)2 settles down to the predicted t power law growth over two decades of spatial scales. We plot the logarithmic derivative of the mean-square relative displacement d1o (t)2/d ln t as a function of time in the lower part of the "gure as a more stringent test of the apparent power law behavior. The theoretically predicted scaling behavior corresponds to a constant logarithmic derivative of 3/2, and indeed the numerically computed logarithmic derivative hovers quite closely to this value after
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
437
Fig. 19. Monte Carlo simulation of tracer pair dispersion through inertial range of isotropic, rapidly decorrelating velocity "eld with H"1/3. Upper graph: log}log plot of root-mean-square tracer separation as a function of time. Lower graph: logarithmic derivative of the mean-square relative separation as a function of time.
an initial transient period. Besides producing a manifest realization of the inertial-range asymptotic theory for relative tracer di!usion in the RDT model, these numerical results also demonstrate the capacity of the underlying Monte Carlo method to generate accurate statistical scaling behavior over several decades. This is not an easy task, as we shall discuss in Section 6. Thus far, we have discussed only the variance of the relative displacement of a pair of tracers, but our exact solution PM (r, t) in fact gives the full probability density function (PDF) for this
438
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
random quantity. The relative displacement between two tracers undergoing independent Brownian motion is described by a Gaussian PDF. From Eq. (280), we see that when a pair of tracers are separated within the inertial range of scales of an isotropic RDT velocity "eld with long range correlations H'0, their relative displacement has a PDF which has a broader-thanGaussian shape. That is, the PDF decays more slowly at large values than a Gaussian with the same mean and variance does. The tangible manifestation of a random quantity with a broaderthan-Gaussian PDF is an unusually high probability for large #uctuations. The reason why the relative tracer displacement should exhibit large #uctuations within the inertial range can be understood from a positive feedback e!ect arising from the fact that turbulent di!usion is more e!ective in separating particles the further apart they already are. If a random #uctuation causes a pair of particles momentarily to be separated by a distance greater than average, then they will be predisposed to be separated even more rapidly at later times. ¹urbulent roughening of passive scalar ,eld: We draw a "nal contrast between molecular di!usion and turbulent di!usion from the isotropic RDT model through a consideration of the smoothness of the passive scalar "eld. One measure of this smoothness, viewed on the inertial range of scales, is given by the behavior of the structure function of the renormalized passive scalar "eld ¹M (x, t): SM ("r", t)"1(¹M (x#r, t)!¹M (x, t))2 as rP0. Noting that this quantity is nothing other than 2(PM (0, t)!PM ("r", t)), we have from the exact solution (280) that
r\& 2C M & 1!exp ! . SM (r, t)" (2!2H)D' t (D' t)B\& * * One now readily checks that there exist positive numerical constants C , C , and r so that \ > r\& r\& 41(¹M (x#r, t)!¹M (x, t))24C C >(D' t)B>\&\& \(D' t)B>\&\& * * over the expanding region 04r4r (D' t)\& . * The structure function of a smooth, isotropic random "eld vanishes as O(r) for small r, but we see that the structure function of ¹M (x, t) vanishes according to a slower power law as rP0, indicating a rough fractal structure of the passive scalar "eld in the inertial range of scales. Moreover, this fractal structure spreads to larger length scales as time evolves. The formal fractal Hurst exponent of this passive scalar "eld structure is 1!H, but because the passive scalar "eld is non-Gaussian, we cannot use Orey's theorem (discussed in Paragraph 3.5.3.1) to say this exponent characterizes the fractal structure of individual realizations. Under ordinary molecular di!usion, sharp features are damped out and the passive scalar "eld would appear smooth on any given length scale l after a su$ciently large time &l/i. On the other hand, turbulent di!usion by an isotropic, rapidly decorrelating velocity "eld creates a rough fractal structure on the passive scalar "eld over an increasing band of scales within the inertialrange even as the amplitude of the #uctuations decay. As we have discussed in Section 3.5 in the context of scalar interfaces, the reason for this distinction is the long-range correlations of the
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
439
turbulent velocity "eld. Through advection, the fractal inertial-range spatial structure of the velocity "eld is impressed upon the passive scalar "eld. We note from our results of Section 3.5 that the rapid decorrelation limit smooths out the passive scalar level sets on the inertial range of scales in a shear #ow. We see here, by contrast, that rough fractal structure persists throughout the inertial range of scales of the passive scalar "eld in an isotropic turbulent #ow, even when it decorrelates rapidly in time. 4.3. Scaling regimes in spectrum of yuctuations of driven passive scalar xeld In Section 4.2, we have discussed some physical characteristics of the di!usion and free decay of passive scalar #uctuations advected by a turbulent RDT model velocity "eld. For large times during which the length scale of the #uctuations passed through the inertial range of scales, we were able to describe a number of universal properties of the passive scalar "eld. Another situation in which one may hope to observe universal features in the passive scalar "eld is in a damped and driven statistically stationary state analogous to that of a fully developed turbulent velocity "eld. That is, we envision some external mechanisms stirring the #uid and agitating the passive scalar "eld at some large length scales. This directly creates large wavelength (small wavenumber) #uctuations in the velocity and passive scalar "eld, which then break up into smaller scale (higher wavenumber) #uctuations through nonlinear interactions. Su$ciently small #uctuations are damped out by viscosity and molecular di!usion. One can generally expect that if the driving is applied in a statistically steady fashion, that the turbulent system will achieve a statistically stationary state in which the input of energy at large scales is balanced by dissipation at very small scales, and the statistics of the velocity and passive scalar "eld settle down to a time-independent form. For conciseness, we will often refer to such a statistically stationary state as `quasi-equilibriuma, with the `quasi-a pre"x di!erentiating the present strongly damped and driven statistical equilibrium from a thermal equilibrium system weakly coupled to its surroundings. We recall that for the velocity "eld, Kolmogorov formulated the well-known hypotheses that (see Paragraph 3.4.3.1 and [169,196]): 1. the statistics of the velocity #uctuations and wavenumbers much greater than those characterizing the driving should be independent of the large scales, and that 2. if the system is driven su$ciently strongly (Reynolds number is high enough) so that there is a wide separation between the scale of the driving ¸ and the scale of dissipation ¸ , then the ) dynamics of the velocity "eld well within the intervening inertial-range of scales is completely self-similar and independent of both the large scales and viscosity. From this follows the famous k\ prediction for the energy spectrum in the inertial range of wavenumbers ¸\;k;¸\. ) These hypotheses have been largely con"rmed with important provisos; see [309] for a recent review of theoretical and experimental developments. Here we shall only be concerned with these basic concepts, particularly as they apply to the structure of the passive scalar ,eld in the statistically stationary, damped and driven state described above. The passive scalar "eld responds directly to the turbulent #uctuations in the velocity "eld. As the statistics of the velocity "eld are believed to be universal to some degree on scales well below the scale ¸ at which the turbulence is driven, it is natural to suppose that the passive scalar #uctuations should also be universal at small scales. Furthermore, one might expect there to exist
440
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
self-similar scaling regimes for the spectrum of passive scalar #uctuations just as the energy spectrum of the velocity "eld exhibits the Kolmogorov k\ scaling in the inertial range. Indeed various theories have been formulated predicting a variety of passive scalar spectral scaling regimes over certain asymptotic ranges of wavenumbers [28,29,76,120,254]. None of the theoretical predictions for the scaling regimes of the velocity or passive scalar spectra makes use of the exact PDEs describing the physics. Mathematically understanding the statistics of the velocity "eld as a solution to the Navier}Stokes equations with random initial data is extremely di$cult due to the nonlinearity. Even solving for the passive scalar statistics advected by a random velocity "eld with "nite correlation time is a challenging problem. The RDT model, however, provides us an opportunity to study directly the connection between the exact advection}di!usion PDE and the scaling regimes of the passive scalar spectrum, as "rst observed by Kraichnan [179,183]. If the random external driving of the passive scalar "eld is Gaussian and delta-correlated in time, then exact, closed evolution equations can still be written for the mean passive scalar density and correlation function of the passive scalar "eld (Section 4.3.1). The statistics of the passive scalar "eld in quasi-equilibrium are given as steady solutions of these equations. In a statistically isotropic environment, the quasi-equilibrium second-order PS correlation function may be represented by an explicit quadrature formula in terms of the spatial structure of the turbulent velocity "eld and driving (Section 4.3.2). The passive scalar spectrum is then expressed as a Fourier transform of this explicit formula. Through asymptotic analysis of these exact formulas, three di!erent scaling regimes in the passive scalar spectrum can be rigorously shown to exist in the RDT model under suitable conditions (Section 4.3.3). These scaling regimes correspond qualitatively to those predicted for a realistic turbulent system, and we shall use the exact results of the RDT model to comment upon some of the approximate real-world theories, particularly those under current controversy (Section 4.3.4). 4.3.1. RDT model with driving 4.3.1.1. Model of large-scale driving force. To establish a statistically stationary state of the passive scalar "eld, we introduce an external driving, or `pumpinga, "eld f (x, t) as a source/sink term in the advection}di!usion equation: R¹(x, t)/Rt#*(x, t) ) ¹(x, t)"i*¹(x, t)#f (x, t) , ¹(x, t"0)"¹ (x) . For the RDT model, we shall assume that the pumping "eldis a mean zero, Gaussian, random, homogenous, stationary, random "eld which is delta-correlated in time: 1 f (x, t) f (x#r, t#q)2"U(r)d(q) ,
(281)
where U(r) is the (scalar) spatial correlation function of the pumping "eld. This is admittedly quite arti"cial from a physical perspective. First of all, it is rather di$cult to envision a system in which an external agency is directly introducing heat or concentration #uctuations homogenously throughout the bulk of the #uid. One would more naturally suppose that the external sources and sinks are con"ned to the boundary or some localized region. The rapidly decorrelating temporal structure is also inappropriate for a macroscale driving; it is
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
441
assumed in order to achieve closed equations for the passive scalar statistics. Nonetheless, we have the freedom to choose the spatial structure U(r) to correspond to pumping on a large length scale ¸ , in which case the RDT model pumping "eld at least provides a cartoon for the generation of D large-scale passive scalar #uctuations due to external driving. Speci"cally, we de"ne the pumping correlation function through its spectrum E (k), which plays D the analogous role of the energy spectrum for the random velocity "eld:
E ("k") D ep k r dk , A "k"B\ 1B B\ We choose for the pumping spectrum E (k) a smooth form which is maximized at wavenumber D k"¸\, vanishes for k4¸\, and decays rapidly for k<¸\. Note that we are assuming D D D isotropic pumping statistics. U(r)"
4.3.1.2. RDT model equations with pumping. The assumption of a delta-correlated pumping preserves the closure properties of the RDT model for a freely advected passive scalar "eld. Eq. (255) for the mean passive scalar density 1¹(x, t)2 is unchanged because the pumping has mean zero. Thus, we may and do self-consistently assume the passive scalar "eld has mean zero for all time. The di!usion equation (256) for the second-order PS correlation function is modi"ed only through the addition of an inhomogeneous term U(r) representing the e!ects of the pumping. Evolution of second-order correlation function for driven passive scalar RP (r, t) " ) ((2iI#D (r)) P (r, t))#U(r) , Q Rt
(282)
P (r, t"0)"P(r) . With the driving included in this way in the RDT model, we can search for solutions corresponding to a quasi-equilibrium state for the passive scalar "eld, in which the statistics of the passive scalar "eld are time-independent. In particular, the quasi-equilibrium second-order PS correlation function P *(r),1¹(x, t)¹(x#r, t)2 is a steady solution of Eq. (282). The asterisks decorating * ensemble averages and statistical functions indicate that the statistics are those corresponding to quasi-equilibrium. Because of the dissipation provided by molecular di!usivity, all solutions with su$cient spatial decay will approach a unique statistically stationary state at long times. We note here an important relation implied by Eq. (282) for the quasi-equilibrium passive scalar dissipation rate sN "2i1( ¹(x, t))2 . This is just the rate at which the passive scalar variance * 1(¹(x, t))2 would decay in the absence of external driving, as is readily checked (not just for the RDT model) by multiplying the undriven advection}di!usion equation by ¹(x, t) and averaging. In a statistically stationary state, the passive scalar dissipation rate is exactly balanced by the rate at which passive scalar #uctuations are introduced into the system by the external driving. For the RDT model, this can be quanti"ed: sN "U(0) , as follows by realizing that sN "!2i*P *(r)"r and evaluating the steady form of Eq. (282) at r"0.
442
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
4.3.1.3. Alternative driving through linear background passive scalar proxle. Before proceeding with our analysis of the model we have just set up, we pause to mention that another way to introduce driving into the advection}di!usion equation is to impose a linear background pro"le on the passive scalar "eld through the initial data [156,277]: ¹ (x)"u ) x. This linear pro"le will persist in time and the turbulence will interact with it to perpetually drive #uctuations of the passive scalar "eld about this background pro"le, eventually leading to a nontrivial statistically stationary state of the passive scalar "eld. A background linear pro"le with a mean gradient for the passive scalar "eld is rather natural for strati"ed #uids in a geophysical setting, and can be readily arranged in the laboratory [127,145]. It is not di$cult to show [185] that all the results that we will derive here for the passive scalar spectrum of the statistically stationary state of the passive scalar "eld driven by external pumping of the form (281) carry over to the case where passive scalar #uctuations are driven instead by turbulent interaction with a background linear passive scalar pro"le. One need only equate ¸ "¸ and the passive scalar dissipation rate sN in the formulas D with gR , where the constant R is a measure of the strength of the turbulent #uctuations: R "(1/d )TrR (0). 4.3.2. Exact quadrature solution for quasi-equilibrium passive scalar correlation function Now, the model we have adopted has isotropic pumping and velocity statistics, so by symmetry and uniqueness of solutions, the quasi-equilibrium passive scalar "eld must also be statistically isotropic. Hence, the quasi-equilibrium second-order PS correlation function is radial P *(r)"P H("r"), and Eq. (282) simpli"es to a one-dimensional ODE: dPH(r) d (2i#D (r))rB\ "!U(r) . (283) r\B , dr dr
The boundary conditions that go with this second-order di!erential equation on the positive real axis r3[0,R) are: E PH(r) is continuous and "nite near r"0, re#ecting "nite variance of the passive scalar #uctu ations, E P H(r) decays to zero as rPR since the passive scalar "eld is uncorrelated at large enough distances. Eq. (283) can be solved exactly by quadrature, with the integration constants determined by the boundary conditions: Exact solution for quasi-equilibrium passive scalar correlation function
PYrB\U(r) dr dr . (284) r\B> 2i#D (r) P , This formula was derived by Kraichnan [183]. Its importance is that it gives an exact formula for the passive scalar correlation function in terms of the velocity and pumping correlation functions, and that it was deduced in a precise fashion from the advection}di!usion equation. One could proceed to study the properties of this PS correlation function [183,185], but we will concentrate our attention on a related function, the passive scalar spectrum, which reveals the small-scale structure of the passive scalar "eld in a clearer fashion. PH(r)"
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
443
4.3.3. Passive scalar spectral scaling regimes in RDT model The (radial) passive scalar (PS ) spectrum E (k) is de"ned to be the spectral resolution of the 2 variance of the passive scalar "eld 1(¹(x, t))2 with respect to wavenumber magnitude k ([320], * pp. 280}281). That is, E (k) measures the strength of the #uctuations of wavenumber k, and 2 E (k) dk"1(¹(x, t))2 . Its relation to the passive scalar "eld is essentially the same as that of * 2 the energyspectrum to the velocity "eld. Not only is the PS spectrum an object with appealing theoretical meaning, but it is closely related to what experimentalists actually measure when observing a signal from a turbulent system. (In most experiments, measurements are taken only along a single line, and thus a `one-dimensional passive scalar spectruma is recorded. This is closely related to the radial passive scalar spectrum discussed when the turbulence is statistically isotropic. (See ([320], Ch. 8) for further discussion.) The radial PS spectrum E (k) can be computed from the Fourier transform of the passive scalar 2 correlation function
PK *(k)"
e\p k xP *(x) dx . 1B
(285)
For the statistically isotropic case of interest in this section, PK *(k)"PK *("k") and the PS spectrum can be simply expressed: E (k)"A kB\PK *(k) . (286) 2 B\ Through Eqs. (284)}(286), we have an exact integral formula for the PS spectrum E (k), which we 2 can analyze for our speci"c choice of velocity and pumping statistics. We will in particular look for universal scaling regimes analogous to the Kolmogorov k\ inertial-range law for the energy spectrum of the velocity "eld. Recall that this self-similar region fell within an asymptotic regime intermediate to the fundamental wavenumbers ¸\ at which energy is fed into the #uid, and ¸\ above which energy is strongly dissipated by viscosity. ) By general intermediate asymptotic principles [25], we can also anticipate self-simliar scaling regimes in the PS spectrum at wavenumbers well-separated from the fundamental wavenumbers characterizing the passive scalar "eld. To proceed, we shall "rst identify in Paragraph 4.3.3.1 the fundamental length scales (and associated wavenumbers) characterizing the quasi-equilibrium passive scalar "eld. Next, in Paragraphs 4.3.3.2 and 4.3.3.3, we report three possible PS spectrum scaling regimes which rigorously arise in the RDT model when certain fundamental length scales are su$ciently widely separated. We will compare the exact self-similar scaling forms of the RDT model PS spectrum with the predictions of approximate theories for real-world turbulent systems in Section 4.3.4. 4.3.3.1. Fundamental length scales and wavenumbers. There are four natural length scales which partially characterize the passive scalar "eld, both in the RDT model and in the real world. Two length scales are inherited from the advecting turbulent velocity "eld: the integral length scale ¸ at which the #uid is driven, and the Kolmogorov dissipation length scale ¸ below which velocity ) #uctuations are strongly damped by viscosity. Next, we have the correlation length ¸ of the D pumping "eld, which sets the (large) length scale at which passive scalar #uctuations are externally introduced. Finally, it is natural to identify a passive scalar dissipation length scale ¸ , analogous to the Kolmogorov dissipation scale of the velocity "eld, at which convection and di!usion e!ects
444
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
are in balance. Below such a scale, di!usive e!ects rapidly damp passive scalar #uctuations, and above ¸ , the passive scalar dynamics are convection-dominated and molecular di!usion e!ects are formally negligible. To see how this scale should be determined in the RDT model, recall from Eq. (259) that the relative di!usion rate of two particles separated by distance "r" is 2i#Tr D (r). The contribution from the turbulent di!usion, Tr D (r), is proportional to the mean-square #uid velocity di!erence between the particles, and generally grows as a function of "r". (By isotropy Tr D (r) is independent of the orientation of r.) It thus becomes appropriate to de"ne ¸ as the length scale at which relative turbulent di!usion and molecular di!usion are in balance: 2i"Tr D (¸ eL ), where eL is a unit vector. Let us now discuss how the various length scales are typically ordered. In general, turbulent systems, ¸ and ¸ are large length scales characteristic of the macroscopic system size, whereas the D Kolmogorov velocity dissipation length ¸ and the passive scalar dissipation length ¸ are ) considerably smaller: ¸ , ¸ (¸ , ¸ . ) The disparity between the length scales is often several orders of magnitude. Now, we will be considering the passive scalar structure on scales small compared with the scale of the driving, so the ordering of ¸ and ¸ will not concern us. The relation between the velocity and passive scalar dissipation lengths is more interesting. The relative magnitude of these dissipation length scales is set by the Schmidt number Sc of the #uid, which is the ratio of the kinematic viscosity l of the #uid to the molecular di!usivity i of the passive scalar: Sc"l/i .
(287)
This ratio is called the Prandtl number Pr when the passive scalar "eld corresponds to weak temperature #uctuations, but we will generally use the term `Schmidt numbera. Note that the Schmidt number measures the relative e!ectiveness of microscopic #uid momentum transport relative to microscopic passive scalar transport. A common situation for transport of light particles or heat in ordinary #uids like air (Pr+0.7) is for the Schmidt number to be order unity. In this case, ¸ &¸ because the e$ciency of ) microscopic momentum and passive scalar transport are comparable, and the length scales at which the microscopic e!ects become relevant are about the same. Another fairly prevalent situation is that of high Schmidt number, for which the passive tracer is di!used much less e!ectively than the momentum of the #uid. The transport of heavy dyes or complex #uorocarbons with large molecular weights as utilized in contemporary laser-induced #ourescence measurements (Sc&10) provide important practical examples. In these situations, ¸ ;¸ because one must go ) to much smaller scales than ¸ to feel the relatively feeble in#uence of molecular di!usion. The ) third possibility of low Schmidt number exists in some exotic cases like electron plasmas (Pr&10\) and thermal #uctuations in liquid mercury (Pr+0.02). The regime ¸ <¸ occurs at ) low Schmidt number, because the molecular di!usion becomes relevant at a larger length scale than viscosity. To each of the four physical length scales just discussed, we naturally associate their corresponding wavenumber for the purpose of discussing the PS spectrum E (k): k "¸\, k "¸\, k " 2 ) ) ¸\, and k "¸\.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
445
4.3.3.2. Passive scalar spectral scaling regimes. Subject to the natural ordering of the length scales discussed above, three natural intermediate asymptotic regimes where one may expect self-similar scaling of the passive scalar spectrum are suggested. These regimes, which we now enumerate, have meaning in both the real world and the RDT model. E inertial-convective regime: k , k ;k;k , k . (288) ) This is present in the case of a wide separation between the system macroscale and dissipation microscales (high Reynolds number and high PeH clet number). Passive scalar #uctuations on this range are driven by inertial-range turbulent eddies, and molecular di!usion plays a subdominant role. E viscous-convective (high Schmidt number) regime: k , k , k ;k;k . (289) ) This is present when Sc<1, and there is no strict need for a high Reynolds number or an extended inertial range. Fluctuations of the passive scalar on these scales are too "ne to be driven directly by the active scales of turbulence; they are rather produced by straining by velocity "eld gradients. Molecular di!usion is ostensibly unimportant in the dynamics of the passive scalar in this range of scales since k;k . E inertial-di+usive (low Schmidt number) regime: k , k , k ;k;k . (290) ) This is present when Sc;1 and Re<1. Molecular di!usion now transports the passive scalar more e!ectively than the turbulence, but the passive scalar "eld is still su!ering deformations from the inertial-range eddies of the turbulent velocity "eld. The convention in the above discussion is that a regime is called inertial or viscous according to whether the wavenumber k is on the inertial k ;k;k or viscous k
446
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where X (t) and X (t) denote the locations of two tracers separated by a distance "X (t)!X (t)";¸ , and the e!ects of molecular di!usion are omitted in this computation. In ) the RDT model, c is related to the energy spectrum through
c\"16dp
EI (k)k dk .
The dimensionless numerical constants appearing in the formulas have the values: 2H B(d/2, 1#H)B(d/2, 1!H) , C " '! pd C "(d#2)/d, C "1/8dp 4! '" where B( ) , ) ) denotes the special beta functions [195]. The scaling forms for the inertial-convective and viscous-convective regime were formally computed by Kraichnan [179,183]. Rigorous derivations for all scaling regimes may be found in [185]. 4.3.3.3. Complete self-similarity of passive scalar spectral scaling regimes. One immediate observation from the exact scaling laws presented in Table 13 is that they are all completely self-similar in the terminology of Barenblatt [25]. That is, in each asymptotic regime of wavenumbers presented above, the PS spectrum depends only on physical parameters which are obviously relevant for that range of scales. There is no dependence on physical parameters associated to remote length scales, and no `anomalous scalinga of the PS spectrum. (See Section 4.4 for further discussion on this topic.) In all the regimes, the PS spectrum depends on the passive scalar dissipation rate sN . This quantity measures the #ux of passive scalar `energya which is injected at low wavenumbersand travels up to high wavenumbers where it is dissipated. sN simply sets the amplitude of the PS spectrum. In addition to sN , the PS spectrum in the inertial-convective regime depends only on the local wavenumber k and the parameter A which measures the strength of the inertial-range eddies. This # is exactly the set of parameters which one would expect to appear in the inertial-convective
Table 13 Universal scaling regimes for passive scalar spectrum in RDT model Asymptotic regime
E (k) 2
Inertial-convective k , k ;k;k , k )
C sN A\k&\ '! #
Viscous-convective k , k , k ;k;k )
C sN c\k\ 4!
Inertial-di!usive k , k , k ;k;k )
C A sN i\k\\& '" #
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
447
asymptotics of the PS spectrum, where the inertial-range eddies play a dominant role, and molecular di!usivity is negligible. Similarly, the viscous-convective form of the PS spectrum depends only on sN , k, and c, the strain rate characterizing the advective e!ects of the velocity "eld in the viscous range of scales. Finally, the inertial-di!usive regime depends on four parameters sN , k, A , and i, because the inertial-range eddies are the predominant cause of distortions on this # range of scales, and molecular di!usivity is playing a strong role. Thus, the passive scalar spectrum in each of the ranges reported depends only on the parameters which one would naively expect. It turns out, therefore, that the kind of reasoning which Kolmogorov used to formulate his k\ inertial-range scaling prediction works correctly in the RDT model when applied to the inertial-convective and viscous-convective range of scales. One simply hypothesizes which parameters should be naturally relevant in each of these ranges of wavenumbers, applies dimensional analysis, and "nds that only a unique scaling combination of these parameters is dimensionally self-consistent. One would thereby arrive at the scaling laws presented in Table 13, except of course that the numerical constants C and C would not be determined by this approach. Obukhov '! 4! [254] and Corrsin [76] independently formulated such a similarity theory for the inertialconvective regime in a real-world turbulent system. The inertial-di!usive regime, on the other hand, involves too many parameters which have a priori relevance so that dimensional analysis alone will not produce a unique scaling prediction. 4.3.4. Connections to theory and experiments concerning real-world scaling regimes We wish now to use the exact results for the scaling regimes of the passive scalar spectrum in the RDT model as a point of reference for discussing some physical theories formulated for analogous scaling regimes in the real world. Recall that we could impose a fairly realistic spatial structure on the turbulent RDT model; the primary de"ciency in the model is the lack of memory in the velocity "eld. Thus, we can anticipate qualitative similarities in the PS spectral scaling regimes in the real world and the RDT model, but there will of course be quantitative discrepancies due to the di!erent temporal structures. We will probe the ideas behind the physical theories for the real world to see if they can be successfully adapted to the RDT model. We now brie#y discuss each of the passive scalar scaling regimes in turn. We present the main theoretical predictions for the real-world PS spectrum, and summarize their experimental status. Then we indicate whether these theories can be adapted to the RDT model, and if so, whether they predict the correct scaling law. Details can be found in [185]. 4.3.4.1. Inertial-convective regime. The prediction for the inertial-convective regime of the passive scalar spectrum in the real world is based on Kolmogorov-type dimensional analysis, and was formulated independently by Obukhov [254] and Corrsin [76]. It reads E (k)+C sN eN \k\ for k , k ;k;k , k , 2 -! ) where eN is the energy dissipation rate and the other parameters have the same physical meaning as in the RDT model. C is supposed to be a universal numerical constant, called the -! Obukhov}Corrsin constant. A number of experiments over the last three decades have reported a decade or two of k\ scaling behavior in #ows with su$ciently large Reynolds numbers, with fairly consistent values of the reported Obukhov}Corrsin constant near 0.4. A recent review of the data for the
448
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
inertial-convective regime from experiments may be found in [308]. In this paper, it is pointed out that the Reynolds number (based on the Taylor microscale) required to see a k\ scaling region is approximately 50 in isotropic #ows, but about 1000 for anisotropic #ows. Experiments conducted in anisotropic settings which report departures from k\ scaling in the inertial-convective range [81,145,242] may simply have too low a Reynolds number to manifest the universal Obukhov}Corrsin behavior. There are still some unexplained mysteries, however. The k\ law seems to be more robust than it should be. In particular, it sometimes arises in #ows for which the inertial-convective scales are not locally isotropic [113,236,306]. Moreover, the k\ scaling in the PS spectrum can extend over a range larger than that for which the velocity "eld exhibits inertial-range k\ scaling. Indeed, a k\ scaling in the PS spectrum is reported in some cases where the Reynolds number is insu$cient for the velocity "eld to have any k\ inertial range at all! [145,309] Although the prediction of the Obukhov}Corrsin similarity theory appears to be well borne out by experiments, the reason behind the k\ scaling is not satisfactorily understood. The Obukhov}Corrsin inertial-convective scaling prediction has a good deal of formal similarity to that of the RDT model: E-!(k)+C sN A\k&\ for k , k ;k;k , k . 2 '! # ) The strength of the inertial range eddies is measured by the energy dissipation rate eN in the real world and by A in the RDT model (see Section 4.1.4). The dimensional analysis reasoning behind # the Obukhov}Corrsin prediction works perfectly in the RDT model. Consequently, a number of other simple heuristic considerations (based on spectral #ux considerations, for example) will also automatically predict the correct inertial-convective scaling form in the RDT model, provided they only involve the inertial-range form of the turbulent energy spectrum and the passive scalar dissipation rate! 4.3.4.2. Viscous-convective (high Schmidt number) regime. The PS spectrum was predicted by Batchelor [28] to have the following viscous-convective scaling form in a real turbulent system: E (k)+sN /ck for k ;k;k . (291) 2 ) He argued, based on empirical considerations, that it would be su$cient to consider the passive scalar dynamics on this range of scales in a steady uniform strain #ow, and deduced Eq. (291) where c is the maximum strain rate. Kraichnan [179] investigated the e!ect of #uctuations in the velocity "eld through consideration of his Rapid Decorrelation in Time model, arriving at the same prediction as Batchelor, up to a numerical multiplicative constant (see Table 13). Chertkov et al. [65] generalize the prediction of a k\ law to "nite correlation times as well by reducing the problem to that of the stretching of lines by a spatially uniform straining "eld in the presence of weak molecular di!usion. We see thus that there exists strong theoretical support for a k\ viscous-convective scaling of the PS spectrum. Note that the scaling law (291) could also be predicted (up to numerical constant) through an accounting of the obviously relevant parameters and simple dimensional analysis, in a fashion entirely parallel to the Obukhov}Corrsin theory for the inertial-convective regime. Until recently, the k\ scaling prediction also enjoyed experimental con"rmation for a number of passive physical quantities in various turbulent systems [123,129,251,271]. Some of these
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
449
"ndings [123,129,251] were later criticized [113,242], however, and some recent high Schmidt number experiments [171,242,338] fail to "nd k\ spectral scaling despite their ability to resolve the viscous-convective scales. The viscous-convective k\ scaling law is thus under current controversy. 4.3.4.3. Inertial-di+usive (low Schmidt number) regime. As we mentioned before, in the inertialdi!usive regime there are too many `obviously relevanta parameters to predict a unique scaling law by an assumption of complete self-similarity, as was possible in the other regimes previously discussed. There is in fact a controversy over the scaling exponent in the inertial-di!usive range. Batchelor et al. (BHT) [29] argue through an approximate consideration of the passive scalar dynamics in Fourier space, that the passive scalar spectrum (for a turbulent velocity "eld with realistic temporal correlations) has the following inertial-di!usive scaling form: E &2(k)"(C /16dp)sN eN i\k\ for k ;k;k , (292) 2 ) ) where C is the Kolmogorov constant (E(k)+C eN k\ for k ;k;k ). A competing theory ) ) ) by Gibson [121], which argues that the small-scale strain rate is important for the inertial-di!usive range dynamics, arrives instead at a prediction of a di!erent scaling in a subrange of the inertial-di!usive regime: E%(k)&C sN i\k\ for k ;k;k , 2 % C is a numerical constant undetermined by the theory, and k is the `Batchelor wavenumbera, % de"ned as the inverse of the length scale ¸ at which the molecular di!usion time balances the straining of the velocity "eld. Measurements of temperature #uctuations in mercury (Pr&0.02) [73,290], and numerical simulations [56] are consistent with a k\ spectrum for k ;k;k . Furthermore, [56,73] also show evidence for a k\ range over k ;k;k . The spectra of both theories decay rapidly and ) are di$cult to measure con"dently, however, especially since su$ciently low Schmidt or Prandtl numbers are di$cult to achieve experimentally [136,137,196]. Indeed the reported scaling regimes extend over less than a decade. Moreover, it has been pointed out in [22] that apparent but false k\ scaling regimes can result from noisy physical-space observations when the true spectral scaling is steeper (such as k\). Finally, the experimental data are too scant to check the predicted dependence of the coe$cient of the inertial-di!usive scaling laws on various physical parameters. In [185], the arguments of the BHT and Gibson theories are examined by the second author in the context of the RDT model. Both the BHT and Gibson theories have versions which appear to be sensibly applicable to the RDT model insofar as the theoretical arguments are concerned. The BHT arguments lead to the correct inertial-di!usive asymptotics for the RDT model: E (k)"C A sN i\k\\& for k , k , k ;k;k , 2 '" # ) C "1/8dp . '" Indeed, the BHT real-world prediction has a strong formal similarity with the exact RDT model result, recalling that A in the RDT model plays a similar role as C eN in the real world. On the # ) other hand, the adapted version of Gibson's theory fails in the RDT model. The dynamics of the
450
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
passive scalar in the inertial-di!usive range of scales does not seem to be sign"cantly in#uenced by straining in the RDT model. This is by no means an invalidation of Gibson's ideas or prediction in a real world setting. Rather it gives us an opportunity to illuminate some ingredients which are essential in the Gibson theory. The straining mechanism envisioned by Gibson has at least a crude analogue in the RDT model, but perhaps the fact that the RDT velocity "eld is Gaussian and delta-correlated makes the strain ine!ective in in#uencing the inertial-di!usive range passive scalar dynamics. Indeed, numerical simulations [122] indicate that sustained compression events and anomalously long-range straining correlations appear to be necessary for the straining to have a strong e!ect on these scales. The outcome of the RDT model calculation might then be viewed as an analytical complement to these numerical "ndings. Even allowing for this source of discrepancy, there is an assumption made in Gibson's theory which is surprising in light of the exact results of the RDT model. From the premise that straining plays an important role in the passive scalar dynamics in the inertial-di!usive range, Gibson formulates a pair of similarity hypotheses which leads to a prediction for the PS spectrum in a subrange of the inertial-di!usive spectrum which involves fewer parameters than the exact scaling form for the RDT model in this range. Given that the RDT model is a simpli"cation of real world turbulence, this is a puzzling outcome, especially since Gibson's prediction comes not from explicit computation, but from soft self-similarity arguments. Indeed, there seems to be a gap in the logical arguments leading up to the k\ prediction. A full discussion may be found in [185]. 4.4. Higher-order small-scale statistics of passive scalar ,eld It is a remarkable feature of the RDT model that closed linear PDEs can be written not only for the mean passive scalar density and second-order correlation function, but for all higher-order correlation functions
, P (+xH,, , t), ¹(xH, t) , H H as well. That this could be done in principle was pointed out in [244,246]. The "rst explicit use of these equations for the study of higher-order statistics of passive scalar #uctuations within the inertial range of scales of a turbulent #ow was accomplished by the "rst author [206] for the case of a freely decaying passive scalar "eld in a turbulent shear #ow. The one-point statistics of the decaying passive scalar (without pumping) were shown to be broader than Gaussian in this setting, and the multipoint statistics within the inertial range were demonstrated to be non-Gaussian. 4.4.1. Anomalous scaling of turbulence structure functions Kraichnan [183] subsequently proposed that the passive scalar structure functions in quasiequilibrium SH(r),1(¹(x#r, t)!¹(x, t)),2 (293) * , should exhibit anomalous scaling within the inertial-convective range of scales in the RDT model. By anomalous scaling in this context is meant that SH(r)JrD, for ¸ , ¸ ;r;¸ , ¸ , ) ,
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
451
where the exponents f are not equal to the values which would be predicted by a complete , self-similarity hypothesis [24,25]. A manifestation of anomalous scaling which is observed in experimental inertial-range measurements of both temperature and velocity in high Reynolds number #ows with fully developed turbulence [5,309] is that f ONf . , As we now explain, the violation of the scaling relation f "Nf implies that the small-scale , passive scalar #uctuations both manifest strong non-Gaussianity (`intermittencya) and depart strongly from Kolmogorov's and Obukhov's original completely self-simliar theory. Using realworld parameters for the moment, the only dimensionally consistent inertial-range asymptotic form of SH(r) which depends purely on the scalar dissipation rate sN , the separation distance r, and , the energy dissipation rate eN is SH(r)&C sN ,eN \,r, for ¸ , ¸ ;r;¸ , ¸ , (294) , ) , with dimensionless universal constants C . The asymptotics (294) correspond to the normal scaling , relation f "Nf . For normal scaling to be violated, some additional parameter must be involved , in the inertial-convective range statistics. This extra parameter is expected to be a physical length scale ¸, which permits the following dimensionally consistent inertial-range anomalous scaling law for the passive scalar structure functions:
r ?, for ¸ , ¸ ;r;¸ , ¸ , SH(r)&C sN ,eN \,r, ) , , ¸
(295)
where a "f !N/3 and the +C , are a sequence of dimensionless, universal constants. , , , , Natural candidates for the length scale ¸ entering the asymptotics are the integral length scale ¸ , the pumping length scale ¸ , the passive scalar dissipation length scale ¸ , and the Kolmogorov dissipation length scale ¸ ; more exotic possibilities are mentioned in [64]. The current conven) tional wisdom, based on experimental observations and numerical simulations at various Reynolds numbers, is that ¸"¸ (and ¸"¸ for velocity structure functions, but ¸ and ¸ are typically of the same order.) HoK lder inequalities ([288], Section 6.2) imply in this case that a 40, or , equivalently, f 4Nf . We remark that, at least in principle, there could be several length scales , appearing as anomalous scaling factors [201]. The inertial-range form (295) is said to be incompletely self-similar in the terminology of Barenblatt [24,25], in that the length scale parameter ¸ enters into the asymptotics, but only through a power law, even though it is not an `obviously relevanta parameter in the asymptotic regime of interest. Indeed, the inertial-convective range of values of r is, by de"nition, far removed from any physical length scale (¸ , ¸ , ¸ , etc.). ) One implication of Eq. (295) with ¸"¸ is that the temperature #uctuations becoming increas ingly intermittent (broader-than-Gaussian) on smaller length scales throughout the inertial range since the #atness factors r D,\,D SH (r) C 1(¹(x#r, t)!¹(x, t)),2 " , & , for ¸ , ¸ ;r;¸ , ¸ ) 1(¹(x#r, t)!¹(x, t))2, ((SH(r)), (C ), ¸ diverge as r/¸ is made small (since f (Nf ), whereas they assume constant values (2N)!/2,N! for , Gaussian random "elds. The physical mechanism generally believed to underlie this small-scale intermittency in both the velocity and passive scalar "elds is a spatially nonuniform transfer of energy or passive scalar variance from large to small scales, producing intricately interlaced regions
452
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
of strong and weak turbulent dissipation. A variety of phenomenological cascade models encoding variations of this notion have been proposed to predict the manner in which f should vary as , a function of the order of the moment N; see the references in [309]. None of these models is, however, directly connected to the Navier}Stokes equations or advection}di!usion equation. Indeed, it is still unknown how to derive analytically the inertial-convective scaling laws for the second-order structure function of the passive scalar and velocity "elds from the primitive equations, and the higher-order statistics provide an even greater challenge. The fact that the RDT Model permits closed PDEs to be written for passive scalar correlation functions to all order, however, gives hope that perhaps anomalous inertial-range scaling could be mathematically derived from the basic equations in this model. The second-order structure function can be written as an explicit quadrature using Eq. (284) and the simple relation S *(r)"2(P *(0)!P *(r)) obtained by binomial expansion of the de"nition of Eq. (293). A direct asymptotic analysis [183] produces the rigorous inertial-convective range scaling formula: 2 sN (D' )\r\& for ¸ , ¸ ;r;¸ , ¸ , SH(r)& ) d(2!2H) *
(296)
which is completely self-similar because it only involves the obviously relevant parameters sN , D' , * and r (see Paragraph 4.3.3.3). The challenge raised in [183,206] is to compute the higher-order structure functions from the basic RDT model equations and to unambiguously demonstrate an anomalous scaling formula such as
r D,\,\& SH (r)&C sN ,(D' )\,r,\& , * ¸ , (297) "C sN ,(D' )\,¸,\&\D,rD, for ¸ , ¸ ;r;¸ , ¸ , ) , * where the length scale ¸ appearing in the anomalous correction is to be identi"ed and C is , a universal sequence of constants. Of course, the truth could be even more complicated than Eq. (297) through, say, the anomalous appearance of multiple length scales, a nonuniversal dependence of C on details of the pumping, or a dependence of S *(r) on r/¸ which does not reduce , , to a power law in the inertial-convective range. The problem of anomalous scaling in the RDT model has been since attacked through a wide variety of means, some of which we will brie#y summarize in Sections 4.4.2 and 4.4.3. There is some controversy regarding these computations, since di!erent groups proceeding from di!erent assumptions predict con#icting values of the anomalous scaling exponents. A clearcut demonstration of anomalous scaling proceeding directly from the basic governing equations without additional assumptions has thus far only been accomplished in further simpli"ed versions of the RDT Model. We brie#y discuss this work in Section 4.4.5, after reporting the results of some numerical simulations in Section 4.4.4. 4.4.2. Exact equations for scalar structure functions in RDT model We now describe three basic mathematically exact representations for the passive scalar structure functions S *(r) which have been used as the basis of quantitative investigations of , anomalous scaling.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
453
4.4.2.1. Representation through solution of the closed PDE. The "rst proceeds by expressing SH(r) via , binomial expansion of Eq. (293) as a sum of Nth-order quasi-equilibrium PS correlation functions
, P *(+xH,), ¹(xH, t) , , * H with spatial arguments +xH,, evaluated at either x or x#r. These quasi-equilibrium PS H correlation functions in turn obey a recursively solvable, linear system of elliptic PDEs: M PH(x,2, x,)"! U((xK!xL)/¸ )P* (+xH, ), (298) , , ,\ H$KL XKLX, * where P ,1, P* "0 and the elliptic operators M have the form \ , , (299)
' (D (xH!xHYp) ' ) . M ,i D ! H HY , H H XHHYX, The subscripts on the di!erential operators indicate the label of the observation point xH on which they act. The velocity structure tensor D (r) was de"ned in Eq. (257), and we assume it to have the same wide inertial-range structure as de"ned in Paragraph 4.2.2.1. The PDE (298) is accompanied by some large-scale boundary condition or decay condition to render the problem well-posed. Of course, we really want to identify P * with the long-time asymptotic solution of the evolution , equation obtained by adding a R/Rt operator to the left-hand side of Eq. (298). Various mathematical [185,206,208,244,246,349] and formal [95,117,164,198,295] derivations of Eq. (298) and special cases thereof have been o!ered. Though the equations are explicit, this description of the PS structure functions involves a number of subtleties and technical di$culties in any practical computation with N'2. First of all, the fundamental PDE (298) is of very high dimension. The symmetries implied by the statistical homogeneity and isotropy of all the random "elds help only somewhat [64]. The structure of the di!erential operator M also poses analytical di$culties; the components of the variable coe$cient , tensor D (xH!xHY) vary from zero when the observation points coincide (xH"xHY) to large quantities of order D' ¸& when the observations points are well separated ("xH!xHY"9¸ ). The * expression of the structure function in terms of the correlation function has the unfortunate feature of weights of mixed sign; for example, SH(r)"2P*(x, x, x, x)!8P*(x#r, x, x, x)#6P *(x#r, x#r, x, x) . (300) This means that one must be cautious about assuming that dominant contributions to P * are , dominant contributions to the structure function S*(r); there are cancellations possible. Deducing , the inertial-convective asymptotics of S *(r) also involves some technical delicacy. Because we are , concerned with r<¸ , one may wish to remove molecular di!usion from consideration, but note that the structure function involves evaluation of the multipoint correlation functions P * in regions , where points coalesce and the molecular di!usion operators dominate some of the turbulent di!usion operators. It has been rigorously shown [92,93] that Eq. (298) does have a iP0 limit which behaves regularly in the space of mean-square integrable functions (¸). Since the structure function involves evaluation of P * at special points, however, this regularity result does not rule out ,
454
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
singular or subtle behavior of SH(r) in the iP0 limit. To our knowledge, there are not any other , rigorous results concerning the properties of solutions to the PDE (298). 4.4.2.2. Single radial variable representation. Since S *(r) is a function of a single variable, it is , natural to want to work with equations of one variable rather than the delicate many-dimensional PDEs described above. This is in fact the approach adopted in Kraichnan's paper [183], in which he deduces a relatively simple equation for S * (r): , 1 d dS * (r) rB\D (r) , "iJ (r) , (301a) , , rB\ dr dr
where J (r),2N1(d¹(r)),\H(d¹(r))2 . , In this last expression,
(301b)
d¹(r),¹(x#r)!¹(x) (for any choice of x, by spatial homogeneity), and H(d¹(r)),21Dr(d¹(r))"d¹(r)2
(301c)
is de"ned as the expected value of 2Dr(d¹(r)), conditioned upon a given value of the scalar increment d¹(r). (The function H(d¹(r)) is akin to, but more complicated than, the conditional dissipation rate [268] which will be discussed in Paragraph 5.4.1.1.) The di!erential equation (301) for the structure function can be deduced from Eq. (298), but a more direct route proceeds through the Fokker}Planck equation for the probability density function (PDF) for the passive scalar increment d¹(r) [184]. The main drawback to the one-dimensional representation (301) of the structure functions is that it is not fully closed; J (r) cannot be expressed in terms of S * (r) in any , , known exact way. 4.4.2.3. Representation through Lagrangian trajectories. Finally, the passive scalar structure functions S * (r) may be expressed in terms of the statistical trajectories of 2N tracers initially , distributed at two points separated by a distance r. Indeed, for any velocity "eld model, the passive scalar correlation function P (+xH,, t) of any order N may be precisely related to the joint statistics , of the Lagrangian trajectories of N tracers starting from +xH,, and moving simultaneously H through the random #ow [36,62,115,185]. The simpli"cation a!orded by the RDT model is that the tracer trajectories obey (coupled) stochastic di!erential equations with deterministic coe$cients [108,185]. The advantage to this Lagrangian approach is that one must analyze a "nite system of stochastic ordinary di!erential equations rather than a PDE. On the other hand, one must explicitly deal with random processes and study the parametric dependence of the tracer trajectory statistics on the initial separation distance r. Some authors [23,62] set up their Lagrangian framework in terms of functional integrals instead of stochastic di!erential equations. 4.4.3. Calculation of anomalous scaling exponents in RDT model A large number of theoretical approaches, based on additional assumptions or formal approximations, have been proposed which permit a tractable computation yielding quantitative
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
455
predictions for the anomalous scaling exponents of the PS structure functions in the RDT model. Unfortunately, the results produced from some of the approximate methods disagree with one another. We shall now brie#y summarize some of the main lines of this theoretical research. 4.4.3.1. Closure by linear ansatz for H(d¹(r)). Kraichnan [183] put forth a simple closure proposal for Eq. (301) which is equivalent [184] to assuming that H(d¹(r)) is a linear function of d¹(r) (multiplied by a uniquely determined function of r). This implies that J (r) is proportional to , NS * (r) times a function of r which is independent of N. If S * (r) has an inertial-convective scaling , , law of the form (297), it then follows that the scaling exponents must satisfy the following anomalous law: 1 1 f " (4Ndf #(d!f )! (d!f ) . , 2 2
(302)
Note that the exponent f is proportional to (N for large N, rather than a linear function of N as , in the case of normal scaling. Fairhall et al. [95] supported Kraichnan's closure hypothesis through the formal derivation of `fusion rulesa [202] for how a scaling exponent characterizing the inertial-convective range scaling properties of the multipoint correlation function P * (+xH,) is , related to its local behavior when two or more of the spatial arguments are made to coalesce. These authors also argued that ¸"¸ was the only length scale which could self-consistently enter the anomalous scaling formula (297). Other analytical support for the anomalous scaling law (302), proceeding from deeper assumptions about nice behavior of certain statistical functions, was given in [184]. 4.4.3.2. Perturbative analyses. A series of later works by various groups attacked the computation of the scaling exponents through various perturbation expansions in the PDE (298). Leading order anomalous deviations of the exponents f from normal scaling were calculated in the asymptotic , limits of small Hurst exponent [35,116,275] HP0, large dimensionality [63,64] dPR, and large Hurst exponent [278] HP2. We note that the large d expansion is motivated by the representation of the di!erential operator M for large d in terms of N(N!1)/2 distances between its various , arguments, which takes the form of a sum of a relatively simple di!erential operator of order d and a more complicated di!erential operator of order d. A common theme of these perturbative calculations is that they seek `zero-modea solutions Z (+xH,) of M , by which is meant that M Z , , , , approximately vanishes when the observation points +xH,, all have separations within the H inertial-convective range of scales, and is some unspeci"ed but regular function otherwise. The idea is that a particular solution with normal scaling can be subtracted from P * to cancel (to leading , order) the inhomogeneity on the right-hand side of Eq. (298) when the separations between the observation points all fall within the inertial-convective range. Each zero mode for a given order N is then assumed to have inertial-range scaling characterized by a single exponent, which is computed by perturbation about some limit in which the homogenous solutions of M can be , explicitly constructed. An argument based on either crude matching or continuity with the unperturbed limit [64,116] is then made to show that a certain zero mode provides the dominant contribution to the structure function S * and determines its inertial-convective scaling exponent , f . The general conclusion is that there is anomalous scaling for the passive scalar structure ,
456
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
function of the form (297), with the length scale ¸"¸ . The Kolmogorov dissipation length ¸ is set ) equal to zero from the start in all these works, and anomalies with respect to the scalar dissipation length scale ¸ are thought not to occur based on rough matching arguments to small scales indicating su$ciently regular behavior of P* as ¸ is made small relative to the largest separation , between the observation points, even when some points coalesce [64]. The results from the perturbation theories are mutually consistent with each other in their domains of overlap, but contradict the law (302) for the scaling exponents deduced from the hypothesis of the linearity of H(dr). These perturbation theory analyses of zero modes have recently been interpreted in terms of the statistics of Lagrangian tracer trajectories by Bernard et al. [36] as well as Gat and Zeitak [115]. The zero modes are associated to statistical `shapea con"gurations of a "nite number of Lagrangian tracers which relax slowly in time to their asymptotic shapes [36,115]. A large dimension dPR perturbation analysis carried out directly on the stochastic equations of motion for the Lagrangian tracers [115] recovers the same results as the dPR perturbation theory based on zero modes of the PDE's for the PS correlation functions [63,64]. We "nally mention a perturbative approach for HP0 pursued within a renormalization group framework by Adzhemyan et al. [2], which recovers the results of the anomalous scaling predictions from the perturbative zero mode analyses in [35,116,275]. 4.4.3.3. Theories predicting constant asymptote of anomalous scaling exponent. A distinct theory, motivated by studiesof the randomly driven Burgers equation [267], has been o!ered by Yakhot [343]. This work puts forth an approximate closure for the Fokker}Planck PDE for the probability density function (PDF) for the scalar increment d¹(r), and seeks a solution for this PDF with a scale-invariant form over the region ¸ , ¸ ;"r";¸ , "d¹(r)";1¹(x)2 . ) The resulting prediction is that the passive scalar structure functions S* (r) exhibit normal scaling , (f "Nf ) up to some value N , after which the scaling exponents remain approximately , f "f . This prediction does not agree with either constant, asymptoting to a "nite value lim , , Eq. (302) or the results of the perturbation theories. The discrepancy with the latter is attributed in [343] to assumptions in that work which are not uniformly valid over all H and d and which require modi"cation in the asymptotic regimes considered by the perturbation theories. The conclusion in [343] that the structure function scaling exponents f should asymptote to , a constant for large N is supported for (H(1 by asymptotic `instantona calculations of Chertkov [62] and Balkovsky and Lebedev [23], though all three papers disagree as to the value of this constant and other quantitative details. The instanton procedure, put forth in a general turbulence context in [96], seeks to describe the tails of the PDF (and thereby the high-order moments) of statistical quantities such as d¹(r) through a functional path-integral formalism inspired by the mathematical theory of quantum mechanics. A similar quantum mechanical analogy had previously been exploited by the "rst author in [206] to analyze higher-order passive scalar correlation functions in a random shear #ow. The high-order moments in the instanton formalism are governed by a sort of semi-classical limit in which the dominant contributions to the functional integral are determined by saddle points in function space of a certain action functional I(d¹). These saddle points correspond in the quantum mechanical analogy to classical trajectories and are sometimes referred to as instantons; in turbulence applications, the instantons represent
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
457
certain motions of Lagrangian #uid elements. The identi"cation of the instanton formally indicates the physical process responsible for the shape of the tails of the PDF of d¹(r) and the intermittency in structure functions of very large order. Completely di!erent physical processes, however, may play the dominant role in producing intermittency of passive scalar structure functions of accessibly low order [61]. The saddle-point equations are generally much too di$cult to solve in general, and one typically proceeds by constructing special approximate instanton solutions, arguing by some auxiliary means which of these should be the dominant contribution, and assessing whether #uctuations about the instanton solutions are relevant [62,96]. Another approach is to seek to solve the instanton equations in some perturbative limit, such as large dimension dPR [23]. 4.4.3.4. Discussion of approximate theoretical approaches. All of the above analytical arguments for anomalous scaling of the passive scalar structure functions involve a number of assumptions of varying degrees of plausibility which are di$cult to verify with full con"dence. Anomalous scaling is an inherently subtle subject, and may well involve the violation of certain `reasonablea beliefs. Indeed, some of the above theories produce con#icting predictions for the anomalous scaling exponents, and it is still unclear which of the theories' plausible assumptions fail and why. Some possibilities are suggested in [114]. The situation would clearly be clari"ed by some unambiguous results involving no assumptions subject to dispute. 4.4.4. Empirical assessment of theoretical predictions concerning anomalous scaling The usual means of testing physical theories are di$cult to apply to the issue of anomalous scaling in the RDT model. As we have indicated in Paragraph 4.4.2.1, the fundamental equations (298) for the high-order passive scalar correlation functions in the RDT model are very di$cult to quantitatively analyze in a mathematically rigorous fashion, and we are not aware of any such work which bears directly on the anomalous scaling of the scalar structure functions S*(r). , Comparison with experiments is not really feasible, since the RDT model velocity "eld has unphysical temporal correlations, though we note that Ching et al. [69,70] found some support for the hypothesis [95,183,184] that H(d¹(r)) is a linear function of d¹(r) in a real turbulent wake. An accurate numerical solution of the closed PDEs (298) for N'2 is too expensive for modern machines both due to the high-dimensionality and the wide range of scales which must be resolved. (Gat et al. [114] and Pumir [276] solve numerically for the scaling exponent characterizing the zero modes of Eq. (298) for N"3, and "nd good agreement with the HP0 and HP1 perturbation theories. While this work is instructive, it still involves the introduction of additional assumptions and does not directly compute the PS structure functions.) Direct numerical simulation (DNS) of the passive scalar advection}di!usion equation with a rapidly decorrelating velocity "eld is di$cult even in d"2 dimensions due to the need to handle the rapid temporal #uctuations, resolve a wide range of spatial scales, and to collect statistics with su$cient quality to compute the higher order scalar structure functions S*(r) accurately as r varies , throughout the wide inertial-convective range [61,340,334]. Some DNS studies have been conducted by Kraichnan et al. [61,184] and Fairhall et al. [94], in which a two-dimensional velocity "eld is constructed by rapidly sweeping a superposition of two steady random velocity "elds past each other. The spatial structure of the steady component "elds is generated by a hierarchical version of the Fourier method which is discussed in Section 6.2.2. The DNS studies [61,94,184] show
458
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
qualitative agreement with the predictions of Kraichnan's linear closure ansatz for structure functions up through order 10 and Hurst exponents ranging from 0.3 to 0.75, but there are some statistically signi"cant quantitative discrepancies. The function H(d¹(r)) obtained from this data is indeed approximately linear for "d¹(r)"9(S*(r)), but manifests a noticable bump for "d¹(r)":(S*(r)) [184]. This particular "nding is in qualitative agreement with one prediction of Yakhot's anomalous scaling theory [343], and could indicate a possible departure from the more general fusion rules formulated for the RDT model [95,202] as well as Kraichnan's linear closure hypothesis. The scaling exponents, even of the fourth-order structure function, are shown in [61] to be very sensitive to slight changes in H(d¹(r)) from a linear form. Unfortunately, some of the asymptotic regimes studied by perturbation theories are not amenable to DNS studies: the dPR and NPR limits are inaccessible for obvious reasons [61], and the quality of scaling within the inertial-convective range is seriously degraded as HP0 because the di!usion length ¸ invades the inertial range [33,340]. A more e$cient means of numerically computing the passive scalar structure functions in the RDT model, realized and developed by Frisch et al. [108] (and independently proposed in [115]), is the Monte Carlo numerical simulation of the tracer trajectories (see Section 6). The passive scalar structure functions S * (r) can be numerically computed at any value r by a Monte Carlo simulation , of 2N particles, starting from N#1 di!erent initial clusterings at two locations separated by a distance r. The computational advantage of this trajectory-based approach is that one need only track a "nite system of stochastic di!erential equations with deterministic coe$cients rather than resolve a PDE on a full spatial grid. One ostensible drawback is the need to repeat the simulations to compute S * (r) for each new value of r. Frisch et al. [108] however circumvent this expense by , noting that the inertial-convective scaling exponent f of the structure function (297) can be , obtained by observing the scaling with respect to the length scale ¸ breaking complete selfsimilarity, which is widely believed to be the pumping length scale ¸ . The other main concern, in common all Monte Carlo simulations, is the need to simulate a large number of independent realizations so that good statistics can be obtained [115]. The fourth-order structure function was computed in [108] by simulating millions of tracer trajectories in d"3 dimensions, with the Hurst exponent of the velocity "eld ranging from H"0.1 up to values very near H"1. The method was validated by comparison of the simulated second-order structure function against the exactly known result (see Section 4.3.2). The numerically computed scaling exponents of the fourth order structure function depart strongly from the prediction (302) of Kraichnan's linear closure ansatz. (The disagreement is not as bad for values H+1 which were previously computed by direct numerical simulations of the advection}di!usion equation [94,184].) The numerical result from Monte Carlo simulations for the fourth-order structure function with H"0.1 is roughly consistent with the prediction of the HP0 theory, but this limit could not be resolved numerically. 4.4.5. Anomalous scaling in further simplixed versions of RDT model In an e!ort to obtain clear and unambiguous results regarding several controversial issues regarding anomalous scaling in the RDT model, some researchers have studied related models which are simpler to analyze. Vergassola and Mazzino [334] realized that a one-dimensional version of the RDT model could be sensibly formulated, provided that the incompressibility condition on the velocity "eld is removed. The velocity "eld is chosen to be delta-correlated in time with an inertial-scaling range just like the RDT shear #ow in Section 4.1.4, except that the velocity
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
459
is directed along the x-axis rather than transverse to it. One may worry that the removal of incompressibility may fundamentally change the character of the model, but the rapid #uctuations of the velocity "eld help to mitigate compressibility e!ects. Indeed, it may be shown by a quadrature computation that the second-order passive scalar structure function S *(r) scales in the 1-D RDT model as r\& in the inertial-convective range of scales, just as in the incompressible RDT model. The one-dimensional advection}di!usion equation with a rapidly decorrelating velocity "eld was numerically simulated by [334] for H" using a pseudo-spectral code, producing clean scaling behavior for S*(r), S *(r), and S*(r) over one to two decades of scales. The scaling exponents obtained from the slope of log-log plots exhibited anomaly (f ONf ). The values of these scaling , exponents were found moreover to agree well at H" with a PadeH approximation constructed from the small Hurst exponent expansion technique in [35,116] adapted to the 1-D compressible analogue of Eq. (298). Vergassola [333] also considered an RDT model version of the kinematic dynamo equations for the magnetic "eld, and showed that the second-order structure function in this model already possesses an anomalous scaling exponent f in the sense that it is not related to H by standard dimensional analysis considerations (as it is in Eq. (296)). He derived an exact expression f as a function of H, and showed that it was consistent in the HP0 limit with a prediction based on a small Hurst exponent expansion [35,116] for the scaling exponent of the zero modes of a closed equation for the correlation function analogous to Eq. (298). These results provide some support for the anomalous scaling predictions based on expansions of the Hurst exponent [35,116,278], but the reason for the disagreement with the competing theory [95,183,184] based on the linearity of H(d¹(r)) remains to be clari"ed [114]. We "nally make mention of an RDT `shell modela of Benzi et al. [33,340] in which Fourier space is discretized into geometrically distributed shells, and the nonlinear advective interaction between velocity and passive scalar Fourier shell modes is truncated to nearest neighbors. The Fourier components of the passive scalar correlation functions then satisfy band-diagonal systems of ordinary di!erential equations with respect to time. The equations for the second- and fourth-order correlation functions can be solved numerically, with the result that the second order structure function S*(r) scales normally in the inertial-convective range with exponent f "2!H whereas the fourth order structure function S*(r) scales anomalously [340]. Direct Monte Carlo simulations of the advection}di!usion equations in the RDT shell model con"rm these conclusions [340]. Stability analysis of the equations for the PS correlation functions shows that the e!ects of molecular di!usion and large-scale pumping have negligible e!ects in the RDT shell model on the scaling properties of the passive scalar correlation function deep within the inertial-convective range [33]. This is in agreement with the conventional wisdom for the continuous RDT model [64,95]. However, the insensitivity to molecular di!usion in the RDT shell model is not uniform in the HP0 limit, because the scalar dissipation length scale ¸ strongly invades the inertial range [33]. 4.4.6. Other applications of RDT model to higher-order scalar statistics The simpli"cation of the temporal structure in the RDT model permit a number of other issues concerning the small-scale structure of the passive scalar "eld to be examined quantitatively. We refer the reader to the papers [164,292,295] wherein statistical properties of the gradient and level set contours of the passive scalar "eld are studied in an RDT velocity "eld without an inertial range.
460
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
5. Elementary models for scalar intermittency Small-scale intermittency, of the type just discussed in Section 4.4, has been a recognized feature of turbulence since 1962, when Kolmogorov [170] and Obukhov [255] modi"ed the original 1941 turbulence theory to account for its e!ects. The small-scale intermittency in both the velocity and advected scalar "elds in a turbulent #ow is typically associated with the patchiness and irregularity of regions of strong vorticity and scalar gradients [72,309]. Statistics which involve evaluation of the turbulent "eld at two closely spaced points are sensitive to small-scale intermittency, and can exhibit anomalous scaling with respect to Reynolds number or separation distance between the observation points. Single-point recordings, on the other hand, measure the total #uctuation coming from all scales, and are dominated by contributions from the large scale #uctuations because they have the larger amplitude. They are therefore insensitive to small-scale intermittency. It was long thought that, at su$ciently high Reynolds number, the single-point measurements of the velocity "eld and of the passive scalar "eld in homogenous turbulence ought to exhibit the Gaussian statistics typical of noisy processes with many degrees of freedom. Indeed, nearly Gaussian behavior for these quantities was observed in several experiments [106,311,316,324,326] dating back to 1947. An intriguing development came in the late 1980s with the report that single-point temperature measurements in a Rayleigh}Benard convection cell experiment at the University of Chicago exhibited large #uctuations with greater frequency than what would arise from a Gaussian distribution [55,135]. More precisely, the single-point probability density function (PDF) p (o) for 2 the temperature ¹, de"ned by
Prob+a(¹(b,"
@ p (o) do 2 ?
was found to decay only exponentially &C e\!M for large values of o, and not like a Gaussian &C e\!M. Broad tails in the single-point PDF indicate unusual activity occurring at the large scales, such as large random coherent structures amidst the turbulent `noisea. This property of large #uctuations in the single-point statistics of a turbulent #ow occurring with a signi"cantly super-Gaussian probability is therefore often referred to as large-scale intermittency. To understand how this might come about, recall that in a Rayleigh}Benard convection cell, the lower face of a cube of #uid is maintained at a temperature hotter than that of the upper face, while the side walls are insulated. When the applied temperature di!erential is signi"cantly strong, the temperature pro"le becomes unstable to a large-scale convection rolling pattern, with hot #uid rising on one side of the cell and cold #uid descending on the other [293]. Superposed on this mean circulation are turbulent #uctuations, and one may envision the large-scale intermittency of the measured temperature as coming from the occasional passage of ascending (descending) plumes of hot (cold) #uid past the probe [135]. Another striking feature about the temperature PDF measured in this experiment beyond its exponential tails is its universality with respect to Rayleigh number (a nondimensionalized measure of the vigor of buoyant convection relative to dissipative processes). These "ndings stimulated theoretical inquiry into whether the departure of the temperature from Gaussian behavior was due to the complex #ow and velocity}temperature interactions in the
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
461
convection cell, or whether it might arise more generally. Pumir et al. [277] analyzed a phenomenological turbulent mixing model, and predicted that exponential tails should be observed in the PDF of a passive scalar advected by a moderate Reynolds number #ow when a constant mean scalar gradient is imposed by the boundary conditions. That is, neither buoyancy nor a wide inertial range ought to be necessary for the appearance of large-scale scalar intermittency. Sinai and Yakhot [298] considered the general evolution of a freely decaying passive scalar in a statistically stationary velocity "eld (with no mean scalar gradient imposed). Through a closure approximation, they predicted that the scalar PDF would develop broad, algebraic tails (p (o)&C o\@ for 2 large o) in the long time limit. These theoretical developments in turn prompted experimental investigation of the statistics of temperature #uctuations small enough in magnitude to be considered passive. Gollub and coworkers [127,191] conducted an experiment directly suggested in [277]. A weak temperature di!erential was applied to opposite sides of a desktop cell in which the #uid was stirred by a oscillating grid. For Reynolds numbers &10, the temperature displays broad exponential tails in its PDF, while the turbulent velocity "eld has a short inertial range and a Gaussian PDF. Meanwhile, Jayesh and Warhaft [146,147] studied the PDF of the temperature and velocity in a wind tunnel with grid-generated turbulence. A mean temperature gradient is impressed on the #ow only at the inlet, and the velocity and temperature #uctuations decay freely in the tunnel. Despite the di!erences from the setup of Gollub et al., similar results were found: at comparable Reynolds number, the velocity PDF quickly relaxes to a Gaussian form, while the temperature PDF exhibits broad exponential tails far downstream of the grid. The temperature PDF in both experiments, however, reverts to a Gaussian form if the Reynolds number does not exceed some moderate value. Jayesh and Warhaft also observed Gaussian behavior for the temperature when a mean temperature gradient is not impressed on the #ow. Large eddy [237,238] and direct numerical simulations [91,111,153,155], which were pursued even earlier and in a di!erent spirit than the above-mentioned laboratory experiments, also indicate a similar variety of both Gaussian and non-Gaussian behavior for turbulently advected active and passive scalars. We shall quickly review some of the main results from physical and numerical experiments in Section 5.1. In order to investigate some of the issues regarding large-scale passive scalar intermittency raised by the above-mentioned laboratory and numerical experiments, the "rst author designed a turbulent di!usion model for which the single-point passive scalar PDF could be computed exactly [207]. The velocity "eld in this Random ;niform Jet model was taken as a superposition of two parallel uniform shear #ows, one with deterministic variation in time modelling a mean #ow, and the other with random and rapid #uctuations modelling the e!ects of moderate Reynolds number turbulence. Beyond the speci"cation of the form of the velocity "eld, no further assumptions are introduced; the advection}di!usion equation can be analyzed in its exact form. Though the model velocity "eld is relatively simple, the single-point statistics of the advected passive scalar "eld display a rich behavior. For example, the scalar PDF approaches a Gaussian or non-Gaussian form at long times, depending on the dynamics of the mean shear #ow. The long-time limiting shape of the scalar PDF is universal in a restricted sense: it depends on three parameters involving only the large-scale features of the initial data and one parameter involving both the #ow parameters and the large-scale features of initial data. We will present the Random Uniform Jet model in Section 5.2, and point to some qualitative connections between its "ndings and the results
462
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
of numerical and laboratory experiments. This discussion draws from the original paper [207] and subsequent elaborations by McLaughlin and the "rst author [233] and by Resnick [282]. Further issues concerning passive scalar intermittency were studied by Bronski and McLaughlin [51] in a variation of the Random Uniform Jet model in which the random shear #ow is taken to be periodic, rather than uniform, in space. While this Random, Spatially Periodic Shear model is not exactly solvable in the same sense as the original Random Uniform Jet model, the passive scalar PDF can still be analyzed in the long-time limit through precise asymptotic expansions. The results of these direct computations can be compared with the predictions of homogenization theory (see Section 2), which applies rigorously in a certain long-time asymptotic limit in which the initial data is simultaneously rescaled to larger scales. As we shall discuss in Section 5.3, the scalar PDF in the Random, Spatially Periodic Shear model displays substantial qualitative di!erences from the homogenized description when evaluated at large but "nite times with the initial data varying on a large but ,xed length scale [51]. This is the appropriate limit procedure for studying the long-time characteristics of a particular system, and its departures from homogenization theory indicate some subtleties about the nature of the limit process involved in the latter. We conclude in Section 5.4 with a brief discussion of some other recent theoretical studies which shed light on other aspects of large-scale passive scalar intermittency beyond those present in the exactly solvable simple models described above. Most of this work is based upon phenomenological or formal approximations, in contrast to the exact analysis of the basic advection}di!usion equation presented in Sections 5.2 and 5.3. 5.1. Empirical observations As we have mentioned in the introduction, the PDF of a turbulently advected scalar has been empirically found to have exponential tails in some circumstances and to be Gaussian in others. It would be interesting to establish criteria classifying the ingredients and parameter ranges associated with large-scale scalar intermittency, but much more investigation appears necessary. (See [147] for the suggestion of such a criterion, which however does not appear to explain the results of [238] and [316].) The physical experiments and direct and large eddy numerical simulations do at least suggest, though, that the shape of the scalar PDF depends on the following features and parameters: 1. 2. 3. 4. 5. 6.
whether the turbulence is driven or freely decaying [153,238], whether buoyancy e!ects are important [55,135,155,238], the presence of a mean scalar gradient [147], the presence of a mean shear #ow (compare [147,316]), Reynolds number [147,191], the relative magnitude of the integral length scale of the velocity "eld and the correlation length of the passive scalar "eld [191,311], 7. the time at which the scalar is measured when it is freely decaying. With regard to the last point, the scalar PDF in some freely decaying turbulent systems exhibits strong skewness [311] or broad tails [91] for a while, but slowly relaxes to a Gaussian distribution after a long time.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
463
We are not aware of any positive experimental results concerning the universality of the scalar PDF other than those reported in the Chicago convection experiments [55], but there does not seem to have been much systematic investigation in this direction. 5.2. An exactly solvable model displaying scalar intermittency We shall now introduce a simpli"ed turbulent di!usion model in which the scalar PDF can be exactly analyzed without the need for any further ad hoc approximations or assumptions. Within the model, we will be able to characterize precisely the conditions under which the scalar PDF displays Gaussian or non-Gaussian features, and touch on the issues 4, 6 and 7 listed in Section 5.1. We can also describe the extent to which the limiting shape of the PDF in the long-time limit is universal. 5.2.1. Random uniform jetmodel For our mathematical investigation of passive scalar intermittency, we consider velocity "eld models from a class of three-dimensional jet #ows:
0
*(x, t)"*(x, y, z, t)" c (t)z#c (t)< (x) .
0
(303)
Here, < (x) is a deterministic spatial pro"le, c (t) is a deterministic function of time, and c (t) is
a stationary, Gaussian random function of time with mean zero and correlation function: 1c (t)c (t#q)2"R(q) . The velocity "eld is directed only in the single direction y, and is composed of a deterministic shear #ow c (t)z and a random shear #ow c (t)< (x) varying in transverse directions. The deterministic
component c (t)z is supposed to model a mean shearing motion which responds to some regular,
external forcing. The random component < (x)c (t) represents a spatially coherent motion with random temporal #uctuations, qualitatively modelling an excited instability in the #uid #ow. The class of models (303) thereby mimics some features of a #ow with a moderate Reynolds number not far above the onset of the turbulent activity. To model a high-Reynolds number jet #ow, we could superpose a further shear velocity "eld with random spatio-temporal #uctuations over a wide inertial range of scales (as in Section 3). The empirical evidence suggests, however, that the issue of passive scalar intermittency is already interesting for moderate Reynolds number #ows [147,191]. We will consider the evolution of passive scalar initial data of the form ¹ (x, y, z)"¹M (x, y)#¹I (x, y) , where ¹M (x, y) is a deterministic mean pro"le and ¹I (x, y) is a mean zero, homogenous, Gaussian random "eld of #uctuations about this pro"le, with correlation function
P(x, y)"1¹I (x, y)¹I (x#x, y#y)2"
1
ep EV>IWPK (g, k) dg dk .
(304)
464
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
If the mean pro"le is nontrivial, it will be assumed to have a "nite nonzero integral:
MM "
1
¹M (x, y) dx dyO0 .
(305)
If random #uctuations are present in the initial data, their spectrum PK (g, k) will be assumed to have the following low-wavenumber asymptotics: lim PK (g, k)&"k"\? (g, k) , (306) EI where (g, k) is a smooth function with (0, 0) "nite and positive. The restriction that the initial data vary only in the x and y directions eases the complexity of the formulas; all the qualitative features we shall discuss carry over to case in which the initial data also varies in the z direction [207,233]. We will presently consider a uniform pro"le for the random shear (< (x)"x), and prescribe the random temporal #uctuations c (t) to be either E a white noise process, with correlation function R(q)"A d(q) ,
(307)
or E a superposition of Ornstein};hlenbeck processes, with correlation function + (308) R(q)" A e\OOH . H H The +q ,, are the correlation time scales of the Ornstein}Uhlenbeck processes, while A and H H +A ,, determine the amplitude of the associated shear #ows. The white noise process may be H H thought of as a certain rapid decorrelation limit of the Ornstein}Uhlenbeck process (q Peq and H H A Pe\A with eP0). H H The passive scalar "eld will evolve freely in the statistically stationary turbulent velocity just de"ned, and no mean scalar gradient nor boundary conditions are imposed. The advection}di!usion model we have just de"ned will be called the Random ;niform Jet model: R¹(x, y, z, t) R¹(x, y, z, t) #(c (t)z#c (t)x) "iD¹(x, y, z, t) ,
Ry Rt
(309)
¹(x, y, z, t"0)"¹M (x, y)#¹I (x, y) . The assumptions made concerning the spatial pro"le of the shear and the temporal dynamics permit the single-point statistics 1¹,(x, y, t)2 of the passive scalar "eld to be expressed by explicit quadrature formulas. Later, in Section 5.3, we shall consider a periodic spatial pro"le < (x) for the random shear. We present the large-scale scalar intermittency properties of the Random Uniform Jet model in Section 5.2.2 and qualitatively relate them to the empirical observations. The derivation of these results, in which the passive scalar correlation functions are represented through the solutions of
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
465
quantum-mechanical SchroK dinger equations, will be sketched in Section 5.2.3. In the present case, these SchroK dinger equations describe precisely the motion of a system of particles subject to a harmonic oscillator potential, and may be solved in fully explicit form through Mehler's formula [296]. This methodology was developed by the "rst author [207] and extended together with McLaughlin [233] for the white noise temporal structure of the random shear. Resnick [282] generalized the analysis to handle the Ornstein}Uhlenbeck temporal stucture. 5.2.2. Exact results for models and relation to physical themes Despite the relative simplicity of the Random Uniform Jet model (309), the PDF pVWXR( ) ) for 2 the single-point passive scalar statistics:
@ pV W X R(o) do 2 ? displays a variety of interesting features. First, the scalar PDF has a Prob+a4¹(x, y, z, t)4b,"
E broader-than-Gaussian distribution at all ,nite times 0(t(R and spatial locations, when the initial data is a mean zero, Gaussian, homogenous random "eld. In the long-time limit, the scalar PDF will become concentrated at zero because of dissipative processes, but it will converge to a nontrivial shape (independent of spatial location (x, y, z)) which exhibits the following features: E broader-than-Gaussian tails when the mean shear #ow c (t)z is weak or absent,
E a Gaussian distribution when the mean shear #ow c (t)z is su$ciently persistent,
E dependence on the relative magnitude of initial velocity and passive scalar length scales (as measured by the parameter a in Eq. (306)), with the PDF becoming more Gaussian as the long-range correlations in the initial data become stronger, E permanent skewness with memory of the sign of the integral of the mean initial scalar pro"le, MM , E a phase transition with respect to the parameter a when the initial data has both deterministic and random components, E dependence on molecular di+usivity at the phase transition value a"3/4, E universality with respect to small-scale features of initial data, E universality with respect to the random temporal correlation structure of the velocity ,eld, whether it be a white noise process or a superposition of Ornstein}Uhlenbeck processes. The fact that our exactly solvable model produces scalar intermittency with this wide range of features makes it an attractive candidate for testing approximate closure schemes [269]. We shall now elaborate upon the above results for the model, and connect them with the experimental and numerical "ndings presented in Section 5.1. 5.2.2.1. Finite-time intermittency. Suppose that the initial data is purely a mean zero, homogenous, Gaussian random "eld ¹ (x, y)"¹I (x, y). Then, when c (t) is white noise in time, one can show through explicit formulas that the scalar PDF pVWXR( ) ) is broader-than-Gaussian at all "nite 2 positive times [207]. That is, the #atness factor 1(¹(x, y, z, t))2 F(x, y, z, t), 1(¹(x, y, z, t))2
466
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
strictly exceeds the Gaussian value of 3 for all (x, y, z)31 and 0(t(R. A similar result can be rigorously deduced through the analysis of quantum mechanical analogies (see Section 5.2.3) when the shear #ow pro"le < (x) is periodic in space (Section 5.3 and [51]) or contains a more general spatio-temporal randomness [206]. Fe!erman [101] pointed out that broader-than-Gaussian scalar PDFs should in fact arise at "nite times for quite general models with random advection and molecular di!usion, when the initial data is a mean zero, homogenous, Gaussian random "eld. His argument sharpens some related observations of Kimura and Kraichnan [162]. Observe "rst that since the advection} di!usion equation is linear: R¹(x, t)/Rt#*(x, t) ' ¹(x, t)"iD¹(x, t) , ¹(x, t"0)"¹ (x) , we may represent its solution as the integral of the initial data against a Green's function pR(x, x):
¹(x, t)"
1B
pR(x, x)¹ (x) dx .
The Green's function appearing here is random because it depends on the random velocity "eld *. Consider now a solution of the advection}di!usion equation which is conditioned upon a particular realization of the velocity "eld * chosen from the statistical ensemble. This conditioned passive scalar ,eld, which we denote ¹(x, t"*), is expressible as the integral of the initial data against a deterministic Green's function, since the velocity "eld is "xed at a given realization. Being a deterministic linear functional of the mean zero, Gaussian random "eld ¹ (x), the conditioned passive scalar "eld ¹(x, t"*) must also be a mean zero, Gaussian random "eld. The original, unconditioned passive scalar "eld ¹(x, t) can therefore be described as a random mixture of the mean zero, Gaussian random "elds ¹(x, t"*). We now show that this implies that ¹(x, t) must have a broader-than-Gaussian single point PDF at all space}time points, unless the conditional passive scalar variance p (x, t; *),1¹(x, t"*)2 2 is independent of the particular realization of the velocity "eld * which is held "xed in the average over the initial data. Recall our convention that the su$x `0a on the expectation brackets indicate an averaging only over the statistics of the initial data, while a su$x `va will indicate an average only over the velocity statistics. (We assume these are statistically independent.) Note "rst that 1¹(x, t)2"11¹(x, t"*)2 2 "102 "0. The #atness factor of the passive scalar T T "eld may then be written: 1¹(x, t)2 11¹(x, t"*)2 2 1(¹(x, t)!1¹(x, t)2)2 T. " " F(x, t), 1(¹(x, t)!1¹(x, t)2)2 1¹(x, t)2 11¹(x, t"*)2 2 T Because ¹(x, t"*) is a Gaussian random "eld for each "xed realization of *, we have 1¹(x, t"*)2 "31(¹(x, t"*))2"3(p (x, t; *)) . 2
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
467
Therefore, 1(p (x, t; *))2 T. F(x, t)"3 2 1p (x, t; *)2 2 T But by the simplest moment inequality 1(p (x, t; *))2 2 T51 , 1p (x, t; *)2 2 T with equality holding only when the functional of the random velocity "eld, p (x, t; *), has zero 2 variance, i.e., behaves deterministically. We have shown that the passive scalar "eld will have a broader-than-Gaussian distribution at all space}time points (x, t) for which p (x, t; *) has some 2 nontrivial dependence on *. Therefore, the question of "nite-time intermittency reduces to the question of whether the conditional variance of the passive scalar "eld at a certain location depends on the particular realization of the velocity "eld. If the velocity "eld were deterministic or absent, then there tautologically would be only one realization of the velocity "eld, and the passive scalar "eld must be exactly Gaussian at all times. On the other hand, if the passive scalar "eld were advected by an arbitrary incompressible random velocity "eld with no molecular di!usion, then the passive scalar "eld is simply advected along characteristics. Consequently, in every realization of the velocity "eld, the single-point scalar PDF will everywhere be identical to that of the homogenous, Gaussian random initial data ¹ (x). Hence, as was also shown by Kimura and Kraichnan [162], the passive scalar "eld arising from homogenous, Gaussian random initial conditions will remain Gaussian unless both random advection and molecular di!usion are active. While these transport mechanisms in isolation always preserve Gaussianity of the initial, homogenous, random passive scalar "eld, their interaction generically will produce intermittent passive scalar "elds at all "nite times. Consider, for example, the Random Uniform Jet Model, in which the random component of the velocity "eld is the stochastic process c (t) multiplying a uniform shear #ow spatial structure. Scalar intermittency will arise provided that the conditional variance of the passive scalar "eld p (x, t; *) depends on the particular realization of c (t). This is 2 clearly true: the shear #ow interacts with the molecular di!usion to produce enhanced di!usion [31,258], and in each realization of the random shear, the mean-square displacement of a tracer along the shear is
R R
c (s)min(s, s)c (s) ds ds , (310) as follows from computations similar to those described in Section 2.3.1. The greater the resulting transport, the more rapidly the passive scalar #uctuations will average out and their variance decrease. In general, velocity "elds with regions of stronger strains and shears give rise to more e!ective di!usion of the passive scalar "eld and consequently faster dissipation of the passive scalar variance than those with gentler gradients. Thus, in a generic random velocity "eld model in which the realizations of the statistical ensemble have di!erent shearing and straining behavior, one can expect broader-than-Gaussian scalar PDF's at all "nite space-time locations (x, t), when the initial D (t"c )"2i
468
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
passive scalar "eld is a mean zero, homogenous, Gaussian random "eld. The assumption of statistical homogeneity is, in fact, inessential. Note that the argument just presented does not suggest that the scalar PDF should generally remain broader-than-Gaussian in the long-time limit. The passive scalar variance p (x, t)" 2 1p (x, t; *)2 "1¹(x, t)2 will decay to zero as tPR, so one must consider instead the normalized 2 T conditional variance: R (x, t; *),p (x, t; *)/p (x, t) . 2 2 2 By the same arguments as above, the scalar PDF approaches a broader-than-Gaussian shape in the long-time limit when R*(x; *),lim R (x, t; *) is a nontrivial functional of the random 2 R 2 velocity "eld, and will asymptotically become Gaussian when R*(x; *) is almost surely a determin2 istic constant (which must be unity). Intuitively, we might associate relaxation to a Gaussian distribution (R*(x),1) with #ows in which the passive scalar "eld has a `self-averaging property,a 2 so that the long-time properties of the passive scalar "eld are the same in each individual realization of the velocity "eld. But it is not clear how to decide which case holds for a general, given nontrivial model. For the Random Uniform Jet Model, we can actually compute the moments of the scalar PDF through explicit formulas, and thereby directly characterize when the scalar PDF is Gaussian or broader-than-Gaussian in the long-time limit. 5.2.2.2. Gaussianity and non-Gaussianity of the asymptotic scalar PDF. As we have just discussed, the single-point passive scalar PDF pVWXR( ) ) is broader-than-Gaussian at "nite times in the 2 Random Uniform Jet Model. The question of whether this PDF relaxes to a Gaussian in the long-time limit is determined [207] through the long-time behavior of the following function related to the temporal structure of the deterministic mean shear #ow c (t)z:
R Q c (s) ds ds . I (t)"i
E If t\I (t) is bounded in time, then as tPR, the passive scalar PDF pVWXR( ) ) converges to
2 a broader-than-Gaussian shape which is completely independent of the mean shear #ow, E If t\I (t)PR as tPR, then the passive scalar PDF pVWXR( ) ) approaches a Gaussian
2 distribution in the long-time limit.
(If neither of these cases hold, then the passive scalar PDF never settles down to a limit, but forever oscillates in response to variations in the mean shear #ow c (t)z.)
In particular, in the absence of the mean shear #ow (c (t)"0), the passive scalar PDF in our
model converges to a broader-than-Gaussian limiting form in the long-time limit. The addition of a uniform mean shear #ow c (t)z with c (t) a periodic function of time will not in#uence this
limiting distribution. On the other hand, the addition of a steady mean shear #ow (c (t) a nonzero
constant) will cause the passive scalar PDF to relax to a Gaussian form. More generally, the asymptotic scalar PDF will be Gaussian or not according to whether or not the mean shear #ow is su$ciently persistent, as measured by whether t\I (t) diverges or stays bounded. Large values of
I (t) correspond to temporal dynamics c (t) of the mean shear #ow which tend su$ciently strongly
toward one direction or the other.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
469
It can be readily checked that the above conclusions hold without change if the mean shear #ow were to be given instead by c (t)x, with its gradient directed parallel to that of the random shear
#ow c (t)x. One might understand why a persistent mean shear #ow is associated with Gaussianity of the limiting passive scalar PDF through a `self-averaging e!ecta. From Section 3, we know that molecular di!usion interacts with shear #ows to produce enhanced di!usion of the passive scalar "eld along the direction of the shear. As discussed above, the interaction of the molecular di!usion with the random shear #ow c (t)x is a source of scalar intermittency due to the variability of the resulting shear-enhanced di!usion and dissipation. On the other hand, the interaction of the molecular di!usion with a deterministic shear #ow will lead to a rapid di!usion along the shearing direction which will tend to average out the passive scalar #uctuations over space in a regular, nonrandom fashion. By a central limit argument, one can expect that this averaging e!ect will tend to restore Gaussianity to the passive scalar statistics. This intuition is re#ected by the exact criterion concerning I (t) described above. This quantity
can be rewritten in a form
I (t)"i
R R
c (t!s)min(s, s)c (t!s) ds ds
(311)
which has a good deal of similarity with the formula for the mean-square tracer displacement along a time-dependent, deterministic uniform shear #ow (cf. (310)):
R R
c (s)min(s, s)c (s) ds ds .
The mean-square displacement along a randomly #uctuating shear #ow c (t)x with white noise correlations can also be calculated by similar means: p (t)"2i 7
p (t)"2i 7
R R
min(s, s)1c (s)c (s)2 ds ds"it ,
and a quadratic expression in t also results upon averaging the formula (311) for I (t) with c (t)
replacing c (t). Therefore, the criterion for whether the asymptotic scalar PDF is Gaussian or not is
related in some sense to whether the mean or random shear di!use the passive scalar "eld more e!ectively. The `Gaussianizinga property of the mean shear revealed in the Random Uniform Jet model might be responsible for the di!erent properties of the temperature observed in wind tunnel experiments by Jayesh and Warhaft [147] and by Tavoularis and Corrsin [316]. In both experiments, a mean temperature gradient is impressed on the #uid at the inlet, after which it freely decays. Jayesh and Warhaft generate turbulence by passing a uniform #ow through a grid, whereas Tavoularis and Corrsin introduce a nearly uniform mean shear #ow into the tunnel, where turbulence develops due to #ow instabilities. In the former experiment, broad tails in the temperature PDF persist far downstream. In the latter, the temperature PDF is very well approximated by a Gaussian, even though the Reynolds number is higher. A satisfactory investigation of this point would of course require a comparison between two experiments in which all conditions are held "xed, other than the presence of the mean shear.
470
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
5.2.2.3. Properties of the non-Gaussian asymptotic PDF. From now on, we shall focus on the characterization of the broader-than-Gaussian shape of the scalar PDF which arises in the long-time limit when t\I (t) is bounded. Its asymptotic shape will not depend on the mean shear
#ow, nor does it depend on whether the random shear is #uctuating in time according to a white noise or Ornstein}Uhlenbeck process. Rather, it depends only on some large-scale features of the initial passive scalar data ¹ (x, y). That is, one can de"ne universality classes of the initial data, dependingonly on a few large-scale parameters, so that all initial data belonging to a particular universality class will approach a common broader-than-Gaussian PDF in the long-time limit. First, we will "rst separately consider the cases of purely random initial data and purely deterministic initial data. Subsequently, we will examine the long-time form of the scalar PDF when the initial data has both deterministic and random components. Finally, we describe the universality properties of the long-time scalar PDF shape in a little more detail. Purely random initial data. If the initial data is a mean zero, homogenous, Gaussian random "eld ¹ (x, y)"¹I (x, y), then by linearity of the advection}di!usion equation, the scalar PDF will have a symmetric form with mean zero at all times. The simplest characterization of the shape of the long-time asymptotic scalar PDF is thus the ("rst) #atness factor: 1¹(x, y, z, t)2 . F*"lim 1¹(x, y, z, t)2 R A Gaussian PDF has a #atness of 3, while the #atness of the asymptotic scalar PDF in the Random Uniform Jet model is given by the explicit formula [233]:
1
FI *"3 ?
"k"\?"k"\?((k)#(k))
dk dk (sinh((k)#(k)) . "k"\?> ("k")\?> dk dk 1 (sinh"k")(sinh"k")
(312)
The parameter a appearing here describes the low-wavenumber behavior of the spectrum of the initial passive scalar #uctuations:
P(x, y)"1¹I (x, y)¹I (x#x, y#y)2"
1
PK (g, k) dg dk ,
lim PK (g, k)&"k"\? (g, k) , EI where (g, k) is a smooth function with (0, 0) "nite and positive. One can rigorously show through calculus inequalities [207,233] that the asymptotic #atness FI * is greater than 3, and that the asymptotic scalar PDF is therefore strictly broader-than? Gaussian. In Table 14, we report some numerical computed values of FI *. ? The value a"0 is the most natural, since it corresponds to initial data for which
0(
P(x, y) dx dy(R .
1
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
471
Table 14
Flatness of asymptotic scalar PDF in Random Uniform Jet Model (from [233]) a
FI H?
!8.0 !4.0 !1.5 !0.8 0.0 0.4 0.8
265.02 22.84 6.01 4.46 3.44 3.16 3.02
This condition is satis"ed for the typical case in which the initial data is predominantly positively correlated, with "nite integral length scale. As we see from the table, the #atness for this class of `ordinarya initial data is 3.44, which indicates a scalar PDF which is slightly broader than a Gaussian shape but not as broad as an exponential distribution (F"6). Varying the parameter a corresponds, in a sense, to varying the ratio between the (initial) length scale of the passive scalar "eld and the length scale of the velocity "eld. Now, the length scale of the velocity "eld in the Random Uniform Jet model is strictly in"nite, so one cannot literally de"ne such a length scale ratio. But the parameter a does describe the strength of the large-wavelength (low-wavenumber) #uctuations in the initial data. Clearly, as a increases, the initial data is becoming more strongly correlated over larger length scales. In particular, for 0(a(1, the integral length scale of the initial data is in"nite, and one can relate this range of parameter values to situations in which the velocity and passive scalar "eld have comparable length scales. We see from Table 14 that as the long-range correlations of the passive scalar initial data strengthen (aP1), the #atness factor approaches the Gaussian value of 3. On the other hand, for negative values of a, the long-wavelength #uctuations of the passive scalar "eld are depleted, and the passive scalar "eld is initially correlated on much smaller length scales than the integral length scale of the velocity "eld. The #atness factor diverges rapidly as a decreases. Indeed, if the spectrum of passive scalar #uctuations vanishes in a neighborhood of the origin (formally, a"R), the #atness factor is in"nite: the asymptotic scalar PDF has algebraic tails [207,233]. The qualitative paradigm we infer from the previous paragraph is that the passive scalar PDF relaxes at long times to an approximately Gaussian distribution when the length scale of the passive scalar "eld is comparable or greater than that of the velocity "eld, but that strong scalar intermittency will persist if the passive scalar "eld is correlated on a much smaller length scale than that of the turbulent velocity "eld. From an intuitive standpoint, one can imagine that a turbulent velocity "eld will be able to e!ectively mix up a passive scalar "eld with larger-scale variations and produce a di!usive averaging over a wide area after a su$cient amount of time. A central limit argument suggests that the passive scalar statistics ought to then become Gaussian (see also [162]). On the other hand, when the passive scalar variations occur on scales smaller than that of the velocity "eld, the mixing is less e$cient because the energetic large scales of the #ow will mostly drag passive scalar structures around rather than chop them up. The passive scalar "eld will instead be distorted
472
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
through irregular small-scale straining and shearing processes, which are likely to produce or at least preserve intermittent features. This is the situation in Rayleigh}Benard convection cells, where thin plumes of hot #uid rise and cold #uid fall through a relatively regular circulation pattern. In the `hard turbulencea regime, the temperature is found to be intermittent in the central region through which these plumes pass [55,135,155,293]. Temperature is of course not a passive scalar in these convection cells, but buoyancy plays no role in the intuitive argument. Purely deterministic initial data. We now describe some characteristics of the asymptotic scalar PDF which arises from purely deterministic initial data ¹ (x, y)"¹M (x, y). We assume that the total mass of the initial data is nonzero and "nite:
MM ,
1
¹M (x, y) dx dyO0 .
In this setting, the single-point PDFs pVWXR will vary with spatial location and will not necessarily 2 be broader-than-Gaussian at "nite times. In the Random Uniform Jet model [233], however, the passive scalar PDF at each point approaches a common and universal broader-than-Gaussian shape with #atness FM *+3.52. Moreover, the scalar PDF will exhibit a persistent asymmetry, which we may characterize by its skewness: 1(¹(x, y, z, t)!1¹(x, y, z, t)2)2 SM (x, y, z, t), . 1(¹(x, y, z, t)!1¹(x, y, z, t)2)2 In the long-time limit, the skewness everywhere approaches a common value [233]: lim SM (x, y, z, t)"SM * sgn M M , R where SM *+0.76. Thus, the passive scalar PDF will forever remember the sign of the mass of the initial data, and will not converge to a symmetric shape. Persistent skewness of temperature was observed in wind tunnel experiments by Sreenivasan and others [311]. The turbulence is generated by a grid, and temperature #uctuations are introduced by heating wires on the same grid, or another screen downstream. The temperature showed signi"cant skewness relatively far downstream of the heated grid, relaxing to an approximately symmetric distribution only after 40}100 thermal mesh sizes. Initial data with deterministic and random components. We shall "nally discuss the asymptotic shape of the scalar PDF in the long-time limit when the initial data ¹ (x, y)"¹M (x, y)#¹I (x, y) is a superposition of a mean pro"le ¹M (x, y) with random #uctuations ¹I (x,y) with the same properties as above. As the advection}di!usion PDE is linear, the passive scalar "eld arising from this combined initial data will be a simple sum of the individual contributions. Because these summands are statistically correlated, the PDF pVWXR( ) ) of the total passive scalar "eld will, 2 however, not be related to the individual PDFs in such a simple fashion. Through explicit computations [233], we "nd that the long-time limit of the passive scalar PDF approaches at every point a common shape, which depends on the parameter a. Its skewness and #atness are as follows: E If !(a(1, the asymptotic skewness S* is zero, and the asymptotic #atness is F*"FI *, just ? as in the case of purely random initial data.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
473
E If a(!, the asymptotic skewness SH"SM H sgn M M and the asymptotic #atness F*"FM *, just as in the case of purely deterministic initial data. E If a"!, then the asymptotic skewness is given by lim S(x, y, z, t)"S* sgn M M , R C M M #C (i/A )M M (0, 0) . S*" (C MM #C (i/A ) (0, 0)) The asymptotic #atness is given by C M M #C (i/A )M M (0, 0)#C (i/A )( (0, 0)) . lim F(x, y, z, t)"F*" (C M M #C (i/A ) (0, 0)) R The numerical constants appearing in these formulas may be computed by quadrature: C +0.27, C +0.50, C +0.10, C +0.74, C +0.25, C +2.55, and C +1.52. The function (g, k) is part of the low-wavenumber description of the spectrum of initial passive scalar #uctuations (306). We thus see a phase transition in the shape of the scalar PDF in the long-time limit. For a'!, the long-wavelength random #uctuations in the initial data are su$ciently strong that they alone determine the long-time skewness and #atness (as well as all the higher-order statistics [233]). For a(!, the mean initial scalar pro"le instead plays the dominant role. At the transition boundary a"!, both the mean and #uctuating components of the initial data are relevant. Moreover, for this special value of a, the long-time scalar PDF depends upon an additional parameter:
(0, 0)i/M M A which is irrelevant for all other a. This parameter involves the relative magnitude of the large-scale variations in the mean and random components of the initial data, through M M and (0, 0) respectively, and further depends on the strength of the advection A (see Eq. (307)) and of molecular di!usion i. If the random temporal #uctuations are governed by a superposition of Ornstein}Uhlenbeck processes (308), then A should be replaced by + A q . H H H The reason for the appearance of these factors is that at the special value of a"!, the variance of the contribution to the passive scalar "eld from the deterministic initial data ¹M (x, y) and from the random initial data ¹I (x, y) decay at long times according to power laws Kt\@, with a common exponent b"3 but di!erent prefactors K. The statistics of the total passive scalar "eld will clearly depend on the relative magnitude of these prefactors, which in turn depends on the relative magnitude of the initial data as well as the advection}di!usion parameters. When aO!, the contributions from the mean and random components of the initial data decay according to power laws with distinct exponents, and thus one will asymptotically dominate the statistics of their sum. 5.2.2.4. Universality of asymptotic PDF shape. With the results on the long-time limiting form of the passive scalar PDF presented, we review the extent to which its shape is universal. We found
474
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
that the skewness and #atness depend in the long-time limit only on the following parameters in the model: E the boundedness or divergence of the quantity t\I (t), which determined whether the mean
shear #ow is persistent enough to drive the passive scalar PDF to Gaussianity, E the parameter a characterizing the strength of the long-wavelength #uctuations of the random initial data (306) (and which roughly represents a ratio of the length scales of the velocity and passive scalar "eld), E the sign of mass of the mean pro"le of initial data M M "1¹ (x,y) dx dy, E the following combination involving the strength of the random advection, the molecular di!usion, and the relative size of the low-wavenumber components of the mean and random components of the initial data:
(0, 0)i/M M A . This parameter is only relevant when a"!, a special value at which the random and mean components of the initial data decay at comparable rates at long times. (It can be shown that the higher-order statistics depend on this same set of quantities [233].) The asymptotic shape of the scalar PDF is universal with respect to all other features of the model, namely: E The small-scale features of the initial data, i.e., anything other than the low-wavenumber asymptotics of its Fourier spectrum. E The temporal correlation structure c (t) of the random shear, whether it be governed by a white noise process or a superposition of Ornstein}Uhlenbeck processes [282]. It is of course possible that if c (t) has long-range temporal correlations, the asymptotic scalar PDF may be altered. E The details of the temporal dynamics of the mean shear c (t), other than through the criterion
involving the single parameter I (t) mentioned above.
5.2.3. Derivation of results The basis for all of the above results concerning the single-point passive scalar PDF in the Random Uniform Jet model is the ability to express the equal-time, multipoint passive scalar correlation functions
, ¹(xH, yH, zH, t) H to all orders in terms of explicit integrals of elementary functions. In particular, all the single-point moments of the passive scalar "eld 1(¹(x, y, z, t)),2 can be computed by numerical quadrature. The derivation of the explicit formulas for the passive scalar statistics will now be illustrated for the case in which the random temporal process c (t) is white noise in time, and the initial data varies only in the shearing direction y: P (+(xH, yH, zH),, , t), , H
¹(x, y, z, t"0)"¹ (y)"¹M (y)#¹I (y) . (313) Our presentation will be taken for the most part from the original paper [207]. We will, at the end, indicate Resnick's modi"cation [282] for treating Ornstein}Uhlenbeck temporal dynamics.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
475
The reader may well note that the initial data (313) which we assume for the derivation is not actually a special case of the initial data considered in the main discussion of Section 5.2. (The quantities M M and PK (g, k) de"ned by Eqs. (304) and (305) would be in"nite because of the absence of decay in the x direction.) One consequence is that the numerical values of the asymptotic skewness and #atness factors which would be computed from the formulas derived below will di!er from those presented above. We have opted to present the derivation for initial data varying in the single direction y because it illustrates all the main ideas and yields formulas which are more easily derived than those which arise when the initial data varies in both the x and y direction. The formulas for the latter case may be found in [233]. To ease notation, we will write the coordinates of the observation points as components of a vector:
x
x "
x
,
$
(314)
x,
y
y "
y $
,
y,
z
z "
z $
.
z,
To avoid confusion with our standard use of the symbol x for the vector of spatial coordinates of a single point, we have a$xed the superscript `( )a as a reminder that the indices of the vectors x , y , and z run over the labels of the observation points 1,2, N, and not over spatial directions. In the following derivation, di!erential operators such as and D will always refer to vectors of the type (314). 5.2.3.1. PDE for multipoint correlation functions. As the random component of the Random Uniform Jet #ow has white noise correlations in time, the passive scalar multipoint correlation functions of all orders obey closed di!usion equations (see Section 4.4): RP (x , y , z , t) , #c (t)z ' y P "i(Dx #Dy #Dz )P
, , Rt
1 , , RP , , P (x , y , z , t"0)" ¹ (yH) . # A < (xH)< (xHY) , 2 RyHRyHY H HHY
(315)
476
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The advective contribution from the mean velocity "eld c (t)z and the molecular di!usion term can
be formally obtained by expanding the time derivative RP /Rt"(R/Rt)1, ¹(xH, yH, zH, t)2 and , H substituting in the advection}di!usion equation. The additional di!usion operator arising from the random white noise advection < (x)c (t) can be derived through the same techniques as used in the Rapid Decorrelation in Time model (Section 4.4), once one notes that the correlation function of the random shear is 1(< (x)c (t))(< (x)c (t))2"A < (x)< (x)d(t!t) . Because the random advection is a shear #ow, we could alternatively proceed by Fourier transforming the advection}di!usion equation with respect to y, as in Section 3.5. The molecular di!usion term is then handled through the Feynman}Kac formula. This is the approach adopted in the original paper [207]. 5.2.3.2. Reformulation as quantum mechanical problem. We now transform the di!usion PDE (315) for P into the form of a SchroK dinger equation (in imaginary time) describing the evolution of , a system of N quantum-mechanical particles. We defer writing < (x) as its linear form x assumed in the Random Uniform Shear layer until later, because the quantum-mechanical formulation holds for general spatial pro"les of the random shear. We begin by isolating the e!ects of the mean shear c (t)z through the de"nition of a new variable
y"y!zC (t), with
R C (t)" c (s) ds ,
(316)
so that (x, y, z) are Lagrangian variables associated to the mean shear #ow. In these `meanLagrangiana variables, the advection term vanishes: RPI (x , y , z , t) , "i(Dx #Dy #Dz )PI #iC (t)Dy PI ,
, Rt
, , RPI 1 , , PI (x , y , z , t"0)" ¹ (yH) . # A < (xH)< (xHY) , RyHRyHY 2 H HHY
(317)
The last two terms of the PDE explicitly indicate the enhanced di!usion along the shearing direction (in mean-Lagrangian coordinates) due to the interaction of the molecular di!usion with the mean shear and to the randomly #uctuating shear. Each of these operators is easily checked to be nonnegative de"nite. Note now that neither the initial data nor any coe$cients of the PDE depend on z . By symmetry, this implies that the solution is in fact independent of this set of variables: PI "PI (x , y , t), and the Dz term may be dropped. , , Owing to the special shear structure of the model, we can see that a partial Fourier transform with respect to the yH variables would conveniently convert the enhanced di!usion operators into
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
477
multiplicative factors. We cannot, however, apply the ordinary Fourier transform when the initial data has homogenous random #uctuations, because then ¹ (y) and PI will not decay at in"nity. , We must use a more general spectral representation, which for the initial data reads ¹ (y)"¹M (y)#¹I (y) ,
¹M (y)" ¹MK (k)ep IW dk , 1
(318)
I (k) . ¹I (y)" ¹IK (k)ep IW d= 1 The Fourier integral for ¹I (y) is a stochastic white noise integral ([341], Section 9), which we already encountered in Paragraph 3.2.2.1. For computational purposes, the white noise di!erential d= I (k) acts as a complex Gaussian random quantity with the formal properties: 1d= I (k)2"0 ,
d= I (!k)"d= I (k) ,
1d= I (k)d= I (k)2"d(k#k) dk dk .
(319)
(An overbar denotes complex conjugation.) The Fourier coe$cient of the random initial data is linked to its spectrum PK (k) (304) by the relation ¹IK (k)"(PI ("k") . The initial data for the passive scalar multipoint correlation function can now be generally written as a superposition of exponentials:
PI (x , y , t"0)" ,
ep k ' y
1,
, (¹KM (kH) dkH#¹KI (kH) d= I (k)) . H H
(320)
The expectation brackets 1 ) 2 denote an average over the initial data, which here just amounts to averaging over the independent complex white noise di!erentials +d= I (k),, . This average can be H H evaluated in general through a cluster expansion and the rules (319). Because of the linearity of the PDE (317), PI can be represented in a similar fashion to Eq. (320) at all later times: ,
PI (x , y , t)" ,
ep k ' y Q (x , k , t) , 1,
, (¹MK (kH) dk#¹IK (kH) d= I (k)) , H H
where Q satis"es the Fourier transformed PDE: ,
RQ (x , k , t) , , "iDx Q !4pi"k "Q !4piC (t)"k "Q !2pA kH< (xH) Q , , , ,
, Rt H Q (x , k , t"0)"1 . ,
478
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
The undi!erentiated terms with spatially constant coe$cients can now be removed by an integrating factor ep' R>GRk , with
R R Q 0)I (t)"i C (s) ds" (321) c (s) ds ds .
Through the above transformations, we "nd in the end that the passive scalar multipoint correlation function can be represented by
P (x , y , z , t)" ,
ep k 䢇y \CKRz e\pk ' R>GRt (x , k , t) ,
1,
, (322) ; (¹K (kH) dk#¹K (kH) d=H(k)) , H where C (t) and I (t) are given by Eqs. (316) and (321), respectively, and t (x , k , t) solves the PDE: K
, Rt (x , k , t) , , "iDx t !2pA kH< (xH) t , , , Rt H (323) t (x , k , t"0)"1 . ,
Note that the e!ects of the mean shear have been explicitly accounted for by C (t) and I (t) in
Eq. (322); t depends only on the molecular di!usivity and the spatial structure of the random , shear. Moreover, the equation for t has the form of a quantum-mechanical SchroK dinger equation , (in imaginary time) Rt (x , k , t) "!iDx t #; (x , k )t , ! , , , , Rt t (x , k , t"0)"1 , , with potential
, ; (x , k )"2pA kH< (xH) . , H We shall only discuss hereafter the case of purely random initial data (¹ (y)"¹I (y)), but the ideas carry over to handle initial data with a deterministic component as well [233]. Because ¹K (k) vanishes, the cluster expansion of 1 ) 2 becomes a simple Wick contraction, with the wavenumbers +kH,, matched in pairs [207]. We are particularly interested in the single-point moments of the H passive scalar "eld 1¹,(x, y, z, t)2, and for these we "nd 1¹,>(x, y, z, t)2"0 ,
(2N)! 1¹,(x, y, z, t)2" 2,N!
, e\pk ' R>GRt (x 0, k !, t) PK (kH) dk . , 1, H
(324)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
479
The arguments x 0 and k ! of t are shorthand notation for the following collections of 2N , variables:
x x
x 0"
$
,
x
k k $
k !"
k,
.
!k !k $
!k,
5.2.3.3. Solution of quantum mechanical problem. We have shown how to explicitly represent the moments of the passive scalar "eld in terms of the solution t to the quantum-mechanical problem , (323). We now insert the explicit form of the random shear pro"le < (x)"x in the Random Uniform Jet model into this equation: Rt (x , k , t) , "iDx t !2pA (k 䢇x )t , , , Rt t (x , k , t"0)"1 . (325) , The potential in the SchroK dinger operator on the right-hand side is thus quadratic, corresponding to a certain harmonic oscillator potential for the collective motion of N quantum particles. The potential is e!ectively one-dimensional, depending only on the weighted sum of the spatial coordinates k 䢇x , so Eq. (325) can be mapped onto the one-dimensional harmonic oscillator problem: Rt(x, t) Rt " !xt , Rx Rt t(x, t"0)"1 . An exact solution to this PDE in terms of elementary functions is given by Mehler's formula [296]: t(x, t)"(cosh(2t))\ e\ RV . The solution to the N-particle equation (325) is then expressed in terms of t as follows: t (x , k , t)"t((2pi\A )"k "\(k 䢇x ), (2piA "k "t) . ,
480
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Substitution of this exact solution into Eq. (324) then yields the following exact formulas for the single-point scalar moments: 1¹,>(x, y, z, t)2"0,
(2N)! 1¹,(x, y, z, t)2" 2,N!
, e\p ' R>GR(cosh(2p(2iA "k "t))\ PK (kH) dk . 1, H
(326)
k
5.2.3.4. Derivation of properties of scalar PDF. The explicit formula (326) now permits direct analysis of the passive scalar moments in the Random Uniform Jet model. The fact that the scalar PDF is broader-than-Gaussian at "nite times follows from the following calculus inequality: , (sech("k"))5 (sech("kH")) , H as established in [207]. The long-time limit of the #atness factor 1¹(x, y, z, t)2 F*"lim 1¹(x, y, z, t)2 R follows from a straightforward asymptotic consideration of Eq. (326):
F*"
3 FI *'3 ?
if t\I (t)PR as tPR ,
if t\I (t) is bounded ,
where
"k"\?"k"\? dk dk (cosh (("k"#"k"))) 1 . FI *"3 ? "k"\? "k"\? dk dk 1 cosh("k") cosh("k") Recall that a describes the behavior of the initial passive scalar spectrum PK (k) near k"0 (see Eq. (306)). The #atness factors FI * may be readily and accurately evaluated through a numerical ? quadrature calculation. The procedure described above for deriving explicit formulas for the single-point moments of the passive scalar "eld can be generalized to handle initial data varying in the x and z as well as y directions, and with deterministic and random components. The formulas will di!er in some details (cf. Eq. (312)); see [207,233] for details. 5.2.3.5. Ornstein};hlenbeck temporal -uctuations. We shall "nally indicate how exact formulas for the passive scalar statistics can be obtained when the temporal #uctuations c (t) are given by a superposition of Ornstein}Uhlenbeck processes, rather than a white noise process as assumed above. Complete details may be found in the thesis of Resnick [282]. To communicate the main
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
481
idea, it su$ces to consider c (t) as a single Ornstein}Uhlenbeck process, with correlation function 1c (t)c (t#q)2"A e\OO .
(327)
The process c (t) may be described as the solution to the linear stochastic di!erential equation dc (t)"!q\c (t) dt#p d= (t) , A A with = (t) a standard Brownian motion, c (0) a Gaussian random variable with mean zero and A variance A , and p "(2A q\). A As discussed in Section 4, the reason closed equations can be written down for the passive scalar multipoint correlation functions when the velocity "eld is Gaussian and delta-correlated in time is that the random advection of tracer trajectories can be expressed as coupled Brownian motions. In the Random Uniform Jet model with white noise correlations, for example, the equations of motion for the locations +(XH(t), >H(t), ZH(t)),, of N tracers are: H dXH(t)"(2i d=H(t) , V d>H(t)"c (t)ZH(t) dt#< (XH(t)) d= (t)#(2i d=H(t) , W
A dZH(t)"(2i d=H(t) , X where = (t) and +=H(t), =H(t), =H(t),, are independent Brownian motions describing the A V W X H random shear #ow and the molecular di!usion, respectively. The di!usion equation (315) follows by the general relation between the statistics of N tracer trajectories and the Nth order passive scalar correlation function (see, for example, [185,244]). With an Ornstein}Uhlenbeck law for the #uctuating shear, the trajectory equations instead read dXH(t)"(2i d=H(t) , V d>H(t)"c (t)ZH(t) dt#< (XH(t))c (t) dt#(2i d=H(t) , W
dZH(t)"(2i d=H(t) , X dc (t)"!q\c (t) dt#p d= (t) . A A Note that these equations are again described in terms of independent Brownian motions = (t) A and +=H(t), =H(t), =H(t),, , and therefore we can still write down a closed di!usion equation for V W X H the passive scalar multipoint correlation function. The main di!erence from the white noise case is that an extra stochastic di!erential equation is required to describe the dynamics of c (t) in terms of a Brownian motion. This fact manifests itself in the need to introduce an additional variable in the di!usion PDE. Namely, with the random shear #uctuating according to an Ornstein}Uhlenbeck process (327), the passive scalar multipoint correlation function may be expressed as follows [282]:
e\D df , P (x , y , z , t)" W (x , y , z , f, t) , 1 , (2pA
(328)
482
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where W satis"es the di!usion equation , , RW RW RW (x , y , z , f, t) , #c (t)z 䢇 y W # f< (xH) ,#q\f ,
, RyH Rf Rt H 1 RW ,, "i(Dx #Dy #Dz )W # p , 2 A Rf
, W (x , y , z , f, t"0)" ¹ (yH) . , H The auxiliary variable f keeps track of the dynamics of the temporal process c (t), and Eq. (328) comes from a weighting with respect to the distribution of c (0). One can now proceed with the same transformations as in the white noise case to represent W as , follows:
W (x , y , z , f, t)" ,
ep k 䢇y \CKRz e\pk ' R>GRt-3(x , k , f, t) ,
1,
, ; (¹KM (kH) dk#¹KI (kH) d=H(k)) , H where
Rt-3 1 Rt-3 , Rt-3(x , k , f, t) , ! 2pif kH< (xH) t-3 , , #q\f , "iDx t-3# p , , Rf 2 A Rf Rt H t-3(x , k , f, t"0)"1 . , After a further transformation designed to clear the "rst derivative term from the PDE, t-3(x , k , f, t)"exp[(t#f/p)/2q ]tI -3(x , k , f, t) , , A , we arrive at a PDE in quantum-mechanical SchroK dinger form
1 RtI -3 , f RtI -3(x , k , f, t) , # ! 2pif kH< (xH) ! , "iDx t-3# p t-3 , , A 2 Rf 2pq , Rt A H tI -3(x , k , f, t"0)"e\DNA O (329) , In the Random Uniform Jet model < (x)"x, and the e!ective potential in the SchroK dinger operator,
f , ;-3(x , k , f)" (2pifk 䢇x )# , 2pq A is again a quadratic form which varies only along a single direction in (x , f) space. Hence, after a (complex) rotation of coordinates, Eq. (329) may be mapped onto the one-dimensional harmonic oscillator problem, and thus expressed in terms of Mehler's formula. Details and generalizations
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
483
may be found in Resnick's thesis [282]. It is also shown there that the long-time limit of the single-point passive scalar moments are independent of whether the random temporal #uctuations c (t) are governed by white noise (307) or a superposition of Ornstein}Uhlenbeck processes (308). (One need only equate the quantity A for the white noise case to + A q for a superposition of H H H Ornstein}Uhlenbeck processes.) 5.3. An example with qualitative xnite-time corrections to the homogenized limit We have seen how the exactly solvable Random Uniform Jet model permits the exploration of a number of issues pertaining to large-scale intermittency, such as the degree of universality of the scalar PDF in the long-time limit and the sensitivity to parameters such as the relative length scales of the velocity and passive scalar "elds. As the uniform shear velocity "eld has an in"nite length scale, we have up to now only been considering situations in which the length scale of the passive scalar "eld is small or comparable to the length scale of the velocity "eld. We shall now modify the spatial structure < (x) of the random velocity "eld in the general jet model (303) to be a periodic function, so that we may also study long-time scalar intermittency properties when the length scale of the passive scalar "eld is much larger than that of the velocity "eld. This is a realm in which homogenization theory, as discussed in Section 2, can be applied. Given a velocity "eld with "nite periodicity and/or short-ranged randomness, homogenization theory furnishes a rigorous asymptotic description of the passive scalar "eld under a large-scale rescaling of the initial data ¹ (x)P¹ (ex) and a di!usively linked long-time limit tPt/e, with eP0. While this is a particularly relevant asymptotic limit to consider in many applications in which the velocity varies on much smaller scales than the scalar "eld, the real issue is the behavior of the passive scalar "eld at large but ,nite times and length-scales. The rigorous asymptotic theory guarantees a certain abstract mode of convergence of the rescaled passive scalar statistics to the homogenized limit, but this by no means implies that homogenization theory describes the large-scale, long-time passive scalar statistics in a uniformly approximate way. We examined this question for the mean and variance of the displacement of a single tracer in some deterministic periodic #ows in Section 2.3, and did "nd good agreement with the homogenized description at "nite times [231]. The accuracy of the homogenization approximation at "nite time for a quite di!erent statistical aspect, the one-point scalar PDF, was explored by Bronski and McLaughlin [51] in a spatially periodic version of the Random Uniform Jet model discussed in Section 5.2. This Random Spatially Periodic Shear model will be de"ned in detail in Section 5.3.1. In this random velocity "eld model, two qualitative departures of the scalar PDF from a straightforward homogenization picture emerge. In the Random Spatially Periodic Shear model, the passive scalar statistics become Gaussian in the homogenized large-scale, long-time limit, as one might expect from a central limit argument. The long-time limit of the scalar PDF evolving from initial data with large but ,xed length scale, however, can become increasingly intermittent with #atness factors diverging in the long-time limit. The disagreement with the homogenized result implies that the limits of long-times tPt/e and large-scale variation in the initial data ¹ (x)P¹ (ex) do not commute (Section 5.3.2). Moreover, even when the moments of the scalar PDF do approach their Gaussian values in the long-time limit, their convergence is very nonuniform. The relaxation time grows quadratically with the order of the moment, indicating that while the scalar PDF develops a Gaussian core at long times, it will always exhibit broader-than-Gaussian tails
484
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
su$ciently far out (Section 5.3.3). The Random, Spatially Periodic Shear model teaches us that the passive scalar statistics evolving from "xed initial data can exhibit qualitative departures from homogenization theory at large but "nite times. The passive scalar statistics in the Random, Spatially Periodic Shear model cannot be represented as fully explicit quadrature expressions in the same way as in the Random Uniform Shear model. Nonetheless, the moments of the single-point scalar PDF at large but "nite times may be computed precisely through a perturbation theory applied to quantum-mechanical analogies similar to those discussed in Section 5.2.3. In this way, it can be directly proved that the scalar PDF arising from mean zero, Gaussian, homogenous, random initial data is broader-than-Gaussian at all later "nite times. The calculations for this and the other results we shall describe are presented in full detail in [51], and will not be reproduced here other than for a few brief remarks in Section 5.3.4. 5.3.1. Random, Spatially Periodic Shear Model The Random, Spatially Periodic Shear velocity "eld model considered by Bronski and McLaughlin is a shear #ow:
*(x, t)"*(x, y, t)"
0
, c (t)< (x) with a deterministic spatial pro"le < (x) of period one (in nondimensionalized units), and white noisetemporal #uctuations: 1c (t)c (t#q)2"A d(q) . The pro"le < (x) will be assumed to be suitably normalized; the amplitude of the shear #ow will be measured by A . Like the Random Uniform Jet model, the Random, Spatially Periodic Shear Flow is an element of the class of general jet models (303), but there is no mean shear, and the z dimension has been omitted. The most fundamental di!erence between the two models is that the Random Uniform Jet velocity "eld has an in"nite length scale of variation, while the Random, Spatially Periodic Shear #ow has "nite period length scale. The advection}di!usion equation for the present model reads R¹(x, y, t) R¹(x, y, t) #c (t)< (x) "iD¹(x, y, t) , Ry Rt ¹(x, y, t"0)"¹ (y) , where ¹ (y) is a mean zero, Gaussian, homogenous random "eld, and is assumed for simplicity to only vary in the shearing direction. We shall now consider two types of spectra for the random initial data which will illustrate some possibilities for the long-time behavior of the single-point passive scalar PDF. 5.3.2. Persistent intermittency for initial data with no long-wavelength -uctuations We shall "rst show that the long-time limit of the passive scalar "eld arising from "xed initial data can di!er sharply from the homogenized limit in which the long-time limit is linked with a large-scale rescaling of the initial data. Suppose that the initial data ¹ (y) is a mean zero,
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
485
Gaussian, homogenous, random "eld with spectrum supported above a "xed, positive wavenumber k :
P(y)"1¹ (y)¹ (y#y)2" ep IWPK (k) dk , 1
"0 PK (k) '0
for "k"4k , for k'k .
(330)
We will consider the long-time limiting shape of the single-point scalar PDF. To diagnose the deviation of this PDF from a Gaussian form, we will use not only the #atness factor F(x, y, t) discussed in Section 5.2, but all the higher order -atness factors F (x, y, t) as , well: 1(¹(x, y, t)!1¹(x, y, t)2),2 1¹,(x, y, t)2 " , F (x, y, t), , 1(¹(x, y, t)!1¹(x, y, t)2)2, 1¹(x, y, t)2, with the last equality holding because the mean passive scalar "eld vanishes. Note that F (x, y, t),1 and F (x, y, t)"F(x, y, t). The values of the #atness factors for a Gaussian distribu tion is F%"(2N)!/2,N! . , An asymptotic computation for the long-time behavior of the #atness factors yields [51]: F (x, y, t)"F% eBH,R[1#O(A i\k)#O(e\pGR)] , , ,
(331)
with dj "C N(N!1)Ai\k#O(Ai\k) , 4 and C is a positive numerical constant depending only on the structure of the periodic shear 4 pro"le < (x). For su$ciently small but nonzero k , the #atness factors are clearly growing without bound as time progresses, so the passive scalar PDF is approaching a shape with broad algebraic tails. Initial data of the form (330) also give rise to diverging #atness factors of the scalar PDF in the long-time limit of the Random Uniform Jet model (when the mean shear is weak) (see Section 5.2). There, we could intuitively understand this strong scalar intermittency as arising from an extreme (a"R) situation in which the passive scalar "eld is correlated on much smaller scales than the velocity "eld. This interpretation does not explain the scalar intermittency in the present Random, Spatially Periodic Shear model, because the ratio of the passive scalar length scale k\ to the velocity length scale 1 is assumed large! According to our discussion in Section 5.2 as well as the prediction of homogenization theory, we might well have expected the passive scalar statistics to approach a Gaussian distribution when k is taken very small. The resolution of this seeming paradox is as follows. If we homogenize the passive scalar "eld by rescaling the initial data ¹ (y)P¹ (ey) and time by tPt/e, then the lowest wavenumber of the
486
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
initial passive scalar spectrum is decreased as k Pk e. Formula (331) applies perfectly well as eP0, and with these replacements: dj (e)"C N(N!1)eAi\k#O(eAi\k) . , 4 Consequently, lim F (x, y, t/e)"lim F% eBH,CRC(1#O(eA i\k)#O(e\pC\GR))"F% . , , , C C Thus, the homogenized result is indeed consistent with the formula for the #atness factors (331). The strong qualitative discrepancy between their predictions is due to the fact that homogenized limit links the large time with the large space scale of variation of the initial data. In an actual experiment or simulation, however, one is usually interested in ,xing a large space scale for the initial data, and then looking at the long-time limit. The fact that these limit processes disagree means that the long-time limit does not commute with the limit of large scale spatial variation of the initial data. Homogenization theory studies a particular large-scale, long-time limit, which may or may not describe the large-scale, long-time limit of interest in a certain application. We have seen explicitly how the passive scalar statistics may manifest strong and persistent intermittency despite the fact that their homogenized limit is Gaussian. 5.3.3. Non-uniform relaxation to Gaussian PDF for initial data with ,nite, nonzero intensity of long-wavelength -uctuations We now present another way in which the long-time statistical behavior of the passive scalar "eld may di!er qualitatively from that predicted by homogenization theory. Suppose the initial, Gaussian, homogenous random passive scalar data has a spectrum PK (k) which is smooth and nonvanishing at the origin:
P(y)"1¹ (y)¹ (y#y)2" ep IWPK (k) dk , 1 PK (0)O0 . The initial passive scalar data then has #uctuations at arbitrarily large wavelengths, while the velocity "eld has a "xed period length ¸ "1. One may therefore expect that the passive scalar T statistics should relax to a Gaussian form in the long-time limit, either by the homogenization result or by letting k P0 in Eq. (331). A precise calculation shows that indeed, the #atness factors converge to their Gaussian values in the long-time limit: lim F (x, y, t)"F% . , , R To examine how rapidly the scalar PDF converges to the asymptotic Gaussian shape in the long-time limit, we keep the leading-order correction [51]: N(N!1)CI 4#O(t\) , F (x, y, t)"F%# , , it
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
487
where CI is a positive numerical constant depending on A /i and the periodic velocity "eld 4 structure < (x). (The further O(t\) corrections also depend on the low-wavenumber spectrum of the initial passive scalar data.) An important observation is that the correction is not uniformly small with respect to the order of the #atness factor N. The time scale needed for F (x, y, t) to approach its Gaussian value grows , quadratically with the order of the moment N. Thus, at any large but "nite time, the low-order #atness factors of the scalar PDF will be close to their Gaussian values, but moments of su$ciently high order (N9(iCI \t) will have signi"cantly super-Gaussian values. Pictorially, this means that 4 the scalar PDF at large times has a Gaussian core with broader-than-Gaussian tails. As time evolves, the broad tails become ever more remote relative to the core, that is, become noticable beginning at an ever-larger number of standard deviations away from the origin. As tPR, these broader-than-Gaussian tails get squeezed o! to in"nity, leaving behind a purely Gaussian limiting distribution. The convergence of the scalar PDF to its homogenized Gaussian limit is thus very nonuniform in the tail regions. This exact result for the present model is consistent with general conclusions drawn by Gao [111] through consideration of a mapping closure approximation to the evolution of the scalar PDF (see Paragraph 5.4.1.1). We remark that CI is an increasing, bounded function of the ratio A /i, which characterizes the 4 relative strength of turbulent advection and molecular di!usion, and thus serves as a PeH clet number in the Random, Spatially Periodic Shear Model. The "nite-time corrections to homogenization theory are thus most evident in this model at high PeH clet number. An instance of slow convergence of higher order #atness factors to their Gaussian values is reported in the direct numerical simulations of Eswaran and Pope [91]. The passive scalar "eld is initialized as a random "eld assuming values $1 over patches of a speci"ed length scale, and allowed to evolve in a statistically stationary turbulent #ow. The second and third order #atness factors F (t) and F (t) of the scalar PDF relax to their Gaussian values only after 6 to 8 large-scale eddy turnover times, by which point the scalar variance has decayed to a small fraction of its initial value. 5.3.4. Remarks on associated quantum mechanics problem We close with some brief comments concerning the derivation of the above results. As we showed in Section 5.2.3, the quantum mechanics problems which arise in analysis of the scalar moments 1(¹(x, y, t)),2 in the Random, Spatially Periodic Shear Model involve 2N particles and read Rt (x , k , t) , "iDx t (x , k , t)!;. (x , k )t (x , k , t) , , , , Rt t (x , k , t"0)"1 , , with potentials
(332a) (332b)
, ;.(x , k )"2pA kH< (xH) . , H The solution to the PDE (332) cannot be written in explicit form, as in the Random Uniform Jet model. Instead, one expands t (x , k , t) as a superposition of eigenfunctions of the SchroK dinger , operators on the right-hand side of Eq. (332a). See [51] for details.
488
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
We have been considering above the passive scalar "eld at large scales at long-times, so the relevant wavenumbers k are small. The potentials ;.(x , k ) are thus weak, and the eigenfunc, tions of the SchroK dinger operators may be analyzed perturbatively. The exponents dj describing , the extent of scalar intermittency in Eq. (331) arise from a computation of the shifts in the energy of the ground state due to the potential ;. relative to N copies of the potential ;.. , One can also study the opposite limit in which the potential becomes very strong (relative to the `kinetic energya Laplacian term iDx ). In this situation, the particles are well localized near minima of the potential ;., and therefore they e!ectively feel a quadratic harmonic oscillator potential. , One can therefore plausibly replace ;. by a quadratic form obtained by Taylor expansion about , the minima; this is known in solid state physics as the `tight-binding approximationa [6]. Two situations in which this tight-binding approximation would be relevant in the Random, Spatially Periodic Shear model are: E passive scalar initial data with spectrum supported only at wavenumbers "k"5k as in Eq. (330), but now with k <1, E high PeH clet number Pe"A /i<1 at times su$ciently short t;(i(1#Pe\)) so that the passive scalar statistics are still dominated by high-wavenumber #uctuations in the initial data. The passive scalar statistics under either of these asymptotic conditions ought to at least qualitatively be describable by the Random Uniform Jet model considered above, as the tight-binding approximation results in a quadratic harmonic oscillator potential of the type which arises in that model (see [51] and Section 5.2.3). 5.4. Other theoretical work concerning scalar intermittency The special structure of the velocity "eld in the Random Uniform Jet model (Section 5.2) and the Random Spatially Periodic Shear model (Section 5.3) has permitted a detailed and exact analysis of the advected passive scalar statistics, without the need for any ad hoc approximations. These models explicitly elucidate one mechanism by which large-scale scalar intermittency can be created, and indicate some features of a turbulent system which may suppress the intermittency or otherwise in#uence its nature. We conclude this section on intermittency by summarizing the main "ndings of some other recent theoretical studies of non-Gaussian features of the scalar PDF. We "rst discuss some formal considerations of the passive scalar statistics in a generic turbulent #ow, and then turn to the analysis of some discrete, phenomenological, mixing models. 5.4.1. General scalar intermittency considerations 5.4.1.1. Conditional dissipation rate formalism. A main theme in the theoretical investigation of large-scale scalar intermittency has been the variability of the local rate of scalar dissipation. Pope [268] had shown much earlier that, in a statistically spatially homogenous setting, the evolution of the scalar PDF pxR(o)"pR(o) is described by a PDE: 2 2 RpR(o) 1 R 2 "! (s(o, t)pR(o)) . (333) 2 Rt 2 Ro
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
489
The only coe$cient appearing in this PDE is the conditional scalar dissipation rate s(o, t)"2i1" ¹(x, t)""¹(x, t)"o2, which is de"ned as the statistical average of 2i" ¹(x, t)", conditioned upon ¹(x, t) assuming the particular value o. Recall from Paragraph 4.3.1.2 that the full (unconditioned) average sN (t)"2i1" ¹(x, t)"2 is the rate at which 1(¹(x, t))2 decays in the absence of external driving. As discussed previously, the scalar PDF pR(o) can be expected to concentrate at o"0 in the 2 long-time limit when the passive scalar "eld is freely decaying. The shape of the PDF in the long-time limit can be studied, however, by normalizing it to zero mean and unit variance. This normalized PDF p R(o ) is just the PDF of the quantity 2 ¹(x, t)!1¹(x, t)2 ¹ (x, t)" . 1(¹(x, t))2 Sinai and Yakhot [298] adapted Pope's formalism to describe the evolution of the normalized scalar PDF in terms of the normalized conditional dissipation rate
s (o , t)"
2i" ¹(x, t)" ¹ (x, t)"o sN (t)
.
They furthermore found an explicit solution for the normalized PDF in terms of the normalized conditional dissipation rate in which both are independent of time:
M oY 1 exp ! do Y . (334) p R(o )"C 2 2 s (o ) s (o Y) C is a normalization constant. This stationary solution is assumed (without proof) to describe the 2 long-time limiting shape of the one-point PDF of a freely decaying passive scalar "eld. If s (o ) is constant, meaning that the (normalized) local scalar dissipation rate is independent of the (normalized) local value of the passive scalar "eld, then a Gaussian limiting distribution is indicated. A precise description of the tails of the PDF requires knowledge of the behavior of the normalized conditional scalar dissipation rate s (o ) for large o , but no useful exact formula for this quantity appears to be available. Sinai and Yakhot suggested a quadratic approximation for s (o ), which yields algebraic tails for the scalar PDF. This is not in agreement with empirical results; see [147] for some discussion. A more elaborate approximation which permits progress in the conditional dissipation rate formalism is the mapping closure procedure developed by Chen et al. [58], wherein the passive scalar "eld is assumed to be representable as a distortion of a Gaussian random "eld. Gao [111] "nds that within the mapping closure approximation, the conditional dissipation rate s(o) forever assumes a nontrivial shape with lasting memory of the initial data. Because the passive scalar variance is decaying to zero, however, the normalized conditional dissipation rate s (o ) will approach unity over an interval of o which expands as tPR. Since constancy of the conditional scalar dissipation rate is associated with Gaussianity of the scalar PDF, Gao concludes that the scalar PDF has a Gaussian core with non-Gaussian tails at long-times. The crossover between the Gaussian core and non-Gaussian tails occurs at a ,xed value of o in the unnormalized scalar PDF pR(o) but at an ever-increasing value of o in the normalized scalar PDF p R(o ). That is, the 2 2 non-Gaussian features of the PDF are always present, but become increasingly remote relative to
490
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
the shrinking variance of the scalar PDF as tPR. Therefore, the shape of the scalar PDF does converge to a Gaussian form in the long-time limit, but nonuniformly in the tails. This is consistent with the results reported in Section 5.3.3 for a certain class of initial data in the Random Spatially Periodic Shear model. 5.4.1.2. Lagrangian formalism. In the work described above, departures of the scalar PDF from Gaussianity are attributed to the nonconstancy of the conditional dissipation rate, which appears naturally in some exact formulas and equations (see Eqs. (333) and (334)). This object is a precise measure of the correlation between the local value of the passive scalar "eld and its gradient, but is quite challenging to model. A more intuitive perspective on how this correlation creates nonGaussianity of the scalar PDF is set forth by Kimura and Kraichnan [162] through consideration of the history of the scalar "eld within a Lagrangian #uid element. The value of the scalar "eld in such a #uid element evolves only through molecular dissipation; advection alone would leave it unchanged. The rate of molecular dissipation in the Lagrangian #uid element depends however, on the local scalar gradient and this does depend very strongly on the advection. Regions of strong compressive strain in the #ow will build large scalar gradients, and consequently rapid scalar dissipation. Therefore, the value of the scalar "eld in a Lagrangian #uid element at a time t'0 depends on its initial value and the history of the local #uid straining. When the scalar "eld is measured at some given point in the "xed (Eulerian) laboratory frame, one observes the value of the scalar "eld in the Lagrangian #uid element which happens to be there at the time. If the initial passive scalar "eld is statistically homogenous (with zero mean), then the originating location of the Lagrangian #uid element is unimportant. Then the measured scalar value will depend only on the initial scalar value (speci"ed by a common PDF) and the strain history of the Lagrangian #uid element which is passing by the probe. The scalar PDF at times t'0 is thus modi"ed from the initial PDF solely because the scalar is dissipated more rapidly in Lagrangian #uid elements in which greater scalar gradients have been generated due to stronger straining by the #uid. Kimura and Kraichnan illustrate this perspective for a #ow in which the velocity "eld is a spatially uniform straining #ow #uctuating randomly in time and the initial scalar data is a homogenous, Gaussian random "eld. The passive scalar "eld observed at a given point at later times is shown to be a random mixture of mean zero Gaussian random variables, with variance depending on the realization of the velocity "eld (or equivalently, the straining history of a #uid element). The scalar PDF is consequently broader-than-Gaussian. We showed in Paragraph 5.2.2.1 through a line of reasoning suggested by Fe!erman [101] that this result in fact applies to quite general random #ows. The Lagrangian point of view was utilized by Shraiman and Siggia [295] in their formal approximate analysis of the scalar PDF advected by a single-scale turbulent #ow at high PeH clet number with a constant mean scalar gradient imposed. Recall from Paragraph 4.3.1.3 that turbulent interaction with a mean scalar gradient provides a means of driving passive scalar #uctuations, so the scalar PDF will settle down at long times to a form with "nite variance, in contrast to a freely decaying situation. One might expect scalar intermittency in the presence of a constant mean scalar gradient because the length scale of the scalar #uctuations will be naturally comparable to that of the velocity "eld. And indeed, Shraiman and Siggia derive exponential tails for the scalar PDF through a representation of the scalar value observed at a given location as
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
491
a functional integral over Lagrangian tracer trajectories. The large #uctuations are computed to come predominantly from situations in which a Lagrangian #uid element enjoys an unusually mild strain during its voyage. In particular, the shape of the tails is not primarily determined by the most obvious class of events in which a Lagrangian #uid element moves unusually persistently across the scalar gradient, with typical straining along the way (see Section 5.4.2 below). Exponential tails for the scalar PDF in the presence of a constant mean scalar gradient were observed in numerical simulations by Holzer and Siggia [139] in which the two-dimensional velocity "eld was evolved according to inviscid dynamics truncated to a "nite band of Fourier modes. Another functional integral approach by Falkovich et al. [96] and a Lagrangian formalism based on the analysis of line stretching by Cherktov et al. [65,66] indicate that the scalar PDF exhibits similar exponential tails if the driving of the #uctuations comes from an external, rapidly decorrelating pumping "eld rather than by turbulent interaction with a background scalar gradient. This conclusion was rigorously established by Bernard et al. [36] for the case in which both the pumping and velocity "elds are smooth in space and rapidly decorrelating in time (as in the RDT model described in Section 4.3.1, with Hurst exponent H"1 for the velocity "eld). Low-strain trajectories are again suggested to be the dominant contributors to the large-scale intermittency of the scalar "eld [96]. 5.4.1.3. Nonlinear mean scalar proxles. Non-Gaussian scalar PDFs can also arise quite simply from Gaussian random initial data when the mean pro"le is nonlinear or the single-point variance is not constant, as pointed out by Kimura and Kraichnan [162] through theoretical arguments and numerical simulations with a synthetic velocity "eld. Even without molecular di!usion, the passive scalar value observed at a later time at a given point will in such instances be a mixture of Gaussian random variables with di!erent means and variances, which is not generally Gaussian. One experimental example with a nonlinear initial mean pro"le is a thermal mixing layer, in which half of the #uid in a wind tunnel is heated to a constant level, with the other half remaining at room temperature. The temperature PDF observed downstream is found to be strongly non-Gaussian at the edges of the evolving turbulent mixing layer [193,203]. Exponential tails have likewise been found in the single-point PDF for the concentration of a dye in jet #ow experiments [272]. A situation in which a nonlinear mean scalar pro"le is imposed through boundary conditions rather than initial data was investigated by Ching and Tu [71] through "nite-di!erence numerical simulations with a single-scale Gaussian random velocity "eld. They "nd that both nearly Gaussian and broader-than-Gaussian scalar PDFs can be obtained in the long-time limit, whether the imposed mean scalar pro"le is linear or nonlinear. They "nd for all cases considered that the scalar PDF develops broader-than-Gaussian tails at su$ciently high PeH clet numbers, in agreement with laboratory experiments [191]. 5.4.2. Phenomenological discrete mixing models As it is di$cult to directly analyze the advection}di!usion equation with a general turbulent velocity "eld model, or to conduct a properly resolved numerical simulation over long-time intervals, some physicists and engineers have invented simpli"ed phenomenological equations for the purposes of studying turbulent di!usion. These phenomenological models seek to capture the essential physics of turbulent advection and molecular di!usion without resolving the full dynamics. Notable among these is the linear eddy model of Kerstein [156]. Though originally developed
492
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
for engineering applications, it has been adopted in various forms in the theoretical investigation of large-scale scalar intermittency. The linear eddy model is formulated on a one-dimensional discrete lattice, imagined to represent a one-dimensional cut through a turbulent #ow [156]. Molecular di!usion is implemented directly through a "nite-di!erence discretization of the ordinary di!usion equation, with constant time step. Turbulent advection is represented in the model by random exchanges of the scalar values at di!erent sites. Both the times at which the exchange occurs and the sites a!ected are prescribed according to a random process with speci"ed mapping structure. A standard numerical implementation of the linear eddy model with a mean scalar gradient imposed produced a scalar PDF with exponential tails [156]. Pumir et al. [277] considered an even simpler phenomenological model in which turbulent mixing is similarly represented by a superposition of a random exchange process and an averaging of neighboring passive scalar values, and showed analytically that exponential tails in the scalar PDF occur in the presence of a mean scalar gradient. Their model was subsequently demonstrated by Holzer and Pumir [138] to be essentially a mean-"eld approximation to the linear eddy model. These latter authors also formulated a simpli"ed variation of the linear eddy model which can be analytically solved without the need to pass to the mean "eld limit. Nearly exponential tails in the scalar PDF are again predicted in the presence of a background gradient, and their origin is traced to the Poisson process governing the times at which random exchange events occur in the model. More precisely, the exponential tails of the PDF are associated with events in which a series of random exchanges occur in rapid succession, e!ectively dragging a parcel of #uid far along the gradient before molecular di!usion has time to equilibrate the associated scalar "eld to the local value at its new location. Kerstein and McMurtry [157] introduced another mean "eld theory of the linear eddy model based on a Langevin approximation, and it again predicts exponential tails in the scalar PDF when a constant mean scalar gradient is present. They also point out that other plausible mean "eld theories can be constructed which lead to Gaussian tails for the scalar PDF. The exponential tails only come about in the above theories because they are built on the assumption that #uid parcels are transported across the scalar gradient according to a Poisson process. Thus, while these discrete models provide some insight into the nature of turbulent mixing, the mechanism by which they generate scalar intermittency is not general enough to relate directly to real world turbulent di!usion. Indeed, Shraiman and Siggia [295] indicate that scalar intermittency in a continuous turbulent #ow is due to events in which Lagrangian #uid elements have a history of low straining. None of the above discrete models account for variability in the strain rate, though Kerstein and McMurtry suggest that its e!ects can be phenomenologically included in their Langevin mean "eld theory [157]. Finally, the linear eddy model was used by McMurtry, Gansuage, Kerstein, and Krueger [234] to simulate numerically the statistics of a decaying passive scalar "eld in statistically stationary turbulence (with no mean scalar gradient imposed). Through appropriate speci"cation of the random exchange events, a high Reynolds number #ow with inertial-range scaling properties can be modelled. The #exibility and economy of the linear eddy model permits a study of the e!ects of both Reynolds number and Schmidt number on the shape of the scalar PDF. It is found that the scalar PDF is not much changed as the model Reynolds number increases beyond 100 (up to 10), but that the scalar PDF is very sensitive to variations in the model Schmidt number (over the range 0.1}10).
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
493
6. Monte Carlo methods for turbulent di4usion For the most part, we have been discussing mathematical passive scalar advection models with certain simplifying features which permit exact analysis. These special models play an important role in unambiguously elucidating various fundamental physical aspects of turbulent di!usion. In addressing speci"c applications and questions concerning complex turbulent #ows, however, one wants to investigate tracer transport in a random velocity "eld model for which exact solutions are not available. It is natural to explore such models through computer simulations. We discussed simulations of tracer trajectories in deterministic, periodic velocity "elds with molecular di!usion in Section 2.3.2. Here we consider in detail the numerical simulation of the motion of tracers in a steady, random velocity "eld *(x). (Examples of numerical simulations of tracers in the opposite extreme of rapidly decorrelating random velocity "elds were described earlier in Paragraph 4.2.2.4; also see [108].) A typical problem is the computation of the (absolute) mean-square displacement pX(t),1"X(t)!x "2 of a tracer, where X(t) is the tracer trajectory and x "X(t"0). Let us suppose i"0 for simplicity. The statistical average in pX (t) is then an average over the full (usually in"nite) ensemble of velocity "elds *(x) described by the given statistical model. In a numerical Monte Carlo simulation, this averaging operation is discretized as an average over a "nite number N of independent samples generated `pseudo-randomlya using a random number generator on the computational machine [274]. More explicitly, an algorithm for producing a random velocity "eld * (x) is prescribed which approximates *(x) in some statistical sense, but which can be fully described by a "nite number of operations involving a "nite number of random variables. The N independent realizations +*H ,, of the approximate velocity "eld are then generated through successive calls to the H random number generator. In each of these realizations of the velocity "eld, the equations of motion for the tracer particle, d XH(t)"*H (XH(t), t) dt , (335) are solved numerically. Finally, a numerical approximation to the mean-square displacement of the tracer as a function of time is obtained by averaging over the "nite sample size generated: 1 , pX (t)" "XH(t)!x " . N H By the Law of Large Numbers ([102], Ch. 10), pX (t) will approximate pX(t) if N is su$ciently large and the discretized random velocity "eld * (x) is a su$ciently accurate approximation to the true velocity "eld *(x). In principle, the Monte Carlo approach can be used in a similar way to compute numerical approximations to the statistical average of any functional of the particle trajectory. One can account for the e!ects of molecular di!usion through the addition of a stochastic term (2i dW(t) to the trajectory equation (335). This requires the generation of additional random variables at each time step, but its treatment is straightforward because this e!ect has a constant coe$cient [163]. To keep focus on the more demanding main issues involving the simulation of the random velocity "eld, we will ignore molecular di!usion (i"0) for the duration of Section 6. The interested reader can consult Section 2.3.2 for Monte Carlo simulations with periodic velocity "elds and i'0.
494
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Overview of Section 6: We begin in Section 6.1 with a brief summary of general accuracy considerations in Monte Carlo simulations. Then, in Section 6.2, we consider a class of three Monte Carlo methods for generating general Gaussian, homogenous, random "elds. We assess their utility for turbulent di!usion studies by applying them to the exactly solvable Random Steady Shear (RSS) Model [141], which we discussed in Section 3.2. This model includes #ows leading to a wide variety of statistical tracer motion, and thereby provides a simple test for the performance of the Monte Carlo methods in simulating turbulent di!usion under various conditions. Comparison of the numerical simulations with the exact results illustrates certain strengths and inherent limitations of the methods, particularly in properly simulating long-range correlations in the random shear velocity "eld. Next, we turn to the numerical simulation of turbulent di!usion in random velocity "elds with a statistical self-similarity characteristic of the inertial range of scales of a turbulent #ow. We continue in Section 6.3 to consider steady shear #ows so that the velocity "eld is still speci"ed by a random scalar function v(x). We seek to simulate a mean zero, Gaussian random "eld v(x) which has an inertial-range scaling law: 1(v(x)!v(x))2"S'"x!x"&, (336) T with 0(H(1 and a constant prefactor S', over a wide range of scales. To simulate the wide range T of active scales of such a random velocity "eld e$ciently, it is natural to formulate hierarchical Monte Carlo schemes in which the random velocity "eld is expressed as a superposition of independently generated random "elds varying on di!erent length scales. We "rst examine one popular hierarchical simulation method, Successive Random Addition [336], and cite results from a rigorous demonstration [87] that this method is fundamentally incapable of simulating a stationary random "eld obeying the self-similar scaling (336) with any quantitative accuracy. We next describe a pair of hierarchical Monte Carlo methods using wavelets, introduced by Elliott, Horntrop, and the "rst author [82,84], which have been shown to be capable of generating a random "eld v(x) with accurate self-similar scaling (336) over 12 decades of scales. By contrast, previous simulations using (variations of) the nonhierarchical Monte Carlo methods discussed in Section 6.2 have only achieved one to two decades of inertial-range scaling behavior. Moreover, the wavelet-based Monte Carlo methods have low variance (see Section 6.1); 100}1000 sample realizations are su$cient for statistical averages to be computed within a few percent error. We compare the wavelet-based Monte Carlo methods with the Randomization Method, the nonhierarchical Monte Carlo method with the greatest capacity for simulating velocity "elds with an extended inertial range, and demonstrate their quantitative accuracy in simulating tracer transport on an exactly solvable model problem. In Section 6.4, we describe a general method of approximating any statistically isotropic, incompressible, multi-dimensional Gaussian random velocity "eld as a superposition of Gaussian homogenous random shear #ows [85]. In this way, any of the Monte Carlo methods for simulating scalar random "elds can be used to simulate statistically isotropic multi-dimensional vector "elds as well. We show that this technique can be used with the wavelet-based Monte Carlo methods discussed in Section 6.3 to generate a statistically isotropic, incompressible, two-dimensional Gaussian random velocity "eld with an inertial range extending over twelve decades of scales. In Section 6.5, we study tracer pair dispersion in two-dimensional synthetic turbulent velocity "elds generated in this manner. Temporal dynamics are induced by sweeping the frozen random
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
495
"eld past the laboratory frame at a constant speed, corresponding to the picture underlying Taylor's hypothesis ([320], p. 253). The mean-square separation between a pair of tracers as it evolves through the inertial range of this synthetic turbulent velocity "eld is found to obey the classical Richardson's t law pDX(t),1"X(t)!X(t)"2&t over eight decades of spatial scales. We relate these numerical results to experimental "ndings, other numerical simulations, and some theoretical work. 6.1. General accuracy considerations in Monte Carlo simulations An examination of the error in the numerically evaluated Monte Carlo average brings out two main practical accuracy concerns in a Monte Carlo simulation. For speci"city, we focus on the mean-square tracer displacement pX(t), though these considerations are completely general. The discrepancy between the numerically computed Monte Carlo approximation pX (t) and the true pX(t) can be expressed as a sum of: E a systematic error (bias) due to numerical discretization of the velocity "eld and the trajectory equations, and E a random sampling error because pX (t) is computed using a "nite number of samples. Mathematically, let pX (t),1(X (t)!x )2* , be the mean-square tracer displacement as would be computed (in principle) by a complete averaging over all the random variables appearing in the discrete numerical approximation * (x) of the velocity "eld. Then we can write the error in the numerical computation of pX (t) as (t) , pX (t)!pX (t)"E (t)#E (t) , E (t)"pX (t)!pX , (t)!pX (t) , E (t)"pX , where E (t) is a deterministic, systematic error, and E (t) is a purely random sampling error. There are two sources of the systematic error E (t): E the di!erence between the statistics of the true velocity "eld *(x) and the numerically speci"ed random velocity "eld * (x) involving only a "nite number of random variables, and E the discretization error in the numerical integration of the tracer trajectories. The accurate and e$cient numerical integration of the tracer trajectory equations (335) requires a suitable (sometimes adaptive) choice of time step. We will not dwell on this technical but important issue here; see [84,86,140] for explicit examples of the kind of considerations involved, particularly when several particles are being simultaneously tracked. We will concentrate here on the issues pertaining to the simulation of the random velocity "eld *(x). To minimize the systematic error in the numerical approximation of the velocity "eld, the probability law of * (x) should be close in some sense to that of *(x) [163]. For example, the mean and correlation tensor of * (x)
496
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
should approximate that of *(x). If *(x) is a Gaussian random "eld, then it is often desirable for * (x) also to be a Gaussian random "eld. The random sampling error E (t) arises solely because the mean-square tracer displacement is numerically computed using only a "nite number of realizations. It has mean zero with respect to the statistics of the numerical scheme, 1E (t)2 "0, and its variance may be computed as * , "XH (t)!x " 1 H !1"X (t)!x "2* " R("X (t)!x ") , 1(E (t))2* " N N
where R("X
(t))2* (t)!x "),1("X (t)!x "!pX , is just the variance of the numerical quantity "X (t)!x " whose Monte Carlo average we are seeking. The random sampling error therefore decreases as the sample size becomes larger, but at the relatively slow rate E (t)&N\. Typically, the sample size is restricted to moderate values (say, a few thousand or million in turbulent di!usion applications), due to computational cost. Therefore, one would like to minimize R("X (t)!x ") to reduce sampling error. This quantity must perforce be at least on the order of the variance of the true random variable "X(t)!x " whose mean we are trying to estimate. The practical numerical issue is to avoid numerical approximation schemes which add on a lot of extra variability and lead to excessively large values of the variance, R("X (t)!x "). An intuitive rule of thumb for designing a low variance Monte Carlo method is that each individual realization generated by the numerical scheme should have `typicala properties of the true random velocity "eld *(x). We will now proceed to examine various Monte Carlo methods for turbulent di!usion with the above considerations in mind. 6.2. Nonhierarchical Monte Carlo methods A simple context in which to discuss numerical Monte Carlo methods for turbulent di!usion is the class of steady, two-dimensional shear #ows with constant cross sweep uN :
*(x, t)"*(x, y, t)"
wN
v(x)
.
Then the numerical simulation of the velocity "eld reduces to the generation of v(x), a scalar random "eld of a single variable, which we will further assume to be Gaussian and statistically homogenous, with mean zero and correlation function 1v(x)v(x#x)2"R(x) . In Section 3.2, we discussed a particular one-parameter family of such #ows as part of the Random Steady Shear (RSS) Model [141]. The elements of this model have a simple structure, the mean-square displacement of a tracer in these #ows can be expressed by exact analytical formulas, and the tracer motion exhibits a wide variety of anomalous scaling behavior. These three properties make this model an excellent means of assessing the performance of numerical approximation schemes for turbulent di!usion. For convenience, we recapitulate in Section 6.2.1 the de"nition of
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
497
the Random Steady Shear (RSS) Model and the exact formulas for the mean-square displacement of a tracer advected by such #ows [141]. We state the numerical values of the parameters used in the Monte Carlo simulations of this model in Paragraph 6.2.1.1. Several numerical procedures for generating a Gaussian, homogenous random "eld are directly suggested by two general expressions of the random "eld in terms of stochastic integrals. We have already encountered the Fourier stochastic integral representation
e\p IVE("k") d= I (k) (337) \ in Paragraph 3.2.2.1. The integration measure d= I (k) is a complex Gaussian white noise with the formal properties: v(x)"
d= I (!k)"d= I (k) , 1d= I (k)2"0 ,
(338)
1d= I (k) d= I (k)2"d(k#k) dk dk , where an overbar denotes complex conjugation. The integrand E(k) is the energy spectrum of the velocity "eld:
R(x)"
e\p IVE("k") dk"2
cos(2pkx)E(k) dk .
(339)
\ The Fourier stochastic integral (337) formally represents the random "eld as a superposition of independent random #uctuations of various wavenumbers, with the amplitude of each #uctuation proportional to the square root of the energy spectral density at its wavenumber. One way of numerically simulating the random "eld v(x) is to truncate this stochastic integral to a "nite interval, and discretize it according to a midpoint rule with equispaced grid points. The random "eld v(x) is thereby expressed as a discrete Fourier transform of a "nite set of Gaussian random variables. This direct algorithm, which has been used by Viecelli and Can"eld [335] and Voss [336] in the generation of fractal random "elds, will be called the (standard) Fourier Method. It will be discussed in Section 6.2.2. Variations of this scheme have been adopted by Kraichnan [180] and by Sabelfeld and coworkers [190,240,291], in which the grid points of the discretization of the stochastic integral (337) are chosen randomly according to some appropriate probability distribution. We shall refer to the strategy of Sabelfeld's group as the Randomization Method, and discuss it in Section 6.2.3. Another explicit expression for the random "eld v(x) is given in terms of a physical-space stochastic integral ([341], Section 26.2):
G(x!r) d=(r)" G(r) d=(x!r) . (340) \ \ The integration measure d=( ) ) is now a real white noise measure, with formal properties: v(x)"
1d=(r)2"0 , 1d=(r) d=(r)2"d(r!r) dr dr .
(341)
498
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
This white noise is convolved against the function G(x), which is proportional to the inverse Fourier transform of the square root of the energy spectrum:
G(x)"
\
e\p IVE("k") dk"2
cos(2pkx)E(k) dk .
(342)
The function G(x) provides another real-space description of the spatial correlations of the random "eld v(x) in addition to the standard correlation function R(x). Like R(x), the function G(x) is even, assumes its maximal value at x"0, and generally decays for large x. The stochastic integral expression (340) represents the random "eld as a local average of an underlying white noise "eld on the same physical-space domain. One can intuitively imagine laying down a random white noise "eld on the real-space domain 1, and then computing the random "eld v( ) ) at a given point x by summing up the values of the white noise "eld with weights speci"ed by the value of the function G centered at x. The value of the random "eld v( ) ) at any other point x is obtained by simply moving the weighting function G so that it is centered at x, and then summing as before. For this reason, the real-space expression (340) is often called a `moving-averagea representation. Note that the averaging procedure produces nontrivial correlations in v( ) ) starting from the uncorrelated "eld d=( ) ) because the evaluation of v( ) ) at di!erent points involves the same random values of d=( ) ); the weighting function is simply centered at di!erent locations. In a manner parallel to that of the Fourier Method, the real-space stochastic integral expression (340) can be implemented numerically through a straightforward truncation of the integration domain and a midpoint-rule discretization with equispaced grid points. This physical-space based method for simulating the random velocity "eld will be called the Moving Average Method. It was "rst studied in the thesis of McCoy [228], and we shall treat it in Section 6.2.4. There is no sensible analogue of the Randomization Method in physical space. For each of the three Monte Carlo methods we have mentioned, the Fourier Method, the Randomization Method, and the Moving Average Method, we will "rst give some details about their implementation. Then we will discuss their performance for the family of Random Steady Shear Model #ows summarized in Section 6.2.1. In this fashion, we will uncover certain inherent numerical artifacts of these methods, and obtain some understanding of circumstances in which they may be expected to perform well or not so well. We "nd, in particular, that the built-in periodicity of the direct Fourier Method creates strong systematic errors after a certain time (Section 6.2.2). The Randomization Method cures this periodicity problem, and performs quite well when the velocity "eld has strong, positive long-range correlations so that the tracer's motion is di!usive or super-di!usive. It su!ers the drawback, however, that the simulated velocity "eld can be substantially non-Gaussian. Also, the Randomization Method does not perform as well in the class of test models for which the correlation function of the velocity "eld has slowly decaying negative tails and the tracer motion is sub-di!usive (Section 6.2.3). In contrast, the Moving Average Method can simulate sub-di!usive and di!usive tracer motion reasonably e$ciently, but cannot accurately represent super-di!usive tracer motion because of an intrinsic shortcoming in handling strong long-range correlations in the velocity "eld (Section 6.2.4). The numerical studies discussed here were originally reported in the thesis of Horntrop [140] and in a paper by Elliott et al. [83].
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
499
Each of the methods discussed above can be extended directly to simulate a multi-dimensional, vectorial velocity "eld by discretizing vector-valued versions ([341], Sections 20}22) of the stochastic integral representations (337) and (340). We will discuss a multi-dimensional version of the Randomization Method in Section 6.4, and will "nd better results from a less direct multidimensional implementation! The methods presented here are also capable, in principle, of simulating non-Gaussian random "elds. The stochastic integral representations of the random "elds would then involve non-Gaussian random measures dZI (k) and dZ(r) in place of the white noise measures d= I (k) and d=(r) ([341], Section 8). Of course, the non-Gaussian random variables in the discretized sums would have to be simulated in some fashion. Here we will restrict our attention to the simulation of Gaussian random "elds. 6.2.1. Exact formulas for mean-square tracer displacement in Random Steady Shear Model In our evaluation of Monte Carlo methods for turbulent di!usion, we will use a speci"c family of Random Steady Shear (RSS) Model #ows with constant cross sweep, which was discussed in detail in Section 3.2.2 and the original paper [141]. The velocity "eld in this model is a steady, two-dimensional shear #ow:
*(x, t)"*(x, y, t)"
wN
v(x)
,
where wN O0 and v(x) is a mean zero, Gaussian random "eld with correlation function R(x)"1v(x)v(x#x)2 . We now de"ne a special, explicit one-parameter family of correlation functions in terms of their energy spectra:
R(x)"
e\p IVE("k") dk"2
cos(2pkx)E(k) dk , (343a) \ (343b) E(k)"(2p)\CA k\C e\L*)I, !R(e(2 , # where e is the infrared scaling exponent, A is a constant amplitude, and ¸ is a dissipation length # ) scale de"ning the ultraviolet cuto! of the power law scaling at high wavenumber (small spatial scales). The special choice of ultraviolet cuto! made in Eq. (343b) permits the following closed-form expression for the correlation functions [141]:
"x" R(x)"2C(2!e)A (¸ #x)C\cos (2!e) arctan . # ) ¸ ) The form of the correlation function R(x) in the RSS Model for various values of the infrared scaling exponent e is shown in Fig. 20. A successful Monte Carlo method must generate a velocity "eld which closely reproduces the correct correlation function, because the mean-square displacement of a tracer particle along the shear p(t)"1(>(t)!y )2 at long times involves an integration 7 of R(x) over a large interval (137):
R p(t)"2 (t!s)R(wN s) ds . 7
(344)
500
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 20. Plots of the velocity correlation function for the Random Steady Shear (RSS) Model for various values of e (from [83]). Upper graph: e"!1 (solid line) and e" (dashed line). Lower graph: e"1 (solid line) and e" (dashed line).
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
501
Table 15 Summary of long-time scaling behavior for the mean-square tracer displacement in Random Steady Shear (RSS) Model, with wN O0 and i"0 Parameter regime
Mean square displacement
Qualitative behavior
e(0 0(e(1 e"1 1(e(2
p(t)&t 7 p(t)&tC 7 p(t)&t 7 p(t)&tC 7
Trapping Sub-di!usive Di!usive Super-di!usive
Two features of R(x) in the RSS Model present challenges to numerical modelling in this regard. First, for e(1, the correlation function R(x) has negative tails which decay only algebraically for large "x". Secondly, as e62, the tails of the correlation function are positive but decay ever more slowly (R(x)&"x"C\ for "x"<¸ ), re#ecting the strong long-range correlations in the velocity "eld. ) The RSS Model therefore tests the capacity of Monte Carlo methods to simulate negative correlations and long-range correlations of a random "eld. The exact solutions for the mean-square displacement p(t) of a tracer in the various RSS Model 7 #ows were worked out in Section 3.2 and [141], and we will use these in graphical comparisons with the numerically simulated mean-square tracer displacement. For the purposes of our general discussion, we simply remind the reader in Table 15 of the long-time scaling behavior of p(t) for 7 various values of the infrared scaling exponent e, when the cross sweep is nonzero wN O0 and molecular di!usion is absent i"0. Note the wide range of long-time behavior assumed by the tracer in the RSS Model as the parameter e is varied. The reason we do not include molecular di!usion is that it would override the sub-di!usive and trapping behavior of the RSS #ows for e(1. For i"0, the RSS Model can test how faithfully Monte Carlo methods replicate both sub-di!usive and super-di!usive tracer motion. 6.2.1.1. Numerical parameter values in Monte Carlo simulations. In the numerical simulations, the tracer is always started at the origin (x ,y )"(0,0), and space and time are nondimensionalized so that ¸ "1 and wN "1. The tracer displacement along the shear: )
R >(t)" v(wN s) ds is computed in every realization according to a trapezoidal rule with time step su$ciently small (*t"0.1) to resolve the #uctuations in the simulated velocity "eld v (x). The value of p(t)"1>(t)2 is then obtained by averaging over a large number of independent simulations of 7 the velocity "eld. It has been checked [140] that the error due to the "nite time step in the integration of the trajectories is negligibly small relative to the errors arising from the "nite sample size in the Monte Carlo average and discrepancies between the statistics of the simulated velocity "eld v (x) and of the true velocity "eld v(x).
502
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
6.2.2. Fourier space-based method We shall now de"ne the Fourier Method in more detail, and apply it to the RSS Model. We will "nd inherent limitations of the method in simulating turbulent di!usion [83,140]. 6.2.2.1. Derivation of Fourier method. We shall provide two simple derivations of the basic simulation formula for the Fourier Method. One is a direct discretization of the stochastic Fourier integral representation of the random "eld v(x). The second circumvents the stochastic integral representation, and provides a useful framework for comparing the underpinnings of the Fourier Method with those of the Randomization Method to be discussed in Section 6.2.3. Discretization of stochastic Fourier integral. A natural means of obtaining numerical schemes is through the truncation and discretization of exact continuum formulas. We apply this approach to the stochastic Fourier integral representation (337)
e\p IVE("k") d= I (k) \ by a Riemann sum approximation over a "nite symmetric partition of 2M#1 intervals, with equal widths *k. This partition extends over a "nite segment [!k , k ], with k "(M#)*k.
Evaluating the integrand at the midpoint of the intervals, we arrive at the following random Riemann}Stieltjes sum for the approximating velocity "eld: v(x)"
+ (x)" e\p H IVE(" j"*k)*= I , (345) H H\+ where the complex random variables *= I (k) are de"ned in terms of the complex white noise H process: v
*= I (k)" H
H> I
d= I (k) .
H\ I From the formal rules (338) for the statistics of the white noise process, we "nd that +*= I ,+ are H H statistically independent complex Gaussian random variables with the properties: I 2"0, 1(*= I )2"0, 1*= I *= I 2"*k . *= I "*= I , 1*= H H H H H H Also, *= I is independent of all these variables, and is itself a mean zero, real Gaussian random variable with variance *k. We can therefore rewrite Eq. (345) as + (x)"E(0)*= I #2 Re E(" j"*k) e\p H IV*= I , H H where Re denotes the real part of the following expression. Expanding the complex random variable *= I into real and imaginary parts, we obtain a concise expression for the approximate H velocity "eld as a discrete random sum of real Fourier modes. We will call its numerical implementation the Fourier Method. The approximate velocity "eld is written: v
+ v (x)" (2E(k )*k [m cos(2pk x)#g sin(2pk x)] , H H H $ H H H H
(346)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
503
where the wavevectors k "j*k denote the locations of the equispaced grid points, and *k "*k H H for j"1,2, M and *k "*k. The +m , g ,+ are a collection of independent standard Gaussian H H H random variables (mean zero and unit variance). (If E(k) diverges at k"0, as for the RSS Models with 1(e(2, then the j"0 term requires some special treatment.) The equal spacing of the grid points permits a rapid passage from the set of random Fourier coe$cients to the random values of v (x) on the (equi-spaced) physical space grid through the Fast Fourier Transform ([50], Ch. 18). The Gaussian random coe$cients can be simulated by applying a Box}Muller transformation ([163], Section 1.3) to uniformly distributed random variables on the unit interval, which can be supplied by standard computer random number generators. The Fourier Method with equispaced grid points has been utilized by Voss [336] in the production of fractal sceneries and by Viecelli and Can"eld [335] in the simulation of a fully developed turbulent velocity "eld with about one decade of an inertial range. An important numerical feature of the Fourier Method is that the simulated random velocity "eld is periodic with period (*k)\ in every realization. The true velocity "eld v(x), however, has no such periodicity when the spectrum E(k) is continuous. Derivation by random Fourier sum ansatz. We now o!er another means of arriving at the simulation formula (346) for the Fourier Method which has enough #exibility to yield the simulation formula for the Randomization Method as well. Rather than proceeding deductively from the stochastic Fourier integral representation for the random "eld v(x), we simply declare that we will seek a "nite spectral approximation. We begin by cutting o! Fourier space to a "nite segment [0, k ], and partitioning this segment into M#1 disjoint intervals, which need not be of
equal width (see Fig. 21). We de"ne k "0 and *k as the width of the interval abutting this point, and take +k ,+ as the midpoints and *k as the widths of the remaining intervals comprising the H H H partition, ordered from left to right. We think of k as a representative wavenumber from its interval H of wavenumber space. We then form a Fourier sum with these wavenumbers: + (x)" a cos(2pk x)#b sin(2pk x) , (347) H H H H H with real, random coe$cients +a ,+ and +b ,+ . We wish to choose the probability distribution H H H H of these random variables so that v (x) approximates the random "eld v(x). First, v (x) should be a Gaussian, homogenous random "eld with mean zero. The fact that linear combinations of mean zero Gaussian random variables are mean zero and Gaussian suggests that +a , b ,+ should be taken according to a jointly Gaussian distribution with zero H H H mean. By substituting the right-hand side of Eq. (347) into 1v (x)v (x#x)2, and noting that this expression must be independent of x by statistical homogeneity, we "nd that the random v
Fig. 21. Partition of a "nite segment [0, k ] of Fourier space into M#1 intervals in the Fourier Method.
504
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
variables +a , b ,+ must all be mutually independent of one another, and that 1a2"1b2,v H H H H H H for 04j4M. We therefore express our random Fourier sum as: + (x)" v [m cos(2pk x)#g sin(2pk x)] , (348) H H H H H H where +m , g ,+ is a collection of independent, standard Gaussian random variables, and +v ,+ H H H H H are constant amplitudes which we are left to choose. We pick these amplitudes by requiring that the correlation function v
R
(x)"1v (x)v (x#x)2 of v (x) approximates the true correlation function (339): (349) R(x)"1v(x)v(x#x)2"2 cos(2pkx)E(k) dk . Expanding the double sum and computing the averages in R (x), we "nd that the correlation function of the approximate random "eld v (x) is + R (x)" v[cos(2pk x) cos(2pk (x#x))#sin(2pk x) sin(2pk (x#x))] H H H H H H + " v cos(2pk x) . (350) H H H A discrete sum of this form can be obtained by a Riemann sum approximation of the integral on the right-hand side of Eq. (349), using the partition de"ned in Fig. 21:
+ (x)" 2E(k )*k cos(2pk x) . (351) H H H H Note that we have implicitly dropped the contribution of the integral from k5k , but this should
not be a serious matter if the energy spectrum E(k) decays rapidly for large k and k is chosen
su$ciently large. Upon comparison with Eq. (350), we "nd that R
v "(2E(k )*k ) H H H will make v (x) a consistent approximation to the random "eld v(x). We arrive therefore at exactly the same Fourier Method simulation formula (346) as before, but the wavenumbers +k ,+ need H H not be equispaced. We note from Eqs. (349) and (350) that the energy spectrum of the velocity "eld simulated by the Fourier Method is the following discrete approximation: + (k)" E(k )*k d(k!k ) H H H H to the continuous energy spectrum E(k). E
(352)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
505
There are at least two reasons one might want to choose unequally spaced wavenumbers, even though one thereby loses the possibility of using the fast Fourier transform. First, when the wavenumbers in the discrete sum are equispaced, then all the random Fourier modes are harmonically aligned, and the simulated random "eld will be exactly periodic (with period k\). This can have undesirable consequences, as we shall explicitly see in the RSS Model application below. Secondly, one may wish to re"ne the partition of wavenumber space near regions of large values or rapid variation of E(k). In particular, for the RSS Model with 1(e(2, E(k) diverges at k"0, and one might want to place extra points near k"0 to improve the accuracy of the simulated velocity "eld at large scales. In Section 6.2.3 below, we will investigate the e!ects of using nonuniformly spaced wavenumber grid points k within the Randomization Method, wherein these wavenumbers H are chosen randomly. For our subsequent discussion of the Fourier Method, we restrict attention to equispaced wavenumber grid points +k ,+ . H H We remark that we could also have handled nonuniformly spaced wavenumbers through discretization of the stochastic Fourier integral (337). The present procedure generalizes more readily, however, to allow a random choice of wavenumbers +k ,+ , as we will discuss in H H Section 6.2.3. 6.2.2.2. Fourier Method applied to RSS turbulent transport model. Examples of random "elds generated by the (equispaced) Fourier Method can be found in the papers of Voss [366] and Viecelli and Can"eld [335]. With su$ciently "ne wavenumber spacing *k, the method su$ces to produce visually appealing fractal "elds [336], with about a decade of statistically self-similar scaling [335]. The authors of each paper complain of the large amount of wavenumbers needed to produce a satisfactory fractal "eld, and prefer the Successive Random Addition Method, which will be discussed in Section 6.3.1. We shall examine the practicality of the Fourier Method for the particular application of simulating turbulent transport by trying it in the RSS Model [83,140]. We will emphasize the consequences of the inherent periodicity of the velocity "eld simulated by the random Fourier Method. These are clearly brought out in a simulation of a sub-di!usive RSS Model (e"). In Fig. 22, the mean-square tracer displacement is shown for a Monte Carlo simulation for a Fourier sum with M"200 wavenumbers, spaced by *k"1/40p, and averaged over 2000 realizations. (The error from truncating the wavenumbers k5k "(M#)*k+10 is then less than 0.1%.) We see
a systematic downward turning of the Fourier Method simulation from the exact result for times t915. This can readily be traced to the fact that the simulated velocity "eld has period 1/*k+125. From the exact formula for the trajectory:
R >(t)" v(wN s) ds , and the fact that v (x) has a vanishing coe$cient of the k "0 mode, we see that (up to numerical integration error) the simulated value of p(t)"1>(t)2 must also be periodic and vanish at 7 t"(wN *k)\+125. The true value of p(t), on the other hand, continues to grow according to 7 a t power law for t<1. The simulated mean-square displacement must therefore turn down from the true solution. The departure becomes noticeable in Fig. 22 after roughly an eighth of a period.
506
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 22. Mean-square tracer displacement along the shear for RSS Model with e"1/2 (from [83]). Thin line: exact formula. Thick line: Fourier Method simulation with M"200 wavenumbers, *k"1/40p, and 2000 realizations.
To verify that this discrepancy is due to periodicity e!ects, and not due to "nite sample size or truncation error, the simulation was repeated with *k"1/160p and M"800 wavenumbers. This increases the inherent periodicity of the velocity "eld and p(t) to 160p+500, a factor of 4 greater 7 than before. The results are plotted in Fig. 23. The agreement between the simulated mean-square displacement and the exact result is now good through time t:60, again an eighth of the arti"cial period. (The simulated curve starts turning down at times greater than that shown the "gure [140].) Therefore, we see that the periodicity of the Fourier Method is a de"nite obstacle in the accurate simulation of turbulent di!usion over long time scales. To contend with it, one would need to choose a wavenumber spacing so small that the tracer does not cross more than an eighth of the period of the simulated velocity "eld, and this may require an enormous amount of computational labor, even with the fast Fourier transform. As we shall discuss in Section 6.2.4, the Moving Average Method appears to be a preferable choice for simulating random velocity "elds without strong long-range correlations, such as the RSS Model with e". The Fourier Method has a further di$culty when simulating super-di!usive tracer motion in a velocity "eld with strong long-range correlations. In Fig. 24, we show a simulation of the RSS Model with e", with the same choice of other numerical parameters as in Fig. 23. The j"0 term in the random Fourier series (346) is problematic because of the infrared divergence of energy; in the present simulations it is just dropped. We see that the Fourier Method simulation undershoots the exact result even though the plotted times extend only up to an eighth of the arti"cial period. If we were to somehow retain a nontrivial j"0 term in the random Fourier series (346), the
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
507
Fig. 23. Mean-square tracer displacement along the shear for RSS Model with e" (from [83]). Thin line: exact formula. Thick line: Fourier Method simulation with M"800 wavenumbers, *k"1/160p, and 2000 realizations.
Fig. 24. Mean-square tracer displacement along the shear for RSS Model with e" (from [140]). Thin line: exact formula. Thick line: Fourier Method simulation with M"800 wavenumbers, *k"1/160p, and 2000 realizations.
508
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
mean-square tracer displacement would instead show a ballistic overshoot of the true superdiffusive behavior at long times. The actual superdi!usive behavior of the tracer displacement is very sensitive to the way in which energy is concentrated at low wavenumbers, but the Fourier Method with equispaced grid points cannot adequately resolve the k\ singularity in E(k) at k"0. We will see in Section 6.2.3 that appropriately randomizing the wavenumbers in the discrete sum can overcome the de"ciencies of the direct Fourier Method for the #ows in the RSS Model with long-range correlations (1(e(2). 6.2.2.3. Conclusions regarding Fourier Method. The Fourier Method (with equispaced grid points) has been unambiguously shown to fail in e$ciently producing accurate statistics for the motion of a tracer over long-time intervals in the simple RSS Model. The main di$culty is the strong systematic error induced by the arti"cal periodicity of the simulated #ow. The Fourier Method furthermore cannot resolve su$ciently strong long-range correlations when they are present. These de"ciences are inherent to the Fourier method in general for the simulation of turbulent di!usion. Some better options for various applications will be discussed throughout the remainder of Section 6. 6.2.3. Randomization Method One way in which various investigators have sought to overcome the numerical artifacts of the equispaced Fourier Method is to choose randomly the wavenumbers +k ,+ appearing in the "nite H H Fourier sum approximation: + (x)" a cos(2pk x)#b sin(2pk x) . H H H H H For example, Kraichnan [180] deterministically assigned the magnitudes of the wavevectors appearing in a simulated multidimensional velocity "eld, but selected their direction according to a random uniform distribution on the sphere. Sabelfeld and other scientists at the Computing Center at Novosibirsk [190,240,291] later developed a more substantial variation in which the magnitudes of the wavevectors are also randomly chosen. We shall call this latter algorithm the Randomization Method, and apply it to the problem of simulating the turbulent di!usion of a tracer in the RSS Model. We will see that it eliminates the periodicity problem intrinsic to the direct Fourier Method, and performs quite well for the di!usive and super-di!usive class of models 14e(2, which include velocity "elds with strong long-range correlations. The Randomization Method is not very successful, however, in simulating sub-di!usive tracer motion in the RSS Models with slowly decaying negative correlations (0(e(1). We shall "rst de"ne the Randomization Method precisely, then present the results of the simulations in the RSS Model. v
6.2.3.1. Dexnition of Randomization Method. In a manner similar to our second derivation of the Fourier Method (Paragraph 6.2.2.1), the prescription of the random velocity "eld in the Randomization Method begins with a deterministic partition of wavenumber space into M disjoint subintervals +I ,+ (Fig. 25). We do not, however, con"ne the partition to a "nite segment; I extends all H H + the way to #R. We now choose a representative wavenumber k in each interval according to H
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
509
Fig. 25. Partition of the Fourier space into M intervals in the Randomization Method.
a probability density function (PDF) p (k) weighted by the energy density E(k): H
Prob+k 3A," p (k) dk, H H 2E(k) for k3I , H p (k)" vH H for k,I , 0 H where
v" 2 H
E(k) dk
(353)
. (354) 'H The random velocity "eld v(x) is then simulated as a random Fourier sum using these wavenumbers: + v (x)" v [m cos(2pk x)#g sin(2pk x)] , (355) 0 H H H H H H where +m , g ,+ is a collection of independent standard Gaussian random variables. H H H Upon comparison with Eq. (346), we see that the random Fourier sum has the same form in the Randomization Method as in the direct Fourier Method. An inessential di!erence is the particular expression (354) for the amplitudes +v ,+ ; one could consistently use these expressions for the H H standard Fourier Method as well. The important distinction is that the wavenumbers +k ,+ H H appearing in the sum are chosen randomly within their associated interval in the Randomization Method, rather than at the midpoint as in the Fourier Method. The amplitudes of the Fourier modes v and probability distribution p (k) for the wavenumbers H H which were described in Eqs. (353) and (354) are uniquely speci"ed by insisting that the simulated random "eld v (x) have the same covariance as the desired random "eld v(x), as we now 0 demonstrate. We begin by positing a general random Fourier sum approximation of the same form that arose in our alternate derivation of the Fourier Method (348): + (x)" v [m cos(2pk x)#g sin(2pk x)] , (356) H H H H H H where +m , g ,+ is a collection of independent, standard Gaussian random variables. We further H H H suppose the wavenumbers k to be randomly distributed within their intervals I . We desire to H H choose the PDFs +p (k),+ of these wavenumbers as well as the constant nonnegative amplitudes H H v
510
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
v so that the correlation function R (x) of the simulated random "eld v (x) approximates the H correlation function of v(x),
R(x)"1v(x)v(x#x)2"2
cos(2pkx)E(k) dk ,
(357)
as well as possible. The correlation function of the random Fourier sum (356) may be computed by "rst averaging over the Gaussian random variables +m , g ,+ as in our derivation of the standard Fourier H H H Method, and then taking another average over the distribution of the wavenumbers +k ,+ : H H R (x)"1v (x)v (x#x)2"11v (x)v (x#x)2 2 KE I + + cos(2pkx)vp (k) dk . " v cos(2pk x) " H H H H H I H ' H (See Eq. (350) for the derivation of the third equality.) Comparing with Eq. (357), we see that in fact we can make R (x) identically equal to R(x) by choosing p (k)"2v\E(k) for k3I , j"1,2, M. H H H The formula for v (354) then follows simply from the normalization Hp (k) dk"1. H ' H
6.2.3.2. General comments on the Randomization Method. The above calculation points out another advantage of the Randomization Method over the Fourier Method besides solving the periodicity problem. Whereas the correlation function of the velocity "eld v (x) simulated by the Fourier $ Method was only a discrete Riemann sum approximation of the true correlation function R(x), the correlation function of the velocity "eld v (x) simulated by the Randomization Method is exactly 0 R(x). The Randomization Method is therefore free of systematic truncation and discretization errors, at least insofar as second-order statistics are concerned. The reason this is possible is that the wavenumbers +k ,+ are allowed to vary over a continuum, so that an ensemble average can H H lead to the desired continuous energy spectrum E(k). With "xed wavenumbers, as in the Fourier Method, the simulated spectrum can at best be a discrete approximation to E(k) (352). Improved computational e$ciency can be expected for the Randomization Method because of its preferential distribution of wavenumbers toward the most energetic parts of the spectrum. In particular, the Randomization Method should resolve strong concentrations of energy in wavenumber space much better than the equispaced Fourier Method. However, the Randomization Method does have some drawbacks in practical implementation. First of all, the fast Fourier transform cannot be used because the wavenumbers are not con"ned to a regular grid. If one is interested in the turbulent transport of a small number of tracers, however, then the random velocity "eld v (x) need only be evaluated at their momentary positions, so the 0 loss of the fast Fourier transform is of no great concern. A potentially more serious disadvantage is the fact that the velocity "eld simulated by the Randomization Method is non-Gaussian, due to the randomness of the wavenumbers +k ,+ . The single-point PDF of v (x) is Gaussian, but the H H 0 PDF of any two-point velocity increment v (x)!v (x) is broader-than-Gaussian because 0 0 such increments are mixtures of mean zero Gaussian random variables of di!erent variances. (See the discussion in Section 5.2.2.) The Gaussianity of the simulated random "eld can be improved in principle by taking the intervals +I ,+ of the partition su$ciently "ne in a certain sense [240,291]. H H It is not clear, however, whether the improved Gaussianity would require an inordinately large
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
511
M in practice, particularly for velocity "elds with a wide range of active scales. A more promising approach, suggested by Kraichnan in a multi-dimensional context [180], is to produce M I independent approximate velocity "elds +vH ,+I of the form (355), and then take the simulated "eld as 0 H a suitably normalized sum: 1 +I vH (x) . (358) v (x)" 0 (M I H 0 The resulting random "eld v (x) has mean zero and the appropriate correlation function R(x). 0 Moreover, by the Central Limit Theorem, the statistics of the random "eld v (x) will approach 0 a Gaussian form if MI is taken su$ciently large. We will return to the issue of non-Gaussianity of the Randomization Method in Section 6.4. We "nally remark that although the Randomization Method has no formal truncation or discretization errors, it does not magically avoid the error incurred in approximating a random "eld with continuous spectrum by a "nite sum of Fourier modes. This source of error rather becomes transferred to the Monte Carlo sampling error. The degree to which the statistics of a "nite sample size will accurately resemble those of the entire ensemble depends on how closely the realizations of the simulated random "eld mimic the properties of the true random "eld. For example, in a turbulent velocity "eld with a power law inertial-range spectrum, the number of intervals (M) in the partition of wavenumber space must be chosen su$ciently large to ensure that each simulated velocity "eld contains a typical distribution of scales [140,291]. A helpful strategy toward this end is to choose the intervals +I ,+ of the partition to contain equal amounts of H H energy:
2 1 E(k) dk" E(k) dk" 1(v(x))2 . M M 'H Note that this partition is naturally associated to a Lebesgue integration of the energy spectrum E(k), whereas the equispaced Fourier Method is built from a standard Riemann sum approximation to the integral of E(k). v"2 H
6.2.3.3. Randomization Method applied to RSS Turbulent Transport Model. The Randomization Method has been used by Sabelfeld and coworkers [190,240,291] to simulate the motion of tracers and pairs of tracers in a fully developed turbulent velocity "eld with a small inertial range. We will discuss the Randomization Method in this more demanding context later in Section 6.3.2. Here, we apply the Randomization Method to the RSS Model to assess how well it simulates various types of turbulent tracer transport behavior. We are particularly interested in examining the extent to which the Randomization Method alleviates some of the inherent di$culties of the Fourier Method. The results presented here for the Randomization Method originate in the thesis of Horntrop [140]. These simulations adopt a simple incarnation of the Randomization Method, in which the random Fourier series of the simulated velocity "eld v (x) consists of 32 wavenumbers 0 chosen independently from [0,R), with PDF given by p (k)"2v\E(k) and v"2E(k) dk. For the energy spectra in the RSS Model (343b), the random wavenumbers are then distributed according to a gamma distribution [103], and can be e$ciently generated on a computer. In the above general notation, this setup corresponds to taking a trivial partition of wavenumber space
512
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
(M"1 in Eq. (355)) and building the simulated "eld by summing up M I "32 independent realizations of single-mode velocity "elds (see Eq. (358)). The superiority of the Randomization Method over the Fourier Method is clearly demonstrated in the super-di!usive regime of the RSS Model. In Fig. 26, the mean-squared tracer displacement produced by the Randomization Method for e" is indistinguishable from the exact result. The relative error is less than 8% throughout the simulation, and actually settles down to about 1% for 304t4200 [142]. The favorable comparison between the results of the Randomization Method (Fig. 26) and the Fourier Method (Fig. 24) for e" becomes even more striking when it is noted that the Randomization Method uses only M I "32 wavenumbers (relative to M"800 for the Fourier Method), and the Randomization Method is plotted over a longer time interval. The success of the Randomization Method in simulating super-di!usive tracer motion can be attributed to its (random) selection of wavenumbers over a continuous range, with preferential weighting toward low wavenumbers where the energy is strongly concentrated (E(k)+A k\C for kW0). At # least within the RSS Model, the Randomization Method takes proper account of the long-range correlations in the velocity "eld which give rise to super-di!usive tracer motion. The periodicity problem of the Fourier Method is also completely avoided. A su$ciently large sample size, however, is necessary to obtain the good agreement observed in Fig. 26. The sampling error becomes quite noticeable if the average involves only 500 independent realizations [140]. The Randomization Method also produces good results for RSS Models with di!usive (e"1) and trapping behavior (e(0) [140], but does not fare so well in the sub-di!usive regime (0(e(1). In Fig. 27, we see that the simulated tracer motion persistently overshoots the correct behavior for e". Evidently, the Randomization Method is not adequately accounting for the slowly decaying negative tail in the correlation function R(x) (see Fig. 20), even with 2000 samples. One concern which should be addressed when using the Randomization Method is the extent to which the simulated "elds deviate from Gaussianity. The ensembles generated in the above simulations did exhibit signi"cantly non-Gaussian sample statistics [140]. This issue plays no role, however, in the simulation of the mean-square tracer displacement in the RSS Model, since this quantity only depends on the second-order statistics of the velocity "eld (344). 6.2.3.4. Conclusions regarding Randomization Method. The above simulations show the Randomization Method to be a superior variation of the Fourier Method in the simulation of turbulent tracer transport. Its #exible choice of wavenumbers alleviates the periodicity problem and properly incorporates long-range correlations in the simulated velocity "eld, at least in the RSS Model. The Randomization Method, however, demonstrates some di$culties in simulating random "elds which have a correlation function with slowly decaying negative tails. One must also be aware that the Randomization Method generally produces random "elds with non-Gaussian statistics. To summarize, the Randomization Method appears to be a good candidate for Monte Carlo simulation of random "elds which have long-ranged positive correlations. We shall return to it in Section 6.3.2 when we consider the numerical simulation of turbulent velocity "elds which have even stronger positive long-range correlations than those present in the super-di!usive RSS Models. 6.2.4. Physical space-based method The Fourier Method and the Randomization Method presented above are based on the stochastic Fourier integral representation (337) of the random velocity "eld v(x). The "nal
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
513
Fig. 26. Mean-square tracer displacement along the shear for RSS Model with e" (from [140]). In the upper graph, the thin line describes exact formula whereas the (nearly coincident) thick line describes the Randomization Method simulation with M I "32 wavenumbers, and 2000 realizations. The lower graph shows the ratio of the simulated to true mean-square tracer displacement.
nonhierarchical Monte Carlo method for simulating random "elds which we will consider is the Moving Average Method, the simplest of another class of methods which are derived from the physical-space stochastic integral representation (340) of the random "eld. After a brief general discussion of the Moving Average Method, we will apply it to the RSS Model and compare the
514
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 27. Mean-square tracer displacement along the shear for RSS Model with e" (from [140]). In the upper graph, the thin line describes exact formula whereas the thick line describes the Randomization Method simulation with M I "32 wavenumbers, and 2000 realizations. The lower graph shows the ratio of the simulated to true mean-square tracer displacement.
outcomes with those of the Fourier-spaced methods discussed above [83,140]. We "nd an intrinsic obstacle for the Moving Average Method in simulating velocity "elds with strong long-range correlations. In fact, utilizing it in such situations can lead to ostensibly plausible scaling behavior which is in fact incorrect! The Moving Average Method, however, simulates tracer transport in
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
515
those RSS Models without strong long-range correlations (e(1) more e$ciently than the Fourier method. In Section 6.3, we will present a hierarchical version of the Moving Average Method which performs extremely well in simulating tracer transport in a fractal random "eld with very strong long-range correlations [84,85]. 6.2.4.1. Dexnition of Moving Average Method. The Moving Average Method is the direct physicalspace based analogue of the Fourier Method. It is obtained from the general physical-space stochastic integral representation (340) of the random "eld v(x):
v(x)"
G(x!r) d=(r) \
(359)
in much the same way that the Fourier Method was derived from the stochastic Fourier integral representation (337). In Eq. (359), d=( ) ) is a real white noise measure with properties stated in Eq. (341), and
G(x)"
e\p IVE("k") dk"2
\
cos(2pkx)E(k) dk
(360)
is a symmetric function peaked at the origin. To implement Eq. (359) numerically, we de"ne a symmetric partition of the real line into intervals of equal width *x, and use these to construct a Riemann sum approximation (with in"nitely many terms) using a midpoint rule discretization: v (x)" G(x!j*r) *= , H H\
*= " H
H> P
(361)
d=(r) .
H\ P
It is readily checked from Eq. (341) that += , is an in"nite collection of independent H H\ Gaussian random variables with mean zero and variance *r. We do not simply truncate Eq. (361) into a "xed, "nite sum, as in the Fourier method, because here the magnitude of the integrand peaks at the variable point x. Instead, we specify a bandwidth b, and restrict the summation in Eq. (359) to " j!W x/*rX"4b, where Wx X denotes the greatest integer not exceeding x. This completes the de"nition of the Moving Average Method: W X V P >@ G(x!r )m (*r , v (x)" H H + W X H V P \@
(362)
where +r "j*r, are the equispaced grid points in the integration and +m , is a collection H H\ H H\ of independent, standard, Gaussian random variables.
516
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Note that the truncation corresponds to an integration over the mobile segment [WxX ! P r ,W xX #r ], where r "(b#)*r and WxX denotes the grid point r lying closest to and P H
P
left of x. The Moving Average Method was applied by McCoy [228] in turbulent di!usion simulations. Mandelbrot and Wallis [220}222] and Feder [100] simulated one-dimensional fractal random "elds using an analogous algorithm based on a one-sided moving average representation [219]. Note that there is no sensible implementation of the Moving Average Method using random or unequally spaced physical-space grid points r , because the computation of the convolution would H become extremely complicated. In any case, there is no motivation for randomizing the physicalspace grid points. First of all, the Moving Average Method does not su!er from the false periodicity of the equispaced Fourier Method. Secondly, the random "eld is statistically homogenous in physical space, so there is no need to resolve special regions as there is in Fourier space when the spectrum is strongly concentrated near k"0. 6.2.4.2. General comments on the Moving Average Method. We can already discern a general disadvantage of the Moving Average Method relative to the Fourier-space based methods in that the random "eld simulated by the Moving Average Method is built out of an in"nite number of random variables. To be sure, the restriction of the random "eld to any "nite region refers to only "nitely many of these variables. The practical di$culty in simulating turbulent tracer transport is keeping track of the random variables needed to evaluate the velocity "eld at the current tracer location. When the x position of the tracer moves across a grid point r which it has never visited before, then a new independent random variable m H H>@ (or m ) must be generated to evaluate v(x). But if, as the tracer meanders, its x position H\@ turns around and crosses a grid point r which it has already visited, then the previously geneH rated value of m (or m ) must be recalled. Consequently, in a standard implementation, H\@ H>@ one would either need to precompute all the random variables which would be needed over a speci"ed domain, or dynamically store and index all the random numbers generated as the tracer moves into new territory. Computer memory limitations will necessarily restrict the spatial region which the tracer is allowed to explore. An alternative procedure is to utilize a reversible random number generator (such as a linear congruential generator) with an indexing scheme which allows any particular random number m to be obtained on demand [84]. The sequence of random H numbers +m , is not explicitly stored in this implementation, so the memory limitations are H H\ averted. Of course, for time-dependent random velocity "elds, such considerations become less sign"cant. Another disadvantage of the Moving Average Method relative to the Fourier-space-based methods is that the simulated random "eld v (x) is not precisely statistically homogenous. The + statistics of v (x) are generally invariant only under shifts v (x)Pv (x#h) for h an integrable + + + multiple of the grid spacing *r. The statistics of v (x) do depend, through the weighting factors + G(x!r ), on the location of x relative to the grid points. We will examine this issue in more H detail in the context of a much improved, hierarchical version of the Moving Average Method in Section 6.3.2. A more pressing concern regarding the Moving Average Method is revealed by a comparison of the correlation function of the simulated "eld with the true correlation function. We "nd that for
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
517
x"q*r'0 and x"q*r, O>OY>@ OY>@ G((q#q)*r!r )G(q*r!r )*r1m m 2 1v (x#x)v (x)2" H HY H HY + + HO>OY\@ HYOY\@ OY>@ @\O " G(r #x)G(r )*r" G(x#r )G(r )*r (363) OY\H OY\H H H HO>OY\@ H\@ whereas the correlation between the true velocity "eld at these same points is, from Eqs. (359) and (341):
1v(x#x)v(x)2"R(x)"
\ \
G(x#x!r)G(x!r)1d=(r) d=(r)2
G(x#x!r)G(x!r) dr" G(x#r)G(r) dr . (364) \ \ The correlation function (363) of the velocity "eld v (x) is a quadrature approximation of the + correlation function (364) of the true velocity "eld v(x). The most serious di!erence between the approximate and exact correlation functions is the truncation of integration interval. In fact, for x'2r , the correlation function of the simulated velocity "eld vanishes, because the two
. observation points x#x and x make use of disjoint subsets of the random variables +m , H H\ The Fourier Method involved a similar truncation of wavenumber space to a "nite segment [0, k ], and this was also re#ected in expression (351) for the simulated correlation function
1v (x#x)v (x)2. There, the truncation was not a big concern because the energy spectrum $ $ E(k) typically decays very rapidly (say, exponentially) at large wavenumber. The moving average weighting function, G(x), however, will not necessarily manifest such rapid decay. In fact, if the velocity "eld has long-range correlations, these must be re#ected in slowly decaying, long-range tails of G(x). The Moving Average Method must be expected to su!er a severe systematic truncation error in such circumstances. "
6.2.4.3. Moving Average Method applied to RSS Model. We shall now use the RSS Model to illustrate explicitly that the Moving Average Method's inherent physical-space truncation makes it inadequate in simulating tracer transport in a velocity "eld with strong long-range correlations (so that R(x) and G(x) have very slow decay) [83,140]. We will "nd, however, that the Moving Average Method performs relatively e$ciently for velocity "elds with milder correlations [140]. An earlier study of the Moving Average Method in simulating turbulent di!usion may be found in McCoy's thesis [228]. For the RSS Model simulations presented here, we use a bandwidth b"800 and grid spacing *r"0.1, so that the convolution in the moving average representation is e!ectively cut o! at a distance r "(b#)*r+80 from the maximum of the weighting function G.
We consider "rst the e"3/2 RSS model, which has strong long-range correlations and falls in the superdi!usive regime. The e!ects of the truncation of the moving average representation to a "nite bandwidth b in Eq. (362) are already apparent in a plot of the correlation function R (x)" + 1v (x)v (x#x)2 (with x"j*r) along with the correlation function R(x)"1v(x)v(x#x)2 of + + the true velocity "eld (Fig. 28). The simulated correlation function R (x) is drastically underrep+ resenting the long-range r\ tail of the true correlation function R(x). This re#ects the fact that
518
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 28. Correlation function of the velocity "eld for RSS Model with e" (from [83]). In the upper graph, the thin line describes true velocity correlation function R(x), whereas the thick line describes the simulation by the Moving Average Method with bandwidth b"800, grid spacing *r"0.1, and integration cuto! r +80. The lower graph shows
the ratio of the simulated to true velocity correlation function.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
519
correlations on scales large compared to r have been arti"cially "ltered out. Recall that the exact
mean-square displacement p(t) of a tracer in the RSS Model is expressible as an integral of R(x) 7 over an interval of length wN t (344). Therefore, the truncation in the Moving Average Method must necessarily lead to a systematic underprediction of p(t) at large time, apart from any additional 7 random errors due to "nite sampling in Monte Carlo simulations. An actual simulation of p(t) 7 using N"2000 realizations of the velocity "eld just described, is shown in Fig. 29. The Moving Average Method severely undershoots the correct behavior, and even worse, produces an apparent scaling behavior at long time with the wrong exponent. Note that the error of the Moving Average Method already appears at t"40, when the tracer has only moved across half the width r +80
of the integration window in the convolution. Similar results are found [140] when the bandwidth is increased to b"2000 (r +200). The Moving Average Method is therefore dangerous to use in
simulating turbulent di!usion in velocity "elds with signi"cant long-range correlations, since it can produce erroneous scaling behavior. Simulations of a one-dimensional fractal random "eld using a related one-sided Moving Average Method [220}222] with similarly large bandwidths and sample size can also predict incorrect scaling exponents for statistics of the random "eld itself ([100], Fig. 9.8). The Moving Average Method, however, performs adequately for the e41 RSS Models, in which the tracer motion is di!usive, subdi!usive, or trapped. The subdi!usive motion of a tracer in the e" RSS Model can be tracked with reasonable accuracy over a time interval 04t:130 (Fig. 30), whereas a Fourier Method simulation of comparable cost (described in Fig. 23 and [140])
Fig. 29. Mean-square tracer displacement along the shear for RSS Model with e" (from [83]). Thin line: exact formula, thick line: Moving Average Method simulation with bandwidth b"800, grid spacing *r"0.1, and integration cuto! r +80.
520
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 30. Mean-square tracer displacement along the shear for RSS Model with e" (from [83]). In the upper graph, the thin line describes exact formula, whereas the thick line describes the Moving Average Method simulation with bandwidth b"800, grid spacing *r"0.1, and integration cuto! r +80. The lower graph shows the ratio of the
simulated to true mean-square tracer displacement.
starts to systematically turn down due to arti"cial periodicity after t+60. The Randomization Method, on the other hand, had a tendency to overshoot the correct subdi!usive tracer behavior (Fig. 27).
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
521
6.2.4.4. Conclusions regarding Moving Average Method. The Moving Average Method intrinsically cannot represent correlations of the velocity "eld on scales larger than the integration cuto! r ,
and this fact can lead to a grossly misleading simulation of the long-time tracer motion in a velocity "eld with strong long-range correlations. Based on the results for the 1(e(2 RSS Models (Figs. 26 and 29), the Randomization Method appears to be a much more economical and accurate Monte Carlo Method for this kind of turbulent di!usion problem. On the other hand, the Moving Average Method performed reasonably well and more e$ciently than the Fourier Method for the RSS Models without strong long-range correlations e41, in which the tracer motion is di!usive, subdi!usive, or trapped. The Moving Average Method can be improved signi"cantly by a proper hierarchical formulation, as we shall discuss in Section 6.3.2. 6.3. Hierarchical Monte Carlo methods for fractal random xelds We have analyzed three Monte Carlo methods for the simulation of turbulent di!usion in a class of steady, random shear #ows. For the rest of Section 6, we will develop and examine Monte Carlo methods with a view toward simulating tracer motion in synthetic #ows with some features in common with fully developed turbulence at high Reynolds number. One characterizing feature of such #ows is the existence of a self-similar inertial range of scales ¸ ;r;¸ , where ¸ is the Kolmogorov dissipation length and ¸ is the integral length scale. ) ) A random steady shear #ow with such an inertial range was analyzed in Section 3.4.1. Its energy spectrum was expressed as E(k)"A k\Ct (k¸ )t (k¸ ), 2(e(4 , (365) # ) where t is an infrared cuto! and t is an ultraviolet cuto!. While the energy spectrum has a self-similar form between ¸\;k;¸\ for all e(4, it is only for 2(e(4 that the velocity ) "eld v(x) exhibits statistical self-similarity within an inertial range ¸ ;r;¸ in physical space. In ) particular, for 2(e(4, the mean-square velocity di!erence (also called the structure function of the velocity "eld) has the following inertial-range scaling: 1(v(x#x)!v(x))2"S' "x"& , T where H"(e!2)/2 is the Hurst exponent, and
(366)
(367) S' "!2A p>&C(!H)/C(H#) . T # (See Section 4.2.1 for a closely analogous discussion in the context of a random shear velocity "eld with rapid decorrelation in time.) Simulation of such a random "eld faces two main di$culties. First of all, the rapid growth of the mean-square velocity di!erence (366) with separation x between the observation points manifests the very strong long-range correlation of the velocity "eld v( ) ). We saw in Section 6.2 how poorly the Fourier Method and Moving Average Method simulated random "elds of the type (365) (with ¸ "R) for spectral exponents 1(e(2, because of their inability to represent accurately the long-range correlations in those "elds. The velocity "elds with inertial ranges we are now considering (2(e(4) have even higher values of e, and the long-range correlations become even more pronounced. A further challenge for numerical simulation of these velocity "elds is to ensure
522
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
that the simulated velocity "elds exhibit clean inertial-range scaling (366). This is particularly pertinent to applications in which one is seeking to determine how the inertial-range scaling of the velocity "eld is re#ected in scaling properties of the passive scalar "eld on length scales within the inertial range. We already considered some of these relations in Section 4 for a velocity "eld with rapid decorrelation in time. High-quality numerical simulations permit investigations of scaling properties for the passive scalar "eld in #ows with more complex features, as we shall illustrate in Section 6.5. To keep focus on the two central issues, the very strong long range correlations and the inertial-range scaling properties, we will remove the cuto!s from explicit consideration as much as possible. That is, we formally take the ideal "eld we are trying to simulate as having a vanishingly small Kolmogorov dissipation length ¸ "0 and an in"nitely large integral length scale ¸ "R. ) This limit, taken at face value, requires some care in interpretation (Section 3.4.1). For the purposes of our discussion of Monte Carlo numerical methods, however, this is of no concern since computer limitations will impose de"nite upper and lower cuto! length scales to the inertial range of any simulated "eld. In what follows, it is only important to remember from Section 3.5 that as the cuto!s are removed, E the velocity increments v(x)!v(x) are statistically homogenous, meaning that their PDF depends only on x!x, E the velocity increments are mean zero Gaussian random variables, with variance converging to the "nite inertial-range scaling limit S' "x!x"&, and T E the statistical dynamics of the separation between a pair of tracer particles converges to a well-de"ned limiting evolution. We remark for our later discussion in Section 6.4 that these facts remain true for multidimensional velocity "elds. We shall therefore pose the Monte Carlo simulation problem of Section 6.3 as follows. We wish to generate a numerical random (steady shear #ow) velocity "eld v(x) for which the increments v(x)!v(x) are homogenous and Gaussian distributed, with mean zero and variance obeying a speci"ed inertial-range scaling law: 1(v(x)!v(x))2"S' "x!x"& , T
(368)
with 0(H(1, over an extensive range of scales. Secondly, we also wish the separation between a pair of tracers advected by such a velocity "eld to be simulated accurately. With all the cuto!s removed from explicit consideration, the desired velocity "eld is in fact a fractal random ,eld [215,100]. This simply means that the velocity "eld enjoys a statistical self-similarity (or, more precisely, statistical self-a.nity [216]) under dilations. Namely, v(x)!v(x) has the same PDF as j\&(v(jx)!v(jx)), as may be checked from Eq. (368) and the fact that a Gaussian random variable is fully determined by its mean and variance. Fractal random "elds have applications in a diversity of "elds beyond turbulent di!usion, such as solute transport in groundwater [79], turbulent combustion [339], random topography in statistical physics [100], and many others [215,336]. We remark that the problem of simulating the random, one-dimensional shear velocity "eld v(x) described above is equivalent to the simulation of a stochastic process known as fractional Brownian motion [100,215,219].
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
523
Of the nonhierarchical Monte Carlo methods discussed in Section 6.2, the Randomization Method is evidently the best choice for simulating fractal random "elds. The Fourier and Moving Average Methods were shown to be incapable of e$ciently representing long-range correlations. In Section 6.3.3, we will compare the performance of the Randomization Method with another class of hierarchical Monte Carlo methods, which we now introduce. These hierarchical methods are designed to respect the statistical self-similarity of the fractal random "eld. The simulated velocity "eld v (x) is represented as a superposition of random "elds associated to a hierarchy of scales: KK (369) (x)" 2\K&v (2Kx) . K KK Each v ( ) ), m 4m4m is an independent, identically distributed, mean zero Gaussian K
random "eld, which can be computed e$ciently. Their precise speci"cation characterizes the particular hierarchical Monte Carlo method. The integers m and m represent large scale and
small scale cuto!s, as can be seen by noting that if the random "elds +v ( ) ), have characteristic K length scale ¸, then v (2Kx) has characteristic length scale 2\K¸. K In the idealized situation in which the truncation to a "nite number of scales may be ignored (m "!R and m "#R), any hierarchical method will automatically simulate a velocity
"eld with the discrete scaling symmetry: v
1(v
(x)!v (x))2"2\&1(v (2x)!v (2x))2. Therefore, some of the inertial-range scaling property (368) is built in to the hierarchical method, and this is the main theoretical motivation for representing the simulated velocity "eld by Eq. (369). It is important to note, however, that the increments of the simulated fractal random "eld v (x) are not thereby guaranteed to have the continuous scaling symmetry and statistical homo geneity of the increments of the true fractal velocity "eld v(x). One intuitively appealing hierarchical Monte Carlo method for generating fractal random "elds is the method of Successive Random Addition (SRA), developed by Voss [336]. This method has become quite popular in the physics community, due to its speed, e$ciency, and #exibility in generating various random fractal surfaces and processes [100,336]. Viecelli and Can"eld [335] have moreover shown how to exploit the local recursive nature of SRA to compute rapidly the fractal "eld at a given point. They therefore suggest SRA as a promising method to apply to the simulation of the turbulent di!usion of a small number of tracers. Unfortunately, the random "elds simulated by SRA have recently been shown to be rigorously inconsistent with the statistical homogeneity and full inertial-range scaling properties of the increments of a truly homogenous fractal random "eld [87]. We discuss this de"ciency of SRA in Section 6.3.1 through explicit and rigorous numerical estimates, which demonstrate that it fails in very practical ways to simulate a truly fractal random "eld. Of course, this says nothing about its capability of producing qualitatively convincing graphical representations of fractal surfaces and landscapes [100,336]. But to simulate turbulent di!usion in the inertial-range of scales with quantitative precision, the simulated velocity "eld must have quantitatively accurate statistical scaling properties, and SRA is intrinsically incapable of meeting this need. In Section 6.3.2, we present a pair of hierarchical methods recently developed by the "rst author with Elliott and Horntrop for the simulation of fractal random "elds. The Multi-=avelet Expansion
524
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
(M=E) Method [84] is based on the same physical-space stochastic integral representation (340) as the Moving Average Method discussed earlier. The Fourier-=avelet Method [82] is a Fourierspace based analogue. They are carefully designed to permit e$cient local computation of the random "eld, so that the velocity "eld v(x) may be evaluated rapidly at whatever positions x"XH(t) a small number of tracers happen to be at a certain moment of time. The MWE Method is designed speci"cally for fractal random "elds, while the Fourier-Wavelet method is #exible enough to be applied in more general situations [82]. Some simulations of the velocity "eld v(x) by the MWE, Fourier-Wavelet, and Randomization Methods are reported in Section 6.3.3. The wavelet approaches are able to generate a high-quality inertial range extending over an unprecedented twelve decades of scales using less than 2500 active computational elements [87,84]. The inertial-range scaling law (368) is accurately reproduced in detail from an average over only 100 or 1000 realizations. Among other things, this stresses the low variance of the wavelet-based Monte Carlo methods. We will explicitly contrast the scaling and homogeneity properties of the random "elds generated by the MWE method and SRA method. The Randomization Method is next compared with the wavelet methods. We "nd that the Fourier-Wavelet Method is the best choice when one wishes to simulate very wide inertial ranges with more than 4}5 decades of scaling behavior, or when it is important for the simulated velocity "eld to be truly Gaussian. Due to the relatively high "xed overhead of the wavelet methods, the Randomization Method is more computationally e$cient when only 4}5 decades of scaling behavior are desired and the statistical quantities of interest do not depend sensitively on the higher order statistics of the velocity "eld [82]. Finally, we apply the wavelet methods and the Randomization Method to the simulation of the relative turbulent di!usion of a pair of tracers in a steady fractal shear #ow for which the exact statistics of the tracer separations can be expressed analytically. The numerical simulations are shown to match closely the exact results. 6.3.1. Successive Random Addition We begin by formulating the Successive Random Addition (SRA) Method [336] as a hierarchical method as described in Eq. (369). Successive Random Addition constructs a random "eld by dyadic expansion. By de"nition, a dyadic rational number x satis"es x"2\Kn for some integers m and n, the octave and the translate, respectively. For each octave m, SRA constructs a piecewise linear "eld, v ( ) ) as follows. First, at each integer x"n, this "eld is assigned an independent K random value m drawn from a standard Gaussian distribution (mean zero, unit variance): KL v (n)"m , n"0,$1,$2,2 . K KL Next, the method extends v ( ) ) to all other points by linear interpolation: K v (x)"m WX (1![x])#m WX [x] . K V > K K V Here W xX is the greatest integer less than x, and [x]"x!W xX is the fractional part of x. The simulated "eld is "nally built up by summing a suitably rescaled "nite collection of such independently generated random "elds: K v (x)" 2\K&v (2Kx) . 10 K KK The SRA algorithm can be readily generalized to multidimensional random "elds [100,335,336].
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
525
We note that in practice SRA can be implemented in a more e$cient way than this literal description [335]. Namely, the computation of v (x) requires the simulation and interpolation of 10 only the two random variables m at each octave m which are associated to the dyadic numbers KL 2\Kn bracketing x. Our concern here, however, is not with the e$ciency of the SRA method, but with the nature of the random "eld which it would simulate even in the ideal limit in which all errors due to computational constraints can be neglected. The problem with the SRA method, as explicitly demonstrated in [87], is that it produces a random "eld with strong deviations from the statistical self-similarity and homogeneity of the increments which a fractal random "eld is supposed to possess. For the fractal random "eld v(x), the following scaled variance of velocity #uctuations is an absolute constant 1(v(x)!v(x))2 "S' ; T "x!x"& see Eq. (368). Consistency would require that the corresponding ratio for the random "eld simulated by SRA: 1(v (x)!v (x))2 10 10 "S' (x, x) 10 "x!x"& should settle down to approximately constant behavior for appropriately chosen simulation parameters. The function S' (x, x) however, always has order unity variations as a function of 10 x and x. This variability in S' (x, x) is systematic, and not due to "nite sample sizes; S' (x, x) as 10 10 de"ned is a property of the simulated random "eld averaged over the full statistical ensemble. Moreover, there is no way to choose the parameters in the algorithm to reduce the variations in S' (x, x) to a desired tolerance. Substantial variations are present for any "nite choice of m and 10
m , and persist in the ideal limit m P!R, m P#R. Consequently, the SRA method
cannot consistently simulate a fractal random "eld with statistically homogenous increments. The random "eld simulated by SRA does not even approximately obey the inertial range scaling law (368) with a constant prefactor. We now summarize some of the results from [87] which quantify the systematic inconsistency of SRA. Rigorous numerical lower bounds on the variation of S' (x, x) are obtained by exact evaluation 10 of this function at specially chosen points. The ratio max S' (x, x)/min S' (x, x) 10 10 is thereby found strictly to exceed unity for all Hurst exponents 0(H(1, and to exceed 2 for a wide range of Hurst exponents 0.30(H(0.85, including the value H"1/3 associated to a turbulent velocity "eld with Kolmogorov scaling [87]. These numerical estimates hold not only for "nite values of the cuto!s m and m , but remain valid as these cuto!s are removed:
m P!R and m P#R. Note that due to the discrete self-similarity of the random "eld
simulated by SRA, S' (x, x)"S' (2x, 2x), and the order unity variations in S' (x, x) occur at 10 10 10 every scale. It might still be conceivable that the variations in S' (x, x) uncovered by the mathematical 10 analysis are concentrated tightly around special points, and that for most values of (x, x), the function S' (x, x) is nearly constant. To demonstrate that this is not the case, we plot in Fig. 31 10
526
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 31. Distributions for the normalized scaling coe$cient for simulations of a fractal random "eld with Hurst exponent H" (from [87]). The broader distribution plots S' (x, x)/M for an ideal random "eld (m "!R, m "#R) 10 "
generated by Successive Random Addition. The thinner distribution plots S' (x, x)/S' for a random "eld generated by +5# T the MWE Method (to be discussed in Section 6.3.2) with M"40 scales, wavelet order q"4, and bandwidth b"5.
a histogram for H" describing the distribution of S' (x, x) in the ideal case in which cuto!s can 10 be neglected (m "!R and m "R) [87]. Recall that S' (x, x) is a statistical quantity fully
10 averaged over the entire statistical ensemble, so these distributions also are associated with the ideal limit in which #uctuations due to "nite sampling are negligible. The values of S' (x, x) are 10 S' (x, x). The area normalized in the plot by the exactly computable constant M ,max " VVYZ1 10 under the histogram between two points h and h represents the relative area in (x, x)31 for
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
527
which h 4S' (x, x)/M 4h . By self-similarity, one may equivalently restrict attention to the 10 " unit square 04x, x41. This histogram is calculated numerically from exact discrete summation formulas, evaluated on a 32;32 discrete grid on the unit square [87]. The distribution of the values of the putative scaling coe$cient S' (x, x) is rather broad. There are no parameters in the SRA 10 algorithm which may be tuned to tighten these distributions so that S' (x, x) becomes approxim10 ately constant in space. Similar results are found for other values of the Hurst exponent [87]. SRA is therefore demonstrated to be an inconsistent algorithm for generating quantitatively accurate homogenous random fractal "elds, both in theoretical and practical terms. 6.3.2. Wavelet approaches We will next describe a pair of recently developed low variance Monte Carlo methods which can e$ciently simulate fractal random "elds over a large range of scales [82,84]. These are based on hierarchical discretizations of the basic stochastic representation formulas (340) and (337):
G(x!r) d=(r) , (370a) \ e\p IVE("k") d= I (k) (370b) v(x)" \ through specially designed orthonormal wavelet expansions. To motivate the introduction of wavelets in these methods, we "rst discuss the inadequacy of a more primitive hierarchical attempt to improve upon the Moving Average Method. This method and the Fourier Method are based on straightforward, equispaced discretizations of Eqs. (370a) and (370b), and were shown in Section 6.2 to fail seriously in representing long-range correlations in random "elds. It is natural to try hierarchical versions of these methods, however, in the simulation of fractal random "elds, since a hierarchical structure naturally treats every simulated scale on an approximately evenhanded basis. Recall that the algorithm for the Moving Average Method was expressed: v(x)"
W X V P >@ G(x!r )m (*r , v (x)" H H + W X H V P \@ are the equispaced grid points in the integration and +m , is a collection where +r "j*r, H H\ H H\ of independent, standard, Gaussian random variables. The main di$culty of the Moving Average Method was that it could not accurately represent long-range correlations with a reasonable bandwidth b. One way to address this problem is to go back to the exact moving average representation on which the method is based:
G(x!r) d=(r) , (371) \ and use the scaling properties of the weighting function G( ) ) to make a more e$cient discrete approximation. For a fractal random "eld with Hurst exponent H as expressed by the structure function scaling (366), G( ) ) may be computed from its relation (360) to the energy spectrum E(k)"A k\\&, yielding [84] # G(x)"K "x"&\ , (372) & v(x)"
528
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
with preconstant p\>&C(2#H) (A . K " # & C((1#2H)/4) (The relation between the prefactor A in the energy spectrum and the prefactor in the structure # function scaling (366) is given by Eq. (367).) As with all our discussions of fractal random "elds, formula (371) with convolution kernel (372) only gives well-de"ned velocity di!erences v(x)!v(x). Finite numerical approximations to this moving average formula will necessarily introduce cuto!s and be entirely well-de"ned. Now, the weighting function G( ) ) is very long-ranged; it even grows with distance for H'! Based on our numerical demonstrations in Section 6.2.4, we have no hope of representing these long-range correlations with a reasonable bandwidth if we use a straightforward equispaced discretization of the convolution integral (371). The scaling property of G( ) ), however, suggests a much more economical way of evaluating this integral. Note that the derivative of G( ) ) decreases with distance according to a power law. Thus, it is clearly wasteful to try and integrate Eq. (371) over a large segment with an equispaced partition. The integration step can be made coarser with "x!r" to maintain a given level of accuracy, and the numerical integration interval can thereby be greatly increased without additional cost. More speci"cally, the self-similarity of the fractal "eld indicates that the integration step used to resolve correlations on a given scale should be proportional to that length scale. This suggests a hierarchical method in which an integration step 2H*r is used for 2H\b*r("x!r"42Hb*r, with a suitable positive integer b. Note that this keeps all evaluations on an equispaced grid. While this method will certainly improve upon the Moving Average Method, the bandwidth required for an accurate simulation is still much too large for practical purposes [84]. The underlying idea of using a variable numerical resolution proportional to the length scale being considered is clearly promising, however. What is needed is a more #exible means of constructing a "nite hierarchical approximation to the moving average stochastic integral (371), and this is provided by the theory of orthonormal wavelet bases [80]. We will describe how to write the moving average representation as an exact discrete expansion with respect to an orthonormal wavelet family, and discuss the mathematical properties which the wavelet family should have so that "nite truncations will be e$cient approximations. The Alpert}Rokhlin multi-wavelet bases [3,4] meet the criteria, and their use in the orthonormal expansion produces what we call the Multi-Wavelet Expansion (MWE) Method [84]. We will then discuss the Fourier-Wavelet Method [82], another wavelet-based Monte Carlo method with certain improvements in simplicity, speed, #exibility. It is derived from an analogous orthonormal wavelet expansion in Fourier space using a Meyer wavelet as a `mother waveleta [80]. While the Fourier-Wavelet Method is formulated in Fourier space, it is not a hierarchical Fourier Method because the simulated random "eld is not represented as a random Fourier sum. It can be naturally understood as a spectral implementation of an orthonormal wavelet expansion of the moving average representation. We will only describe the theory and algorithms for the MWE and Fourier-Wavelet Methods here. Some demonstrations of their performance in practice will be given in Section 6.3.3 and throughout the remainder of Section 6.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
529
6.3.2.1. Multi-=avelet Expansion Method. We begin by showing how the moving average stochastic integral representation (371) may be expressed through an orthonormal basis as a randomly weighted sum of functions. The truncation of this sum permits a `"nite elementa type of discretization as an alternative to the `"nite di!erencea discretizations of the stochastic integrals which led to the Moving Average Method and the Fourier Method. The MWE Method is based on such an orthonormal expansion using a wavelet basis which we will describe subsequently. Orthonormal expansion of moving average representation. Let + , be a (complete) orthonorH H mal basis for
¸(1)" g #g# "
"g(x)" dx \
(R ,
the Hilbert space of square integrable complex functions on the real line with inner product
(g, h)"
g(x)h(x) dx . \
The orthonormal property means that
1 ( , )"d " H HY HHY 0
if j"j ; otherwise .
Moreover, any square integrable function g3¸(1) can be expanded as a countably in"nite and convergent sum of orthonormal basis functions [80]: g" (g, ) . (373) H H H An orthonormal basis can be used to rewrite any stochastic integral with respect to white noise as an in"nite sum of weighted independent standard Gaussian random variables:
(g, ) (r) d=(r)" (g, )m , g(r) d=(r)" H H H H \ \H H
(374)
where
m" H
(r) d=(r) . H \
By the properties (341) of real white noise d=(r), it is easily checked from the orthonormality of the (r) that the +m , is a sequence of independent, standard, real Gaussian random variables H H H (mean zero, variance one). Because Eq. (374) holds for any g3¸, we can de"ne a general rule for representing the white noise measure in stochastic integrals: d=(r)" m (r) , H H H
(375)
530
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
where + (r), is an orthonormal basis of ¸(1) and +m , is a collection of independent, H H H H standard real Gaussian random variables. Applying this orthonormal expansion to the moving average representation (371), we obtain
G(x!r) d=(r)" G夹 (x)m , H H \ H where the convolution operation is denoted by a star: v(x)"
f夹g(x)"
f(x!r)g(r) dr . \ Multi-wavelet orthonormal bases. To use the orthonormal expansion to construct a discrete hierarchical approximation for v(x), we shall consider multi-wavelet orthonormal bases [80]. A multi-wavelet (a generalized wavelet) is a set of functions + NO,O that has the special property N that its discrete translates and dilates: + NO (x)"2K NO(2Kx!n) " p"1,2,q; m, n"0,$1,$2,2,, KL form an orthonormal basis for ¸(1). We will use the term wavelet to refer to a function from a multi-wavelet, although the term usually refers to a single function whose dilates and translates form a basis for ¸(1) [80]. The double subscript notation NO (x) for dilation and translation is KL standard for multi-wavelets. The superscript q denotes the order of the multi-wavelet basis. The "rst subscript m is called the octave and the second subscript is called the translate. The expansion of the moving average representation in terms of the multi-wavelet basis is written: O (376) G夹 NO (x)mN , v(x)" KL KL N KL\ where +mN " p"1,2q; m, n"0,$1,$2,2, is a collection of independent, standard, Gaussian KL random variables. Now we make use of the self-similar scaling (372) of the convolution kernel G to write this expansion in an explicitly hierarchical form. By simple rescalings, one can check that G夹 NO (x)"2\K&G夹 NO(2Kx!n) . KL Therefore, we have the following hierarchical representation for the random fractal velocity "eld v(x): v(x)" 2\K&v (2Kx) , (377a) K K\ O (377b) v (x)" G夹 NO(x!n)mN . KL K N L\ We note that for a suitable choice of a multi-wavelet (including the Alpert}Rokhlin multi-wavelet which we will use), the convolution G夹 NO is perfectly well-de"ned without cuto!s. The only divergence is in the sum over m, which as usual, will be cut o! by any numerical implementation. See [84] for a rigorous mathematical treatment.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
531
The multi-wavelet representation (377) is an exact formula for the fractal random "eld v(x), equivalent to the moving average representation (371). A numerical implementation will of course require a truncation of the in"nite sums over m and n. The cost of such a wavelet-based algorithm will clearly be proportional to the number of terms retained in the sums. We can consequently keep more octaves (more terms in the m summation) at a given cost if we keep fewer terms in the sums over translations (index n). The issue, then, is how to minimize the number of translates which must be summed over in each octave to meet a given accuracy. ¸ocalization of sum over translates. The errors incurred in truncating the sum (378) G夹 NO(x!n)mN KL L\ may appear at "rst glance to be as devastating as those which result in truncating the physicalspace stochastic integral in the moving average representation:
v(x)"
G(x!r) d=(r) . \ Indeed, this latter expression may be written in a similar form to Eq. (378): (x!n)m , v(x)" G夹s
L L\ where s
1 (x)"
0
(379)
for 04x41 , otherwise ,
and +m , is a collection of independent, standard Gaussian random variables. Both Eqs. (378) L L\ and (379) involve a sum over convolutions of the slowly decaying (or even increasing!) function G( ) ). The Moving Average Method is essentially derived by truncating the sum in Eq. (379) (with the step size rescaled to *r), and we saw in Section 6.2.4 that this led to gross inaccuracies even for large bandwidths. There is however a crucial di!erence between the two expressions: Eq. (378) is a convolution of G with multi-wavelets which we are still free to choose, whereas the summands in Eq. (379) are completely determined. The success of a wavelet-based Monte Carlo Method relies crucially upon choosing the multi-wavelets + NO,O in an intelligent manner so that G夹 NO decays rapidly and the N sum (378) may be well approximated by only a relatively small number of terms with n+x. To see what properties the multi-wavelets should have to make this possible, we work out the far-"eld asymptotics of the convolution [84] by binomially expanding G(x!r)"G"x!r"&\ for large x: G夹 NO(x)&C K I "x"&\\/ as xP$R ! & / where C are some explicitly computable combinatorial constants, and Q is the minimal non! negative integer q so that the wavelet moment:
I, O
rO NO(r) dr \
532
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
is nonvanishing. The decay of the convolution G夹 NO is therefore determined by the number of moments I which vanish for all wavelets in the multi-wavelet + NO,O . We are thus led to look for O N a multi-wavelet with good moment cancellation properties [37]. The Alpert}Rokhlin basis [3,4] meets our need. For each q51, there exists an Alpert}Rokhlin multi-wavelet + NO,O consisting of piecewise polynomial functions supported in the unit interval N [0,1] with moments vanishing through order q!1:
NO(x)xO dq"0 for q"0,2, q!1 .
Their essential properties for our purposes are summarized concisely in [84]. With a larger choice of q, the convolution G夹 NO decays more rapidly and fewer terms are needed in the sum over n to meet a speci"ed accuracy. This consideration must be balanced by the cost of maintaining q di!erent wavelets. Description of multi-wavelet expansion (M=E) method. According to the analysis in [84], the Alpert}Rokhlin multiwavelet of order q"4 is found to be su$cient for accurate simulation for Hurst exponent H", the Kolmogorov value. One now chooses a suitable bandwidth b so that the sum over n is well represented by a sum over "n!W x X "4b, and imposes cuto!s m and m on
the summation over the octaves m in Eq. (377a). By scale invariance, we can generally put m "0,
and let m "M!1, where M is the number of octaves simulated. The Multi-=avelet Expansion
(MWE) Method then takes the form +\ v (x)" 2\K&v (2Kx) , +5# +5# K WX V >@ O f (x!n)mN , v (x)" +5#N KL +5# L WVX \@ N f (x)"G夹 NO(x) , +5#N
(380)
where +mN " p"1,2, q; m"0,2, M!1; n"0,$1,$2,2, is a collection of independent KL Gaussian standard random variables. It may be desirable in certain applications to let the bandwidth depend on the octave b"b ; see K [84]. In any case, rigorous estimates are available for the error in truncating the summation over n, and these can be used to choose b according to the accuracy desired. Such analysis may be found in the original paper [84]. We note that with the q"4 Alpert}Rokhlin multi-wavelet, bandwidths on the order of b"5 are already su$cient to guarantee excellent accuracy [84]. One feature which the Multi-Wavelet Expansion Method shares with the Moving Average Method is its reference to an in"nite collection of independent random variables. While only "nitely many need to be evaluated to determine v (x) over any "nite region, there is still the +5# practical di$culty of storing and keeping track of the random numbers as they are generated. This problem can be overcome through the use of a reversible random number generator and an indexing scheme exploiting the hierarchical structure of the MWE Method. This procedure, developed and described in [84], completely avoids the enormous memory cost which would be required if the random numbers needed to be precomputed and stored.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
533
We "nally remark that the q"1 Alpert}Rokhlin multi-wavelet is nothing more than the simplest of all wavelets, the Haar wavelet [80]:
(x)"
!1 1 0
if 0(x( , if (x(1 , otherwise .
It can be checked that the MWE Method using these Haar wavelets is equivalent to the straightforward hierarchical version of the Moving Average Method which we described near the beginning of Section 6.3.2. As only the zeroth-order moment of the Haar wavelet vanishes, the summation over n is not so well localized in this case: G夹 (x)&C "x"&\ as xP$R. ! Computations in [84] using rigorous truncation error estimates show that to obtain good accuracy, the bandwidth b must be orders of magnitude larger if Haar wavelets were used instead of the q"4 Alpert}Rokhlin multi-wavelets. This emphasizes the importance of the choice of multiwavelet basis in the success of the MWE Method; the fact that it is hierarchical is not su$cient unto itself. 6.3.2.2. Fourier-wavelet method. We now describe the Fourier-Wavelet Method, a variation of the MWE Method which is easier to implement, faster, and more #exible. The FourierWavelet Method is based on the same general orthonormal multi-wavelet expansion (377) as the MWE Method, but is implemented spectrally. As we shall show in a moment, a single Meyer wavelet [80] is su$cient to generate an orthonormal wavelet basis with the desired properties, so we will drop the multi-wavelet indices p and q in our discussion of the FourierWavelet Method: (381) v(x)" G夹 (x)m , KL KL KL\
(x)"2K (2Kx!n), m, n"0,$1,$2,2 . KL The +m , is a collection of independent, standard, real Gaussian random variables. For KL KL\ fractal random "elds v(x), the power law form of G permits the following self-similar hierarchical expression: v(x)" 2\K&v (2Kx) , (382) K K\ v (x)" G夹 (x!n)m . KL K L\ At the end we will show how the Fourier-Wavelet Method can be applied to generate more general random "elds without self-similarity properties, but at the moment we focus on fractal random "elds. The Fourier-Wavelet Method departs from the MWE Method in that the convolutions in the sum are handled spectrally. The convolution theorem from Fourier analysis ([50], p. 108) states
534
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
that for any square-integrable functions g, h3¸(1):
e\p IVgL (k)hK (k) dk , \ where hats and the operator F each denote a Fourier transform: g夹 h(x)"F\(FgFh)(x)"
gL (k)"(Fg)(k),
ep IVg(x) dx , \ and F\ denotes the inverse of the Fourier transform:
e\p IVt(k) dk . \ Now, from the relation (342), the Fourier transform of G(x) is the square root of the energy spectrum: (FG)(k)"E("k"). De"ning K "F , we have (F\t)(x)"
G夹 (x)"F\(E K )(x) , where E(k) is understood here to be extended as an even function to the negative k axis. The orthonormal wavelet expansion (382) may then be written: v(x)" 2\K&v (2Kx) , K K\ v (x)" f (x!n)m , K KL L\ f (x)"F\(E K )(x) .
(383)
This expression for the random "eld v(x) may also be derived from its stochastic Fourier integral representation (337) by expanding the complex white noise d= I (k) with respect to the orthonormal basis + K "F , in a manner similar to Eq. (375) (see [82]). Note that regardless of the KL KL KL\ method of derivation of Eq. (383), the underlying wavelet basis + "2K (2Kx!n), is KL KL\ de"ned in physical space. Its Fourier transform is an orthonormal basis by the Plancherel theorem ([50], p. 113), but not (in general) a wavelet basis since the + K , are not related to each KL KL\ other by dilation and translation. We now tackle the problem of localizing the summation over n in Eq. (383) from the spectral perspective. What is required is that f decay rapidly. But decay of f in physical space is linked to the smoothness of its Fourier transform fK "Ff. Namely, if
dNfK (k) dk"C (R , N dkN \ then (see [50], p. 117): "2p"\N
" f (x)"4C "x"\N . N
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
535
We are therefore led to choose the wavelet so that fK (k)"E(k) K (k) has a su$ciently large number of bounded derivatives for the energy spectrum of interest. For the fractral random "eld currently under consideration, E(k)"A "k"\\& has a nasty singularity at # k"0, but is otherwise smooth. We can therefore guarantee fK (k) to have p bounded derivatives if the Fourier transform of the wavelet is compactly supported away from the origin, and has p classical derivatives. The Meyer wavelet based on a pth-order perfect B-spline satis"es these properties in an optimal fashion; see [7,80,82] for the details. Numerical studies in [82] indicate that a second order (p"2) perfect B-spline is a good practical choice in de"ning the Meyer wavelet for a fractal random "eld with H". The Fourier-=avelet Method is then implemented by keeping only a "nite number of octaves m"0,2, M!1 and using the rapid decay of f (x) to approximate the sum over its translates to high accuracy using a reasonable bandwidth b: +\ v (x)" 2\K&v (2Kx) , $5 $5K K WX V >@ v (x)" f (x!n)m , (384) $5K $5 KL WX L V \@ f (x)"F\(E K )(x) . $5 The +m ; m"0,2, M!1; n"0,$1,$2,2, are standard Gaussian independent random KL variables. The functions f (x) are evaluated by fast Fourier transform and interpolation. Rigorous $5 estimates for the numerical errors incurred in this computation as well as for the truncation error in the summation over n are given in [82]. Extension of Fourier-=avelet Method to random ,elds without self-similarity. One appealing feature of the Fourier-Wavelet Method is that it may be applied without signi"cant change to the simulation of random "elds with a wide range of active scales where perhaps the energy spectrum E(k) is not a simple power law. To see this, let us return to the general orthonormal wavelet expansion (381) which did not assume self-similarity of G: v(x)" G夹 (x)m , KL KL KL\
(x)"2K (2Kx!n), m, n"0,$1,$2,2 . KL Using the scaling properties of the wavelets, this can always be cast in a hierarchical form: v(x)" v (2Kx) , K K\ v (x)" f (x!n)m , K K KL L\
(385a) (385b)
536
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
but the functions f do not in general satisfy the scaling relation f (x)"2\K&f (x), which held K K for fractal random "elds with Hurst exponent H. They must instead be computed separately for each m: f (x)"G 夹 (x), G (x)"2\KG(2\Kx) . K K K The spectral representation of these functions is (see Eq. (383))
(386)
f (x)"F\(E K )(x) , (387) K K E (k)"2KE(2Kk) . (388) K Localization of the summations of translates of these functions in Eq. (385b) requires that the wavelet be chosen so that each of the f decay rapidly. Viewed in physical space (386), this K appears to be a complicated task, but it can be done quite simply in the spectral framework (387). We simply need to ensure that fK (k)"(Ff )(k)"E(k) K (k) K K K has su$ciently many bounded derivatives for all octaves m retained in the simulation. The Meyer wavelet based on a pth-order perfect B-spline still works well for this purpose [82]. Because it is compactly supported away from the origin and has p bounded classical derivatives, fK (k) will have K p bounded derivatives for any smooth spectrum E(k) which may even have strong algebraic singularities at k"0 and k"R. The form of the Fourier-=avelet Method for general random "elds can therefore be written: +\ v (x)" v (2Kx) , $5 $5K K WX V >@ (x!n)m , (389) v (x)" f $5K KL $5K L WVX \@ f (x)"F\(E K )(x) , $5K K E (k)"2KE(2Kk) , K where +m ; m"0,2, M!1; n"0,$1,$2,2, is a collection of standard Gaussian indepenKL dent variables, and is a Meyer wavelet constructed from a perfect B-spline of order p. Practical choices of p may depend on the application. The functions f (x) are computed by fast Fourier $5K transform and interpolation. 6.3.2.3. Comparison of the Fourier-=avelet and Multi-=avelet Expansion Methods. We have already pointed out one advantage of the Fourier-Wavelet overthe Multi-Wavelet Expansion (MWE) Method, in that the Fourier-Wavelet Method is applicable to random "elds with general spectra E(k). In localizing the summation over translates of f (x) in the MWE Method +5#N Eq. (380), a moment cancellation criterion was used to choose the Alpert}Rokhlin multi-wavelet. This criterion arose from a far-"eld expansion of the speci"c moving average weighting function G(x)"K "x"&\ associated to fractal random "elds. The criterion used for localizing the & summation over the analogous functions f (x) in the Fourier-Wavelet Method (389), on the $5K other hand, does not rely on speci"c assumptions about the energy spectrum E(k). Furthermore,
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
537
because the Fourier-Wavelet Method is a spectral method, it is compatible with spectral techniques for solving partial di!erential equations. The MWE Method has no such compatibility with anisotropic spectra generated by solutions of partial di!erential equations. The Fourier-Wavelet Method is also much simpler to implement and faster than the MWE Method, as detailed in [82,84]. 6.3.3. Comparison of simulation results We now demonstrate the excellent performance which results from the careful mathematical design of the MWE and Fourier-Wavelet methods. We will also discuss the relative merits of these wavelet-based methods and the nonhierarchical Randomization Method in simulating fractal random "elds. First we will focus on the simulation of the random velocity "eld v(x) itself, and then turn to the simulation of the relative turbulent di!usion of a pair of tracers being swept at a constant rate across a steady, fractal random shear #ow. Recall that a Gaussian fractal random "eld is characterized by the velocity increment between two points v(x)!v(x) being a mean zero, Gaussian random variable, with variance varying as a power law of the separation distance between the observation points: 1(v(x)!v(x))2"S' "x!x"& . T
(390)
6.3.3.1. Consistency of Wavelet Methods. The "rst issue we check is that the wavelet methods will indeed generate random "elds with clean scaling behavior in the ideal limit in which "nite Monte Carlo sampling error can be ignored. We test whether the rescaled mean-square velocity di!erence of the velocity "eld simulated by the MWE method: 1(v (x)!v (x))2 +5# S' (x, x)" +5# +5# "x!x"& is indeed approximately constant, as it should be for a consistent approximation to a fractal random "eld satisfying Eq. (390). We plot in Fig. 31 a histogram of this function for a MWE random "eld with M"40 octaves, wavelet order q"4, and bandwidth b"5 alongside the corresponding histogram for the SRA algorithm with in"nitely many octaves [87]. (In the histogram, S' (x, x) is normalized by S' rather than a maximum of S' (x, x) over the sampled +5# T +5# unit square, but as can be seen from the histogram, these values are very nearly the same.) The histogram for the MWE Method is much narrower than that of the SRA Method, showing only a 6% variation of S' (x, x) for the sampled values in the square 04x, x41. Note that only +5# "nitely many octaves are retained for the MWE computation, so the histogram is broadened somewhat by the breakdown of scaling for "x!x"+1, the largest retained scale. Even sharper constancy for S' (x, x) can be expected on scales well separated from the cuto!s. In any case, +5# Fig. 31 demonstrates that the MWE Method consistently generates a fractal random "eld with accurate scaling (390) and homogeneity of its increments in the limit of in"nitely many realizations. 6.3.3.2. Simulations of velocity xeld structure function. A striking feature of the wavelet Monte Carlo methods is their low variance: averaging over an accessibly small number of realizations still produces phenomenally clean statistical scaling. In Fig. 32, we present the numerically computed
538
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 32. Monte Carlo simulations for the structure function S (x) of a fractal velocity "eld with Hurst exponent H" T using MWE Method with M"40 octaves, wavelet order q"4, and bandwidth b"5. The simulated structure functions are plotted with#symbols, computed from averages over 10 (upper graph), 100 (middle graph), and 1000 (lower graph) realizations. The structure function of the true fractal random "eld is plotted with a solid line (from [87]).
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
539
structure function of the velocity "eld: S (x)"1(v(x#x)!v(x))2 T for x"0, averaged over 10, 100, and 1000 independent realizations of H" fractal random "elds v(x) generated by the MWE method. For all three sample sizes, including the one with only 10 realizations, this structure function obeys a power law with exponent 0.66, in very good agreement with the correct value 2H". Moreover, the scaling coe$cient S' (x, x) remains within 8% of +5# the correct constant value S' over the entire 12 decades of scaling. Only 2500 active computational T elements are needed to generate each realization. The use of the Alpert}Rokhlin wavelets to localize the computation is crucial to this e$ciency. With Haar wavelets, orders of magnitude more wavelets would have to be retained for comparable accuracy [84]. Many more practical numerical details about the MWE method may be found in [84]. The Fourier-Wavelet Method also enjoys great practical success. The plots in Fig. 33 depict the velocity "eld structure function simulation results using the Fourier-Wavelet Method with M"40 octaves and bandwidth b"10. We observe quite good agreement with the fractal scaling law (390) over nine decades. A log}log least-squares power law "t produces a scaling exponent 0.668 versus a true value of , while the "t of the scaling coe$cient is 0.635 versus a true value of S' "0.639 in T this simulation. The statistics of the sample were shown in [82] to be highly Gaussian, in that the #atness factor of the two-point velocity di!erence 1(v (x)!v (0))2 $5 F (x)" $5 $5 1(v (x)!v (0))2 $5 $5 remains within 0.5 of the Gaussian value F(x)"3 over the nine decades of accurate scaling. The deviations from Gaussianity are purely due to "nite sample size; the underlying simulation formulas for the wavelet methods describe Gaussian random "elds. Finally, we apply the Randomization Method to the simulation of a fractal random "eld with a Hurst exponent H" (see Section 6.2.3 for an introduction to the Randomization Method). The Randomization Method makes explicit reference to the energy spectrum, which for the present fractal random "eld is formally E(k)"A "k"\ . # For the formulas of the Randomization Method to be well de"ned, the total energy E(k) dk must be "nite, so we introduce a numerical cuto!: E(k)"0 for "k"4k .
We subdivide wavenumber space into M"256 compartments +I , containing equal amounts H H of energy:
1 E(k) dk , E(k) dk" M 'H and construct the simulated velocity "eld as a superposition of Fourier modes with one wavenumber selected randomly from each of these M"256 bands. We are thereby able to obtain a random
540
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 33. Monte Carlo simulations for the structure function S (x) of a fractal velocity "eld with Hurst exponent H" T using the Fourier-Wavelet Method with M"40 octaves, bandwidth b"10 for 2000 realizations (from [82]). The upper graph shows the simulated structure function as dots and the structure function of the true fractal random "eld as a solid line. The lower plot shows the ratio of the simulated to the true structure function.
"eld with roughly 4 decades of accurate scaling behavior in the structure function when averaged over 2000 independent realizations (Fig. 34). Unlike the wavelet methods, the simulation formula for the Randomization Method does not describe a Gaussian random "eld, so one may well expect signi"cant departures from Gaussianity in the simulated sample. As shown in [82], the #atness
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
541
Fig. 34. Monte Carlo simulations for the structure function S (x) of a fractal velocity "eld with Hurst exponent H" T using the Randomization Method with M"256 compartments in the partition and 2000 realizations (from [82]). The upper graph shows the simulated structure function as dots and the structure function of the true fractal random "eld as a solid line. The lower graph shows the ratio of the simulated to the true structure function.
542
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
factor computed from the above sample of 2000 "elds simulated by the Randomization Method, 1(v (x)!v (0))2 0 F (x)" 0 , 0 1(v (x)!v (0))2 0 0 is within 0.5 of the Gaussian value 3 only for the upper three decades of its four decade scaling regime. Over the lowest decade of scaling, the #atness factor becomes very large, and the simulated "eld on these scales is strongly non-Gaussian. This sort of behavior was also observed for other Randomization Method simulations with other choices of parameters; see [82] for further details and discussion. 6.3.3.3. Simulations of relative tracer diwusion in fractal random steady shear yow. We have seen above that the wavelet Monte Carlo methods and the Randomization Method are each capable of generating random "elds with several decades of self-similar scaling behavior. Our particular interest is to apply these methods to simulating tracer motion in velocity "elds with wide inertial-ranges. Therefore, it is prudent to check directly that the tracer motion is simulated accurately in an exactly solvable model. Even though we have veri"ed the quality of the simulated velocity "elds in several ways, we attempt to understand subtle discrepancies which may have strong cumulative e!ects on the simulation of tracer motion. To this end, we introduce an extension of the Random Steady Shear (RSS) Model, which we used as a benchmark problem for nonhierarchical Monte Carlo methods in Section 6.2. As in the RSS Model, we take the velocity "eld as a two-dimensional steady random shear #ow with constant cross sweep:
*(x, t)"*(x, y, t)"
wN
v(x)
,
where v(x) is a mean zero, homogenous, Gaussian random "eld with correlation function expressed through its energy spectrum:
E("k")ep IV dk"2 E(k)cos(2pkx) dk . \ In the Fractal Random Steady Shear (FRSS) Model, we shall choose the steady shear #ow as a fractal random "eld with formal energy spectrum R(x)"1v(x#x)v(x)2"
E(k)"A "k"\\& , # with Hurst exponent 0(H(1. As we mentioned above, there is some technical di$culty in this de"nition because of the infrared divergence of the energy spectrum at k"0, but in any numerical implementation there will be some e!ective cuto!s imposed at both large and small wavenumbers. For practical purposes, then, it is only important that the statistical quantities we consider are insensitive to such cuto!s when they are su$ciently remotely separated from the scales of interest. The velocity structure function is one such statistical quantity, and the mean-square relative displacement of a pair of tracers is another; see Section 3.5. The structure functions of the simulated velocity "elds have been examined above, and we now turn to the problem of correctly simulating
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
543
the mean-square relative displacement of a pair of tracers along the direction of the shear: p (t),1(>(t)!>(t))2 . 7 We assume in the following that the cross sweep is nontrivial wN O0, and that there is no molecular di!usion (i"0). By the same methods used in [141] and Section 3.2, one can compute an exact formula for p (t) in the FRSS Model [84]: 7 S' T ("wN t">"*x">&!"*x!wN t">&!"*x#wN t">&) , p (t)" 7 (1#H)(1#2H)"wN " (391) C(!H) S' "!2A p>& T # C(H#) Here *x"X(t)!X(t)"X(0)!X(0), and we have assumed that >(0)">(0). In the numerical studies, the Hurst exponent is chosen as the Kolmogorov value H", and space and time are nondimensionalized so that wN "1 and A "1. The largest resolved scale (or lowest # wavenumber) in the velocity "eld is taken as 1 in these nondimensionalized units. In [82,84], it is shown that the mean-square tracer displacement in an FRSS #ow simulated by the MWE, Fourier-Wavelet, and Randomization Methods do all closely follow the exact relation over several decades, provided that su$ciently many octaves are included, a su$ciently small integration step size is taken, and a su$ciently large sample size is used in the average. We will simply present a few illustrative results, and refer the reader to [82,84] for a much more thorough exploration and for practical guidelines concerning choices of parameters. Because of the extra expense of integrating particle trajectories, a smaller number of octaves are retained in the simulated velocity "eld in these validation studies than in the above demonstrations of the capacity of the wavelet methods to simulate velocity "elds with extraordinarily wide self-similar inertial ranges (Figs. 32}34). In Fig. 35, we show that the mean-square relative tracer displacement p (t) can be simulated by 7 the MWE Method [84] over "ve decades of length scales with error never exceeding 3.5%. These results were obtained by averaging over 1000 realizations, with M"30 octaves in the simulated velocity "eld and a specially chosen octave-dependent bandwidth yielding a total of 792 active wavelets. The integration step size *t for the particle trajectory is set to a suitable value of one-"fth of the initial separation x"100d of the tracers. The plot uses units rescaled by the parameter d"2\; note that the smallest length scale in the velocity "eld is 2\+"2\"2\d. It is shown in [84] that the dominant contribution to the error comes from the "nite sample size. Moreover, in contrast to the systematic errors generated by some of the nonhierarchical Monte Carlo Methods in Section 6.2, the errors in MWE simulations typically #uctuate about zero. Similarly solid results are found for the Fourier-Wavelet Method. With M"16 octaves and bandwidth b"10, the mean-square relative tracer separation p (t) can be simulated with 6% 7 accuracy as it increases through a decade of scales [82]. The Randomization Method also gives similar results [82]. Note, however, that as in the RSS Model, p (t) depends only on the second 7 order statistics of the velocity "eld and is thus insensitive to the fact that the Randomization Method generates a velocity "eld with strong departures from Gaussianity over some length scales.
544
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 35. Monte Carlo simulations for tracer pair dispersion in FRSS Model with Hurst exponent H" using the MWE Method with M"30 octaves, wavelet order q"4, a total of 792 wavelets, averaged over 1000 realizations (from [84]). The upper graph compares the simulated relative mean-square tracer displacement p (t) (dotted line) with the exact 7 analytical value (solid line). The vertical axis is rescaled by a factor d"2\, the initial tracer separation is x"100d, and the integration time step h"0.2x. The lower plot displays the ratio of the simulated p (t) to its exact value. 7
6.3.3.4. Relative advantages of wavelet and randomization Monte Carlo methods. Our above discussion is a brief synopsis of the validation studies in [82,84] which demonstrate that the MWE, Fourier-Wavelet, and Randomization Methods are each capable of generating ensembles of steady random shear #ows with several decades of inertial-range scaling and su$cient accuracy
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
545
for turbulent di!usion simulations. We will now brie#y mention some of the relative merits in e$ciency and power of these three methods. A more complete discussion may be found in [82]. Both wavelet methods generate fractal random "elds of comparable quality. As we discussed in Section 6.3.2, the Fourier-Wavelet Method has some advantages in simplicity, speed, and #exibility over the MWE Method, and we therefore consider it as the wavelet method of choice. The Fourier-Wavelet Method generates random "elds which are much closer to Gaussian than those generated by the Randomization Method. Therefore, if the statistical quantities of interest are sensitive to the higher order statistics of the random "eld, then the Fourier-Wavelet Method is the preferred method. If it is only required that the second-order structure function of the random "eld exhibit several decades of self-similar scaling, then the decision in using the Fourier-Wavelet method or Randomization Method comes down mostly to computational cost and di$culty which we discuss brie#y next. Being hierarchical and local in nature, the memory cost of the wavelet methods grows only linearly with the number of simulated decades of the fractal random "eld. The cost of the Randomization Method, on the other hand, is exponential in the number of scaling decades [82]. The wavelet methods, however, have a much higher overhead, and there is a crossover in the relative computational e$ciency between the wavelet and Randomization Methods. The rule of thumb enunciated in [82] is that if 4}5 or fewer decades of scaling behavior are needed in the random "eld, the Randomization Method is more computationally e$cient (and simpler to implement). If a wider scaling regime is desired, then the Fourier-Wavelet Method has superior e$ciency. To avoid serious memory limitations, it is important that the random numbers in the Fourier-Wavelet Method be generated on demand using a reversible random number generator, as described in [84,85].
6.4. Multidimensional simulations Thus far, we have been considering one-dimensional random "elds, appropriate for turbulent di!usion in random, steady shear #ows. General turbulent velocity "elds will be multidimensional vector "elds, so we need to extend the successful one-dimensional Monte Carlo methods to multiple dimensions. The Randomization Method has a straightforward multidimensional implementation. One need only partition the multidimensional wavenumber space into compartments and de"ne probability distributions for the wavenumber selected within each according to the same principle as that described for the one-dimensional case in Section 6.2.3. A minor and easily handled complication is that the amplitudes of the Fourier modes are now Gaussian random vectors rather than Gaussian random scalars, and one must account for correlations between the various components of the velocity "eld ([163], Section 1.4). Developing a multidimensional version of the wavelet methods appears a bit more daunting. While the abstract wavelet expansions behind both the MWE and Fourier-Wavelet Methods have direct vector-valued analogues, one is faced with the task of choosing a vector-valued wavelet basis which will e$ciently localize the computation. Fortunately, there is a simpler approach for the special but important case in which the turbulent velocity "eld *(x) is Gaussian, incompressible,
546
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
and statistically isotropic (see Section 4.2.2). Such a velocity "eld can be well approximated by an appropriate "nite superposition of random shear waves rotated in various directions, and these random shear waves are in turn built out of one-dimensional Gaussian homogenous random "elds [85,208]. This Rotated Random Shear =ave Approximation therefore supplies a means of numerically simulating a Gaussian, incompressible, statistically isotropic vector "eld using any of the Monte Carlo methods for generating one-dimensional Gaussian homogenous random "elds described in Sections 6.2 and 6.3. Elliott and the "rst author [85] have utilized this idea to construct a numerical approximation of a two-dimensional, Gaussian, isotropic, incompressible fractal random velocity "eld *(x) out of one-dimensional random shear #ows generated by the wavelet methods. The resulting multidimensional synthetic random velocity "eld inherits the wide scaling ranges generated by the one-dimensional wavelet methods, and 1000 realizations are su$cient to yield nearly Gaussian and isotropic sample statistics. We will mention some explicit numerical validation results from [82,85] in Section 6.4.2. We also brie#y discuss the application of the Rotated Random Shear Wave Approximation to the one-dimensional Randomization Method, which in fact produces two-dimensional random velocity "elds with statistical properties superior to those generated by the direct multi-dimensional formulation of the Randomization Method indicated above. We note that incompressibility is not an essential constraint. An arbitrary, Gaussian, statistically isotropic random vector "eld (which need not be incompressible) can be well approximated by a superposition of one-dimensional random "elds through a more general random Radon plane wave decomposition. The interested reader may "nd the necessary modi"cations to the Rotated Random Shear Wave Approximation in [85]. 6.4.1. Rotated random shear wave approximation We shall now demonstrate how an incompressible, statistically isotropic, Gaussian random velocity "eld can be well approximated by a superposition of Gaussian random shear #ows pointing in various directions. Let the given incompressible, statistically isotropic, Gaussian random velocity "eld *(x) have mean zero and correlation tensor R(x). It can be represented in terms of the (scalar) energy spectrum E(k) in the following way (273):
1*(x#x)*(x)2"R(x)"
2E("k") ep k ' x P(k) dk . (d!1)A "k"B\ 1B B\
(392)
Here A is the area of the unit sphere SB\ in 1B. The tensor P(k) is the projection operator onto B\ the plane perpendicular to k: P(k)"I!(kk/"k") , and enforces incompressibility of *(x). Next we introduce some useful notation. The symbol eL in what follows denotes a unit H vector in the jth coordinate direction. U-K 3SO(d) is de"ned as the unique rotation matrix which maps eL to the unit vector -K 3SB\, and leaves all vectors orthogonal to -K and eL invariant.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
547
The Rotated Random Shear Wave Approximation is motivated by writing the Fourier integral (392) as an iterated integral over the magnitude and direction of the wavevector k:
1 R -K (x) d-K , R(x)" A B\ 15 B\ 1 (393) 2E(k) P(-K ) dk . R -K (x)" cos(2pk-K ' x) 15 d!1 We now observe that R -K (x) is the correlation tensor of a certain superposition of d!1 simple 15 random shear layers all varying along the direction -K and directed orthogonally to -K and one another. Speci"cally, let +v (x),B be a collection of independent, homogenous, Gaussian random scalar H H "elds with mean zero and common correlation function:
E("k") 2E("k") dk" cos(2pkx) dk . d!1 d!1 \ Next, we form the canonical shear wave velocity ,eld * (x) in 1B varying along the "rst coordinate 15 direction, with components built from these random "elds: 1v (x#x)v (x)2"R (x), H H
ep IV
B * (x)" v (x )eL . (394) 15 H H H In d"2 dimensions, * (x) is just a standard shear layer. In d"3 dimensions, * (x) takes the form 15 15 of a random planar shear wave, in which the velocity "eld is directed within planes of constant x and is uniform within each of these shearing planes. The canonical random shear wave * (x) is 15 a Gaussian random vector "eld with mean zero and correlation tensor: B R (x),1* (x#x)* (x)2" R (x )eL eL "R (x )(I!eL eL )"R (x )P(eL ) . 15 15 15 H H H Next, we de"ne * -K as the random shear wave obtained by rotating the vector "eld * (x) (via U-K ) 15 15 to the de"nite (deterministic) direction -L 3SB\: * -K (x),U-K *(UR-K x) . 15 This rotated velocity "eld is also a mean zero Gaussian random vector "eld, and its correlation tensor is obtained by the following transformation: 1* -K (x#x)* -K (x)2"U-K R (UR-K x)UR-K 15 15 15 "R (eL ' UR-K x)U-K P(eL )UR-K "R (-K ' x)P(-K )"R -K (x) . 15 Thus, R -K (x) is exactly the correlation tensor of a Gaussian random shear wave varying along the 15 direction -K . Since the correlation tensor of the desired statistically isotropic vector "eld *(x) is expressed as an average (393) of R -K (x) over all directions -K , we are led to the following means of statistically 15 approximating *(x) in terms of random shear waves. First choose a set of M deterministic
548
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
directions -K H3SB\, j"1,2, M at least approximately equally spaced around SB\. De"ne the Rotated Random Shear =ave Approximation: 1 + * (x)" * -K H(x) , 0015 (M H 15 where the +* -K H(x),+ are random shear waves obtained from rotating M statistically indepenH 15 dent realizations of the canonical random shear wave * (x) (394) to the directions +-K H,+ . H 15 The random velocity "eld described by this Rotated Random Shear Wave Approximation is a Gaussian random vector "eld with mean zero and correlation tensor: 1 + R (x),1* (x#x)* (x)2" R -K H(x) . 0015 0015 0015 15 M H If the directions +-K H,+ are chosen to be roughly equally spaced, then the simulated correlation H tensor R (x) is a "nite quadrature approximation to formula (393) for the correlation tensor for 0015 the true, statistically isotropic, incompressible, Gaussian random velocity "eld *(x). Since a mean zero Gaussian random "eld is determined entirely by its correlation tensor, the statistical accuracy of the approximation of the velocity "eld *(x) by * (x) is entirely determined by the explicitly 0015 computable error of this "nite quadrature approximation. In particular, by suitable choices of (x) obtained by superposition of random shear waves directions +-K H,+ , the velocity "eld * H 0015 can be made approximately statistically isotropic. 6.4.2. Numerical implementation of Rotated Random Shear Wave Approximation The Rotated Random Shear Wave Approximation suggests an immediate way to simulate numerically a given statistically isotropic, incompressible, Gaussian random vector "eld *(x) with correlation tensor R(x). Namely, we can use one of the e$cient methods discussed in Section 6.3 to generate the independent Gaussian random "elds +v (x),B which comprise the canonical shear H H wave * (x) (394). M independent realizations of such a shear wave are then rotated to the 15 (x), which will have mean zero and correlation tensor approximdirections +-K H,+ to give * H 0015 ately equal to that of *(x). The main issue in this multi-dimensional extension of the Monte Carlo Methods is the choice of the set of directions +-K H,+ . In d"2 dimensions, it is natural and practical to distribute them H with equiangular spacing about the unit circle. The task is a bit trickier in d"3 dimensions, since there is no way to distribute more than 20 directions in an exactly equispaced fashion. In either case, the deviation of the simulated "eld from statistical isotropy can be quanti"ed and bounded by explicit quadrature formulas [85]. We will here give an explicit description of the method of generating an approximately statistically isotropic, incompressible Gaussian velocity "eld using the Rotated Random Shear Wave Approximation for the case of d"2 dimensions. In this case, the random shear waves * -K (x) are just random shear layers: 15 * -K (x)"-K ,v(-K ' x) , 15
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
549
where ! is a vector perpendicular to -K . The approximate velocity "eld is then symbolically written [85]:
-K ,"
1 + -K ,Hv (-K H ' x) , * (x)" H 001* (M H where M is the number of directions used, +-K H,+ is a collection of unit vectors regularly spaced H around the unit circle S, and +v ,+ is a collection of independent realizations of a random H H homogenous scalar one-dimensional "eld with energy spectrum E(k) equal to that of the twodimensional velocity "eld *(x) being simulated. These random scalar "elds can be computed using any simulation technique. It is shown in [85] that M 516 is needed for the simulated velocity "eld to be approximately (within 8%) isotropic according to a natural energy criterion, regardless of the method of simulation used for the random scalar "elds +v ,+ . H H We now report the results obtained by using the Rotated Random Shear Wave Approximation in conjunction with the Multi-Wavelet Expansion (MWE) Method to simulate a two-dimensional, statistically isotropic, incompressible, fractal Gaussian random "eld with energy spectrum E(k)"2k\. (We by no means imply that this energy spectrum is appropriate for a high Reynolds number two-dimensional #ow in nature.) We will apply this method in Section 6.5 in our numerical study of pair dispersion in a velocity "eld with a wide inertial range. The simulation of the scalar "elds includes M"52 octaves, and the bandwidth is chosen so that only 10\ of the energy is lost by the truncation [85,84]. The computation uses 46 592 active elements. In Fig. 36, we plot the simulated velocity structure function: S (r)"1"*(x#reL )!*(x)"2 T averaged over 10, 100, and 1000 realizations, for eL directed midway between two of the M "32 directions +-K H, used for the shear layers. The simulated structure function is found to match the exact analytical formula S (r)"S' r, T T S' "!2C(!1/3)p/C(5/6) T accurately over 12 decades of scales. Only one to two decades of approximate scaling behavior have been achieved in previous simulations of fractal "elds in two dimensions by Viecelli and Can"eld [335] using Successive Random Addition and the Fourier Method on a 256;256 grid, and in three dimensions by Fung et al. [109] using a variant of Kraichnan's method [180] of randomly directed sinusoidal shear waves with 84 computational elements. Power law "ts to the structure function evaluated along "ve di!erent directions (including the one plotted in Fig. 36) reveal excellent quantitative accuracy for the MWE-based simulation method. With only 100 realizations, the error in the "tted exponent is never more than 1.1%, and the error in the "tted prefactor S' is never more than 6%. Moreover, relative measures of deviations T from Gaussianity and isotropy are only a few percent for a sample size of 1000. We refer to [85] for the details of the stringent tests of the quality of the simulated velocity "eld.
550
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
Fig. 36. Structure function of the two-dimensional velocity "eld with energy spectrum E(k)"k\, simulated by the MWE Method (M"52 octaves) with Rotated Random Shear Wave Approximation (M "32 directions). The structure function is evaluated along the radial direction h"p/32. The Monte Carlo statistics for (A) 10, (B) 100, and (C) 1000 realizations are plotted with diamond symbols (from [85]).
The Rotated Random Shear Wave Approximation also works successfully when the onedimensional scalar "elds are simulated by the Fourier-Wavelet Method or the Randomization Method. Details may be found in [82]. We note only that the velocity "elds simulated using the Randomization Method in conjunction with the Rotated Random Shear Wave Approximation are smoother and have longer scaling regimes than those simulated by a straightforward generalization
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
551
of the Randomization Method to two dimensions [142]. This points out once again that one must pay heed to the relative variance of Monte Carlo Methods in practice, and not just their theoretical accuracy in the asymptotic limit of in"nitely many realizations. Another closely related way of simulating statistically isotropic random vector "elds by a superposition of shear waves is to choose the directions +-K H,+ randomly from a uniform distribution H over the sphere SB\ (see [208]). There would be two main disadvantages to this variation as compared to a regularly spaced, deterministic choice of directions. First, the simulated velocity "eld would be non-Gaussian. More importantly, the variance of the Monte Carlo Method would be greater, and a larger number of realizations would be required to achieve a desired accuracy. 6.5. Simulation of pair dispersion in the inertial range We close our section on Monte Carlo methods for turbulent di!usion with a numerical study of the turbulent dispersion of a pair of tracers in a synthetic, statisticallyisotropic turbulent #ow with a wide inertial range of scales. We have already analyzed this problem theoretically in two simpli"ed contexts. In Section 3.5, we developed exact formulas for the pair distance function, the PDF for the separation between a pair of tracers, in an anisotropic turbulent shear #ow (with no molecular di!usion). We also derived (following Kraichnan [179]) an explicit PDE in Section 4.2.1 for the pair-distance function in a statistically isotropic velocity "eld with extremely rapid decorrelations in time; see Eq. (268) and the ensuing discussion. No exact solutions, however, appear available for pair dispersion in multi-dimensional turbulent #ows decorrelating at a "nite rate. Such a problem is of signi"cant applied interest in engineering and atmosphere-ocean science, since the relative di!usion of a pair of tracers is connected with the growth of the size of a cloud of tracers released in a #uid. 6.5.1. Richardson's law We concentrate, as in our previous treatments of pair dispersion, on the growth of the separation distance l(t),"X(t)!X(t)" between a pair of tracers as it evolves through a wide inertial range of scales. We will further specialize our attention to the mean-square tracer separation pX(t)"1l(t)2 rather than the full pair-distance function. As we mentioned in Section 4.2.1, Richardson [284] empirically observed that the mean-square separation between balloons released into the atmosphere grows according to a cubic power law: pX(t)&t. Obukhov [252,253] later showed that such a result could be theoretically deduced through an inertial-range similarity hypothesis and dimensional analysis, and formulated it as the following universal inertial-range prediction: pX(t)+C eN t for ¸ ;(pX(t));¸ . (395) 0 ) Here eN is the energy dissipation rate, and C represents the universal Richardson+s constant. The 0 statement (395) is generally referred to as Richardson's t law. There has been a large e!ort to derive this law and predict its associated constant C by turbulence closure theories 0 [178,192,200,241,322] and to empirically con"rm it and measure C through actual experiments 0 [248,258,261,315] and numerical simulations [109,291,351]. We note that Richardson [284] also formulated a considerably stronger statement (see [31]) that the relative di!usivity of a pair of
552
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
tracers is proportional to the 4/3 power of their momentary (unaveraged) separation, and this has been called Richardson's 4/3 law. In what follows, we will strictly discuss Richardson's t law. 6.5.2. Monte Carlo simulation of pair dispersion Here we describe the "rst numerical experiments, performed by Elliott and the "rst author [86], which exhibited Richardson's t law over many decades of pair separation. Synthetic, twodimensional incompressible, Gaussian random velocity "elds were generated through the MultiWavelet Expansion (MWE) Method and the Rotated Random Shear Wave Approximation which we described in Section 6.4.2. Recall that this method is capable of simulating approximate statistically isotropic, incompressible, Gaussian random velocity "elds which support an accurately self-similar inertial range: 1"*(x#r)!*(x)"2"S' r& , (396) T extending over 12 decades of scales. The basic algorithm was validated for applications in turbulent di!usion on an exactly solvable steady shear layer model (see Section 6.3.3), and on an exactly solvable statistically isotropic model in which the velocity "eld is rapidly decorrelating in time (see Section 4.2.2). The simulated velocity "eld *(x) varies only in space, and is frozen in time. Pair dispersion proceeds very di!erently in a frozen, random two-dimensional velocity "eld than in realistic, temporally evolving turbulent #ows. To introduce temporal #uctuations in the numerical simulation, we sweep the frozen velocity "eld past the laboratory frame by a constant velocity "eld w. This corresponds exactly to Taylor's hypothesis ([320], p. 253) for relating experimental time-series measurements to the spatial structure of the turbulence. The tracers are not transported by the constant sweep in the numerical simulation, and we also ignore molecular di!usion i"0. The equations of motion for the tracers are then dXH(t)/dt"*(XH(t)!wt) , XH(t"0)"xH . (397) Note how the constant sweeping explicitly induces temporal #uctuations in the velocity "eld seen by the tracers. It is natural to associate the sweep velocity w with the magnitude of the velocity #uctuations at the largest simulated scale ¸ of the inertial range: S w+1((*(x#¸ eL )!*(x)) ' eL )2 . (398) S The numerical length scale ¸ is roughly equivalent to the integral length scale ¸ in our theoretical S studies. As usual, eL denotes any unit vector. Using the inertial-range relation for the root-meansquare longitudinal velocity di!erence appearing on the right-hand side of Eq. (398): 1((*(x#r)!*(x)) ' (r/"r")) 2"S' "r"& , T, we are led to set w"(S' ¸& . T, S
(399)
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
553
Statistical isotropy (in d"2 dimensions) implies that S' "(2H#2)S' . Quantities are next T T, nondimensionalized with respect to the length scale ¸ and the time scale ¸ /"w". In these S S nondimensionalized units, ¸ , w, and S' are all equal to unity. S T, 6.5.3. Pair separation statistics obtained from Monte Carlo simulation Here we present the results of the Monte Carlo simulations [86] for the pair separation statistics which utilize the algorithm described above with the Hurst exponent chosen as the Kolmogorov value H". The initial particle separation is chosen as l "10\, which is well within the resolution capabilities of the Monte Carlo algorithm being used. The adaptive time step strategy is described and validated in [86]. Averages are computed over 1024 realizations. The graph of the root mean-square pair separation pX(t)"1"*X"(t)2 in Fig. 37 indicates a power law behavior after about t"100 and persists for eight decades of pair separation. The
Fig. 37. Plot of the root-mean-square tracer pair separation pX(t) versus time (from [86]). Hurst exponent H", initial separation l "10\, averaged over 1024 realizations.
554
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
graph of the logarithmic derivative of pX(t) versus time in Fig. 38 oscillates mildly with a mean value 3, providing an independent and much more stringent con"rmation of Richardson's t law. Finally, in Fig. 39, we graph the variation of A (t)"pX(t)t\, which is just the prefactor in the Richardson's t law. Remarkably, as the reader can see by comparing Figs. 37 and 39, the prefactor settles down over more than 7.5 decades of pair separation to the constant value 0.031$0.004. We recall that one of the main computational devices in the Monte Carlo algorithm used above is the approximation of an isotropic incompressible Gaussian random velocity "eld by a superposition of a large (M "32) number of independent simple shear layers oriented in various directions with equiangular spacing. If instead only a small number of independent shear layer directions are utilized, then the simulated random "eld is anisotropic but with a similar energy spectrum as in the isotropic case. Pair dispersion simulations using only M "2 or 4 directions were conducted in
Fig. 38. Plot of the logarithmic derivative of root-mean-square tracer pair separation, c"d ln pX(t)/d ln t versus time (from [86]). Solid line indicates c"3 predicted by Richardson's t law. Hurst exponent H", initial separation l "10\, averaged over 1024 realizations.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
555
Fig. 39. Plot of the scaling prefactor in the root-mean-square tracer pair separation, A (t),pX(t)t\ versus time (from [86]). Hurst exponent H", initial separation l "10\, averaged over 1024 realizations.
[83] to investigate the e!ects of anisotropy on Richardson's t law. It was found that Richardson's t law remains valid over many decades of separation. Moreover, the prefactor A (t) is approxim ately constant over the scaling regime and nearly universal. For both M "2 and M "4, with various angles between the constant sweep w and the directions of the shear #ows comprising the velocity "eld, the best "t constant values for the scaling coe$cient A (t) fell within the range of 0.029}0.032, which includes the isotropic value 0.031 computed above. These results give strong evidence that the Richardson constant C in (395) is universal for Gaussian random "elds with 0 a wide self-similar inertial range, whether they are isotropic or anisotropic. The adjustment time to achieve the scaling behavior can vary however with the degree of anisotropy. Other statistics are measured in [86] which quantify the intermittency of the pair separation process. In particular, the separation distance *X(t) is found to have a broader-than-Gaussian distribution, and Richardson's t law is crudely obeyed by the mean-square particle separation averaged over only two realizations.
556
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
6.5.4. Relation to other work concerning Richardson's t law In addition to providing a numerical demonstration of Richardson's t law over many decades of scales, the results of the above Monte Carlo simulation pose some interesting challenges for various theories which seek to predict the statistics of pair separation in the inertial range. We shall separately discuss issues pertaining to the t scaling of the mean-square particle separation and the computed value of the scaling preconstant. 6.5.4.1. Open problem: ¹heoretical explanations for Richardson's t law for velocity ,eld satisfying ¹aylor's hypothesis. The mean-square pair separation pX(t) has been demonstrated to obey Richardson's t law in an extraordinarily clean way over eight decades of scales, and the underlying numerical algorithm has been extensively validated for simulating turbulent di!usion [84,85]. It is therefore most remarkable that no theory of which we are aware clearly predicts that Richardson's t law should hold for the velocity "eld with the spatio-temporal dynamics used in the simulation! Recall that the velocity "eld in the laboratory frame * (x, t) is given by sweeping a frozen * random velocity "eld *(x, t), at a constant velocity w: * (x, t)"*(x!wt) . * The frozen "eld *(x) is Gaussian random, statistically isotropic, incompressible #ow with a wide inertial range with the Kolmogorov value H" for the Hurst exponent. The tracers are advected by * (x, t); see Eq. (397). A key di!erence between the simulated "eld * (x, t) and the usual * * random velocity models assumed in turbulence theories is that the temporal decorrelation for * (x, t) is explicitly set through Taylor's hypothesis by a constant sweep velocity w (which is * naturally equated in magnitude with the large-scale velocity #uctuations in *(x)). Physical scaling considerations [86] indicate that the sweep velocity w should be included along with eN and t in the list of a priori relevant parameters describing the inertial-range dynamics of pair separation in the simulation described above. Dimensional analysis is then insu$cient to predict a unique inertial-range scaling behavior for pX(t). Obukhov's inertial-range similarity arguments therefore do not even explain qualitatively Richardson's t law for a velocity "eld with spatial statistics given by Kolmogorov theory and temporal statistics set by Taylor's hypothesis. We now brie#y mention some other modern theories which suggest why Richardson's t law should be observed in various contexts, and indicate why none of these, as they stand, provide a clear explanation for the scaling behavior observed in the Monte Carlo numerical simulations. Some researchers [21,258,351] have pointed out that a cubic growth of the mean-square displacement could arise for reasons having nothing to do with inertial-range scaling. For example, Babiano and coworkers [21,351] show that a cubic growth of the mean-square distance between a pair of tracers will occur over ranges of scales in which the accelerations of the tracers are independent of one another and statistically stationary. These considerations may well describe reasons why Richardson's t law is observed in experimental situations and numerical simulations (such as [108]) where Obukhov's similarity arguments do not apply or on scales extending outside the inertial range of the velocity "eld. The pX(t)&t scaling behavior in the Monte Carlo simulation reported in Section 6.5.3, however, cannot be explained so simply. This is demonstrated by other similar numerical simulations in [86] with di!erent values of the Hurst exponent H describing the inertial-range scaling of the velocity "eld (396). It is found for H"0.2, 0.3, and 0.4 that the mean-square particle separation
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
557
has power law scaling pX(t)&tA over several decades, with c+2/(1!H) within the small error 0.03. Therefore, the scaling behavior of the pair dispersion in the Monte Carlo simulations under discussion is fundamentally related to the Hurst exponent H, and cannot be explained by the above class of theories which does not take the scaling properties of the inertial range into account. It is moreover interesting to note that the dependence of the scaling exponent c"2/(1!H) is in accord with a variety of theories [178,192,351] which assume that the only relevant time scale describing the pair separation dynamics at a scale ¸ in the inertial range is the eddy turnover time: ¸ +(S' )\¸\& . q (¸)" T, *v (¸) ,
(400)
Here *v (¸)+(S' ¸& is the mean-square longitudinal velocity di!erence observed between , T, points separated by a distance ¸. As seen in Eq. (400), the eddy turnover time is simply a natural advective time scale at scale ¸. Consequently, any analytical or phenomenological theory for inertial-range pair dispersion (such as that described in [351]) which involves only length scales and the mean-square (longitudinal) velocity di!erence across such scales is implicitly assuming that the only relevant time scale is the eddy turnover time. For a #ow satisfying Taylor's hypothesis, there is however another relevant time scale set by the time taken for the constant sweep to travel a distance ¸: q (¸)"¸/"w" . When the sweep velocity is matched to the magnitude of the large-scale velocity #uctuations, as it is in the Monte Carlo simulations described above, then the sweeping time scale q (¸) is much shorter than the eddy turnover time q (¸) for all scales within the inertial range [86,319]. Therefore, the sweeping time scale has an a priori importance in the dynamics of tracers in a #ow satisfying Taylor's hypothesis. Formally, it appears that q (¸) should be setting the Lagrangian correlation time, which as we have discussed in Section 3, plays a crucial role in determining the statistical dynamics of a tracer. It is far from clear why pair separation in a #ow satisfying Taylor's hypothesis should obey the scaling laws predicted by theories which ignore the presence of any large-scale sweeping mechanism. Indeed, there is unambiguous mathematical evidence [14,208] that the nature of the spatio-temporal energy spectrum can have a substantial in#uence on pair dispersion. Moreover, if the Lagrangian History Direction Interaction Approximation (LHDIA) used by Kraichnan [178] or the Eddy-Damped Quasi-Normal Markovian Approximation (EDQNM) used by Larcheve( que and Lesieur [192] are crudely modi"ed to account for the sweeping by replacing the appearance of the eddy turnover time q (¸) by the sweeping time scale q (¸), they will predict pair dispersion behavior very di!erent from Richardson's t law for H" and its generalization pX(t)&t\& for general H. It would be most interesting to see whether and how these or other [200] turbulence closure theories could properly take sweeping e!ects into account in a more sophisticated way, and to obtain a clear understanding for why Richardson's t (or more general t\&) law should still be observed within the inertial-range of a velocity "eld with temporal decorrelations set by Taylor's hypothesis. Some subtle consequences of sweeping have been explicitly and rigorously demonstrated for random shear #ow models in [14,141], and were discussed in Section 3. The importance of sweeping e!ects is not limited to #ows satisfying Taylor's hypothesis; tracer pairs in any turbulent #ow are subjected to (nonconstant) sweeping by the large scales of the #ow [319,320].
558
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
6.5.4.2. Theoretical overprediction of Richardson constant. Since the value of the Richardson constant C in his t law (395) has been the object of extensive experimental [248,261,315], 0 theoretical [124,178,192,200,241,299,322], and numerical [109,291] investigation in various contexts, we relate the results reported in Section 6.5.3 to those developed elsewhere. By comparing Eq. (399) with the prefactor S' "1 to the theoretical Kolmogorov relation for the longitudinal T, velocity #uctuation: 1((*(x#r)!*(x)) ' (r/"r"))2"C' eN "r"&, , with experimentally measured dimensionless constant C' +2.0 (in three dimensions), we can , associate an e!ective value of eN "(C' )\+2.8 to the simulation. The Monte Carlo simulations , presented here therefore predict a Richardson constant of C "0.09$0.01 0 in the scaling law (395) for pair dispersion in a two-dimensional, incompressible, Gaussian, random, isotropic velocity "eld which possesses an isotropic Kolmogorov spectrum and satis"es Taylor's hypothesis. This value agrees reasonably well with the one obtained by Tatarski [315], C "0.06 in his 0 experiments. Ozmidov [261] has argued from his experimental data that the appropriate range for C is O(10\). Sabelfeld [291] used the Randomization Method (Section 6.2.3) to study pair 0 dispersion over one decade of scales in a three-dimensional, statistically isotropic synthetic turbulent #ow satisfying Taylor's hypothesis, and obtained the value C "0.25$0.03. Fung et al., 0 in an interesting paper [109], did not study pair dispersion in a #ow satisfying Taylor's hypothesis, but instead built synthetic three-dimensional turbulent velocity "elds with Kolmogorov spatiotemporal statistics as in Section 3.4.3. Inertial-range scaling (396) was satis"ed for less than one decade (in contrast to the 12 decades in the methods [84}86] utilized above); nevertheless, the Richardson's t law was observed for 1.5 decades of pair separation with a Richardson constant C "0.1. All of the empirical work just mentioned points to a small value of the Richardson 0 constant, C , and the direct simulations spanning many decades of pair separation reported in [86] 0 and Section 6.5.3 con"rm a small value C "0.09$0.01 for pair dispersion in a #ow which has 0 a wide inertial scaling range and satis"es the assumptions of Taylor's hypothesis. On the other hand, turbulence closure theories produce values of C that are a full order of 0 magnitude larger. With LHDIA, Kraichnan [178] predicted C "2.42; with another Lagrangian0 history closure, Lundgren [200] predicted C "3.06; an EDQNM procedure [192] leads to 0 C "3.50; a stochastic, Markovian two-particle model [322] has C "1.33; some quasi-Gaussian 0 0 approximations predicted C "0.534 [241] and C "2.49 [57]; and some Langevin equation 0 0 models [124,299] produced C "0.667. What are the reasons for the wide discrepancies between 0 these closure theories and the results mentioned in the previous paragraph regarding the value of C ? 0 One source may be the way in which the closure theories treat the temporal dynamics of the tracers [86]. Kraichnan [178] found that under a certain rapid decorrelation in time limit within his LHDIA calculation, the pair separation would continue to obey Richardson's t law but with a scaling constant 50 times larger. Another possibility for the depression of Richardson's constant below theoretically predicted values is the #ow topology. The theories may not be taking into account the slowing of the relative di!usion of a tracer pair as it passes through regions in which vorticity dominates strain [109].
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
559
6.5.4.3. General remarks on the role of Monte Carlo simulations. We have seen in the above discussion an excellent instance of the valuable interaction between mathematics, reliable numerical simulations, and physical theories. Mathematical considerations suggested the basis of an e$cient and accurate Monte Carlo algorithm for simulating turbulent di!usion in #ows with a wide inertial range, and exactly solvable mathematical model problems and other considerations were used to validate and scrutinize the method (Section 6.3). This numerical algorithm was then utilized to explore turbulent di!usion in more realistic #ows which are still described in a mathematically straightforward fashion (inertial-range scaling, Gaussian statistics, statistical isotropy), but for which exact solutions are no longer available. The results from these Monte Carlo simulations (Section 6.5.3) then pose new test problems for approximate physical theories for turbulent di!usion. One advantage of numerical simulations with synthetic velocity "elds over laboratory experiments in this regard is the fact that the turbulent environment is speci"ed in a mathematically transparent fashion, so the challenge for physical theories can be posed with a suitable degree of complexity. For example, the predictions of turbulent di!usion theories can be "rst examined for accuracy without taking into account intermittency and other nonideal features of a turbulent velocity "eld. Furthermore, as we discussed in Sections 4 and 5 above, Gaussian velocity "elds can often induce similar non-Gaussian statistics in a passive scalar "eld at long times as more complex non-Gaussian velocity "elds. For Richardson's t law, such expected behavior has been con"rmed recently [42].
7. Approximate closure theories and exactly solvable models We have demonstrated throughout this report how simple mathematical models can illustrate various subtle physical mechanisms involved in turbulent di!usion. In Section 6, we also discussed how simple models manifesting a complex variety of behavior can be used to assess the virtues and shortcomings of numerical simulation methods, and how they can lead to and validate the design of more powerful and e$cient algorithms. In this concluding section, we mention how the simple mathematical models can be used in a similar spirit to test the robustness of various approximate theoretical closure theories for turbulent di!usion. We will be intentionally brief because the reader may "nd extensive discussions of these applications in the original work of Avellaneda and the "rst author [13,17] and the recent review paper of Smith and Woodru! [300]. We discussed at the beginning of Section 3 the inherent di$culty of deriving e!ective large-scale equations for the mean passive scalar density due to the active #uctuations of the velocity "eld over a wide inertial range of scales. The rigorous homogenization theory described in Section 2 cannot be applied in general because there is usually no strong scale separation between the length scales of the passive scalar and velocity "eld. Various schemes for deriving approximate large-scale equations for the mean passive scalar density in the absence of scale separation have been proposed proceeding from a diverse collection of frameworks and formal assumptions [48,49,57,175,177,182, 227,285,286,321,327,328,344]. However, the equations produced by di!erent theories are generally distinct, and it is usually di$cult to determine whether the formal hypotheses are satis"ed on which the di!erent theories are founded. Tests of the theoretical predictions against laboratory experiments and direct numerical simulations are therefore crucial [133]. Experimental assessments however face certain limitations
560
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
concerning both the extent to which the input parameters can be faithfully matched between the theory and the laboratory setup, and the extent to which accurate and comprehensive data can be collected in high Reynolds number #ows. Direct numerical simulations, on the other hand, are constrained by hardware limitations to moderate Reynolds numbers [59,154], particularly if nontrivial macroscale variations are present. Simpli"ed mathematical models therefore provide an important complementary means of examining the accuracy and content of approximate closure theories. We have seen how exact characterization of the passive scalar statistics may be achieved in a variety of nontrivial mathematical models. These often allow precise characterization of turbulent di!usion in important asymptotic limits as well as at "nite parameter values and over "nite time intervals. We particularly mention in this regard the Simple Shear Models described in Section 3 for which exact equations describing the high Reynolds number behavior of the mean passive scalar density have been derived using a rigorous renormalization procedure [10]. It is particularly instructive to compare these exact equations with the predictions of approximate closure theories to gain some insight into their strengths and shortcomings. Such a study was carried out by Avellaneda and the "rst author [13] for closures based on the renormalization group theory (RNG) [300,344] and renormalized perturbation theory (in particular, Kraichnan's Direct Interaction Approximation [173}175,177,197,285] and the "rst-order smoothing (quasi-normal) approximation [48,49]). Each of the approximate closure theories recovers the correct large-scale equations for a subset of the phase diagram of scaling exponents (e, z) (see Section 3.4.3), but predicts incorrect equations in other substantial regions [13]. In particular, the RNG theory is exact for those Simple Shear Models in which the correlation time of the velocity "eld is much shorter than the dynamical time scale of the passive scalar "eld, but fails otherwise [13,17,300]. The RNG theory always predicts a local e!ective di!usion equation with some enhanced eddy di!usivity, but the rigorous results of the Simple Shear Model indicate that this is inappropriate in a wide variety of situations [10]. The RNG theory also predicts incorrect space}time rescalings for certain regimes. The renormalized perturbation theories, by contrast, predict the correct scaling exponents (after an elaborate analysis) for all phase regimes in the Simple Shear Model, but sometimes mistakenly suggest nonlocal evolution equations when the exact equations are in fact local [13]. (Other examples of this latter phenomenon in simple stochastic problems are presented in [328]). Both the RPT and RNG theories predict correct large-scale equations in one phase region abutting the Kolmogorov values ((e, z)"(8/3, 2/3)), but introduce discrepancies from the exact renormalized equation at this point and in the other neighboring regime. Note that the Eulerian and Lagrangian versions [177] of the renormalized perturbation theories are equivalent for the Simple Shear Model without a sweep as discussed in [13] because the Eulerian and Lagrangian velocity correlations coincide. We see in this way how simple mathematical models can yield both quantitative and qualitative insight into the strengths and shortcomings of approximate closure strategies. Recently, van den Eijnden and the current authors have investigated how various closure theories, including a new `modi"ed direction interaction approximationa [328,329], fare under the introduction of a temporally #uctuating cross sweep to a shear #ow (Section 3). This is the simplest model problem with complex behavior where the Eulerian and Lagrangian correlations are unequal. These results will be reported in a forthcoming paper [330]. Our hope, in the long run, is to learn the strengths and weaknesses of various existing closure theories by applying them to the
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
561
simpli"ed mathematical models described in this report, and thereby be instructed in the formulation of new and improved closure approximations.
Acknowledgements The "rst author would like to thank his long time collaborator in turbulent di!usion, Marco Avellaneda, for his explicit and implicit contributions to this review. The authors also thank David Horntrop and Richard McLaughlin for their help with the "gures for Sections 2 and 6. AJM is partially supported by grants NSF-DMS-9625795, ARO-DAAG55-98-1-0129, and ONR-N0001495-1-0345. PRK is an NSF postdoctoral fellow whose work toward this review was supported by a Fannie and John Hertz Foundation Graduate Fellowship.
References [1] R.J. Adler, The Geometry of Random Fields, Sections 8.3, Wiley Series in Probability and Mathematical Statistics, Wiley, Chichester, 1981, pp. 198}203. [2] L.Ts. Adzhemyan, N.V. Antonov, A.N. Vasil'ev, Renormalization group, Operator product expansion, and anomalous scaling in a model of advected passive scalar, chao-dyn/9801033, 1998. [3] B.K. Alpert, Ph.D. Thesis, Department of Computer Science, Yale University, 1990. [4] B.K. Alpert, Construction of simple multiscale bases for fast matrix operations, in: Ruskai et al. (Eds.), Wavelets and their Applications, Jones and Bartlett Publishers, Boston, MA, 1992, pp. 211}226. [5] R.A. Antonia, E.J. Hop"nger, Y. Gagne, F. Anselmet, Temperature structure functions in turbulent shear #ows, Phys. Rev. A 30 (5) (1984) 2704}2707. [6] Neil W. Ashcroft, N.D. Mermin, Solid State Physics, Ch. 10, W.B. Saunders Company, Philadelphia, 1976, pp. 176}190. [7] P. Auscher, G. Weiss, M.V. Wickerhauser, Local sine and cosine bases of Coifman and Meyer and the construction of smooth wavelets, in: C.K. Chui (Ed.), Wavelets: a Tutorial in Theory and Applications, Wavelet Analysis and its Applications, vol. 2, Academic Press, New York, 1992, pp. 237}256. [8] M. Avellaneda, Jr., E. Frank, C. Apelian, Trapping, percolation, and anomalous di!usion of particles in a two-dimensional random "eld, J. Statist. Phys. 72 (5}6) (1993) 1227}1304. [9] M. Avellaneda, A.J. Majda, Stieltjes integral representation and e!ective di!usivity bounds for turbulent transport, Phys. Rev. Lett. 62 (7) (1989) 753}755. [10] M. Avellaneda, A.J. Majda, Mathematical models with exact renormalization for turbulent transport, Comm. Pure Appl. Math. 131 (1990) 381}429. [11] M. Avellaneda, A.J. Majda, Homogenization and renormalization of multiple-scattering expansions for Green functions in turbulent transport, in: Composite Media and Homogenization Theory (Trieste, 1990), of Programme Nonlinear Di!erential Equations Applications, vol. 5, BirkhaK user, Boston, MA, 1991, pp. 13}35. [12] M. Avellaneda, A.J. Majda, An integral representation and bounds on the e!ective di!usivity in passive advection by laminar and turbulent #ows, Comm. Math. Phys. 138 (1991) 339}391. [13] M. Avellaneda, A.J. Majda, Approximate and exact renormalization theories for a model for turbulent transport, Phys. Fluids A 4 (1) (1992) 41}56. [14] M. Avellaneda, A.J. Majda, Mathematical models with exact renormalization for turbulent transport, II: fractal interfaces, non-Gaussian statistics and the sweeping e!ect, Comm. Pure. Appl. Math. 146 (1992) 139}204. [15] M. Avellaneda, A.J. Majda, Renormalization theory for eddy di!usivity in turbulent transport, Phys. Rev. Lett. 68 (20) (1992) 3028}3031. [16] M. Avellaneda, A.J. Majda, Superdi!usion in nearly strati"ed #ows, J. Statist. Phys. 69 (3/4) (1992) 689}729.
562
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
[17] M. Avellaneda, A.J. Majda, Application of an approximate R-N-G theory, to a model for turbulent transport, with exact renormalization, in: Turbulence in Fluid Flows, IMA Vol. Math. Appl., vol. 55, Springer, Berlin, 1993, pp. 1}31. [18] M. Avellaneda, A.J. Majda, Simple examples with features of renormalization for turbulent transport, Phil. Trans. R. Soc. Lond. A 346 (1994) 205}233. [19] M. Avellaneda, S. Torquato, I.C. Kim, Di!usion and geometric e!ects in passive advection by random arrays of vortices, Phys. Fluids A 3 (8) (1991) 1880}1891. [20] M. Avellaneda, M. Vergassola, Stieltjes integral representation of e!ective di!usivities in time-dependent #ows, Phys. Rev. E 52 (3) (1995) 3249}3251. [21] A. Babiano, C. Basdevant, P. Le Roy, R. Sadourny, Relative dispersion in two-dimensional turbulence, J. Fluid Mech. 214 (1990) 535}557. [22] A. Babiano, C. Basdevant, R. Sadourny, Structure functions and dispersion laws in two-dimensional turbulence, J. Atmospheric Sci. 42 (9) (1985) 941}949. [23] E. Balkovksy, V. Lebedev, Instanton for the Kraichnan passive scalar problem, chao-dyn/9803018, 12 March, 1998. [24] G.I. Barenblatt, Dimensional Analysis, Gordon and Breach, New York, 1987. [25] G.I. Barenblatt, Scaling, self-similarity, and intermediate asymptotics, Cambridge Texts in Applied Mathematics, vol. 14, Cambridge University Press, Cambridge, UK, 1996. [26] G.I. Barenblatt, A.J. Chorin, Scaling laws and zero viscosity limits for wall-bounded shear #ows and for local structure in developed turbulence, Commun. Pure Appl. Math. 50 (4) (1997) 381}398. [27] G.K. Batchelor, Di!usion in a "eld of homogenous turbulence II. The relative motion of particles, Proc. Cambridge Phil. Soc. 48 (1952) 345}362. [28] G.K. Batchelor, Small-scale variation of convected quantities like temperature in turbulent #uid. Part 1. General discussion and the case of small conductivity, J. Fluid. Mech. 5 (1959) 113}133. [29] G.K. Batchelor, I.D. Howells, A.A. Townsend, Small-scale variation of convected quantities like temperature in a turbulent #uid. Part 2. The case of large conductivity, J. Fluid. Mech. 5 (1959) 134}139. [30] C.M. Bender, S.A. Orszag, Advanced Mathematical Methods for Scientists and Engineers, International Series in Pure and Applied Mathematics, McGraw-Hill, New York, 1978. [31] A.F. Bennett, A Lagrangian analysis of turbulent di!usion, Rev. Geophys. 25 (4) (1987) 799}822. [32] A. Bensoussan, J.-L. Lions, G. Papanicolaou, Asymptotic Analysis for Periodic Structures, Studies in Mathematics and its Applications, vol. 5, North-Holland-Elsevier Science Publishers, Amsterdam, 1978. [33] R. Benzi, L. Biferale, A. Wirth, Analytic calculation of anomalous scaling in random shell models for a passive scalar, Phys. Rev. Lett. 78 (26) (1997) 4926}4929. [34] R. Benzi, S. Ciliberto, R. Tripiccione, C. Baudet, F. Massaioli, S. Succi, Extended self-similarity in turbulent #ows, Phys. Rev. E 48 (1) (1993) R29}R32. [35] D. Bernard, K. Gawe7 dzki, A. Kupiainen, Anomalous scaling in the N-point functions of passive scalar, Phys. Rev. E 54 (3) (1996) 2564}2572. [36] D. Bernard, K. Gawe7 dzki, A. Kupiainen, Slow modes in passive advection, J. Statist. Phys. 90 (3-4) (1998) 519}569. [37] G. Beylkin, R. Coifman, V. Rokhlin, Wavelets in numerical analysis, in: Ruskai et al. (Eds.), Wavelets and their Applications, Jones and Bartlett Publishers, Boston, MA, 1992, pp. 181}210. [38] R.N. Bhattacharya, A central limit theorem for di!usions with periodic coe$cients, Ann. Probab. 13(2) (1985) 385}396. [39] R.N. Bhattacharya, V.K. Gupta, H.F. Walker, Asymptotics of solute dispersion in periodic porous media, SIAM J. Appl. Math. 49(1) (1989) 86}98. [40] L. Biferale, A. Crisanti, M. Vergassola, A. Vulpiani, Eddy di!usivities in scalar transport, Phys. Fluids 7(11) (1995) 2725}2734. [41] P. Billingsley, Probability and Measure, 3rd ed., Wiley, New York, 1995. [42] G. Bo!etta, A. Celani, A. Crisanti, A. Vulpiani, Relative dispersion in fully developed turbulence: from Eulerian to Lagrangian statistics in synthetic #ows, preprint, 1998. [43] R. Borghi, Turbulent combustion modelling, Prog. Energy Combust. Sci. 14 (1988) 245}292.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
563
[44] A.N. Borodin, A limit theorem for solutions of di!erential equations with random right-hand side, Theory Probab. Appl. 22(3) (1977) 482}497. [45] J.-P. Bouchaud, A. Comtet, A. Georges, P. Le Doussal, Anomalous di!usion in random media of any dimensionality, J. Physique 48 (1987) 1445}1450. [46] J.-P. Bouchaud, A. Georges, J. Koplik, A. Provata, S. Redner, Superdi!usion in random velocity "elds, Phys. Rev. Lett. 64(21) (1990) 2503}2506. [47] J.-P. Bouchaud, A. Georges, Anomalous di!usion in disordered media: statistical mechanisms, models and physical applications, Phys. Rep. 195(4-5) (1990) 127}293. [48] R.C. Bourret, An hypothesis concerning turbulent di!usion, Can. J. Phys. 38 (1960) 665}676. [49] R.C. Bourret, Stochastically perturbed "elds, with applications to wave propagation in random media, Nuovo Cimento (10) 26 (1962) 1}31. [50] R.N. Bracewell, The Fourier Transform and its Applications, 2nd ed., McGraw-Hill, New York, 1986. [51] J.C. Bronski, R.M. McLaughlin, Scalar intermittency and the ground state of periodic SchroK dinger equations, Phys. Fluids 9(1) (1997) 181}190. [52] R. Camassa, S. Wiggins, Transport of a passive tracer in time-dependent Rayleigh}BeH nard convection, Phys. D 51(1-3) (1991) 472}481; Nonlinear science: the next decade, Los Alamos, NM, 1990. [53] R.A. Carmona, J.P. Fouque, Di!usion-approximation for the advection-di!usion of a passive scalar by a spacetime Gaussian velocity "eld, in: E. Bolthausen, M. Dozzi, F. Russo (Eds.), Seminar on Stochastic Analysis, Random Fields and Applications, Progress in Probability, vol. 36, Centro Stefano Franscini, BirkhaK user, Basel, 1995, pp. 37}49. [54] R.A. Carmona, S.A. Grishin, S.A. Molchanov, Massively parallel simulations of motions in a Gaussian velocity "eld, Stochastic Modelling in Physical Oceanography, Progr. Prob., vol. 39, BirkhaK user, Boston, 1996, pp. 47}68. [55] B. Castaing, G. Gunaratne, F. Heslot, L. Kadano!, A. Libchaber, S. Thomae, X.-Z. Wu, S. Zaleski, G. Zanetti, Scaling of hard thermal turbulence in Rayleigh-BeH nard convection, J. Fluid Mech. 204 (1989) 1}30. [56] J. Chasnov, V.M. Canuto, R.S. Rogallo, Turbulence spectrum of a passive temperature "eld: results of a numerical simulation, Phys. Fluids 31(8) (1988) 2065}2067. [57] V.R. Chechetkin, V.S. Lutovinov, A.A. Samokhin, On the di!usion of passive impurities in random #ows, Physica A 175 (1991) 87}113. [58] H. Chen, S. Chen, R.H. Kraichnan, Probability distribution of a stochastically advected scalar "eld, Phys. Rev. Lett. 63(24) (1989) 2657}2660. [59] S. Chen, G.D. Doolen, R.H. Kraichnan, Z.-S. She, On statistical correlations between velocity increments and locally averaged dissipation in homogenous turbulence, Phys. Fluids A 5 (1993) 458}463. [60] S. Chen, R.H. Kraichnan, Sweeping decorrelation in isotropic turbulence, Phys. Fluids A 1(12) (1989) 2019}2024. [61] S. Chen, R.H. Kraichnan, Simulations of a randomly advected passive scalar "eld, Phys. Fluids (1998) in press. [62] M. Chertkov, Instanton for random advection, Phys. Rev. E 55 (3) (1997) 2722}2735. [63] M. Chertkov, G. Falkovich, Anomalous scaling exponents of a white-advected passive scalar, Phys. Rev. Lett. 76(15) (1996) 2706}2709. [64] M. Chertkov, G. Falkovich, I. Kolokolov, V. Lebedev, Normal and anomalous scaling of the fourth-order correlation function of a randomly advected passive scalar, Phys. Rev. E 52(5) (1995) 4924}4941. [65] M. Chertkov, G. Falkovich, I. Kolokolov, V. Lebedev, Statistics of a passive scalar advected by a large-scale two-dimensional velocity "eld: analytic solution, Phys. Rev. E 51(6) (1995) 5609}5627. [66] M. Chertkov, A. Gamba, I. Kolokolov, Exact "eld-theoretical description of passive scalar convection in an N-dimensional long-range velocity "eld, Phys. Lett. A 192 (1994) 435}443. [67] S. Childress, Alpha-e!ect in #ux ropes and sheets, Phys. Earth Planet. Int. 20 (1979) 172}180. [68] S. Childress, A.M. Soward, Scalar transport and alpha-e!ect for a family of cat's-eye #ows, J. Fluid Mech. 205 (1989) 99}133. [69] E.S.C. Ching, V.S. L'vov, E. Podivilov, I. Procaccia, Conditional statistics in scalar turbulence: theory versus experiment, Phys. Rev. E 54 (6) (1996) 6364}6371. [70] E.S.C. Ching, V.S. L'vov, I. Procaccia, Fusion rules and conditional statistics in turbulent advection, Phys. Rev. E 54(5) (1996) R4520}R4523.
564
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
[71] E.S.C. Ching, Y. Tu, Passive scalar #uctuations with and without a mean gradient: a numerical study, Phys. Rev. E 49(2) (1994) 1278}1282. [72] A.J. Chorin, Vorticity and turbulence, Applied Mathematical Sciences, vol. 103, Springer, New York, 1994. [73] J.P. Clay, Turbulent mixing of temperature in water, air, and mercury, Ph.D. Thesis, University of California at San Diego, 1973. [74] P. Constantin, I. Procaccia, The geometry of turbulent advection: Sharp estimates for the dimensions of level sets, Nonlinearity 7 (1994) 1045}1054. [75] P. Constantin, I. Procaccia, K.R. Sreenivasan, Fractal geometry of isoscalar surfaces in turbulence: theory and experiments, Phys. Rev. Lett. 67 (13) (1991) 1739}1742. [76] S. Corrsin, On the spectrum of isotropic temperature #uctuations in isotropic turbulence, J. Appl. Phys. 22 (1951) 469. [77] A. Crisanti, M. Falcioni, G. Paladin, A. Vulpiani, Anisotropic di!usion in #uids with steady periodic velocity "elds, J. Phys. A 23(14) (1990) 3307}3315. [78] G.T. Csanady, Turbulent Di!usion in the Environment, Geophysics and Astrophysics Monographs, vol. 3, D. Reidel, Dordrecht, 1973. [79] G. Dagan, Theory of solute transport by groundwater, in Annual Review of Fluid Mechanics, vol. 19, Annual Reviews, Palo Alto, CA, 1987, pp. 183}215. [80] I. Daubechies, Ten Lectures on Wavelets, CBMS-NSF Regional Conf. Series in Applied Mathematics, vol. 61, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1992. [81] D.R. Dowling, P.E. Dimotakis, Similarity of the concentration "eld of gas-phase turbulent jets, J. Fluid Mech. 218 (1990) 109}141. [82] F.W. Elliott Jr, D.J. Horntrop, A.J. Majda, A Fourier-wavelet Monte Carlo method for fractal random "elds, J. Comput. Phys. 132(2) (1997) 384}408. [83] F.W. Elliott Jr, D.J. Horntrop, A.J. Majda, Monte Carlo methods for turbulent tracers with long range and fractal random velocity "elds, Chaos 7(1) (1997) 39}48. [84] F.W. Elliott Jr, A.J. Majda, A wavelet Monte Carlo method for turbulent di!usion with many spatial scales, J. Comput. Phys. 113(1) (1994) 82}111. [85] F.W. Elliott Jr, A.J. Majda, A new algorithm with plane waves and wavelets for random velocity "elds with many spatial scales, J. Comput. Phys. 117 (1995) 146}162. [86] F.W. Elliott Jr, A.J. Majda, Pair dispersion over an inertial range spanning many decades, Phys. Fluids 8 (4) (1996) 1052}1060. [87] F.W. Elliott Jr, A.J. Majda, D.J. Horntrop, R.M. McLaughlin, Hierarchical Monte Carlo methods for fractal random "elds, J. Statist. Phys. 81 (1995) 717. [88] P.F. Embid, A.J. Majda, P.E. Souganidis, E!ective geometric front dynamics for premixed turbulent combustion with separated velocity scales, Comb. Sci. Technol. 103 (1994) 85}115. [89] P.F. Embid, A.J. Majda, P.E. Souganidis, Comparison of turbulent #ame speeds from complete averaging and the G-equation, Phys. Fluids 7(8) (1995) 2052}2060. [90] P.F. Embid, A.J. Majda, P.E. Souganidis, Examples and counterexamples for Huygens Principle in premixed combustion, Comb. Sci. Technol. 120(1-6) (1996) 273}303. [91] V. Eswaran, S.B. Pope, Direct numerical simulations of the turbulent mixing of a passive scalar, Phys. Fluids 31(3) (1988) 506}520. [92] G. Eyink, J. Xin, Dissipation-independence of the inertial-convective range in a passive scalar model, Phys. Rev. Lett. 77(13) (1996) 2674}2677. [93] G. Eyink, J. Xin, Existence and uniqueness of ¸-solutions at zero-di!usivity in the Kraichnan model of a passive scalar, chao-dyn/9605008, 15 May 1996. [94] A.L. Fairhall, B. Galanti, V.S. L'vov, I. Procaccia, Direct numerical simulations of the Kraichnan Model: scaling exponents and fusion rules, Phys. Rev. Lett. 79(21) (1997). [95] A.L. Fairhall, O. Gat, V. L'vov, I. Procaccia, Anomalous scaling in a model of passive scalar advection: exact results, Phys. Rev. E 53(4A) (1996) 3518}3535. [96] G. Falkovich, I. Kolokolov, V. Lebedev, A. Migdal, Instantons and intermittency, Phys. Rev. E 54(5) (1996) 4896}4907.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
565
[97] A. Fannjiang, G. Papanicolaou, Convection enhanced di!usion for periodic #ows, SIAM J. Appl. Math. 54(2) (1994) 333}408. [98] A. Fannjiang, G. Papanicolaou, Di!usion in turbulence, Probab. Theory Related Fields 105 (3) (1996) 279}334. [99] A. Fannjiang, G. Papanicolaou, Convection-enhanced di!usion for random #ows, J. Statist. Phys. 88(5-6) (1997) 1033}1076. [100] J. Feder, in: Fractals, Chs. 9}14, Physics of Solids and Liquids, Plenum Press, New York, 1988, pp. 163}243. [101] C. Fe!erman, private communication. [102] W. Feller, An Introduction to Probability Theory and its Applications, 3rd ed., vol. 1, Wiley, New York, 1968. [103] W. Feller, An Introduction to Probability Theory and its Applications, 2nd ed., vol. 2, Section II.2, Wiley, New York, 1971, pp. 47, 48. [105] G.B. Folland, Introduction to Partial Di!erential Equations, 2nd ed., Princeton University Press, Princeton, 1995. [106] F.N. Frenkiel, P.S. Klebano!, Two-dimensional probability distribution in a turbulent "eld, Phys. Fluids 8 (1965) 2291}2293. [107] A. Friedman, Stochastic Di!erential Equations and Applications, vol. 1, Academic Press, New York, 1975. [108] U. Frisch, A. Mazzino, M. Vergassola, Intermittency in passive scalar advection, Phys. Rev. Lett. 80(25) (1998) 5532}5535. [109] J.C.H. Fung, J.C.R. Hunt, N.A. Malik, R.J. Perkins, Kinematic simulation of homogenous turbulence by unsteady random Fourier modes, J. Fluid Mech. 236 (1992) 281}318. [111] F. Gao, Mapping closure and non-Gaussianity of the scalar probability density functions in isotropic turbulence, Phys. Fluids A 3(10) (1991) 2438}2444. [112] T.C. Gard, Introduction to Stochastic Di!erential Equations, Pure and Applied Mathematics, vol. 114, Marcel Dekker, New York, 1988. [113] A.E. Gargett, Evolution of scalar spectra with the decay of turbulence in a strati"ed #uid, J. Fluid Mech. 159 (1985) 379}407. [114] O. Gat, V.S. L'vov, E. Podivilov, I. Procaccia, Nonperturbative zero modes in the Kraichnan model for turbulent advection, Phys. Rev. E 55(4) (1997) R3836}R3839. [115] O. Gat, R. Zeitak, Multiscaling in passive scalar advection as stochastic shape dynamics, Phys. Rev. E 57(5) (1998) 5511}5519. [116] K. Gawe7 dzki, A. Kupiainen, Anomalous scaling of the passive scalar, Phys. Rev. Lett. 75(21) (1995) 3834}3837. [117] K. Gawe7 dzki, A. Kupiainen, Universality in turbulence: an exactly solvable model, in: Low-dimensional Models in Statistical Physics and Quantum Field Theory (Schladming, 1995), Lecture Notes in Physics, vol. 469, Springer, Berlin, 1996, pp. 71}105. [118] I.M. Gel'fand, N.Ya. Vilenkin, Generalized Functions, Applications of Harmonic Analysis, Ch. 4, Academic Press, New York, 1964. [119] L.W. Gelhar, A.L. Gutjahr, R.L. Na!, Stochastic analysis of macrodispersion in a strati"ed aquifer, Water Resour. Res. 15 (6) (1979) 1387}1397. [120] C.H. Gibson, Fine structure of scalar "elds mixed by turbulence. I, Zero-gradient points and minimal gradient surfaces, Phys. Fluids 11(11) (1968) 2305}2315. [121] C.H. Gibson, Fine structure of scalar "elds mixed by turbulence. II, Spectral Theory, Phys. Fluids 11(11) (1968) 2316}2327. [122] C.H. Gibson, W.T. Ashurst, A.R. Kerstein, Mixing of strongly di!usive passive scalars like temperature by turbulence, J. Fluid Mech. 194 (1988) 261}293. [123] C.H. Gibson, W.H. Schwarz, The universal equilibrium spectra of turbulent velocity and scalar "elds, J. Fluid Mech. 16 (1963) 357}384. [124] F.A. Gi!ord, Horizontal di!usion in the atmosphere: a Lagrangian-dynamical theory, Atmos. Environ. 16 (1982) 505}512. [125] K. Golden, S. Goldstein, J.L. Lebowitz, Classical transport in modulated structures, Phys. Rev. Lett. 55 (24) (1985) 2629}2632. [126] N. Goldenfeld, Lectures on Phase Transitions and the Renormalization Group, Frontiers in Physics, vol. 85, Addison-Wesley, Reading, MA, USA, 1992.
566
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
[127] J.P. Gollub, J. Clarke, M. Gharib, B. Lane, O.N. Mesquita, Fluctuations and transport in a stirred #uid with a mean gradient, Phys. Rev. Lett. 67(25) (1991) 3507}3510. [128] F.C. Gouldin, Interpretation of jet mixing using fractals, AIAA J. 26 (1988) 1405}1407. [129] H.L. Grant, B.A. Hughes, W.M. Vogel, A. Moilliet, The spectrum of temperature #uctuations in turbulent #ow, J. Fluid Mech. 34 (1968) 423}442. [130] V.K. Gupta, R.N. Bhattacharya, Solute dispersion in multidimensional periodic saturated porous media, Water Resour. Res. 22(2) (1986) 156}164. [131] O. GuK ven, F.J. Molz, Deterministic and stochastic analyses of dispersion in an unbounded strati"ed porous medium, Water Resour. Res. 22(11) (1986) 1565}1574. [132] H.G.E. Hentschel, I. Procaccia, Relative di!usion in turbulent media: the fractal dimension of clouds, Phys. Rev. A 29(3) (1983) 1461}1470. [133] J.R. Herring, R.M. Kerr, Comparison of direct numerical simulations with predictions of two-point closures for isotropic turbulence convecting a passive scalar, J. Fluid Mech. 118 (1982) 205}219. [134] R. Hersh, Random evolutions: a survey of results and problems, Rocky Mountain J. Math. 4 (3) (1974) 443}477. [135] F. Heslot, B. Castaing, A. Libchaber, Transition to turbulence in helium gas, Phys. Rev. A 36 (1987) 5870}5873. [136] R.J. Hill, Models of the scalar spectrum for turbulent advection, J. Fluid Mech. 88(3) (1978) 541}562. [137] R.J. Hill, Solution of Howell's model of the scalar spectrum and comparison with experiment, J. Fluid Mech. 96(4) (1980) 705}722. [138] M. Holzer, A. Pumir, Simple models of non-Gaussian statistics for a turbulently advected passive scalar, Phys. Rev. E 47(1) (1993) 202}219. [139] M. Holzer, E.D. Siggia, Turbulent mixing of a passive scalar, Phys. Fluids 6(5) (1994) 1820}1837. [140] D.J. Horntrop, Monte Carlo simulation for turbulent transport, Ph.D. Thesis, Princeton University, 1995. Program in Applied and Computational Mathematics. [141] D.J. Horntrop, A.J. Majda, Subtle statistical behavior in simple models for random advection-di!usion, J. Math. Sci. Univ. Tokyo 1 (1994) 1}48. [142] D.J. Horntrop, A.J. Majda, An overview of Monte Carlo simulation techniques for the generation of random "elds, Proc. 9th Aha Huliko Hawaiian Winter Workshop, 1997, to appear. [143] I.A. Ibragimov, Yu.V. Linnik, Independent and Stationary Sequences of Random Variables, Ch. 17, WoltersNoordho! Publishing, Groningen, The Netherlands, 1971. [144] M.B. Isichenko, Ya.L. Kalda, E.B. Tatarinova, O.V. Tel'kovskaya, V.V. Yan'kov, Di!usion in a medium with vortex #ow, Sov. Phys. JETP 69(3) (1989) 517}524. [145] Jayesh, C. Tong, Z. Warhaft, On temperature spectra in grid turbulence, Phys. Fluids 6(1) (1994) 306}312. [146] Jayesh, Z. Warhaft, Probability distribution of a passive scalar in grid-generated turbulence, Phys. Rev. Lett. 67(25) (1991) 3503}3506. [147] Jayesh, Z. Warhaft, Probability distribution, conditional dissipation, and transport of passive temperature #uctuations in grid-generated turbulence, Phys. Fluids A 4(10) (1992) 2292}2307. [148] V.V. Jikov, S.M. Kozlov, O.A. Oleinik, Homogenization of Di!erential Operators and Integral Functionals, Springer, Berlin, 1994. [149] V.V. Jikov, S.M. Kozlov, O.A. Oleinik, Homogenization of Di!erential Operators and Integral Functionals, Ch. 2, Springer, Berlin, 1994, pp. 55}85. [150] F. John, Partial Di!erential Equations, Applied Mathematical Sciences, 4th ed., Ch. 1, Springer, Berlin, 1982, 1}32. [151] S. Karlin, H.M. Taylor, A Second Course in Stochastic Processes, section 16.1, Academic Press, Boston, 1981. [152] A.P. Kazantsev, Enhancement of a magnetic "eld by a conducting #uid, Sov. Phys. JETP 26 (1968) 1031. [153] R.M. Kerr, Higher-order derivative correlations and the alignment of small-scale structures in isotropic numerical turbulence, J. Fluid Mech. 153 (1985) 31}58. [154] R.M. Kerr, Velocity, scalar and transfer spectra in numerical turbulence, J. Fluid Mech. 211 (1990) 309}322. [155] R.M. Kerr, Rayleigh number scaling in numerical convection, J. Fluid Mech. 310 (1996) 139}179. [156] A.R. Kerstein, Linear-eddy modelling of turbulent transport. Part 6. Microstructure of di!usive scalar mixing "elds, J. Fluid Mech. 231 (1991) 361}394.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
567
[157] A.R. Kerstein, P.A. McMurtry, Mean-"eld theories of random advection, Phys. Rev. E 49(1) (1994) 474}482. [158] J. Kevorkian, J.D. Cole, Perturbation Methods in Applied Mathematics, Applied Mathematical Sciences, vol. 34, Ch. 4, Springer, Berlin, 1981, pp. 330}480. [159] R.Z. Khas'minskii, A limit theorem for the solutions of di!erential equations with random right-hand sides, Theory Probab. Appl. 11(3) (1966) 390}406. [160] R.Z. Khas'minskii, On stochastic processes de"ned by di!erential equations with a small parameter, Theory Probab. Appl. 11(2) (1966) 211}228. [161] C.-B. Kim, J.A. Krommes, Improved rigorous upper bounds for transport due to passive advection described by simple models of bounded systems, J. Statist. Phys. 53(5-6) (1988) 1103}1137. [162] Y. Kimura, R.H. Kraichnan, Statistics of an advected passive scalar, Phys. Fluids A 5 (9) (1993) 2264}2277. [163] P.E. Kloeden, E. Platen, Numerical Solution of Stochastic Di!erential Equations, Applications of Mathematics: Stochastic Modelling and Applied Probability, vol. 23, Springer, Berlin, 1992. [164] V.I. Klyatskin, W.A. Woyczynski, D. Gurarie, Short-time correlation approximations for di!using tracers in random velocity "elds: a functional approach, Stochastic Modelling in Physical Oceanography, Progr. Probab., vol. 39, BirkhaK user, Boston, 1996, pp. 221}269. [165] E. Knobloch, W.J. Merry"eld, Enhancement of di!usive transport in oscillatory #ows, Astrophys. J. 401(1) (Part 1) (1992) 196}205. [166] D.L. Koch, J.F. Brady, A non-local description of advection-di!usion with application to dispersion in porous media, J. Fluid Mech. 180 (1987) 387}403. [167] D.L. Koch, J.F. Brady, Anomalous di!usion due to long-range velocity #uctuations in the absence of a mean #ow, Phys. Fluids A 1(1) (1989) 47}51. [168] D.L. Koch, R.G. Cox, H. Brenner, J.F. Brady, The e!ect of order on dispersion in porous media, J. Fluid Mech. 200 (1989) 173}188. [169] A.N. Kolmogorov, The local structure of turbulence in incompressible viscous #uid for very large Reynolds numbers, Dokl. Akad. Nauk SSSR 30 (1941) 301}305. [170] A.N. Kolmogorov, A re"nement of previous hypotheses concerning the local structure of turbulence in a viscous incompressible #uid at high Reynolds number, J. Fluid Mech. 13 (1962) 82}85. [171] S. Komori, T. Kanzaki, Y. Murakami, H. Ueda, Simultaneous measurements of instantaneous concentrations of two species being mixed in a turbulent #ow by using a combined laser-induced #uorescence and laser-scattering technique, Phys. Fluids A 1(2) (1989) 349}352. [172] T.W. KoK rner, Fourier Analysis, appendix C, Cambridge University Press, Cambridge, U.K., 1988, pp. 565}574. [173] R.H. Kraichnan, Irreversible statistical mechanics of incompressible hydromagnetic turbulence, Phys. Rev. 109 (1958) 1407}1422. [174] R.H. Kraichnan, The structure of isotropic turbulence, J. Fluid Mech. 5 (1959) 497}543. [175] R.H. Kraichnan, Dynamics of nonlinear stochastic systems, J. Math. Phys. 2 (1961) 124}148. [176] R.H. Kraichnan, Kolmogorov's hypotheses and Eulerian turbulence theory, Phys. Fluids 7(11) (1964) 1723}1734. [177] R.H. Kraichnan, Lagrangian-history closure approximation for turbulence, Phys. Fluids 8(4) (1965) 575}598. [178] R.H. Kraichnan, Dispersion of particle pairs in homogenous turbulence, Phys. Fluids 9 (1966) 1728}1752. [179] R.H. Kraichnan, Small-scale structure of a scalar "eld convected by turbulence, Phys. Fluids 11(5) (1968) 945}953. [180] R.H. Kraichnan, Di!usion by a random velocity "eld, Phys. Fluids 13(1) (1970) 22}31. [181] R.H. Kraichnan, Turbulent di!usion: evaluation of primitive and renormalized perturbation series by PaH de approximants and by expansion of Stieltjes transforms into contributions from continuous orthogonal functions. in: G.A. Baker, L. Gammel (Eds.), The PadeH Approximant in Theoretical Physics, Academic Press, New York, 1970, pp. 129}170. [182] R.H. Kraichnan, Eddy viscosity and di!usivity: exact formulas and approximations, Complex Systems 1 (1987) 805}820. [183] R.H. Kraichnan, Anomalous scaling of a randomly advected passive scalar, Phys. Rev. Lett. 72 (7) (1994) 1016}1019. [184] R.H. Kraichnan, V. Yakhot, S. Chen, Scaling relations for a randomly advected passive scalar "eld, Phys. Rev. Lett. 75(2) (1995) 240}243.
568
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
[185] P.R. Kramer, Passive scalar scaling regimes in a rapidly decorrelating turbulent #ow, Ph.D. Thesis, Princeton University, November 1997. [186] P.R. Kramer, Two di!erent rapid decorrelation in time limits for turbulent di!usion, J. Statist. Phys. (1998) To be submitted. [187] J.A. Krommes, R.A. Smith, Rigorous upper bounds for transport due to passive advection by inhomogeneous turbulence, Ann. Phys. 177(2) (1987) 246}329. [188] R. Kubo, Stochastic Liouville equations, J. Math. Phys. 4(2) (1963) 174}183. [189] H. Kunita, Stochastic Flows and Stochastic Di!erential Equations, Cambridge Studies in Advanced Mathematics, vol. 24, Cambridge University Press, Cambridge, UK, 1990. [190] O.A. Kurbanmuradov, K.K. Sabelfeld, Statistical modelling of turbulent motion of particles in random velocity "elds, Sov. J. Numer. Anal. Math. Modelling 4(1) (1989) 53}68. [191] B.R. Lane, O.N. Mesquita, S.R. Meyers, J.P. Gollub, Probability distributions and thermal transport in a turbulent grid #ow, Phys. Fluids A 5(9) (1993) 2255}2263. [192] M. Larcheve( que, M. Lesieur, The application of eddy-damped Markovian closures to the problem of dispersion of particle pairs, J. MeH canique 20 (1981) 113}134. [193] J.C. LaRue, P.A. Libby, Thermal mixing layer downstream of half-heated turbulence grid, Phys. Fluids 24(4) (1981) 597}603. [194] P. Le Doussal, Di!usion in layered random #ows, polymers, electrons in random potentials, and spin depolarization in random "elds, J. Statist. Phys. 69(5/6) (1992) 917}954. [195] N.N. Lebedev, Special Functions and their Applications, Ch. 1, Dover, New York, 1972, pp. 1}15. [196] M. Lesieur, Turbulence in Fluids, 2nd revised ed., vol. 1, in: Fluid Mechanics and its Applications, Kluwer, Dordrecht, 1990. [197] D.C. Leslie, Developments in the Theory of Turbulence, Chs. 8}10, 12, Oxford Science Publications. The Clarendon Press Oxford University Press, New York, 1983, pp. 156}226, 267}284. [198] T.C. Lipscombe, A.L. Frenkel, D. ter Haar, On the convection of a passive scalar by a turbulent Gaussian velocity "eld, J. Statist. Phys. 63(1-2) (1991) 305}313. [199] S. Lovejoy, Area-perimeter relation for rain and cloud areas, Science 216(9) (1982) 185}187. [200] R. Lundgren, Turbulent pairs dispersion and scalar di!usion, J. Fluid Mech. 111 (1981) 25}57. [201] V. L'vov, E. Podivilov, I. Procaccia, Scaling behavior in turbulence is doubly anomalous, Phys. Rev. Lett. 76(21) (1996) 3963}3966. [202] V. L'vov, I. Procaccia, Fusion rules in turbulent systems with #ux equilibria, Phys. Rev. Lett. 76(16) (1996) 2898}2901. [203] B.-K. Ma, Z. Warhaft, Some aspects of the thermal mixing layer in grid turbulence, Phys. Fluids 29(10) (1986) 3114}3120. [204] S.-K. Ma, Modern Theory of Critical Phenomena, Frontiers in Physics, vol. 46, Addison-Wesley, Reading, MA, USA, 1976. [205] A.J. Majda, Lectures on turbulent di!usion, Lecture Notes at Princeton University, 1990. [206] A.J. Majda, Explicit inertial range renormalization theory in a model for turbulent di!usion, J. Statist. Phys. 73 (1993) 515}542. [207] A.J. Majda, The random uniform shear layer: an explicit example of turbulent di!usion with broad tail probability distributions, Phys. Fluids A 5(8) (1993) 1963}1970. [208] A.J. Majda, Random shearing direction models for isotropic turbulent di!usion, J. Statist. Phys. 25(5/6) (1994) 1153}1165. [209] A.J. Majda, Lectures on turbulent di!usion, Lecture Notes at Courant Institute of Mathematical Sciences, 1996. [210] A.J. Majda, R.M. McLaughlin, The e!ect of mean #ows on enhanced di!usivity in transport by incompressible periodic velocity "elds, Stud. Appl. Math. 89(3) (1993) 245}279. [211] A.J. Majda, P.E. Souganidis, Large-scale front dynamics for turbulent reaction-di!usion equations with separated velocity scales, Nonlinearity 7(1) (1994) 1}30. [212] A.J. Majda, P.E. Souganidis, Bounds on enhanced turbulent #ame speeds for combustion with fractal velocity "elds, J. Statist. Phys. 83(5-6) (1996) 933}954.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
569
[213] A.J. Majda, P.E. Souganidis, Flame fronts in a turbulent combustion model with fractal velocity "elds, C.P.A.M. Fritz John Vol. (1998) To appear. [214] B.B. Mandelbrot, On the geometry of homogenous turbulence, with stress on the fractal dimension of the iso-surfaces of scalars, J. Fluid Mech. 72 (1975) 401}416. [215] B.B. Mandelbrot, The Fractal Geometry of Nature, W.H. Freeman, San Francisco, New York, updated and augmented edition, 1983. [216] B.B. Mandelbrot, Self-a$ne fractals and fractal dimension, Phys. Scripta 32 (1985) 257}260. [217] B.B. Mandelbrot, Self-a$ne fractal sets, I: the basic fractal dimensions, in: L. Pietronero, E. Tosatti (Eds.), Fractals in Physics, ICTP, North Holland-Elsevier Science Publishers, Amsterdam, New York, 1986, pp. 3}15. [218] B.B. Mandelbrot, Self-a$ne fractal sets, III: Hausdor! dimension anomalies and their implications, in: L. Pietronero, E. Tosatti (Eds.), Fractals in Physics, ICTP, North Holland-Elsevier Science Publishers, Amsterdam, New York, 1986, pp. 21}28. [219] B.B. Mandelbrot, J.W. Van Ness, Fractional Brownian motions, fractional noises and applications, SIAM Rev. 10(4) (1968) 422}437. [220] B.B. Mandelbrot, J.R. Wallis, Computer experiments with fractional Gaussian noises. Part 1, Averages and variances, Water Resour. Res. 5(1) (1969) 228}241. [221] B.B. Mandelbrot, J.R. Wallis, Computer experiments with fractional Gaussian noises. Part 2, Rescaled ranges and spectra, Water Resour. Res. 5(1) (1969) 242}259. [222] B.B. Mandelbrot, J.R. Wallis, Computer experiments with fractional Gaussian noises. Part 3, Mathematical appendix, Water Resour. Res. 5 (1) (1969) 260}267. [223] G. Matheron, G. de Marsily, Is transport in porous media always di!usive? A counterexample, Water Resour. Res. 16(5) (1980) 901}917. [224] R. Mauri, Dispersion, convection, and reaction in porous media, Phys. Fluids A 3 (5, part 1) (1991) 743}756. [225] R.M. Mazo, C. Van den Broeck, The asymptotic dispersion of particles in N-layer systems: periodic boundary conditions, J. Chem. Phys. 86(1) (1987) 454}459. [226] P. McCarty, W. Horsthemke, E!ective di!usion coe$cient for steady two-dimensional convective #ow, Phys. Rev. A 37(6) (1988) 2112}2117. [227] W.D. McComb, The Physics of Fluid Turbulence, Oxford Engineering Science Series, vol. 25, Clarendon Press, New York, 1991. [228] A. McCoy, Ph.D. Thesis, Department of Mathematics, University of California at Berkeley, 1975. [229] D.W. McLaughlin, G.C. Papanicolaou, O.R. Pironneau, Convection of microstructure and related problems, SIAM J. Appl. Math. 45(5) (1985) 780}797. [230] R.M. McLaughlin, Turbulent transport, Ph.D. Thesis, Princeton University, November 1994, Program in Applied and Computational Mathematics. [231] R.M. McLaughlin, Numerical averaging and fast homogenization, J. Statist. Phys. 90(3-4) (1998) 597}626. [232] R.M. McLaughlin, M.G. Forest, An anelastic, scale-separated model for mixing, with application to atmospheric transport phenomena, Phys. Fluids (1998) Submitted. [233] R.M. McLaughlin, A.J. Majda, An explicit example with non-Gaussian probability distribution for nontrivial scalar mean and #uctuation, Phys. Fluids 8(2) (1996) 536. [234] P.A. McMurtry, T.C. Gansauge, A.R. Kerstein, S.K. Krueger, Linear eddy simulations of mixing in a homogenous turbulent #ow, Phys. Fluids A 5(4) (1993) 1023}1034. [235] C. Meneveau, K.R. Sreenivasan, Interface dimension in intermittent turbulence, Phys. Rev. A 41(4) (1990) 2246}2248. [236] P. Mestayer, Local isotropy and anisotropy in a high-Reynolds-number turbulent boundary layer, J. Fluid Mech. 125 (1982) 475}503. [237] O. MeH tais, M. Lesieur, Large eddy simulations of isotropic and stably-strati"ed turbulence, in: H.H. Fernholz, H.E. Fiedler (Eds.), Advances in Turbulence 2, Springer, Berlin, 1989, pp. 371}376. [238] O. MeH tais, M. Lesieur, Spectral large-eddy simulation of isotropic and stably strati"ed turbulence, J. Fluid Mech. 239 (1992) 157}194. [239] I. MezicH , J.F. Brady, S. Wiggins, Maximal e!ective di!usivity for time-periodic incompressible #uid #ows, SIAM J. Appl. Math. 56 (1) (1996) 40}56.
570
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
[240] G.A. Mikhailov, Optimization of Weighted Monte Carlo Methods, Ch. 6, Springer Series in Computational Physics, Springer, Berlin, 1992, pp. 152}156, 161}164. [241] T. Mikkelsen, S.E. Larsen, H.L. PeH cseli, Di!usion of Gaussian Pu!s, Q. J. Roy Meteorol. Soc. 113 (1987) 81}105. [242] P.L. Miller, P.E. Dimotakis, Measurements of scalar power spectra in high Schmidt number turbulent jets, J. Fluid Mech. 308 (1996) 129}146. [243] H.K. Mo!att, Transport e!ects associated with turbulence with particular attention to the in#uence of helicity, Rep. Prog. Phys. 46 (1983) 621}664. [244] S.A. Molchanov, Ideas in the theory of random media, Acta Applicandae Math. 22 (1991) 139}282. [245] S.A. Molchanov, L.I. Piterbarg, Averaging in turbulent di!usion problems, Probability Theory and Random Processes, Kijev, Naukova Dumka, 1987, pp. 35}47 (in Russian). [246] S.A. Molchanov, A.A. Ruzmaikin, D.D. Sokolo!, Dynamo equations in a random short-term correlated velocity "eld, Magnitnaja Gidrodinamika 4 (1983) 67}73 (in Russian). [247] A.S. Monin, A.M. Yaglom, Statistical Fluid Mechanics: Mechanics of Turbulence, vol. 1, MIT Press, Cambridge, MA, 1975. [248] A.S. Monin, A.M. Yaglom, Statistical Fluid Mechanics: Mechanics of Turbulence, vol. 2, MIT Press, Cambridge, MA, 1975. [249] F. Morgan, Geometric Measure Theory: A Beginner's Guide, Section 2.3, Academic Press, New York, 1988, pp. 8}10. [250] A.H. Nayfeh, Introduction to Perturbation Techniques, Section 3.4, Wiley Classics Library, Wiley, New York, 1981, p. 86. [251] J.O. Nye, R.S. Brodkey, The scalar spectrum in the viscous-convective subrange, J. Fluid Mech. 29 (1967) 151}163. [252] A.M. Obukhov, Spectral energy distribution in a turbulent #ow, Dokl. Akad. Nauk SSSR 32 (1) (1941) 22}24. [253] A.M. Obukhov, Spectral energy distribution in a turbulent #ow, Izv. Akad. Nauk. SSSR Ser. Geogr. Geophys. 5(4-5) (1941) 453}466. [254] A.M. Obukhov, The structure of the temperature "eld in a turbulent #ow, Izv. Akad. Nauk. SSSR Ser. Geogr. Geophys. 13 (1949) 58. [255] A.M. Obukhov, Some speci"c features of atmospheric turbulence, J. Fluid Mech. 13 (1962) 77}81. [256] K. OelschlaK ger, Homogenization of a di!usion process in a divergence-free random "eld, Ann. Probab. 16(3) (1988) 1084}1126. [257] B. "ksendal, Stochastic Di!erential Equations, 5th ed., Universitext, Springer, Berlin, 1998. An introduction with applications. [258] A. Okubo, Oceanic di!usion diagrams, Deep-Sea Res. 18 (1971) 789}802. [259] S. Orey, Gaussian sample functions and the Hausdor! dimension of level crossings, Z. Wahrscheinlichkeitstheorie Verw. Geb. 15 (1970) 249}256. [260] M.V. Osipenko, O.P. Pogutse, N.V. Chudin, Plasma di!usion in an array of vortices, Sov. J. Plasma Phys. 13(8) (1987) 550}554. [261] R. Ozmidov, On the rate of dissipation of turbulent energy in sea currents and in the dimensionless constant in the &4/3 power law', Izv. Akad. Nauk. SSSR Ser. Geotiz (1960) 821}823. [262] G.C. Papanicolaou, W. Kohler, Asymptotic theory of mixing stochastic ordinary di!erential equations, Commun. Pure Appl. Math. 27 (1974) 641}668. [263] G.C. Papanicolaou, O.R. Pironeau, The asymptotic behavior of motions in random #ows, in: L. Arnold, R. Lefever (Eds.), Stochastic Nonlinear Systems in Physics, Chemistry and Biology, Springer Series in Synergetics, vol. 8, Springer, Berlin, 1981, pp. 36}41. [264] G.C. Papanicolaou, S.R.S. Varadhan, Boundary value problems with rapidly oscillating random coe$cients, in: J. Fritz, J.L. Lebowitz, D. Szasz (Eds.), Random Fields: Rigorous Results in Statistical Mechanics and Quantum Field Theory, Colloquia Mathematica Societatis JaH nos Bolyai, vol. 2, North Holland-Elsevier Science Publishers, Amsterdam, New York, Oxford, 1979, pp. 835}873. [265] F. Pasquill, F.B. Smith, Atmospheric Di!usion, 3rd ed., Ellis Horwood Series in Environmental Science, Ellis Horwood, Chichester, 1983.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
571
[267] A.M. Polyakov, Turbulence without pressure, Phys. Rev. E 52(6) (1995) 6183}6188. [268] S.B. Pope, The probability approach to the modelling of turbulent reacting #ows, Combust. Flame 27 (1976) 299}312. [269] S.B. Pope, Lagrangian PDF methods for turbulent #ows, Annual Review of Fluid Mechanics, vol. 26, Annual Reviews, Palo Alto, CA, 1994, pp. 23}63. [270] S.C. Port, C.J. Stone, Random measures and their application to motion in an incompressible #uid, J. Appl. Probab. 13 (1976) 498}506. [271] R.R. Prasad, K.R. Sreenivasan, The measurement and interpretation of fractal dimensions of the scalar interface in turbulent #ows, Phys. Fluids A 2(5) (1990) 792}807. [272] R.R. Prasad, K.R. Sreenivasan, Quantitative three-dimensional imaging and the structure of passive scalar "elds in fully turbulent #ows, J. Fluid Mech. 216 (1990) 1}34. [273] A.A. Praskovsky, E.B. Gledzer, M.Yu. Karyakin, Ye. Zhou, The sweeping decorrelation hypothesis and energyinertial scale interaction in high Reynolds number #ows, J. Fluid Mech. 248 (1993) 493}511. [274] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in FORTRAN, 2nd ed., Ch. 7, Cambridge University Press, Cambridge, 1992, pp. 266}319. [275] A. Pumir, Anomalous scaling behaviour of a passive scalar in the presence of a mean gradient, Europhys. Lett. 34(1) (1996) 25}29. [276] A. Pumir, Determination of the three-point correlation function of a passive scalar in the presence of a mean gradient, Europhys. Lett. 37(8) (1997) 529}534. [277] A. Pumir, B. Shraiman, E.D. Siggia, Exponential tails and random advection, Phys. Rev. Lett. 66(23) (1991) 2984}2987. [278] A. Pumir, B.I. Shraiman, E.D. Siggia, Perturbation theory for the d-correlated model of passive scalar advection near the Batchelor limit, J. Phys. Rev. E 55(2) (1997) R1263}R1266. [279] S. Redner, Superdi!usive transport due to random velocity "elds, Physica D 38 (1989) 287}290. [280] S. Redner, Superdi!usion in random velocity "elds, Physica A 168 (1990) 551}560. [281] M. Reed, B. Simon, Methods of Modern Mathematical Physics. I, 2nd ed., Ch. 6, Academic Press [Harcourt Brace Jovanovich Publishers], New York, Functional Analysis, 1980, pp. 182}220. [282] S.G. Resnick, Dynamical problems in non-linear advective partial di!erential equations, Ph.D. Thesis, University of Chicago, August 1995. [283] L.F. Richardson, Some measurements of atmospheric turbulence, Phil. Trans. Roy. Soc. Lond. A 221 (1920) 1}28. [284] L.F. Richardson, Atmospheric di!usion shown on a distance-neighbor graph, Proc. Roy. Soc. Lond. A 110 (1926) 709}737. [285] P.H. Roberts, Analytical theory of turbulent di!usion, J. Fluid. Mech. 11 (1962) 257}283. [286] H.A. Rose, Eddy di!usivity, eddy noise, and subgrid-scale modelling, J. Fluid. Mech. 81 (4) (1977) 719}734. [287] M.N. Rosenbluth, H.L. Berk, I. Doxas, W. Horton, E!ective di!usion in laminar convective #ows, Phys. Fluids 30 (9) (1987) 2636}2647. [288] H.L. Royden, Real Analysis, 3rd ed., MacMillan, New York, 1988. [290] J.H. Rust, A. Sesonske, Turbulent temperature #uctuations in mercury and ethylene glycol in pipe #ow, Int. J. Heat Mass Transfer 9 (1966) 215}227. [291] K.K. Sabelfeld, Monte Carlo Methods in Boundary Value Problems, Ch. 1, 5, Springer Series in Computational Physics, Springer, Berlin, 1991, pp. 31}47, 228}238. [292] A.I. Saichev, W.A. Woyczynski, Probability distributions of passive tracers in randomly moving media, in: S.A. Molchanov (Ed.), Stochastic Models in Geosystems, IMA Volumes in Mathematics and its Applications, Springer, Berlin, 1996. [293] M. Sano, X.Z. Wu, A. Libchaber, Turbulence in helium-gas free convection, Phys. Rev. A 40 (11) (1989) 6421}6430. [294] B.I. Shraiman, Di!usive transport in a Rayleigh}BeH nard convection cell, Phys. Rev A 36(1) (1987) 261}267. [295] B.I. Shraiman, E.D. Siggia, Lagrangian path integrals and #uctuations in random #ow, Phys. Rev. E 49(4) (1994) 2912}2927. [296] B. Simon, Functional Integration and Quantum Physics Section 4, Academic Press, New York, 1979, p. 38. [297] Ya.G. Sinai, Introduction to Ergodic Theory, Princeton University Press, Princeton, 1976.
572
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
[298] Ya.G. Sinai, V. Yakhot, Limiting probability distributions of a passive scalar in a random velocity "eld, Phys. Rev. Lett. 63(18) (1989) 1962}1964. [299] F.B. Smith, Conditional particle motions in a homogenous turbulent "eld, Atmos. Environ. 2 (1968) 491}508. [300] L.M. Smith, S.L. Woodru!, Renormalization-group analysis of turbulence, in: Annual Review of Fluid Mechanics, Annu. Rev. Fluid Mech., vol. 30, Annual Reviews, Palo Alto, CA, 1998, pp. 275}310. [301] T.H. Solomon, J.P. Gollub, Chaotic particle transport in time-dependent Rayleigh}BeH nard convection, Phys. Rev. A 38 (12) (1988) 6280}6286. [302] T.H. Solomon, J.P. Gollub, Passive transport in steady Rayleigh}BeH nard convection, Phys. Fluids 31(6) (1988) 1372}1379. [303] A.M. Soward, Fast dynamo action in a steady #ow, J. Fluid Mech. 180 (1987) 267}295. [304] A.M. Soward, S. Childress, Large magnetic Reynolds number dynamo action in a spatially periodic #ow with mean motion, Philos. Trans. Roy. Soc. Lond. A 331 (1990) 649}733. [305] K.R. Sreenivasan, Fractals and multifractals in #uid turbulence, in: Annual Review of Fluid Mechanics, vol. 23, Annual Reviews, Palo Alto, CA, 1991, pp. 539}600. [306] K.R. Sreenivasan, On local isotropy of passive scalars in turbulent shear #ows, Proc. Roy. Soc. Lond. A 434 (1991) 165}182. [307] K.R. Sreenivasan, On the Universality of the Kolmogorov constant, Phys. Fluids 7(11) (1995) 2778}2784. [308] K.R. Sreenivasan, The passive scalar spectrum and the Obukhov-Corrsin constant, Phys. Fluids 8(1) (1996) 189}196. [309] K.R. Sreenivasan, R.A. Antonia, The phenomenology of small-scale turbulence, in: Annual Review of Fluid Mechanics, vol. 29, Annu. Rev. Fluid Mech., Annual Reviews, Palo Alto, CA, 1997, 435}472. [310] K.R. Sreenivasan, R. Ramshankar, C. Meneveau, Mixing, entrainment and fractal dimension of surfaces in turbulent #ows, Proc. Roy. Soc. Lond. A 421 (1989) 79}108. [311] K.R. Sreenivasan, S. Tavoularis, R. Henry, S. Corrsin, Temperature #uctuations and scales in grid-generated turbulence, J. Fluid Mech. 100 (1980) 597}621. [312] E.M. Stein, Harmonic Analysis } Real-Variable Methods, Orthogonality, and Oscillatory Integrals, of Princeton Mathematical Series, vol. 43, Section 8.1.3, Princeton University Press, Princeton, 1993, p. 334. [313] R.L. Stratonovich, Topics in the Theory of Random Noise. vol. I: General Theory of Random Processes. Nonlinear Transformations of Signals and Noise, Sections 4.7}9, Gordon and Breach Science Publishers, New York, 1963, pp. 83}103. Revised English edition. Translated from the Russian by Richard A. Silverman. [314] E.B. Tatarinova, P.A. Kalugin, A.V. Sokol, What is the propagation rate of the passive component in turbulent #ows limited by?, Europhys. Lett. 14(8) (1991) 773}777. [315] V. Tatarski, Radiophysical methods of investigating atmospheric turbulence, Izv. Vyssh. Ucheb. Zaved. 3 Radio"zika 4 (1960) 551}583. [316] S. Tavoularis, S. Corrsin, Experiments in nearly homogenous turbulent shear #ow with a uniform mean temperature gradient. Part 1, J. Fluid Mech. 104 (1981) 311}347. [317] G.I. Taylor, Di!usion by continuous movements, Proc. Lond. Math. Soc. Ser. 2 (20) (1921) 196}212. [318] G.I. Taylor, Dispersion of soluble matter in solvent #owing slowly through a tube, Proc. Roy. Soc. Lond. A 219 (1953) 186}203. [319] H. Tennekes, Eulerian and Lagrangian time microscales in isotropic turbulence, J. Fluid Mech. 67(3) (1975) 561}567. [320] H. Tennekes, J.L. Lumley, A First Course in Turbulence, MIT Press, Cambridge, MA, 1972. [321] D.J. Thomson, Criteria for the selection of stochastic models of particle trajectories in turbulent #ows, J. Fluid Mech. 180 (1987) 529}556. [322] D.J. Thomson, A stochastic model for the motion of particle pairs in isotropic high-Reynolds-number turbulence, and its application to the problem of concentration variance, J. Fluid Mech. 216 (1990) 113}153. [323] E.C. Titchmarsh, Eigenfunction Expansions Associated with Second-Order Di!erential Equations Part 1, Ch. V, Clarendon Press, Oxford, 1962, pp. 107}128. [324] A.A. Townsend, The measurement of double and triple correlation derivatives in isotropic turbulence, Proc. Cambridge Phil. Soc. 43 (1947) 560.
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
573
[325] D.J. Tritton, Physical Fluid Dynamics, 2nd ed., Ch. 14.4, Clarendon, Press, Oxford, 1988, pp. 168}171. [326] C.W. van Atta, W.Y. Chen, Correlation measurements in grid turbulence using digital harmonic analysis, J. Fluid Mech. 34 (1968) 497}515. [327] H. van Dop, F.T.M. Nieuwstadt, J.C.R. Hunt, Random walk models for particle displacements in inhomogenous unsteady turbulent #ows, Phys. Fluids 28(6) (1985) 1639}1653. [328] E. Vanden Eijnden, Contribution to the statistical theory of turbulence: application to anomalous transport in plasmas, Ph.D. Thesis, UniversiteH Libre de Bruxelles, July 1997, FaculteH des Sciences, Physique Statistique. [329] E. Vanden Eijnden, An approximation for linear random di!erential equations, Phys. Rev. E 58 (1998) R5229}5232. [330] E. Vanden Eijnden, A.J. Majda, P.R. Kramer, Testing approximate closures for turbulent di!usion on some model #ows, In preparation, J. Statist. Phys. (1998) to be submitted. [331] S.R.S. Varadhan, Large Deviations and Applications, CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 46, SIAM Publ., Philadelphia, 1984. [332] J.C. Vassilicos, On the geometry of lines in two-dimensional turbulence, in: Fernholz, Fiedler (Eds.), Advances in Turbulence 2, Springer, Berlin, 1989, pp. 404}411. [333] M. Vergassola, Anomalous scaling for passively advected magnetic "elds, Phys. Rev. E 53(4) (1996) R3021}R3024. [334] M. Vergassola, A. Mazzino, Structures and intermittency in a passive scalar model, Phys. Rev. Lett. 79 (10) (1997) 1849}1852. [335] J.A. Viecelli, E.H. Can"eld Jr., Functional representation of power-law random "elds and time series, J. Comput. Phys. 95 (1991) 29}39. [336] R.F. Voss, Random fractal forgeries, in: R.A. Earnshaw (Ed.), Fundamental Algorithms for Computer Graphics, NATO ASI Series F: Computer and System Sciences, vol 17, NATO Science A!airs Divison, Springer, Berlin, 1985, pp. 805}835. [337] J.C. Wheeler, R.G. Gordon, Bounds for averages using moment constraints, in: G.A. Baker, Gammel (Ed.), The PadeH Approximant in Theoretical Physics, Academic Press, New York, 1970, pp. 99}128. [338] B.S. Williams, D. Marteau, J.P. Gollub, Mixing of a passive scalar in magnetically forced two-dimensional turbulence, Phys. Fluids (1996), Submitted. [339] F.A. Williams, Combustion Theory: The Fundamental Theory of Chemically Reacting Flow Systems, Chs. 3, 7, Addison-Wesley Series in Engineering Science, Addison-Wesley, Reading, MA, USA, 1965. [340] A. Wirth, L. Biferale, Anomalous scaling in random shell models for passive scalars, Phys. Rev. E 54 (5) (1996) 4982}4989. [341] A.M. Yaglom, Correlation Theory of Stationary and Related Random Functions. Vol. I: Basic Results, Springer, Berlin, 1987. [342] A.M. Yaglom, Correlation Theory of Stationary and Related Random Functions. Vol. II: Supplementary Notes and References, Springer, Berlin, 1987. [343] V. Yakhot, Passive scalar advected by a rapidly changing random velocity "eld: probability density of scalar di!erences, Phys. Rev. E 55(1) (1997) 329}336. [344] V. Yakhot, S.A. Orszag, Renormalization group analysis of turbulence. I. Basic theory, J. Sci. Comput. 1(1) (1986) 3}51. [345] V. Yakhot, S.A. Orszag, Z.-S. She, Space}time correlations in turbulence: kinematic versus dynamical e!ects, Phys. Fluids A 1(2) (1989) 184}186. [346] W. Young, A. Pumir, Y. Pomeau, Anomalous di!usion of tracer in convection rolls, Phys. Fluids A 1(3) (1989) 462}469. [347] W.R. Young, P.B. Rhines, C.J.R. Garrett, Shear-#ow dispersion, internal waves and horizontal mixing in the ocean, J. Phys. Oceanogr. 12 (1982) 515}527. [348] Ya.B. Zel'dovich, Exact solution of the problem of di!usion in a periodic velocity "eld, and turbulent di!usion, Sov. Phys. Dokl. 27(10) (1982) 797}799. [349] C.L. Zirbel, E. C7 inlar, Mass transport by Brownian #ows, in: S.A. Molchanov (Ed.), Stochastic Models in Geosystems, IMA Volumes in Mathematics and its Applications, Springer, Berlin, 1996. [350] C.L. Zirbel, Stochastic #ows: dispersion of a mass distribution and Lagrangian observations of a random "eld, Ph.D. Thesis, Princeton University, 1993, Program in Applied and Computational Mathematics.
574
A.J. Majda, P.R. Kramer / Physics Reports 314 (1999) 237}574
[351] N. Zouari, A. Babiano, Derivation of the relative dispersion law in the inverse energy cascade of two-dimensional turbulence, Physica D 76 (1994) 318}328. [352] G. Zumofen, A. Blumen, J. Klafter, M.F. Shlesinger, LeH vy walks for turbulence: a numerical study, J. Statist. Phys. 54(5/6) (1989) 1519}1528. [353] G. Zumofen, J. Klafter, A. Blumen, Enhanced di!usion in random velocity "elds, Phys. Rev. A 42(8) (1990) 4601}4608. [354] G. Zumofen, J. Klafter, A. Blumen, Trapping aspects in enhanced di!usion, J. Statist. Phys. 65(5/6) (1991) 991}1013.
Physics Reports 314 (1999) 671
Erratum
Stopping of heavy ions in plasmas at strong coupling (Physics Reports 309 (1999) 117}208)夽 GuK nter Zwicknagel , Christian Toep!er, Paul-Gerhard Reinhard Laboratoire de Physique des Gaz et des Plasmas, BaL timent 212, Universite& Paris XI, F-91405 Orsay, France Institut fu( r Theoretische Physik, Universita( t Erlangen, D-91058 Erlangen, Germany
On page 172, a serious printing error occurs in Eq. (130). The correct equation should read as follows: (3ZC < (k)"!4nk ¹j " k#1/j
夽
PII of the original article: S0370-1573(98)00056-8
0370-1573/99/$ - see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 9 ) 0 0 0 3 3 - 2
(130)
T. Piran/Physics Reports 314 (1999) 575}667
GAMMA-RAY BURSTS AND THE FIREBALL MODEL
Tsvi PIRAN Racah Institute for Physics, The Hebrew University, Jerusalem, 91904, Israel
AMSTERDAM } LAUSANNE } NEW YORK } OXFORD } SHANNON } TOKYO
575
Physics Reports 314 (1999) 575}667
Gamma-ray bursts and the "reball model Tsvi Piran Racah Institute for Physics, The Hebrew University, Jerusalem, 91904, Israel Physics Department, Columbia University, New York, NY 10027, USA Received October 1998 editor: M.P. Kamionkowski Contents 1. Introduction 2. Observations 2.1. Duration 2.2. Temporal structure and variability 2.3. Spectrum 2.4. Spectral evolution 2.5. Spectral lines 2.6. Angular positions 2.7. Angular distribution 2.8. Quiescent counterparts and the historical `no hosta problem 2.9. Afterglow 2.10. Repetition? 2.11. Correlations with Abell clusters, quasars and supernovae 2.12. < , count and peak #ux
distributions 3. The distance scale 3.1. Redshift measurements 3.2. The angular distribution 3.3. Interpretation of the peak #ux distribution 3.4. Time dilation 4. The compactness problem and relativistic motion 4.1. Relativistic motion 4.2. Relativistic beaming? 5. An overview of the generic model 5.1. Models for the energy #ow
578 580 580 581 581 585 585 586 586 586 588 592
6.
7.
592 593 593 593 593
8.
594 598 598 599 601 601 602
9.
5.2. Models for the energy conversion 5.3. Typical radii Fireballs 6.1. A simple model 6.2. Extreme-relativistic scaling laws 6.3. The radiation-dominated phase 6.4. The matter-dominated phase 6.5. Spreading 6.6. Optical depth 6.7. Anisotropic "reballs Temporal structure and kinematic considerations 7.1. Time scales 7.2. Angular spreading and external shocks 7.3. Angular variability and other caveats 7.4. Temporal structure in internal shocks Energy conversion 8.1. Slowing down of relativistic particles 8.2. Synchrotron emission from relativistic shocks 8.3. Synchrotron self-absorption 8.4. Inverse Compton emission 8.5. Radiative e$ciency 8.6. Internal shocks 8.7. Shocks with the ISM } external shocks 8.8. The internal}external scenario Afterglow 9.1. Hydrodynamics of a slowing down relativistic shell
Supported by the US-Israel BSF grant 95-328 and by NASA grants NAG5-3516 and NAG5-3091. Permanent address. 0370-1573/99/$ - see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 1 2 7 - 6
602 604 604 605 607 608 609 609 610 610 611 611 612 614 616 618 618 622 627 628 629 630 636 641 641 642
T. Piran / Physics Reports 314 (1999) 575}667 9.2. Phases in a relativistic decelerating shell 9.3. Synchrotron emission from a relativistic decelerating shell 9.4. New puzzles from afterglow observations 10. Models of the inner engine 10.1. The `inner enginea 10.2. NSMs: binary neutron star mergers 10.3. Binary neutron stars vs. black hole}neutron star mergers 11. Other related phenomena 11.1. Cosmic rays
645 646 651 652 652 653 655 655 656
11.2. UCHERs}ultra-high-energy cosmic rays 11.3. High-energy neutrinos 11.4. Gravitational waves 11.5. Low energy neutrinos 11.6. Black holes 12. Cosmological implications 13. Summary and conclusions Acknowledgements References
577
656 657 657 657 657 657 659 660 660
Abstract Gamma-ray bursts (GRBs) have puzzled astronomers since their accidental discovery in the late 1960s. The BATSE detector on the COMPTON-GRO satellite has been detecting one burst per day for the last six years. Its "ndings have revolutionized our ideas about the nature of these objects. They have shown that GRBs are at cosmological distances. This idea was accepted with di$culties at "rst. The recent discovery of an X-ray afterglow by the Italian/Dutch satellite BeppoSAX has led to a detection of high red-shift absorption lines in the optical afterglow of GRB970508 and in several other bursts and to the identi"cation of host galaxies to others. This has con"rmed the cosmological origin. Cosmological GRBs release &10}10 erg in a few seconds making them the most (electromagnetically) luminous objects in the Universe. The simplest, most conventional, and practically inevitable, interpretation of these observations is that GRBs result from the conversion of the kinetic energy of ultra-relativistic particles or possibly the electromagnetic energy of a Poynting #ux to radiation in an optically thin region. This generic `"reballa model has also been con"rmed by the afterglow observations. The `inner enginea that accelerates the relativistic #ow is hidden from direct observations. Consequently, it is di$cult to infer its structure directly from current observations. Recent studies show, however, that this `inner enginea is responsible for the complicated temporal structure observed in GRBs. This temporal structure and energy considerations indicates that the `inner enginea is associated with the formation of a compact object } most likely a black hole. 1999 Elsevier Science B.V. All rights reserved. PACS: 98.70.Rz; 95.30.Lz; 95.30.Sf Keywords: Gamma-ray bursts
578
T. Piran / Physics Reports 314 (1999) 575}667
1. Introduction Gamma-ray bursts (GRBs), short and intense bursts of &100 keV}1 MeV photons, were discovered accidentally in the late 1960s by the Vela satellites [1]. The mission of these satellites was to monitor the `Outer Space Treatya that forbade nuclear explosions in space. A wonderful by-product of this e!ort was the discovery of GRBs. The discovery of GRBs was announced in 1973 [1]. It was con"rmed quickly by Russian observations [2] and by observations on the IMP-6 satellite [3]. Since then, several dedicated satellites have been launched to observe the bursts and numerous theories were put forward to explain their origin. Claims of observations of cyclotron spectral lines and of discovery of optical archival counterparts led in the mid-1980s to a consensus that GRBs originate from Galactic neutron stars. This model was accepted quite generally and was even discussed in graduate textbooks [4}6] and encyclopedia articles [7,8]. The BATSE detector on the COMPTON-GRO (Gamma-Ray Observatory) was launched in the spring of 1991. It has revolutionized GRB observations and consequently our basic ideas on their nature. BATSE observations of the isotropy of GRB directions, combined with the de"ciency of faint GRBs, ruled out the galactic disk neutron star model and make a convincing case for their extra-galactic origin at cosmological distances [9]. This conclusion was recently con"rmed by the discovery by BeppoSAX [10] of an X-ray transient counterparts to several GRBs. This was followed by a discovery of optical [11,12] and radio transients [13]. Absorption line with a redshift z"0.835 were measured in the optical spectrum of the counterpart to GRB970508 [14] providing the "rst redshift of the optical transient and the associated GRB. Latter, redshifted emission lines from galaxies associated with GRB971214 [15] (with z"3.418) and GRB980703 [16] (with z"0.966) were discovered. Galaxies has been discovered at the positions of other bursts. There is little doubt now that some, and most likely all GRBs are cosmological. The cosmological origin of GRBs immediately implies that GRB sources are much more luminous than previously thought. They release &10}10 ergs or more in a few seconds, the most (electromagnetically) luminous objects in the Universe. This also implies that GRBs are rare events. BATSE observes on average one burst per day. This corresponds, with the simplest model (assuming that the rate of GRBs does not change with cosmological time) to one burst per million years per galaxy. The average rate changes, of course, if we allow beaming or a cosmic evolution of the rate of GRBs. In spite of those discoveries, the origin of GRBs is still mysterious. This makes GRBs a unique phenomenon in modern astronomy. While pulsars, quasars and X-ray sources were all explained within a few years, if not months, after their discovery, the origin of GRBs remains unknown after more than 30 years. The fact that GRBs are a short transient phenomenon which until recently did not have any known counterpart, is probably the main reason for this situation. Our inability to resolve this riddle also re#ects the accidental and unexpected nature of this discovery which was not done by an astronomical mission. Theoretical astrophysics was not ripe to cope with GRBs when they were discovered.
A few GRBs, now called soft gamma repeaters, compose a di!erent phenomenon, are believed to form on galactic neutron stars.
T. Piran / Physics Reports 314 (1999) 575}667
579
A generic scheme of a cosmological GRB model has emerged in the last few years and most of this review is devoted to an exposition of this scheme. The recently observed X-ray, optical and radio counterparts were predicted by this picture [17}21]. This discovery can, to some extent, be considered as a con"rmation of this model [22}25]. According to this scheme the observed c-rays are emitted when an ultra-relativistic energy #ow is converted to radiation. Possible forms of the energy #ow are kinetic energy of ultra-relativistic particles or electromagnetic Poynting #ux. This energy is converted to radiation in an optically thin region, as the observed bursts are not thermal. It has been suggested that the energy conversion occurs either due to the interaction with an external medium, like the ISM [27] or due to internal process, such as internal shocks and collisions within the #ow [28}30]. Recent work [20,31] shows that the external shock scenario is quite unlikely, unless the energy #ow is con"ned to an extremely narrow beam, or else the process is highly ine$cient. The only alternative is that the burst is produced by internal shocks. The `inner enginea that produces the relativistic energy #ow is hidden from direct observations. However, the observed temporal structure re#ects directly this `engine'sa activity. This model requires a compact internal `enginea that produces a wind } a long energy #ow (long compared to the size of the `enginea itself) } rather than an explosive `enginea that produces a "reball whose size is comparable to the size of the `enginea. Not all the energy of the relativistic shell can be converted to radiation (or even to thermal energy) by internal shocks [32}34]. The remaining kinetic energy will most likely dissipate via external shocks that will produce an `afterglowa in di!erent wavelength [20]. This afterglow was recently discovered, con"rming the "reball picture. At present there is no agreement on the nature of the `enginea } even though binary neutron star mergers [35] are a promising candidate. All that can be said with some certainty is that whatever drives a GRB must satisfy the following general features: It produces an extremely relativistic energy #ow containing +10}10 erg. The #ow is highly variable as most bursts have a variable temporal structure and it should last for the duration of the burst (typically a few dozen seconds). It may continue at a lower level on a time scale of a day or so [36]. Finally, it should be a rare event occurring about once per million years in a galaxy. The rate is of course higher and the energy is lower if there is a signi"cant beaming of the gamma-ray emission. In any case the overall GRB emission in c-rays is &10 erg/10years/galaxy. We begin (Section 2) with a brief review of GRB observation (see [37}41] for additional reviews and [42}45] for a more extensive discussion). We then turn to an analysis of the observational constraints. We analyze the peak intensity distribution and show how the distance to GRBs can be estimated from this data. We also discuss the evidence for another cosmological e!ect: timedilation (Section 3). We then turn (Section 4) to discuss the optical depth or the compactness problem. We argue that the only way to overcome this problem is if the sources are moving at an ultra-relativistic velocity towards us. An essential ingredient of this model is the notion of a "reball } an optically thick relativistic expanding electron}positron and photon plasma (for a di!erent model see however [46]). We discuss "reball evolution in Section 6. Kinematic considerations which determine the observed time scales from emission emerging from a relativistic #ow provides important clues on the location of the energy conversion process. We discuss these constraints in Section 7 and the energy conversion stage in Section 8. We review the recent theories of afterglow formation in Section 9. We examine the confrontation of these models with observations and we discuss some of the quantitative problems.
580
T. Piran / Physics Reports 314 (1999) 575}667
We then turn to the `inner enginea and review the recent suggestions for cosmological models (Section 10). As this inner engine is hidden from direct observation, it is clear that there are only a few direct constraints that can be put on it. Among GRB models, binary neutron star merger [35] is unique. It is the only model that is based on an independently observed phenomenon [48], is capable of releasing the required amounts of energy [49] within a very short time scale and takes place at approximately the same rate [50}52]. At present it is not clear if this merger can actually channel the required energy into a relativistic #ow or if it could produce the very high-energy observed in GRB971214. However, in view of the special status of this model we discuss its features and the possible observational con"rmation of this model in Section 10.2. GRBs might have important implications to other branches of astronomy. Relation of GRBs to other astronomical phenomena such as UCHERs, neutrinos and gravitational radiation are discussed in Section 11. The universe and our Galaxy are optically thin to low-energy c-rays. Thus, GRBs constitute a unique cosmological population that is observed practically uniformly on the sky (there are small known biases due to CGRO's observation schedule). Most of these objects are located at z+1 or greater. Thus, this population is farther than any other systematic sample (QSOs are at larger distances but they su!er from numerous selection e!ects and there is no all sky QSOs catalog). GRBs are, therefore, an ideal tool to explore the Universe. Already in 1986, PaczynH ski [53] proposed that GRBs might be gravitationally lensed. This has led to the suggestion to employ the statistics of lensed bursts to probe the nature of the lensing objects and the dark matter of the Universe [54]. The fact that no lensed bursts where detected so far is su$cient to rule out a critical density of 10 M to 10 M black holes [55]. Alternatively we may use the > > peak-#ux distribution to estimate cosmological parameters such as X and K [56]. The angular distribution of GRBs can be used to determine the very large scale structure of the Universe [57,58]. The possible direct measurements of red-shift to some bursts enhances greatly the potential of these attempts. We conclude in Section 12 by summarizing these suggestions. Over the years several thousand papers concerning GRBs have appeared in the literature. With the growing interest in GRBs the number of GRB papers has been growing at an accelerated rate recently. It is, of course, impossible to summarize or even list all this papers here. I refer the interested reader to the complete GRB bibliography that was prepared by K. Hurley [59]. 2. Observations GRBs are short, non-thermal bursts of low-energy c-rays. It is quite di$cult to summarize their basic features. This di$culty stems from the enormous variety displayed by the bursts. I will review here some features that I believe hold the key to this enigma. I refer the reader to the proceedings of the Huntsville GRB meetings [42}45] and to other recent reviews for a more detailed discussion [37}41]. 2.1. Duration A `typicala GRB (if there is such a thing) lasts about 10 s. However, observed durations vary by six orders of magnitude, from several milliseconds [60] to several thousand seconds [61]. About This is assuming that there is no strong cosmic evolution in the rate of GRB.
T. Piran / Physics Reports 314 (1999) 575}667
581
3% of the bursts are preceded by a precursor with a lower peak intensity than the main burst [62]. Other bursts were followed by low-energy X-ray tails [63]. Several bursts observed by the GINGA detector showed signi"cant apparently thermal, X-ray emission before and after the main part of the higher energy emission [64,65]. These are probably pre-discovery detections of the X-ray afterglow observed now by BeppoSAX and other X-ray detectors. The de"nition of duration is, of course, not unique. BATSE's team characterizes it using ¹ (¹ ) the time needed to accumulate from 5% to 95% (from 25% to 75%) of the counts in the 50}300 keV band. The shortest BATSE burst had a duration of 5 ms with structure on scale of 0.2 ms [66]. GRB920229 has a spike with a rise time of 0.22 ms and a decay time of 0.4 ms [67]. The longest so far, GRB940217, displayed GeV activity one and a half hours the main burst [68]. The bursts GRB961027a, GRB961027b, GRB961029a and GRB961029b occurred from the same region in the sky within two days [69] if this `gang of foura is considered as a single very long burst then the longest duration so far is two days! These observations may indicate that some sources display a continued activity (at a variable level) over a period of days [70]. It is also possible that the observed afterglow is an indication of a continued activity [36]. The distribution of burst durations is bimodal. BATSE con"rmed earlier hints [71] that the burst duration distribution can be divided into two sub-groups according to ¹ : long bursts with ¹ '2 s and short bursts with ¹ (2 s [72}77]. The ratio of observed long bursts to observed short bursts is three to one. This does not necessarily mean that there are fewer short bursts. BATSE's triggering mechanism makes it less sensitive to short bursts than to long ones. Consequently, short bursts are detected to smaller distances [75,78}80] and we observed a smaller number of short bursts. 2.2. Temporal structure and variability The bursts have a complicated and irregular time pro"les which vary drastically from one burst to another. Several time pro"les, selected from the second BATSE catalog, are shown in Fig. 1. In most bursts, the typical variation takes place on a time-scale d¹ signi"cantly smaller than the total duration of the burst, ¹. Walker et al. [81] "nd that in the majority of burst that they have looked at the rise time for individual peaks is shorter than 4 ms. In a minority of the bursts there is only one peak with no substructure and in this case d¹&¹. It turns out that the observed variability provides an interesting clue to the nature of GRBs. We discuss this in Section 7. We de"ne the ratio N,¹/d¹ which is a measure of the variability. Fig. 2 depicts the total observed counts (at E'25 keV) from GRB1676. The bursts lasted ¹&100 s and it had peaks of width d¹&1 s, leading to N"100. 2.3. Spectrum GRBs are characterized by emission in the few hundred keV ranges with a non-thermal spectrum (see Fig. 3) X-ray emission is weaker } only a few percent of the energy is emitted below 10 keV and prompt emission at lower energies has not been observed so far. The current best upper limits on such emission are given by LOTIS. For GRB970223 LOTIS "nds m '11 and provides an upper 4 limit on the simultaneous optical to gamma-ray #uence ratio of (1.1;10\ [82]. Most bursts are
582
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 1. Total number of counts vs. time for several bursts from the BATSE Catalogue. Note the large diversity of temporal structure observed.
accompanied, on the other hand, by a high-energy tail which contains a signi"cant amount of energy } EN(E) is almost a constant. GRB940217, for example, had a high-energy tail up to 18 GeV [83]. In fact, EGRET and COMPTEL (which are sensitive to higher energy emission but have a higher threshold and a smaller "eld of view) observations are consistent with the possibility that all bursts have high-energy tails [84,85]. An excellent phenomenological "t for the spectrum was introduced by Band et al. [86]:
hl (hl)?exp ! for hl(H, E N(l)"N [(a!b)E ]?\@(hl)@exp(b!a), for hl'H,
(1)
where H,(a!b)E . There is no particular theoretical model that predicts this spectral shape. Still, this function provides an excellent "t to most of the observed spectra. It is characterized by two power laws joined smoothly at a break energy H. For most observed values of a and b, lF JlN(l) peaks at E "(a#2)E "[(a#2)/(a!b)]H. The `typicala energy of the observed J radiation is E . This is where the source emits the bulk of its luminosity. E de"ned in this way should not be confused with the hardness ratio which is commonly used in analyzing BATSE's data, namely the ratio of photons observed in channel 3 (100}300 keV) to those observed in channel
T. Piran / Physics Reports 314 (1999) 575}667
583
Fig. 2. Counts vs. time for BATSE burst 1676. The bursts lasted ¹&60 s and it had peaks of width d¹&1 s, leading to N+60.
Fig. 3. Observed spectrum of BATSE' burst 228.
2 (50}100 keV). Sometimes we will use a simple power-law "t to the spectrum: N(E)dEJE\? dE.
(2)
In these cases the power-law index will be denoted by a. A typical spectra index is a+1.8}2 [87]. In several cases the spectrum was observed simultaneously by several instruments. Burst 9206022, for example, was observed simultaneously by BATSE, COMPTEL and Ulysses. The time-integrated spectrum on those detectors, which ranges from 25 keV to 10 MeV agrees well with a Band spectrum with: E "457$30 keV, a"!0.86$0.15 and b"!2.5$0.07 [88]. Schaefer et al. [89] present a complete spectrum from 2 keV to 500 MeV for three bright bursts.
584
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 4. N(H) } the number of bursts with hardness, H, in the Band et al. [86] sample (dashed-dotted line) and in the Cohen et al. sample (solid line) [91] together with a theoretical "t of a distribution above H"120 keV with c&}0.5 (a slowly decreasing numbers of GRBs per decade of hardness).
Fig. 4 shows the distribution of observed values of H in several samples [86,90,91]. Most of the bursts are the range 100 keV(H(400 keV, with a clear maximum in the distribution around H&200 keV. There are not many soft GRBs } that is, GRBs with peak energy in the tens of keV range. This low peak energy cuto! is real as soft bursts would have been easily detected by current detectors. However, it is not known whether there is a real paucity in hard GRBs and there is an upper cuto! to the GRB hardness or it just happens that the detection is easiest in this (few hundred keV) band. BATSE triggers, for example, are based mostly on the count rate between 50 and 300 keV. BATSE is, therefore, less sensitive to harder bursts that emit most of their energy in the MeV range. Using BATSE's observation alone, one cannot rule out the possibility that there is a population of harder GRBs that emit equal power in total energy which are not observed because of this selection e!ect [91}94]. More generally, a harder burst with the same energy as a soft one emits fewer photons. Furthermore, the spectrum is generally #at in the high-energy range and it decays quickly at low energies. Therefore it is intrinsically more di$cult to detect a harder burst. A study of the SMM data [95] suggests that there is a de"ciency (by at least a factor of 5) of GRBs with hardness above 3 MeV, relative to GRBs peaking at &0.5 MeV, but these data are consistent with a population of hardness that extends up to 2 MeV. Overall the spectrum is non-thermal. This indicates that the source must be optically thin. The spectrum deviates from a black body in both the low and the high-energy ends: The X-ray paucity constraint rules out optically thick models in which the c-rays could be e!ectively degraded to X-rays [96]. The high-energy tails lead to another strong constraint on physical GRB models. These high-energy photons escape freely from the source without producing electron positron pairs! As we show later, this provides the "rst and most important clue on the nature of GRBs. The low-energy part of the spectrum behaves in many cases like a power law: F Jl? with J !(a( [19,97]. This is consistent with the low-energy tail of synchrotron emission from
T. Piran / Physics Reports 314 (1999) 575}667
585
relativistic electrons } a distribution of electrons in which all the population, not just the upper tail, is relativistic. This is a direct indication for the existence of relativistic shocks in GRBs. More than 90% of the bright bursts studied by Schaefer et al. [89] satisfy this limit. However, there may be bursts whose low-energy tail is steeper [98]. Such a spectrum cannot be produced by a simple synchrotron emission model and it is not clear how is it produced.
2.4. Spectral evolution Observations by earlier detectors as well as by BATSE have shown that the spectrum varies during the bursts. Di!erent trends were found. Golenetskii et al. [99] examined two-channel data from "ve bursts observed by the KONUS experiment on Venera 13 and 14 and found a correlation between the e!ective temperature and the luminosity, implying that the spectral hardness is related to the luminosity. Similar results were obtained Mitrofanov et al. [100]. Norris et al. [101] investigated ten bursts seen by instruments on the SMM (Solar Maximum Mission) satellite. They found that individual intensity pulses evolve from hard-to-soft with the hardness peaking earlier than the intensity. This was supported by more recent BATSE data [102]. Ford et al. [103] analyzed 37 bright BATSE bursts and found that the spectral evolution is a mixture of those found by Golenetskii et al. [99] and by Norris et al. [101]: The peak energy either rises with or slightly proceeds major intensity increases and softens for the remainder of the pulse. For bursts with multiple peak emission, later spikes tend to be softer than earlier ones. A related but not similar trend is shown by the observations that the bursts are narrower at higher energies with ¹(l)Jl\ [104]. As we show in Section 8.7.3 this behavior is consistent with synchrotron emission [105].
2.5. Spectral lines Both absorption and emission features have been reported by various experiments prior to BATSE. Absorption lines in the 20}40 keV range have been observed by several experiments } but never simultaneously. GINGA has discovered several cases of lines with harmonic structure [106,107]. These lines were interpreted as cyclotron lines (re#ecting a magnetic "eld of +10 G) and providing one of the strongest arguments in favor of the galactic neutron star model. Emission features near 400 keV have been claimed in other bursts [108]. These have been interpreted as red-shifted 511 keV annihilation lines with a corresponding redshift of +20% due to the gravitational "eld on the surface of the Neutron star. These provided additional evidence for the galactic neutron star model. So far BATSE has not found any of the spectral features (absorption or emission lines) reported by earlier satellites [109,110]. This can be interpreted as a problem with previous observations (or with the di$cult analysis of the observed spectra) or as an unlucky coincidence. Given the rate of observed lines in previous experiments it is possible (at the +5% level) that the two sets of data are consistent [111]. Recently, MeH szaH ros and Rees [112] suggested that within the relativitic "reball model the observed spectral lines could have be blueshifted iron X-ray line.
586
T. Piran / Physics Reports 314 (1999) 575}667
2.6. Angular positions BATSE is capable of estimating on its own the direction to a burst. It is composed of eight detectors that are pointed towards di!erent directions in the sky. The relative intensity of the counts in the various detectors allows us to measure the direction to the burst. The positional error of a given burst is the square root of the sum of squares of a systematic error and a statistical error. The statistical error depends on the strength of the burst. It is as large as 203 for a weak burst, and it is negligible for a strong one. The estimated systematic error (using a comparison of BATSE positions with IPN (Inter Planetary Network) localization) is +1.63 [113]. A di!erent analysis of this comparison [114,115] suggests that this might be slightly higher, around 33. The location of a burst is determined much better using the di!erence in arrival time of the burst to several detectors on di!erent satellites. Detection by two satellites limits the position to a circle on the sky. Detection by three determines the position and detection by four or more overdetermines it. Even in this case the positional error depends on the strength of the bursts. The stronger the burst, the easier it is to identify a unique moment of time in the incoming signals. Clearly, the accuracy of the positional determination is better the longer the distance between the satellites. The best positions that have been obtained in this way are with the IPN of detectors. For 12 events the positional error boxes are of a few arc-minutes [116]. BeppoSAX Wide Field Camera (WFC) that covers about 5% of the sky located a few bursts within 3 (3p). BeppoSAX's Narrow Field Instrument (NFI) obtained the bursts' positions to within 50. X-ray observations by ASCA and ROSAT have yielded error boxes of 30 and 10, respectively. Optical identi"cation has led, as usual, to a localization within 1. Finally VLBI radio observation of GRB970508 has yielded a position within 200 larcsec. The position of at least one burst is well known. 2.7. Angular distribution One of the most remarkable "ndings of BATSE was the observation that the angular distribution of GRBs' positions on the sky is perfectly isotropic. Early studies had shown an isotropic GRB distribution [117] which have even led to the suggestion that GRBs are cosmological [118]. In spite of this it was generally believed, prior to the launch of BATSE, that GRBs are associated with galactic disk neutron star. It has been expected that more sensitive detectors would discover an anisotropic distribution that would re#ect the planar structure of the disk of the galaxy. BATSE's distribution is, within the statistical errors, in complete agreement with perfect isotropy. For the "rst 1005 BATSE bursts the observed dipole and quadrupole (corrected to BATSE sky exposure) relative to the galaxy are: 1cos h2"0.017$0.018 and 1sin b!1/32"!0.003$0.009. These values are, respectively, 0.9p and 0.3p from complete isotropy [39]. 2.8. Quiescent counterparts and the historical `no hosta problem One of the main obstacles in resolving the GRB mystery was the lack of identi"ed counterparts in other wavelengths. This has motivated numerous attempts to discover GRB counterparts (for a review see [119,120]). This is a di$cult task } it was not known what to expect and where and when to look for it.
T. Piran / Physics Reports 314 (1999) 575}667
587
The search for counterparts is traditionally divided to e!orts to "nd a #aring (burst), a fading or a quiescent counterpart. Fading counterparts, afterglow, have been recently discovered by BeppoSAX and as expected this discovery has revolutionized GRB studies. This allowed also the discovery of host galaxies in several cases, which will be discussed in the following section. Soft X-ray #aring (simultaneous with the GRB) was discovered in several bursts but it is an ambiguous question whether this should be considered as a part of the GRB itself or is it a separate component. Flaring has not been discovered in other wavelengths yet. Quiescent counter parts were not discovered either. Most cosmological models suggest that GRBs are in a host galaxy. If so then deep searches within the small error boxes of some GRBs localized by the IPN system should reveal the host galaxy. Until the discovery of GRB afterglow these searches have yielded only upper limits on the magnitudes of possible hosts. This has led to what is called the `no hosta problem. Schaefer et al. conducted searches in the near and far infrared [121] using IRAS, in radio using the VLA [122] and in archival optical photographs [123] and have found only upper limits and no clear counterpart candidates. Similar results from multiple wavelength observations have been obtained by Hurley et al. [68]. Vrba et al. [124] have monitored the error boxes of seven bursts for "ve year. They did not "nd any unusual objects. As for the `no hosta problem this authors, as well as Luginbuhl et al. [125] and Larson et al. [126] concluded, using the standard galaxy luminosity function, that there are enough dim galaxies in the corresponding GRB error boxes which could be the hosts of cosmological burst and therefore, there is no `no hosta problem. More recently Larson and McLean [127] monitored in the infrared nine of the smallest error boxes of burst localized by the IPN with a typical error boxes of eight arcmin. They found in all error boxes at least one bright galaxy with K415.5. However, the error boxes are too large to discern between the host galaxy and unrelated background galaxies. Schaefer et al. [128] searched the error boxes of "ve GRBs using the HST. Four of these are smaller boxes with a size of &1 arcmin. They searched but did not "nd any unusual objects with UV excess, variability, parallax or proper motion. They have found only faint galaxies. For the four small error boxes the luminosity upper limits of the host galaxy are 10}100 times smaller than the luminosity of an ¸ galaxy. Band and Hartmann [129] concluded that the error boxes of Larsen and McLean [127] H are too large to discriminate between the presence or the absence of host galaxies. However, they "nd that the absence of host galaxies in the Schaefer et al. [128] data is signi"cant, at the 2;10\ level, suggesting that there are no bright hosts. This situation has drastically changed and the `no hosta problem has disappeared with afterglow observations. These observations have allowed for an accurate position determination and to identi"cation of host galaxies for several GRBs. Most of these host galaxies are dim with magnitude 24.4(R(25.8. This support the conclusions of the earlier studies that GRBs are not associated with bright galaxies and de"nitely not with cores of such galaxies (ruling out for example AGN type models). These observations are consistent with GRBs rate being either a constant or being proportional to the star formation rate [130]. According to this analysis it is not surprising that most hosts are detected at R&25. However, though these two models are consistent with the current data both predict the existence of host galaxies brighter than 24 mag, which were not observed so far. One could say now that the `no hosta problem has been replaced by the `no bright hosta problem. But this may not be a promlem but rather an indication on the nature of the sources.
588
T. Piran / Physics Reports 314 (1999) 575}667
The three GRBs with measured cosmological redshifts lie in host galaxies with a strong evidence for star formation. These galaxies display prominent emission lines from line associated with star formation. In all three cases the strength of those lines is high for galaxies of comparable magnitude and redshift [16,130}133]. The host of GRB980703, for example, show a star forming rate of &10 M yr\ or higher with a lower limit of 7M yr\ [131]. For most GRBs with afterglow the > > host galaxy was detected but no emission or absorption lines were found and no redshift was measured. This result is consistent with the hypothesis that all GRBs are associated with starforming galaxies. For those hosts that are at redshift 1.3(z(2.5 the corresponding emission lines are not observed as for this redshift range no strong lines are found in the optical spectroscopic window [133]. The simplest conclusion of the above observations is that all GRBs are associated with star forming regions. Still one has to keep in mind that those GRBs on which this conclusion was based had a strong optical afterglow, which not all GRBs show. It is possible that the conditions associated with star-forming regions (such as high interstellar matter density } or the existance of molecular clouds) are essential for the appearance of strong optical afterglow and not for the appearance of the GRB itself. 2.9. Afterglow GRB observations were revolutionized on 28 February 1997 by the Italian-Dutch satellite BeppoSAX [134] that discovered an X-ray counterpart to GRB970228 [10]. GRB970228 was a double-peaked GRB. The "rst peak which lasted &15 s was hard. It was followed, 40 s later, by a much softer second peak, which lasted some &40 s. The burst was detected by the Gamma-Ray Burst Monitor (GRBM) as well as by the Wide Field Camera (WFC). The WFC, which has a 403;403 "eld of view detected soft X-rays simultaneously with both peaks. Eight hours latter the Narow Field Instrument (NFI) was pointed towards the burst's directions and detected a continuous X-ray emission. The accurate position determined by BeppoSAX enabled the identi"cation of an optical afterglow [11] } a 20 magnitude point source adjacent to a red nebulae. HST observations [135] revealed that the nebula adjacent to the source is roughly circular with a diameter of 0.8. The diameter of the nebula is comparable to the one of galaxies of similar magnitude found in the Hubble Deep Field, especially if one takes into account a possible visual extinction in the direction of GRB970228 of at least one magnitude [136]. Following X-ray detections by BeppoSAX [10,137], ROSAT [138] and ASCA [139] revealed a decaying X-ray #ux Jt\ ! (see Fig. 5). The decaying #ux can be extrapolated as a power law directly to the X-ray #ux of the second peak (even though this extrapolation requires some care in determining when is t"0). The optical emission also depicts a decaying #ux [140] (see Fig. 6). The source could not be observed from late March 1997 until early September 1997. When it was observed again, on 4th September by HST [141,133] it was found that the optical nebulosity does not decay and the point source shows no proper motion, refuting earlier suggestions. The visual magnitude of the nebula on 4th September was 25.7$0.25 compared with <"25.6$0.25 on 26th March and 7th April. The visual magnitude of the point source on 4th September was (<"28.0$0.25), which is consistent with a decay of the #ux as t\ ! [133]. In spite of extensive e!orts no radio emission was detected and one can set an upper limit of &10 l Jy to the radio emission at 8.6 GHz [142].
T. Piran / Physics Reports 314 (1999) 575}667
589
Fig. 5. Decay of the X-ray afterglow from GRB970228. From Eq. (10). Shown is source #ux at the 2}10 keV range. The data is "tted with a power-law t\ .
Fig. 6. Decay of the optical afterglow in GRB070228, GRB9700508 and GRB971214. A clear power-law decay can be seen in all cases.
GRB970508 was detected by both BATSE in c-rays [143] and BeppoSAX in X-rays [144] on 8 May 1997. The c-ray burst lasted for &15 s, with a c-ray #uence of &3;10\ erg/cm\. Variable emission in X-rays, optical [12,145}149] and radio [13,150] followed the c-rays. The spectrum of the optical transient taken by Keck revealed a set of absorption lines associated with Fe II and Mg II and O II emission line with a redshift z"0.835 [14]. A second absorption line system with z"0.767 is also seen. These lines reveal the existence of an underlying, dim galaxy
590
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 7. Decay of the optical afterglow in, GRB9700508. A clear transition from a power law decay to a constant can be seen. From [132].
host. HST images [151,152] and Keck observations [132] show that this host is very faint (R"25.72$0.2 mag), compact (41 arcsec) dwarf galaxy at z"0.835 and nearly coincident on the sky with the transient. (see Fig. 7). The optical light curve peaks at around 2 days after the burst. Assuming isotropic emission (and using z"0.835 and H"100 km/s/Mpc) this peak #ux corresponds to a luminosity of a few ;10 erg/s. The #ux decline shows a continuous power law decay Jt\ ! [153}156,132]. After about 100 days the light curve begun to #atten as the transient faded and become weaker than the host [155,157}159]. Integration of this light curve results in an overall emission of a few ;10 erg in the optical band. Radio emission was observed, "rst, one week after the burst [13] (see Fig. 8). This emission showed intensive oscillations which were interpreted as scintillation [160]. The subsequent disappearance of these oscillations after about three weeks enabled Frail et al. [13] to estimate the size of the "reball at this stage to be &10 cm. This was supported by the indication that the radio emission was initially optically thick [13], which yields a similar estimate to the size [25]. GRB970828 was a strong GRB that was detected by BATSE on 28 August 1997. Shortly afterwards RXTE [161,162] focused on the approximate BATSE position and discovered X-ray emission. This X-ray emission determined the position of the burst to within an elliptical error box with 5;2. However, in spite of enormous e!ort no variable optical counterpart brighter than R"23.8 that has changed by more than 0.2 magnitude was detected [163]. There was also no indication of any radio emission. Similarly X-ray afterglow was detected from several other GRBs (GRB970615, GRB970402, GRB970815, GRB980519) with no optical or radio emission. Seventeen GRBs have been detected with arcminute positions by 22 July 1998: 14 by the WFC of BeppoSAX and three by the All-Sky Monitor (ASM) on board the Rossi X-ray Timing Explorer (RXTE). Of these seventeen burst, thirteen were followed up within a day in X-rays and all those resulted in good candidates for X-ray afterglows. We will not discuss all those here (see Table 1 for a short summary of some of the properties). Worth mentioning are however, GRB971214, GRB980425 and GRB980703.
T. Piran / Physics Reports 314 (1999) 575}667
591
Fig. 8. Light curve of the radio afterglow from GRB970508. From [13].
Table 1 Observational data of several GRBs for which afterglow was detected. The two columns O and R indicate whether emission was detected in the optical and radio, respectively. The total energy of the burst is estimated through the observed #uence and redshift, assuming spherical emission and a #at X"1, K"0 universe with H "65 km/s/Mpc Burst
X-ray detection
O
R
c-ray #uence (erg/cm)
Redshift
Total energy (erg)
GRB970228 GRB970508 GRB970616 GRB970815 GRB970828 GRB971214 GRB971227 GRB980326 GRB980329 GRB980425 GRB980515 GRB980519 GRB980703
BeppoSAX BeppoSAX BeppoSAX RXTE RXTE RXTE BeppoSAX BeppoSAX BeppoSAX BeppoSAX BeppoSAX BeppoSAX RXTE
# # ! ! ! # ! ! # # ! # #
! # ! ! ! # ! ! # # ! # #
1;10\ 2;10\ 4;10\ 1;10\ 7;10\ 1;10\ 9;10\ 1;10\ 5;10\ 4;10\ 1;10\ 3;10\ 5;10\
! 0.835 ! ! ! 3.418 ! ! ! 0.0085 ! ! 0.966
! 2;10 ! ! ! 1;10 ! ! ! 7;10 ! ! 1;10
GRB971214 was a rather strong burst. It was detected on December 14.9 UT 1997 [164]. Its optical counterpart was observed with a magnitude 21.2$0.3 on the I band by Halpern et al. [165], on December 15.47 UT 12 h after the burst. It was observed one day later on December 16.47 with I magnitude 22.6. Kulkarni et al. [15] obtained a spectrum of the host galaxy for
592
T. Piran / Physics Reports 314 (1999) 575}667
GRB971214 and found a redshift of z"3.418! With a total #uence of 1.09;10\ erg/cm [166] this large redshift implies, for isotropic emission, X"1 and H "65 km/s/Mpc, an energy release of &10erg in c-rays alone. The familiar value of 3;10 [15] is obtained for X"0.3 and H "0.55 km/s/Mpc. GRB980425 was a moderately weak burst with a peak #ux of 3$0.3;10\ erg/cm s. It was a single peak burst with a rise time of 5 s and a decay time of about 25 s. The burst was detected by BeppoSAX (as well as by BATSE) whose WFC obtained a position with an error box of 8. Inspection of an image of this error box taken by the New Technology Telescope (NTT) revealed a type Ic supernova SN1998bw that took place more or less at the same time as the GRB [167]. Since the probability for a chance association of the SN and the GRB is only 1.1;10\ it is likely that this association is real. The host galaxy of this supernova (ESO 184-G82) has a redshift of z"0.0085$0.0002 putting it at a distance of 38$1 Mpc for H"67 km/s Mpc. The corresponding c-ray energy is 5;10 erg. With such a low luminosity it is inevitable that if the association of this burst with the supernova is real it must correspond to a new and rare subgroup of GRBs. GRB980703 was a very strong burst with an observed gamma-ray #uence of (4.59$0.42);10\ erg/cm [168]. Keck observations revealed that the host galaxy has a redshift of z"0.966. The corresponding energy release (for isotropic emission, X"0.2 and H "65 km/s/Mpc) is &10 erg [131]. 2.10. Repetition? Quashnock and Lamb [169] suggested that there is evidence from the data in the BATSE 1B catalog for repetition of bursts from the same source. If true, this would severely constrain most GRB models. In particular, it would rule out any other model based on a &once in lifetime' catastrophic event. This claim has been refuted by several authors [170,171] and most notably by the analysis of the 2B data [172] and 3B data [173]. A unique group of four bursts } `the gang of foura } emerged from the same position on the sky within two days [69]. One of those bursts (the third burst GRB961029a) was extremely strong, one of the strongest observed by BATSE so far. Consequently it was observed by the IPN network as well, and its position is known accurately. The other three (GRB961027a,GRB961027b and GRB961029d) were detected only by BATSE. The precise position of one burst is within the 1p circles of the three other bursts. However, two of the bursts are almost 3p away from each other. Is this a clear cut case of repetition? It is di$cult to assign a unique statistical signi"cance to this question as the signi"cance depends critically on the a priori hypothesis that one tests. Furthermore, the time di!erence between the "rst and the last bursts is less than two days. This is only one order of magnitude longer than the longest burst observed beforehand. It might still be possible that all those bursts came from the same source and that they should be considered as one long burst. 2.11. Correlations with Abell clusters, quasars and supernovae Various attempts to search for a correlation between GRBs and other astronomical objects led to null result. For example, Blumenthal and Hartmann [174] found no angular correlation This value depends also on the spectral shape of the burst.
T. Piran / Physics Reports 314 (1999) 575}667
593
between GRBs and nearby galaxies. They concluded that if GRBs are cosmological then they must be located at distances larger than 100 Mpc. Otherwise, they would have shown a positive correlation with the galaxy distribution. The only exception is the correlation (at 95% con"dence level) between GRBs at the 3B catalog and Abell clusters [78,175]. This correlation has been recently con"rmed by Kompaneetz and Stern [176]. The correlation is strongest for a subgroup of strong GRBs whose position is accurately known. Comparison of the rich clusters auto-correlation with the cross-correlation found suggests that &26$15% of the accurate position GRBs sub-sample members are located within 600 h Mpc. Recently, Schartel et al. [178] found that a group of 134 GRBs with position error radius smaller than 1.83 are correlated with radio quiet quasars. The probability of such correlation by chance coincidence is less than 0.3%. It should be stressed that this correlation does not imply that there is a direct association between GRBs and Abell clusters, such as would have been if GRBs would have emerged from Abell clusters. All that it means is that GRBs are distributed in space like the large-scale structure of the universe. Since Abell clusters are the best tracers of this structure they are correlated with GRBs. Therefore the lack of excess Abell Clusters in IPN error boxes (which are much smaller than BATSE's error boxes) [177] does not rule out this correlation. 2.12. < , count and peak yux distributions
The limiting #uence observed by BATSE is +10\erg/cm. The actual #uence of the strongest bursts is larger by two or three orders of magnitude. A plot of the number of bursts vs. the peak #ux depicts clearly a paucity of weak bursts. This is manifested by the low value of 1< 2, a statistic
designed to measure the distribution of sources in space [179]. A sample of the "rst 601 bursts has 1< 2"0.328$0.012, which is 14p away from the homogeneous #at space value of 0.5 [180].
Correspondingly, the peak count distribution is incompatible with a homogeneous population of sources in Eucleadian space. It is compatible, however, with a cosmological distribution (see Fig. 10). The distribution of short bursts has a larger 1< 2 and it is compatible with
a homogeneous Eucleadian distribution [78}80].
3. The distance scale 3.1. Redshift measurements The measurements of redshifts of several GRB optical counterparts provide the best and the only direct distance estimates for GRBs. Unfortunately, these measurements are available only for a few bursts. 3.2. The angular distribution Even before these redshift measurements there was a strong evidence that GRBs originate from cosmological distances. The observed angular distribution is incompatible with a galactic disk distribution unless the sources are at distances less than 100 pc. However, in this case we would
594
T. Piran / Physics Reports 314 (1999) 575}667
expect that 1< 2"0.5 corresponding to a homogeneous distribution [179] while the observa tions yield 1< 2"0.33.
A homogeneous angular distribution could be produced if the GRB originate from the distant parts of the galactic halo. Since the solar system is located at d"8.5 kpc from the galactic center such a population will necessarily have a galactic dipole of order d/R, where R is a typical distance to a GRB [181]. The lack of an observed dipole strongly constrain this model. Such a distribution of sources is incompatible with the distribution of dark matter in the halo. The typical distance to the GRBs must of the order of 100 kpc to comply with this constraint. For example, if one considers an e!ective distribution that is con"ned to a shell of a "xed radius then such a shell would have to be at a distance of 100 kpc in order to be compatible with current limits on the dipole [182]. 3.3. Interpretation of the peak yux distribution The counts distribution or the Peak #ux distribution of the bursts observed by BATSE show a paucity of weak burst. A homogeneous count distribution, in an Eucleadian space should behave like N(C)JC\, where N(C) is the number of bursts with more than C counts (or counts per second). The observed distribution is much #atter (see Fig. 10). This fact is re#ected by the low 1< 2 value of the BATSE data: there are fewer distant sources than expected.
The observed distribution is compatible with a cosmological distribution of sources. A homogeneous cosmological distribution displays the observed trend } a paucity of weak bursts relative to the number expected in a Eucleadian distribution. In a cosmological population four factors combine to make distant bursts weaker and by this to reduce the rate of weak bursts: (i) K correction } the observed photons are red-shifted. As the photon number decreases with energy this reduces the count rate of distant bursts for a detector at a "xed energy range. (ii) The cosmological time dilation causes a decrease (by a factor 1#z) in the rate of arrival of photons. For a detector, like BATSE, that measures the count rate within a given time window this reduces the detectability of distant bursts. (iii) The rate of distant bursts also decreases by a factor 1#z and there are fewer distant bursts per unit of time (even if the rate at the comoving frames does not change). (iv) Finally, the distant volume element in a cosmological model is di!erent than the corresponding volume element in a Eucleadian space. As could be expected, all these e!ects are signi"cant only if the typical red-shift to the sources is of order unity or larger. The statistics 1< 2' is a weighted average of the distribution N('f ). Already in 1992,
Piran [56] compared the theoretical estimate of this statistics to the observed one and concluded that the typical redshift of the bursts observed by BATSE is z &1. Later Fenimore et al. [203]
compared the sensitivity of PVO (that observes N('f )Jf\) with the sensitivity of BATSE and concluded that z (BATSE)&1 (the maximal z from which bursts are detected by BATSE). This
corresponds to a peak luminosity of &10 erg/s. Other calculations based on di!erent statistical methods were performed by Horack and Emslie [188], Loredo and Wasserman [183,184], Rutledge et al. [187] Cohen and Piran [185] and MeH szaH ros and collaborators [189}192] and others. In particular, Loredo and Wassermann [183,184] give an extensive discussion of the statistical methodology involved. Consider a homogeneous cosmological distribution of sources with a peak luminosity ¸, that may vary from one source to another. It should be noted that only the luminosity per unit solid angle is accessible by these arguments. If there is signi"cant beaming, as inferred [25], the
T. Piran / Physics Reports 314 (1999) 575}667
595
distribution of total luminosity may be quite di!erent. The sources are emitting bursts with a count spectrum: N(l) dl"(¸/hl)NI (l) dl, where hl, is the average energy. The observed peak (energy) #ux in a "xed energy range [E ,E ] from a source at a redshift z is
(1#z) ¸ # NI [l(1#z)]hl(1#z)h dl (3) f (¸,z)" 4pdl (z) hl # where dl(z) is the luminosity distance [186]. To estimate the number of bursts with a peak #ux larger than f, N('f ), we need the luminosity function, t(¸,z): the number of bursts per unit proper (comoving) volume per unit proper time with a given luminosity at a given redshift. Using this function we can write
XD* dl dr (z) dz d¸, (4) t(¸,z) (1#z) dz where the redshift, z( f,¸), is obtained by inverting Eq. (3) and r (z) is the proper distance to a redshift z. For a given theoretical model and a given luminosity function we can calculate the theoretical distribution N( f ) and compare it with the observed one. A common simple model assumes that t(¸,z)" (¸)o(z) } the luminosity does not change with time, but the rate of events per unit volume per unit proper time may change. In this case we have N('f )"4p
XD* dl dr (z) dz d¸. (5) o(z) (1#z) dz The emitted spectrum, N(l), can be estimated from the observed data. The simplest shape is a single power law (Eq. (2)) with a"1.5 or a"1.8 [87]. More elaborate studies have used the Band et al. [86] spectrum or even a distribution of such spectra [187]. The cosmic evolution function o(z) and the luminosity function (¸) are unknown. To proceed one has to choose a functional shape for these functions and characterize it by a few parameters. Then using maximum likelihood, or some other technique, estimate these parameters from the data. A simple characterization of o(z) is N('f )"4p
(¸)
o(z)"o (1#z)\@. Similarly the simplest characterization of the luminosity is as standard candles:
(6)
(¸)"d(¸!¸ ), (7) with a single parameter, ¸ , or equivalently z , the maximal z from which the source is detected
(obtained by inverting Eq. (3) for f"f and ¸"¸ ).
There are two unknown cosmological parameters: the closure parameter, X, and the cosmological constant K. With the luminosity function given by Eqs. (5) and (6) we have three unknown parameters that determine the bursts' distribution: ¸ , o , b. We calculate the likelihood function over this "ve-dimensional parameter space and "nd the range of acceptable models (those whose likelihood function is not less than 1% of the maximal likelihood). We then proceed to perform a KS (Kolmogorov}Smirnov) test to check whether the model with the maximal likelihood is an acceptable "t to the data.
596
T. Piran / Physics Reports 314 (1999) 575}667
The likelihood function is practically independent of X in the range: 0.1(X(1. It is also insensitive to the cosmological constant K (in the range 0(K(0.9, in units of the critical density). This simpli"es the analysis as we are left only with the intrinsic parameters of the bursts' luminosity function. There is an interplay between evolution (change in the bursts' rate) and luminosity. Fig. 9 depicts the likelihood function in the (z ,b) plane for sources with a varying intrinsic rate. The banana shaped contour lines show that a population whose rate is independent of z (b"0) is equivalent to a population with an increasing number of bursts with cosmological time (b'0) with a lower ¸ (lower z ). This tendency saturates at high intrinsic evolution (large b), for which the limiting
z does not go below +0.5 and at very high ¸ , for which the limiting b does not decrease below
!1.5. This interplay makes it di$cult to constraint the redshift distribution of GRB using the peak #ux distribution alone. For completeness we quote here `typicala results based on standard candles, no evolution and an Einstein}DeSitter cosmology [185]. Recall that 1< 2 of the short bursts distribution is rather close to the homogeneous
Eucleadian value of 0.5. This means that when analyzing the peak #ux distribution one should analyze separately the long and the short bursts [185]. For long bursts (bursts with t '2 s) the likelihood function peaks at z "2.1 (see Fig. 10) [185]. The allowed range at a 1% con"dence
level is: 1.4(z (3.1 (z?"1.5> for a"2). The maximal redshift, z "2.1> , corres
\
\ ponds, with an estimated BATSE detection e$ciency of +0.3, to 2.3> ;10\ events per galaxy \ per year (for a galaxy density of 10\h Mpc\ [194]). The rate per galaxy is independent of H and is only weakly dependent on X. For X"1 and K"0 the typical energy of a burst with an
Fig. 9. The likelihood function (levels 33%, 10%, 3.3% 1%, etc.) in the (b,¸) plane for standard candles a"1.5, X"1, and evolution given by o(z)"(1#z)\@. Superimposed on this map is the luminosity of GRB970508, (solid curve), GRB971214 (dashed curve) and GRB980703 (dahsed-dotted curve). We have used h "1. From [204].
T. Piran / Physics Reports 314 (1999) 575}667
597
Fig. 10. The observed long burst peak #ux distribution and three theoretical cosmological distributions with X"1, K"0, a"!1.5, standard candles and no source evolution: ¸"3.4;10 erg/s (solid line: best "t), ¸"7.2;10 erg/s (dashed line: lower 1% bound), ¸"1.4;10 erg/s (dashed-dotted line: upper 1% bound).
observed #uence, F, is 7>;10(F/10\ erg/cm) erg. The distance to the sources decreases and \ correspondingly the rate increases and the energy decreases if the spectral index is 2 and not 1.5. These numbers vary slightly if the bursts have a wide luminosity function. Short bursts are detected only up to a much nearer distances: z (short)"0.4> , again
assuming standard candles and no source evolution. There is no signi"cant lower limit on z for
short bursts and their distribution is compatible with a homogeneous non-cosmological one. The estimate of z (short) corresponds to a comparable rate of 6.3 ;10\ events per year per
\ galaxy and a typical energy of 3>;10F erg (there are no lower limits on the energy or and \ no upper limit on the rate since there is no lower limit on z (short)). The fact that short bursts are
detected only at nearer distances is also re#ected by the higher 1< 2 of the population of these
bursts [80]. Relatively wide luminosity distributions are allowed by the data [185]. For example, the KS test gives a probability of 80% for a double peaked luminosity distribution with luminosity ratio of 14. These results demonstrate that the BATSE data alone allow a variability of one order of magnitude in the luminosity. The above considerations should be modi"ed if the rate of GRBs trace the SFR } the star formation rate [195}197]. The SFR has been determined recently by two independent studies [199}201]. The SFR peaks at z&1.25. This is a strongly evolving non-monotonic distribution which is drastically di!erent from the power laws considered so far. Sahu et al. [196] "nd that o(z)JSFR(z) yields N('f ) distribution that is compatible with the observed one (for q "0.2, H "50 km/s Mpc) for a narrow luminosity distribution with ¸ "10 erg/s. Wijers et al. [197] A
598
T. Piran / Physics Reports 314 (1999) 575}667
"nd that the implied peak luminosity is higher ¸ "8.3;10 erg/s and it corresponds to a situA ation in which the dimmest bursts observed by BATSE originate from z+6! This result has been questions by some authors. Speci"cally, Kommers et al. [198], for example "nds that the maximum redshift for standard-candle model where the GRB rate traces the SFR and the BATSE threshold of 0.3 ph/cm/cm/s in 50}300 keV is z&3. The direct redshift measure of GRB970508 [14] agrees well with estimates made previously using peak-#ux count statistics ([203,184,185]). The redshift of GRB971214, z"3.418, and of GRB980703, z"0.966, and the implied luminosities disagree with these estimates. However, if we allow for a wide distribution function a simple model of a power-law evolution with a wide luminosity function (corresponding to the width inferred from the known three luminosities) still agrees with the peak-#ux distribution. Krumholz et al. [202] and Hogg and Fruchter [130] "nd that with a wide luminosity function both models of a constant GRB rate and a GRB rate following the star formation rate are consistent with the peak #ux distribution and with the observed redshift of the three GRBs. A future detection of additional redshifts for other bursts will enable us to estimate directly the luminosity function of GRBs. It will also enable us to determine the evolution of GRBs. 3.4. Time dilation Norris et al. [205,206] examined 131 long bursts (with a duration longer than 1.5 s) and found that the dimmest bursts are longer by a factor of +2.3 compared to the bright ones. With our canonical value of z "2.1 the bright bursts originate at z +0.2. The corresponding
expected ratio due to cosmological time dilation, 2.6, is in agreement with this measurement. Fenimore and Bloom [207] "nd, on the other hand, that when the fact that the burst's duration decreases as a function of energy as *t+E\ is included in the analysis, this time dilation corresponds to z '6. This would require a strong negative intrinsic evolution: b+!1.5$0.3.
Cohen and Piran [193] suggested a way to perform the time dilation, spectral and redshift analysis simultaneously. Unfortunately current data are insu$cient for this purpose.
4. The compactness problem and relativistic motion The key to understanding GRBs lies, I believe, in understanding how GRBs bypass the compactness problem. This problem was realized very early on in one form by Ruderman [208] and in another way by Schmidt [209]. Both used it to argue that GRBs cannot originate from cosmological distances. Now, we understand that GRBs are cosmological and special relativistic e!ects enable us to overcome this constraint. The simplest way to see the compactness problem is to estimate the average opacity of the high-energy gamma-ray to pair production. Consider a typical burst with an observed #uence, F. For a source emitting isotropically at a distance D this #uence corresponds to a total energy release of
E"4pDF"10 erg
D F . 3000 Mpc 10\ erg/cm
(8)
T. Piran / Physics Reports 314 (1999) 575}667
599
Cosmological e!ects change this equality by numerical factors of order unity that are not important for our discussion. The rapid temporal variability on a time scale d¹+10 ms implies that the sources are compact with a size, R (cd¹+3000 km. The observed spectrum (see Section G 2.3) contains a large fraction of high-energy c-ray photons. These photons (with energy E ) could interact with lower energy photons (with energy E ) and produce electron-positron pairs via ccPe>e\ if (E E 'm c (up to an angular factor). Denote by f the fraction of photon pairs C that satisfy this condition. The average optical depth for this process is [210}212]: q "f p FD/Rm c, AA 2 G C or
F q "10f AA 10\ erg/cm
D d¹ \ , 3000 Mpc 10 ms
(9)
where p is the Thompson cross-section. This optical depth is very large. Even if there are no pairs 2 to begin with they will form rapidly and then these pairs will Compton scatter lower energy photons, resulting in a huge optical depth for all photons. However, the observed non-thermal spectrum indicates with certainty that the sources must be optically thin! An alternative calculation is to consider the optical depth of the highest energy photons (say a GeV photon) to pair production with the lower energy photons. The observation of GeV photons shows that they are able to escape freely. In other words, it means that this optical depth must be much smaller than unity [213}215]. This consideration leads to a slightly stronger but comparable limit on the opacity. The compactness problem stems from the assumption that the size of the sources emitting the observed radiation is determined by the observed variability time scale. There will not be a problem if the source emitted the energy in another form and it was converted to the observed gamma-rays at a large distance, R , where the system is optically thin and q (R )(1. A trivial solution of this 6 AA 6 kind is based on a weakly interacting particle, which is converted in #ight to electromagnetic radiation. The only problem with this solution is that there is no known particle that can play this role (see, however [216]). 4.1. Relativistic motion Relativistic e!ects can fool us and, when ignored, lead to wrong conclusions. This happened 30 years ago when rapid variability implied `impossiblea temperatures in extra-galactic radio sources. This puzzle was resolved when it was suggested [217,218] that these objects reveal ultra-relativistic expansion. This was con"rmed later by VLBA measurements of superluminal jets with Lorentz factors of order two to ten. This also happened in the present case. Consider a source of radiation that is moving towards an observer at rest with a relativistic velocity characterized by a Lorentz factor, c"1/(1!v/c1. Photons with an observed energy hl have been blueshifted and their energy at the source was +hl /c. Since the energy at the source is lower fewer photons have su$cient energy to produce pairs. Now the observed fraction f , of photons that could produce N pairs is not equal to the fraction of photons that could produce pairs at the source. The latter is smaller by a factor c\? (where a is the high-energy spectral index) than the observed fraction. At the same time, relativistic e!ects allow the radius from which the radiation is emitted, R (ccd¹ C
600
T. Piran / Physics Reports 314 (1999) 575}667
to be larger than the original estimate, R (cd¹, by a factor of c. We have C f p FD q " N 2 , AA c? Rm c C C or
F 10 f q + AA c>? N 10\ erg/cm
D d¹ \ , 3000 Mpc 10 ms
(10)
where the relativistic limit on R was included in the second line. The compactness problem can be C resolved if the source is moving relativistically towards us with a Lorentz factor c'10>?+10. A more detailed discussion [213,214] gives comparable limits on c. Such extreme-relativistic motion is larger than the relativistic motion observed in any other celestial source. Extragalactic super-luminal jets, for example, have Lorentz factors of &10, while the known galactic relativistic jets [219] have Lorentz factors of &2 or less. The potential of relativistic motion to resolve the compactness problem was realized in the eighties by Goodman [220], PaczynH ski [53] and Krolik and Pier [221]. There was, however, a di!erence between the "rst two approaches and the last one. Goodman [220] and PaczynH ski [53] considered relativistic motion in the dynamical context of "reballs, in which the relativistic motion is an integral part of the dynamics of the system. Krolik and Pier [221] considered, on the other hand, a kinematical solution, in which the source moves relativistically and this motion is not necessarily related to the mechanism that produces the burst. Is a purely kinematic scenario feasible? In this scenario the source moves relativistically as a whole. The radiation is beamed with an opening angle of c\. The total energy emitted in the source frame is smaller by a factor c\ than the isotropic estimate given in Eq. (8). The total energy required, however, is at least (Mc#4pFD/c)c, where M is the rest mass of the source (the energy would be larger by an additional amount E c if an internal energy, E , remains in the source after the burst has been emitted). For most scenarios that one can imagine Mcc(4p/c)FD. The kinetic energy is much larger than the observed energy of the burst and the process is extremely (energetically) wasteful. Generally, the total energy required is so large that the model becomes infeasible. The kinetic energy could be comparable to the observed energy if it also powers the observed burst. This is the most energetically economical situation. It is also the most conceptually economical situation, since in this case the c-ray emission and the relativistic motion of the source are related and are not two independent phenomena. This will be the case if GRBs result from the slowing down of ultra relativistic matter. This idea was suggested by MeH szaH ros and Rees [27,222] in the context of the slowing down of "reball accelerated material [223] by the ISM and by Narayan et al. [28] and independently by Rees and MeH szaH ros [29] and PaczynH ski and Xu [30] in the context of self-interaction and internal shocks within the "reball. It is remarkable that in both cases the introduction of energy conversion was motivated by the need to resolve the `Baryonic Contaminationa problem (which we discuss in the next section). If the "reball contains even a small amount of baryons all its energy will eventually be converted to kinetic energy of those baryons. A mechanism was needed to recover this energy back to radiation. However, it is clear now that the idea is much more general and it is an essential part of any GRB model regardless of the nature of the relativistic energy #ow and of the speci"c way it slows down.
T. Piran / Physics Reports 314 (1999) 575}667
601
Fig. 11. Radiation from a relativistic beam with a width h. Each observer will detect radiation only from a very narrow beam with a width C\. The overall angular size of the observed phenomenon can vary, however, with C\(h(4p.
Assuming that GRBs result from the slowing down of a relativistic bulk motion of massive particles, the rest mass of the ultra-relativistic particles is
hFD h F D c \ M" , (11) +10\M e\ > 4p 10\ erg/cm 3000 Mpc ce c 100 where e is the conversion e$ciency and h is the opening angle of the emitted radiation. We see that the allowed mass is very small. Even though a way was found to convert back the kinetic energy of the baryons to radiation (via relativistic shocks) there is still a `baryonic contaminationa problem. Too much baryonic mass will slow down the #ow and it will not be relativistic. 4.2. Relativistic beaming? Radiation from relativistically moving matter is beamed in the direction of the motion to within an angle c\. In spite of this the radiation produced by relativistically moving matter can spread over a much wider angle. This depends on the geometry of the emitting region. Let h be the + angular size of the relativistically moving matter that emits the burst. The beaming angle h will be h if h 'c\ and c\ otherwise. Thus, if h "4p } that is if the emitting matter has been + + + accelerated spherically outwards from a central source (as will be the case if the source is a spherical "reball) } the burst will be isotropic even though each observer will observe radiation coming only from a very small region (see Fig. 11). The radiation will be beamed into c\ only if the matter has been accelerated along a very narrow beam. The opening angle can also have any intermediate value if it emerges from a beam with an opening angle h'c\, as will be the case if the source is an anisotropic "reball [225,226] or an electromagnetic accelerator with a modest beam width. Beaming requires, of course, an event rate larger by a ratio 4p/h compared to the observed rate. Observations of about one burst per 10\ year per galaxy implies one event per hundred years per galaxy if h+c\ with c given by the compactness limit of &100.
5. An overview of the generic model It is worthwhile to summarize now the essential features of the generic GRB model that arose from the previous discussion. Compactness has led us to the requirement of relativistic motion,
602
T. Piran / Physics Reports 314 (1999) 575}667
with a Lorentz factor c5100. Ockham's razor and the desire to limit the total energy have led us to the idea that the observed gamma-rays arise in the process of slowing down of a relativistic energy #ow, at a stage that the motion of the emitting particles is still highly relativistic. This leads us to the generic picture mentioned earlier and to the suggestion that GRBs are composed of a three stage phenomenon: (i) a compact inner hidden `enginea that produces a relativistic energy #ow, (ii) the energy transport stage and (iii) the conversion of this energy to the observed prompt radiation. One may add a fourth stage (iv) conversion of the remaining energy to radiation in other wavelengths and on a longer time scale } the `afterglowa. 5.1. Models for the energy yow The simplest mode of relativistic energy #ow is in the form of kinetic energy of relativistic particles. A variant that has been suggested is based on the possibility that a fraction of the energy is carried by Poynting #ux [70,227}230] although in all models the power must be converted to kinetic energy somewhere. The energy #ow of &10 erg/s from a compact object whose size is :10 cm requires a magnetic "eld of 10 G or higher at the source. This large value might be reached in stellar collapses of highly magnetized stars or ampli"ed from smaller "elds magnethohydrodynamically (70). Overall the di!erent models can be characterized by two parameters: the ratio of the kinetic energy #ux to the Poynting #ux and the location of the energy conversion stage (+10 cm for internal conversion or +10 cm for external conversion). This is summarized in Table 2. In the following section we will focus on the simplest possibility, that is of a kinetic energy #ux. 5.2. Models for the energy conversion Within the baryonic model the energy transport is in the form of the kinetic energy of a shell of relativistic particles with a width *. The kinetic energy is converted to `thermala energy of relativistic particles via shocks. These particles then release this energy and produce the observed radiation. There are two modes of energy conversion (i) External shocks, which are due to interaction with an external medium like the ISM. (ii) Internal shocks that arise due to shocks within the #ow when fast moving particles catch up with slower ones. Similar division to external and internal energy conversion occurs within other models for the energy #ow.
Table 2 General scheme for energy transport
Internal conversion External conversion
Kinetic energy dominated
Kinetic energy and poynting #ux
Poynting #ux dominated
[28,29,18]
[227]
[228,70,230]
[27,222]
}
[229,70]
T. Piran / Physics Reports 314 (1999) 575}667
603
Table 3 Critical radii R R E R R C R B R A R l or ¸ l
Initial radius Matter dominates Optically thin to pairs Optically thin Internal collisions External Newtonian shocks External relativistic shocks Non-relativistic external shock Sedov length
cdt Rg [(3E/4pRa)/¹ ]R N (p E/4pm cg) 2 N dc lc\ l* l or lc\ l"(3E/4pn m c) N
+10}10 cm +10 cm +10 cm +10 cm +10}10 cm +10 cm +10 cm +10}10 cm +10 cm
Adiabatic "reball; Radiative "reball.
Fig. 12. Fireball evolution from its initial formation at rest to the "nal Newtonian Sedov solution. The energy extraction is due to the interaction with the ISM via a relativistic forward shock and a Newtonian reverse shock. We have used for this calculations m"43, E "10 (erg), c "50, R "3;10 (cm). Shown are the average value of the Lorentz factor (thick solid line), the value at the forward shock (thin solid line), the maximal value (dotted line) and an analytic estimate (dashed dotted line). From [231].
External shocks arise from the interaction of the shell with external matter. The typical length scale is the Sedov length, l,(E/n m c). The rest mass energy within a sphere of radius l, N equals the energy of the shell. Typically l&10 cm. As we see later (see Section 8.7.1) relativistic external shocks (with a Newtonian reverse shock) convert a signi"cant fraction of their kinetic energy at R "l/c+10!10 cm, where the external mass encountered equals c\ of the A
604
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 13. Fireball evolution from its initial formation at rest to the "nal Newtonian Sedov solution. The energy extraction is due to the interaction with the ISM via relativistic forward and reverse shocks. The parameters for this computation are: m"0.1, E "10 (erg), c "10, R "4.3;10 (cm). Shown are the average value of the Lorentz factor (thick solid line), the value at the forward shock (thin solid line), the maximal value (dotted line) and an analytic estimate (dashed dotted line). From [231].
shell's mass. Relativistic shocks (with a relativistic reverse shock) convert their energy at R "l*+10 cm, where the shock crosses the shell. Internal shocks occur when one shell overtakes another. If the initial separation between the shells is d and both move with a Lorentz factor c with a di!erence of order c these shocks take place at: dc. A typical value is 10}10 cm. 5.3. Typical radii In Table 3 we list the di!erent radii that arise in the "reball evolution. Figs. 12 and 13 (from [231]) depict a numerical solution of a "reball from its initial con"guration at rest to its "nal Sedov phase.
6. Fireballs Before turning to the question of how is the kinetic energy of the relativistic #ow converted to radiation we ask is it possible to produce the needed #ows? More speci"cally, is it possible to accelerate particles to relativistic velocities? It is remarkable that a relativistic particle #ow is almost the unavoidable outcome of a `"reballa } a large concentration of energy (radiation) in a small region of space in which there are relatively few baryons. The relativistic "reball model was
T. Piran / Physics Reports 314 (1999) 575}667
605
proposed by Goodman [220] and by PaczynH ski [53]. They have shown that the sudden release of a large quantity of gamma-ray photons into a compact region can lead to an opaque photon}lepton `"reballa through the production of electron}positron pairs. The term `"reballa refers here to an opaque radiation-plasma whose initial energy is signi"cantly greater than its rest mass. Goodman [220] considered the sudden release of a large amount of energy, E, in a small volume, characterized by a radius, R . Such a situation could occur in an explosion. PaczynH ski [53] considered a steady radiation and electron}positron plasma wind that emerges from a compact region of size R with an energy, E, released on a time scale signi"cantly larger than R /c. Such a situation could occur if there is a continuous source that operates for a while. As it will become clear later both con"gurations display, in spite of the seemingly large di!erence between them, the same qualitative behavior. Both Goodman [220] and PaczynH ski [53] considered a pure radiation "reballs in which there are no baryons. Later Shemi and Piran [223] and PaczynH ski [224] considered the e!ect of a baryonic load. They showed that under most circumstances the ultimate outcome of a "reball with a baryonic load will be the transfer of all the energy of the "reball to the kinetic energy of the baryons. If the baryonic load is su$ciently small the baryons will be accelerated to a relativistic velocity with c+E/M. If it is large the net result will be a Newtonian #ow with vK(2E/M. 6.1. A simple model The evolution of a homogeneous "reball can be understood by a simple analogy to the Early Universe [223]. Consider, "rst, a pure radiation "reball. If the initial temperature is high enough pairs will form. Because of the opacity due to pairs, the radiation cannot escape. The pairsradiation plasma behaves like a perfect #uid with an equation of state p"o/3. The #uid expands under of its own pressure. As it expands it cools with TJR\ (T being the local temperature and R the radius). The system resembles quite well a part of a Milne Universe in which gravity is ignored. As the temperature drops below the pair-production threshold the pairs annihilate. When the local temperature is around 20 keV the number of pairs becomes su$ciently small, the plasma becomes transparent and the photons escape freely to in"nity. In the meantime the "reball was accelerated and it is expanding relativistically outwards. Energy conservation (as viewed from the observer frame) requires that the Lorentz factor that corresponds to this outward motion satis"es cJR. The escaping photons, whose local energy (relative to the "reball's rest frame) is +20 keV are blue shifted. An observer at rest detects them with a temperature of T JcT. Since TJR\ and cJR we "nd that the observed temperature, T , approximately equals T , the initial temperature. The observed spectrum, is however, almost thermal [220] and it is still far from the one observed in GRBs. In addition to radiation and e>e\ pairs, astrophysical "reballs may also include some baryonic matter which may be injected with the original radiation or may be present in an atmosphere surrounding the initial explosion. These baryons can in#uence the "reball evolution in two ways. The electrons associated with this matter increase the opacity, delaying the escape of radiation. Initially, when the local temperature T is large, the opacity is dominated by e>e\ pairs [220]. This opacity, q , decreases exponentially with decreasing temperature, and falls to unity when N T"T +20 keV. The matter opacity, q , on the other hand decreases only as R\, where R is the N @
606
T. Piran / Physics Reports 314 (1999) 575}667
radius of the "reball. If at the point where q "1, q is still '1, then the "nal transition to q"1 is N @ delayed and occurs at a cooler temperature. More importantly, the baryons are accelerated with the rest of the "reball and convert part of the radiation energy into bulk kinetic energy. The expanding "reball has two basic phases: a radiation-dominated phase and a matter-dominated phase. Initially, during the radiation-dominated phase the #uid accelerates with cJR. The "reball is roughly homogeneous in its local rest frame but due to the Lorentz contraction its width in the observer frame is *+R , the initial size of the "reball. A transition to the matter-dominated phase takes place when the "reball has a size RE R " +10 cm R E (M/5;10\M )\ > E Mc
(12)
and the mean Lorentz factor of the "reball is c+E/Mc. We have de"ned here E ,E/10 erg and R ,R /10 cm. After that, all the energy is in the kinetic energy of the matter, and the matter coasts asymptotically with a constant Lorentz factor. The matter-dominated phase is itself further divided into two sub-phases. At "rst, there is a frozen-coasting phase in which the "reball expands as a shell of "xed radial width in its own local frame, with a width &cR &(E/Mc)R . Because of Lorentz contraction the shell appears to an observer with a width *+R . Eventually, when the size of the "reball reaches R "*c+10 cm(*/10 cm)(c/100) variability in c within the "reball results in a spreading of the "reball which enters the coasting-expanding phase. In this "nal phase, the width of the shell grows linearly with the size of the shell, R:
*(R)+R/c+10 cm
R 10 cm
c \ R for R'R "10 cm 100 10 cm
c . 100
(13)
The initial energy to mass ratio, g"(E/Mc), determines the order of these transitions. There are two critical values for g [223]: g
3pEp¹ 2 N +3;10ER\ " 4pmcR N
(14)
and
3p E 2 +10ER\. g" @ 8pm cR N These correspond to four di!erent types of "reballs (see Table 4):
(15)
1. A pure radiation xreball (g (g): The e!ect of the baryons is negligible and the evolution is of a pure photon}lepton "reball. When the temperature reaches T , the pair opacity q drops to N N 1 and q 1. At this point the "reball is radiation-dominated (E'Mc) and most of the energy @ escapes as radiation. 2. Electron-dominated opacity (g (g(g ): In the late stages, the opacity is dominated by free @ electrons associated with the baryons. The comoving temperature decreases far below T before N q reaches unity. However, the "reball continues to be radiation-dominated and most of the energy still escapes as radiation.
T. Piran / Physics Reports 314 (1999) 575}667
607
Table 4 Di!erent "reballs Type
g"E/Mc
M
Pure radiation Electrons opacity Relativistic baryons Newtonian
g (g g (g(g @ 1(g(g @ g(1
M(M "10\M ER > M (M(M "2;10\M ER @ > M (M(5;10\M E @ > 5;10\M E (M >
3. Relativistic baryonic xreball (1(g(g ): The "reball becomes matter-dominated before it @ becomes optically thin. Most of the initial energy is converted into bulk kinetic energy of the baryons, with a "nal Lorentz factor c +(E/Mc). This is the most interesting situation for GRBs. 4. Newtonian xreball (g(1): This is the Newtonian regime. The rest energy exceeds the radiation energy and the expansion never becomes relativistic. This is the situation, for example in supernova explosions in which the energy is deposited into a massive envelope. 6.2. Extreme-relativistic scaling laws The above summary describes the qualitative features of a roughly homogeneous expanding "reball. Surprisingly similar scaling laws exists also for inhomogeneous "reballs [232] as well as for relativistic winds [53]. Consider a spherical "reball with an arbitrary radial distribution of radiation and matter. Under optically thick conditions, the radiation and the relativistic leptons (with energy density e) and the matter (with baryon mass density o) at each radius behave like a single #uid, moving with the same velocity. The pressure, p, and the energy density, e, are related by p"e/3. We can express the relativistic conservation equations of baryon number, energy and momentum using characteristic coordinates: r and s,t!r as [232]
o R 1 R (rou)"! , Rs c#u r Rr
(16)
R e 1 R (reu)"! , Rs c#u r Rr
(17)
1 R 4 R r o# e u "! r Rr 3 Rs
u 4 1 Re Re o# e # ! , 3 c#u 3 Rs Rr
(18)
where u"uP"(c!1, and we use units in which c"1 and the mass of the particles m"1. The derivative R/Rr now refers to constant s, i.e. is calculated along a characteristic moving outward at the speed of light. After a short acceleration phase we expect that the motion of a #uid shell will become highly relativistic (c1). If we restrict our attention to the evolution of the "reball from this point on, we may treat c\ as a small parameter and set c+u, which is accurate to order O(c\). Then, under a wide range of conditions the quantities on the right-hand sides of Eqs. (16)}(18) are
608
T. Piran / Physics Reports 314 (1999) 575}667
signi"cantly smaller than those on the left. When we neglect the right-hand sides of Eqs. (16)}(18) the problem becomes e!ectively only r dependent. We obtain the following conservation laws for each #uid shell: (19) roc"const., rec"const., r(o#e)c"const. A scaling solution that is valid in both the radiation-dominated and matter-dominated regimes, as well as in the transition zone in between, can be obtained by combining the conserved quantities in Eq. (19) appropriately. Let t be the time and r be the radius at which a #uid shell in the "reball "rst becomes ultra-relativistic, with c9 few. We label various properties of the shell at this time by a subscript 0, e.g. c , o , and e . De"ning the auxiliary quantity D, where 1 c 3c o 3o , # ! , (20) D c 4e c 4e we "nd that r"r cD/c, o"o /D, e"e /D. (21) These are parametric relations which give r, o, and e of each #uid shell at any time in terms of the c of the shell at that time. The parametric solution 21 describes both the radiation-dominated and matter-dominated phases of the "reball within the frozen pulse approximation. That is as long as the "reball does not spread due to variation in the velocity. 6.3. The radiation-dominated phase Initially, the "reball is radiation-dominated. For c(e /o )c , the "rst term in Eq. (20) dominates and we "nd DJr, cJr, recovering the radiation-dominated scaling: cJr, oJr\, eJr\.
(22)
The scalings of o and e given in Eq. (22) correspond to those of a #uid expanding uniformly in the comoving frame. Indeed, all three scalings in Eq. (22) can be derived for a homogeneous radiationdominated "reball by noting the analogy with an expanding universe. Although the #uid is approximately homogeneous in its own frame, because of Lorentz contraction it appears as a narrow shell in the observer frame, with a radial width given by *r&r/c&constant&R , the initial radius of the "reball, or the initial width of the speci"c shell under discussion when we consider a continuous wind. We can now go back to Eqs. (16)}(18) and set R/Rs&c/r. We then "nd that the terms we neglected on the right-hand sides of these equations are smaller than the terms on the left by a factor &1/c. Therefore, the conservation laws 19 and the scalings 22 are valid so long as the radiation-dominated "reball expands ultra-relativistically with large c. The only possible exception is in the very outermost layers of the "reball where the pressure gradient may be extremely steep and R/Rs may be c/r. Ignoring this minor deviation, we interpret Eq. (19) and the constancy of the radial width *r in the observer frame to mean that the "reball behaves like a pulse of energy with a frozen radial pro"le, accelerating outward at almost the speed of light.
T. Piran / Physics Reports 314 (1999) 575}667
609
6.4. The matter-dominated phase The radiation-dominated regime extends out to a radius r&(e /o )r . At larger radii, the "rst and last terms in Eq. (20) become comparable and c tends to its asymptotic value of c "(4e /3o #1)c . This is the matter-dominated regime. (The transition occurs when 4e/3"o, which happens when c"c /2.) In this regime, DJr, leading to the scalings: cPconstant, oJr\, eJr\.
(23)
The modi"ed scalings of o and e arise because the "reball now moves with a constant radial width in the comoving frame. (The steeper fall-o! of e with r is because of the work done by the radiation through tangential expansion.) Moreover, since eo, the radiation has no important dynamical e!ect on the motion and produces no signi"cant radial acceleration. Therefore, c remains constant on streamlines and the #uid coasts with a constant asymptotic radial velocity. Of course, since each shell moves with a velocity that is slightly less than c and that is di!erent from one shell to the next, the frozen pulse approximation on which Eq. (19) is based must ultimately break down at some large radius. 6.5. Spreading At very late times in the matter-dominated phase the frozen pulse approximation begins to break down. In this stage the radiation density e is much smaller than the matter density o, and the Lorentz factor, c, tends to a constant value c for each shell. We may therefore neglect the term !(1/3)(Re/Rr) in Eq. (18) and treat c and u in Eqs. (16)}(18) as constants. We then "nd that the #ow moves strictly along the characteristic, b t!r"constant, so that each #uid shell coasts at a constant radial speed, b "u /c . We label the baryonic shells in the "reball by a Lagrangian coordinate RI , moving with a "xed Lorentz factor c (RI ), and let t and r represent the time and radius at which the coasting phase begins, which corresponds essentially to the point at which the #uid makes the transition from being radiation-dominated to matter-dominated. We then "nd
1 (c(RI )!1 (t!t (RI ))+ 1! [t!t (r)]. r(t,RI )!r (RI )" c (RI ) 2c(RI )
(24)
The separation between two neighboring shells separated by a Lagrangian distance *RI varies during the coasting phase as
d(Rr/RRI ) 1 Rc *RI . *RI " dt c (RI ) RRI
(25)
Thus the width of the pulse at time t is *r(t)+*r #*c (t!t )/c+R #(t!t )/c, where *r &R is the width of the "reball when it begins coasting, c is the mean c in the pulse, and *c &c is the spread of c across the pulse. From this result we see that within the matter dominated coasting phase there are two separate regimes. So long as t!t (cR , we have
610
T. Piran / Physics Reports 314 (1999) 575}667
a frozen-coasting phase in which *r is approximately constant and the frozen pulse approximation is valid. In this regime the scalings in Eq. (23) are satis"ed. However, when t!t 'cR , the "reball switches to an expanding-coasting phase where *rJt!t and the pulse width grows linearly with time. In this regime, the scaling of o reverts to oJr\, and, if the radiation is still coupled to the matter, eJr\. 6.6. Optical depth Independently of the above considerations, at some point during the expansion, the "reball will become optically thin. For a pure "reball this happens when the local temperature drops to about 20 keV at R +10 cm ER\.
(26)
In a matter-dominated "reball the optical depth is usually determined by the ambient electrons. In this case the "reball becomes optically thin at
p E 2 +6;10 cm (E (g/100)\. R" C 4pm cg N
(27)
From this stage on the radiation and the baryons no longer move with the same velocity and the radiation pressure vanishes, leading to a breakdown of Eqs. (16)}(18). Any remaining radiation will escape freely now. The baryon shells will coast with their own individual velocities. If the "reball is already in the matter dominated coasting phase there will be no change in the propagation of the baryons. However, if the "reball is in the radiation dominated phase when it becomes optically thin, then the baryons will switch immediately to a coasting phase. This transition radius, R has another crucial role in the "reball evolution. It is the minimal radius in C which energy conversion and generation of the observed GRB can begin. Photons produced at R(R cannot escape. C 6.7. Anisotropic xreballs It is unlikely that a realistic "reball will be spherically symmetric. In fact, strong deviation from spherical symmetry are expected in the most promising neutron star merger model, in which the radiation is expected to emerge through funnels along the rotation axis. The initial motion of the "reball might be fairly complex but once c1 the motion of each #uid element decouples from the motion of its neighbors with angular distance larger than c\. This motion can be described by the same asymptotic solution, as if it is a part of a spherical shell. We de"ne the angular range over which di!erent quantities vary as h. Additionally now the motion is not radially outwards and uOu. We de"ne the spread angle as u,u cos . The spherical "reball equations hold locally if c\(h and (2/c.
(28)
T. Piran / Physics Reports 314 (1999) 575}667
611
7. Temporal structure and kinematic considerations General kinematic considerations impose constraints on the temporal structure produced when the energy of a relativistic shell is converted to radiation. The enormous variability of the temporal pro"les of GRBs from one burst to another in contrast to the relatively regular spectral characteristics, was probably the reason that until recently this aspect of GRBs was largely ignored. However, it turns out that the observed temporal structure sets a strong constraint on the energy conversion models [20,233]. GRBs are highly variable (see Section 2.2) and some con"gurations cannot produce such temporal pro"les. 7.1. Time scales Special relativistic e!ects determine the observed duration of the burst from a relativistic shell (see Fig. 14). E The radial time scale ¹ : Consider an in"nitely thin relativistic shell with a Lorentz factor c # (the subscript E is for the emitting region). Let R be a typical radius characterizing the emitting # region (in the observer frame) such that most of the emission takes place between R and 2R . # # The observed duration between the "rst photon (emitted at R ) and last one (emitted at 2R ) is # # [208,18]: ¹
R /2cc. # #
(29)
Fig. 14. Di!erent time scales in terms of the arrival time of four photons: t , t , t , and t . ¹ "t !t ; ! " ! ¹ "t !t , */c"t !t . "
612
T. Piran / Physics Reports 314 (1999) 575}667
E The angular time scale ¹ : Because of relativistic beaming an observer sees up to solid angle of c\ from the line of sight. Two photons emitted at the same time and radius R , one on the line of # # sight and the other at an angle of c\ away travel di!erent distances to the observer. The # di!erence lead to a delay in the arrival time by [208,18,233]: "R /2cc. (30) # # Clearly this delay is relevant only if the angular width of the emitting region, h is larger than c\. # ¹
In addition there are two other time scales that are determined by the #ow of the relativistic particles. These are: E Intrinsic duration *¹: The duration of the #ow. This is simply the time in which the source that produces the relativistic #ow is active. *¹"*/c, where * is the width of the relativistic wind (measured in the observer's rest frame). For an explosive source *+R . However, * could be G much larger for a wind. The observed duration of the burst must be longer or equal to */c E Intrinsic variability d¹: The time scale on which the inner source varies and produces a subsequent variability with a length scale d"cd¹ in the #ow. Naturally, d¹ sets a lower limit to the variability time scale observed in any burst. Clearly * and d must satisfy R 4d4*. Finally we have to consider the cooling time scale.
(31)
E The cooling time scale ¹ : This is the di!erence in arrival time of photons while the shocked material cools measured in the rest frame of an observer at rest at in"nity. It is related to the local cooling time, e/P (where e is the internal energy density and P is the power radiated per unit volume) in the #uid's rest frame by ¹ "e/Pc . (32) # Note that this di!ers from the usual time dilation which gives c e/P. # For synchrotron cooling there is a unique energy dependence of the cooling time scale on frequency: ¹ (l)Jl\ [105] (see Eq. (59)). If ¹ determines the variability we will have d¹(l)Jl\. This is remarkably close to the observed relation d¹Jl\ [104]. Quite generally ¹ is shorter than the hydrodynamics time scales [105,235,70]. However, during the late stages of an afterglow, ¹ becomes the longest time scale in the system. 7.2. Angular spreading and external shocks Comparison of Eqs. (29) and (30) (using *R :R ) reveals that ¹ +¹ . As long as the shell's # # angular width is larger than c\, any temporal structure that could have arisen due to irregularities in the properties of the shell or in the material that it encounters will be spread on a time given by ¹ . This means that ¹ is the minimal time scale for the observed temporal variability: d¹5¹ .
T. Piran / Physics Reports 314 (1999) 575}667
613
Comparison with the intrinsic time scales yields two cases:
¹"
¹ "R /cc if *(R /c (Type-I), # # # # */c otherwise (Type-II).
(33)
In Type-I models, the duration of the burst is determined by the emission radius and the Lorentz factor. It is independent of *. This type of models include the standard `external shock modela [27,18,236] in which the relativistic shell is decelerated by the ISM, the relativistic magnetic wind model [229] in which a magnetic Poynting #ux runs into the ISM, or the scattering of star light by a relativistic shell [237,238]. In Type-II models, the duration of the burst is determined by the thickness of the relativistic shell, * (that is by the duration that the source is active and produces the relativistic wind). The angular spreading time (which depends on the radius of emission) is shorter and therefore irrelevant. These models include the `internal shock modela [28}30], in which di!erent parts of the shell are moving with di!erent Lorentz factor and therefore collide with one another. A magnetic dominated version is given by Thompson [227]. The majority of GRBs have a complex temporal structure (e.g. Section 2.2) with N,¹/d¹ of order 100. Consider a Type-I model. Angular spreading means that at any given moment the observer sees a whole region of angular width c\. Any variability in the emission due to di!erent # conditions in di!erent radii on a time scale smaller than ¹ is erased unless the angular size of the emitting region is smaller than c\. Thus, such a source can produce only a smooth single humped # burst with N"1 and no temporal structure on a time-scale d¹(¹. Put in other words a shell, of a Type-I model, and with an angular width larger than c\ cannot produce a variable burst with # N1. This is the angular spreading problem. On the other hand a Type-II model contains a thick shell *'R /c and it can produce # # a variable burst. The variability time scale, is again limited d¹'¹ but now it can be shorter than the total duration ¹. The duration of the burst re#ects the time that the `inner enginea operates. The variability re#ects the radial inhomogeneity of the shell which was produced by the source (or the cooling time if it is longer than d/c). The observed temporal variability provides an upper limit to the scale of the radial inhomogeneities in the shell and to the scale in which the `inner enginea varies. This is a remarkable conclusion in view of the fact that the "reball hides the `inner enginea. Can an external shock give rise to a Type-II behavior? This would have been possible if we could set the parameters of the external shock model to satisfy R 42c cd¹. As discussed in Section 8.7.1 # # the deceleration radius for a thin shell with an initial Lorentz factor c is given by R "lc\, (34) # and the observed duration is lc\/c. The deceleration is gradual and the Lorentz factor of the emitting region c is similar to the original Lorentz factor of the shell c. It seems that with an # arbitrary large Lorentz factor c we can get a small enough deceleration radius R . However, Eq. # (34) is valid only for thin shells satisfying *'lc\ [236]. As c increases above a critical value c5c "(l/*) the shell can no longer be considered thin. In this situation the reverse shock penetrating the shell becomes ultra-relativistic and the shocked matter moves with Lorentz factor c "c (c which is independent of the initial Lorentz factor of the shell c. The deceleration radius #
614
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 15. The deceleration radius R and the Lorentz factor of the shocked shell c as functions of the initial Lorentz factor C C c, for a shell of "xed width *"3;10 cm. For low values of c, the shocked material moves with Lorentz factor c &c. C However, as c increases the reverse shock becomes relativistic reducing signi"cantly the Lorentz factor c (c. This C phenomena prevents the `external shock modela from being Type-II.
is now given by R "*l, and it is independent of the initial Lorentz factor of the shell. The # behavior of the deceleration radius R and observed duration as function of the shell Lorentz factor # c is given in Fig. 15 for a shell of thickness *"3;10 cm. This emission radius R is always larger # than */c; thus an external shock cannot be of type II. # 7.3. Angular variability and other caveats In a Type-I model, that is a for a shell satisfying *(R /c, variability is possible only if the # # emitting regions are signi"cantly narrower than c\. The source would emit for a total duration # ¹ . To estimate the allowed opening angle of the emitting region imagine two points that emit radiation at the same (observer) time t. The di!erence in the arrival time between two photons emitted at (R ,h ) and (R ,h ) at the same (observer) time t is # # R "h!h" R hM "h !h " R hM dh " # , " # d¹+ # c c 2c
(35)
where h is the angle from the line of sight and we have used h ,h 1, hM ,(h #h )/2 and dh,"h !h ". Since an observer sees emitting regions up to an angle c\ away from the line of # sight hM &c\. The size of the emitting region r "R dh is limited by: # # r 4c cd¹. #
(36)
T. Piran / Physics Reports 314 (1999) 575}667
615
Fig. 16. A very narrow jet of angular size considerably smaller than c\, for which the angular spreading problem does C not exist. The duration of the burst is determined by the deceleration distance *R , while the angular time is assumed C small. The variability could now be explained by either variability in the source which leads to a pulsed jet (a) or by a uniform jet interacting with an irregular ISM (b).
The corresponding angular size is c cd¹ 1 dh4 # " . R Nc # #
(37)
Note that Fenimore et al. [233] who examined this issue, considered only emitting regions that are directly on the line of sight with hM &"h !h " and obtained a larger r which was proportional to R. However only a small fraction of the emitting regions will be exactly on the line of sight. Most # of the emitting regions will have hM &c\, and thus Eq. (36) yields the relevant estimate. # The above discussion suggests that one can produce GRBs with ¹+¹ +R /cc and # # d¹"¹/N if the emitting regions have angular size smaller than 1/Nc +10\. That is, one needs # an extremely narrow jet. Relativistic jets are observed in AGNs and even in some galactic objects, however, their opening angles are of order of a few degrees almost two orders of magnitude larger. A narrow jet with such a small opening angle would be able to produce the observed variability. Such a jet must be extremely cold (in its local rest frame); otherwise its internal pressure will cause it to spread. It is not clear what could produce such a jet. Additionally, for the temporal variability to be produced, either a rapid modulation of the jet or inhomogeneities in the ISM are needed. These two options are presented in Fig. 16. A second possibility is that the shell is relatively `widea (wider than c\) but the emitting regions # are small. An example of this situation is schematically described in Fig. 17. This may occur if either the ISM or the shell itself are very irregular. This situation is, however, extremely ine$cient. The area of the observed part of the shell is pR/c. The emitting regions are much smaller and to # # comply with the temporal constraint their area is pr. For high e$ciency all the area of the shell must eventually radiate. The number of emitting regions needed to cover the shell is at least (R /c r ). In Type-I models, the relation R "2cc¹ holds, and the number of emitting region # # # # required is 4N. But a sum of 4N peaks each of width 1/N of the total duration does not produce a complex time structure. Instead it produces a smooth time pro"le with small variations, of order 1/(2N1, in the amplitude.
616
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 17. A shell with angular size c\ (the angular size is highly exaggerated). The spherical symmetry is broken by the presence of bubbles in the ISM. The relative angular size of the shell and the bubbles is drawn to scale assuming that a burst with N"15 is to be produced. Consequently, N"15 bubbles are drawn (more bubbles will add up to a smooth pro"le). The fraction of the shell that will impact these bubbles is small leading to high ine$ciency. As N increases the e$ciency problem becomes more severe &N\. From [20].
In a highly variable burst there cannot be more than N sub-bursts of duration d¹"¹/N. The corresponding area covering factor (the fraction of radiating area of the shell) and the corresponding e$ciency is less than 1/4N. This result is independent of the nature of the emitting regions: ISM clouds, star light or fragments of the shell. This is the case, for example, in the Shaviv and Dar model [238] where a relativistic iron shell interacts with the starlight of a stellar cluster (a spherical shell interacting with an external fragmented medium). This low e$ciency poses a series energy crisis for most (if not all) cosmological models of this kind. In a recent paper, Fenimore et al. [234] consider other ways, which are based on low surface covering factor, to resolve the angular spreading problems. None seems very promising. 7.4. Temporal structure in internal shocks Type-II behavior arises naturally in the internal shock model. In this model di!erent shells have di!erent Lorentz factors. These shells collide with each other and convert some of their kinetic energy into internal energy and radiate (Fig. 18). If the emission radius is su$ciently small angular spreading will not erase the temporal variability. This requires R 42ccd¹. This condition is # # always satis"ed as internal shocks take place at R "R +dc with c Kc. Since d(* we have # # ¹"*/c'¹ "R /2cc"d/c"d¹. Clearly, multiple shells are needed to account for the # # observed temporal structure. Each shell produces an observed peak of duration d¹ and the whole complex of shells (whose width is *) produces a burst that lasts ¹"*/c. The angular spreading time is comparable to the temporal variability produced by the `inner enginea. They determine the observed temporal structure provided that they are longer than the cooling time. Before concluding that internal shocks can actually produce GRBs we must address two issues. First, can internal shocks produce the observed variable structure? Second, can it be done e$ciently? We will address the "rst question here and the second one in Section 8.6 where we discuss energy conversion in internal shocks. Mochkovitch et al. [32] and Kobayashi et al. [33] used a simple model to calculate the temporal pro"les produce by an internal shock. According to this model the relativistic #ow that produces
T. Piran / Physics Reports 314 (1999) 575}667
617
Fig. 18. The internal shock scenario. The source produces multiple shells as shown in this "gures. The shells will have di!erent Lorentz factors. Faster shells will catch up with slower ones and will collide, converting some of their kinetic energy to internal energy. This model is Type-II and naturally produces variable bursts. From [20].
Fig. 19. A peak produced by a collision between two shells. The luminosity plotted versus the arrival time. The solid line corresponds to R"cdt and the dotted line corresponds to R"0. From [33]. C
the shocks is characterized by multiple shells, each of width l and with a separation ¸. The shells are assigned random Lorentz factors (varying in the range [c ,c ]) and random density or energy.
One can follow the motion of the shell and calculate the time of the binary (two-shell) collisions that take place, until all the shells that could collide have collided and the remaining #ow has a monotonically decreasing velocity. The energy generated and the emitted radiation in each binary encounter are then combined to a synthetic sample of a temporal pro"le. The emitted radiation from each binary collision will be observed as a single pulse characterized by an amplitude and a width. The amplitude depends on the amount of energy converted to
618
T. Piran / Physics Reports 314 (1999) 575}667
radiation in a given collision (see Section 8.6). The time scale depends on the cooling time, the hydrodynamic time, and the angular spreading time. The internal energy is radiated via synchrotron emission and inverse Compton scattering. In most of cases, the electrons' cooling time is much shorter than the hydrodynamic time scale [105,235,18] so we consider only the latter two. The hydrodynamic time scale is determined by the time that the shock crosses the shell, whose width is d. In fact, there are two shocks: a forward shock and a reverse shock. Detailed calculations [33] reveal that this time scale (in the observer's rest frame) is of order of the light crossing time of the shell, that is d/c. Let the distance between the shells be d. A collision takes place at d/c and the angular spreading yields an observed pulse whose width is &d/c. If d'd the overall pulse width d¹ is determined by angular spreading. The shape of the pulse become asymmetric with a fast rise and a slower decline (Fig. 19) which GRBs typically show (see Section 2.2). The amplitude of an individual pulse depends on the energy produced in the collision, which we calculate latter in Section 8.6. Typical synthetic temporal pro"le are shown in Fig. 20. Clearly, internal shocks can produce the observed highly variable temporal pro"les, provided that the source of the relativistic particles, the `inner enginea, produces a su$ciently variable wind. Somewhat surprisingly the resulting temporal pro"le follows, to a large extent, the shape of the pulse emitted by the source. Long bursts require long relativistic winds that last hundred seconds, with a rapid variability on a time scale of a fraction of a second. Thus, unlike previous worries [79,239] we "nd that there is some direct information on the `inner enginea of GRBs: It produces the observed complicated temporal structure. This severely constrains numerous models.
8. Energy conversion 8.1. Slowing down of relativistic particles The cross section for a direct electromagnetic or nuclear interaction between the relativistic particles and the ISM particles is too small to convert e$ciently the kinetic energy to radiation. The "reball particles can be slowed down only via some collective interaction such as a collisionless shock. Supernova remnants (SNRs) in which the supernova ejecta is slowed down by the ISM show that collisionless shocks form in somewhat similar circumstances. One can expect that collisionless shocks will form here as well [27,18]. GRBs are the relativistic analogues of SNRs. In both cases the phenomenon results from the conversion of the kinetic energy of the ejected matter to radiation. Even the total energy involved is comparable. One crucial di!erence is the amount of ejected mass. SNRs involve a solar mass or more. The corresponding velocity is several thousands kilometers per second, much less than the speed of light. In GRBs, the masses are smaller by several orders of magnitude and with the same energy the matter attains ultra-relativistic velocities. A second crucial di!erence is that while SNRs result from the interaction of the ejecta with the ISM, GRBs result from internal collisions. The interaction of the ejecta in GRBs with the ISM produces the `afterglowa that follows some GRBs. The interaction between the SNR ejecta and the ISM takes place on scales of several pc and it is observed for thousands of years. The internal interaction of the relativistic matter in GRBs takes place on a scale of several hundred astronomical units and special relativistic e!ects reduce
T. Piran / Physics Reports 314 (1999) 575}667
619
Fig. 20. Luminosity vs. observer time, for di!erent synthetic models: (a) c "100, c "1000, N"100, g"!1 and
¸/l"5; (b) c "100, c "1000, N"100, g"1 and ¸/l"5; (c) c "100, c "1000, N"20 , g"!1 and
¸/l"5; (d) c "100, c "1000, N"20 , g"!1 and ¸/l"1; (e) c "100, c "1000, N"100, random energy
with E "1000 and ¸/l"5; (f) c "100, c "1000, N"100, random density with o "1000 and ¸/l"5. From
[33].
the observed time scale to a fraction of a second. The interaction with the ISM that leads to the `afterglowa takes place on a scale of a tenth of a pc. Once more special relativistic e!ects reduce the observed time scale to several days. In the following sections I discuss the slowing down of matter due to relativistic shocks. The discussion begins with a general review of relativistic inelastic collisions and continues with the relativistic shock conditions. Then I turn to the radiation processes: synchrotron emission and Inverse Compton. After the general discussion I apply the general results to internal shocks, to external shocks and to the afterglow.
620
T. Piran / Physics Reports 314 (1999) 575}667
8.1.1. Relativistic inelastic collisions Consider a mass (denoted by the subscript r) that catches up a slower one (denoted by s). The two masses collide and merge to form a single mass (denoted m). Energy and momentum conservation yield m c #m c "(m #m #E/c)c ,
(38)
m (c!1#m (c!1"(m #m #E/c)(c !1,
where E is the internal energy generated in the collision (in the rest frame of the merged mass). There are two interesting limits. First let m be at rest: c "1. This is the case in external shocks, or in a shock between relativistic ejecta and a non-relativistic material that was ejected from the source before it exploded. Eqs. (38) reveal that a mass m +m /c m (39) is needed to yield c +c /2 and E+m /2. The external mass needed to convert half of the kinetic
energy is smaller than the original mass by a factor of c [27,18]. The second case corresponds to an internal collision between shells that are moving at di!erent relativistic velocities: c 9c 1. Eqs. (38) yield
m c #m c . c K (40)
m /c #m /c The internal energy (in the frame of an external observer) of the merged shell, E "c E is the
di!erence of the kinetic energies before and after the collision: E "m c(c !c )#m c(c !c ).
The conversion e$ciency of kinetic energy into internal energy is [33,70]
(41)
(m #m )c . (42) e"1! (m c #m c ) As can be expected a conversion of a signi"cant fraction of the initial kinetic energy to internal energy requires that the di!erence in velocities between the shells will be signi"cant: c c and that the two masses will be comparable m +m . 8.1.2. Shock conditions Quite often the previous estimates based on the approximation that the whole shell behaves as a single object are good enough. However, the time scale between the interaction of di!erent parts of the shell with the ISM may be relatively long (compared to the time scale to collect an external mass M/c) and in this case one has to turn to the hydrodynamics of the interaction. This calculation takes into account the shocks that form during the collision. Consider the situation a cold shell (whose internal energy is negligible compared to the rest mass) that overtakes another cold shell or moves into the cold ISM. Generally, two shocks form:
T. Piran / Physics Reports 314 (1999) 575}667
621
Fig. 21. The Lorentz factor c, the density o and the pressure p in the shocks. There are four regions: the ISM (region 1), the shocked ISM (region 2), the shocked shell (region 3) and the unshocked shell (region 4), which are separated by the forward shock (FS), the contact discontinuity (CD) and the reverse shock (RS). The initial parameters are the same as in [13]. From [231].
an outgoing shock that propagates into the ISM or into the external shell, and a reverse shock that propagates into the inner shell, with a contact discontinuity between the shocked material (see Fig. 21). Two quantities determine the shocks' structure: c, the Lorentz factor of the motion of the inner shell (denoted 4) relative to the outer one } or the ISM (denoted 1) and the ratio between the particle number densities in these regions, f,n /n . There are three interesting cases: (i) Ultra-relativistic shock (c1) with f'c. This happens during the early phase of an external shock or during the very late external shock evolution when there is only a single shock. We call this con"guration `Newtoniana because the reverse shock is non-relativistic (or mildly relativistic). In this case the energy conversion takes place in the forward shock: let c be the Lorentz factor of the motion of the shocked #uid relative to the rest frame of the #uid at 1 (an external observer for interaction with the ISM and the outer shell in case of internal collision) and let c be the Lorentz
622
T. Piran / Physics Reports 314 (1999) 575}667
factor of the motion of this #uid relative to the rest frame of the relativistic shell [4]. c +c, c +1. The particle and energy densities (n,e) in the shocked regions satisfy
(43)
n +4cn , e,e "4cn m c, n "7n , e "e. (44) N (ii) Later during the propagation of the shell the density ratio decreases and f(c. Both the forward and the reverse shocks are relativistic. The shock equations between regions 1 and 2 yield [241,242,225] c "fc/(2, n "4c n , e,e "4cn m c. N Similar relations hold for the reverse shock:
(45)
c "f\c/(2, n "4c n . (46) In addition, we have e "e and c (c/c #c /c)/2 which follow from the equality of pressures and velocity on the contact discontinuity. Comparable amounts of energy are converted to thermal energy in both shocks when both shocks are relativistic. But only a negligible amount of energy is converted to thermal energy in the reverse shock if it is Newtonian [236]. (iii) Internal shocks are characterized by f+1 } both shells have similar densities, and by a Lorentz factor of order of a few (2(c(10) describing the relative motion of the shells. In this case, for an adiabatic index (4/3) we have c "((c#1)/2+max[1,(c/2], (47) n "(4c #3)n +4c n , e "c n m c. N Both shocks are mildly relativistic and their strength depends on the relative Lorentz factors of the two shells. 8.1.3. Lorentz factors in diwerent emitting regions Before concluding this section and turning to the radiation mechanisms we summarize brie#y the di!erent relativistic motions encountered when considering di!erent emitting regions. The relativistic shocks are characterized by c that describes the shock's velocity as well as the `thermala motion of the shocked particles. It is measured relative to a rest frame in which the unshocked material is at rest. The Lorentz factor of the forward shock is usually di!erent from the Lorentz factor of the reverse shock. The emitting region } the shocked material } moves relativistically relative to an observer at rest at in"nity. This is characterized by a Lorentz factor, c . # Table 5 summarizes the di!erent values of c and c for external and internal shocks and for the # afterglow. 8.2. Synchrotron emission from relativistic shocks 8.2.1. General considerations The most likely radiation process in GRBs is synchrotron emission [243,18,225,105]. The observed low-energy spectra provide an indication that this is indeed the case [97,244].
T. Piran / Physics Reports 314 (1999) 575}667
623
Table 5 Lorentz factors c #
c
Forward
c
c
Reverse
c
1
Forward
cm
cm
Reverse
cm c c(t)
m\ (c /2& a few c(t)
Shock type
Newtonian External Relativistic Internal Afterglow
The parameters that determine synchrotron emission are the magnetic "eld strength, B, and the electrons' energy distribution (characterized by the minimal Lorentz factor c and the index of C the expected power-law electron energy distribution p). These parameters should be determined from the microscopic physical processes that take place in the shocks. However, it is di$cult to estimate them from "rst principles. Instead we de"ne two dimensionless parameters, e and e , that C incorporate our ignorance and uncertainties [225,115]. The dimensionless parameter e measures the ratio of the magnetic "eld energy density to the total thermal energy e: e ,; /e"B/8pe,
(48)
so that, after substituting the shock conditions we have (49) B"(32pcec mn. N There have been di!erent attempts to estimate e [227,243,245]. We keep it as a free parameter. Additionally, we assume that the magnetic "eld is randomly oriented in space. The second parameter, e , measures the fraction of the total thermal energy e which goes into C random motions of the electrons: e ,; /e. C C
(50)
8.2.2. The electron distribution We call consider a `typicala electron as one that has the average c of the electrons distribution: C 1c 2"(m /m )e c . (51) C M C C Collisionless acceleration of electrons can be e$cient if they are tightly coupled to the protons and the magnetic "eld by means of plasma waves [246]. Since the electrons receive their random motions through shock-heating, we assume (following the treatment of non-relativistic shocks) that they develop a power law distribution of Lorentz factors: N(c )&c\N for c 'c . C C C C
(52)
624
T. Piran / Physics Reports 314 (1999) 575}667
We require p'2 so that the energy does not diverge at large c . Since the shocks are relativistic we C assume that all the electrons participate in the power-law, not just a small fraction in the tail of the distribution as in the Newtonian case. An indication that this assumption is correct is given by the lower energy spectrum observed in some GRBs [97,244]. The minimum Lorentz factor, c , of C the distribution is related to e and to the total energy e&c nm c: C N p!2 m p!2 ec " 1c 2, (53) c " N C m p!1 C p!1 C C where c is the relative Lorentz factor across the corresponding shock. The energy index p can be "xed by requiring that the model should be able to explain the high-energy spectra of GRBs. If we assume that most of the radiation observed in the soft gamma-rays is due to synchrotron cooling, then it is straightforward to relate p to the power-law index of the observed spectra of GRBs, b. The mean spectral index of GRBs at high photon energies b+!2.25, [86] corresponds to p+2.5. This agrees, as we see later (Section 9.3.2), with the value inferred from afterglow observations (p&2.25). We assume this value of p in what follows. The corresponding ratio that appears in Eq. (53) (p!2)/(p!1) equals 1/3 and we have c "610c . C The shock acceleration mechanisms cannot accelerate the electrons to arbitrary high-energy. For the maximal electron's energy, with a corresponding c , the acceleration time equals to the C cooling time. The acceleration time is determined by the Larmor radius R and the AlfveH n velocity * v [247]: t "cR /v . (54) * This time scale should be compared with the synchrotron cooling time c m c/P (in the local C C frame). Using v +e, Eq. (49) to estimate B and Eq. (57) below to estimate P one "nds 24pee e c "3.7;10 (55) c " C Bp c n 2 cooling. This value is quite large and generally it does not a!ect the observed spectrum in the soft gamma ray range.
8.2.3. Synchrotron frequency and synchrotron power The typical energy of synchrotron photons as well as the synchrotron cooling time depend on the Lorentz factor c of the relativistic electron under consideration and on the strength of the C magnetic "eld (see e.g. [248]). Since the emitting material moves with a Lorentz factor c the # photons are blue shifted. The characteristic photon energy in the observer frame is given by (hl ) "( q B/m c)cc . C C C # The power emitted by a single electron due to synchrotron radiation in the local frame is
(56)
(57) P "p c; c, C 2 where p is the Thomson cross section. The cooling time of the electron in the #uid frame is then 2 c m c/P. The observed cooling time t is shorter by a factor of c : C C # t (c )"3m c/4p ; c c . (58) C C 2 C #
T. Piran / Physics Reports 314 (1999) 575}667
625
Substituting the value of c from Eq. (56) into the cooling rate Eq. (58) we obtain the cooling time C scale as a function of the observed photon energy:
3 2pc m q C C. (59) t (l)" Bc l p # 2 Since c does not appear explicitly in this equation t at a given observed frequency is independent C of the electrons' energy distribution within the shock. This is provided, of course, that there are electrons with the required c so that there will be emission in the frequency considered. As long as C there is such an electron the cooling time is `universala. This equation shows a characteristic scaling of t (l)Jl\. This is not very di!erent from the observed relation d¹Jl\ [104]. However, it is not clear if the cooling time and not another time scale determined the temporal pro"le. The cooling time calculated above sets a lower limit to the variability time scale of a GRB since the burst cannot possibly contain spikes that are shorter than its cooling time. Observations of GRBs typically show asymmetric spikes in the intensity variation, where a peak generally has a fast rise and a slower exponential decline (FRED). A plausible explanation of this observation is that the shock heating of the electrons happens rapidly (though episodically), and that the rise time of a spike is related to the heating time. The decay time is then set by the cooling, so that the widths of spikes directly measure the cooling time. 8.2.4. The integrated synchrotron spectrum The instantaneous synchrotron spectrum of a single electron with an initial energy c m c is C C a power law with F Jl up to l (c ) and an exponential decay above it. If the electron is J C energetic it will cool rapidly until it will reach c . This is the Lorentz factor of an electron that C cools on a hydrodynamic time scale. For a rapidly cooling electron we have to consider the time integrated spectrum above h (c ): F Jl\ from l (c ) up to l (c ). C J C C To calculate the overall spectrum due to all the electrons we need to integrate over c . Our C discussion here follows [249]. We consider a power-law electron distribution with a power index p and a minimal Lorentz factor c (see Eq. (52)). Overall, we expect a broken power-law C spectrum with a break frequency around the synchrotron frequency of the lowest energy electrons l (c ). These power-law indices depend on the cooling rate. The most energetic electrons will C always be cooling rapidly (independently of the behavior of the `typical electrona). Thus, the highest spectrum will always satisfy F "N[(c(l)]m c(c(l) dc/dlJl\N. (60) J C Similarly the low-energy electrons will always be slow cooling and thus the lowest part of the spectrum will behave like F Jl. J For slow cooling we have the instantaneous spectrum F Jl for the lower part of the J spectrum. For the upper part we have F "N[(c(l)]P[(c(l)] dc/dlJl\N\, J where c(l) is the Lorentz factor for which the synchrotron frequency equals l.
(61)
626
T. Piran / Physics Reports 314 (1999) 575}667
For fast cooling we have F Jl\ for the lower part and F Jl\N for the upper part. Here at J J the lower end the least energetic electrons will be cooling slowly even when the typical electron is cooling rapidly. Thus we will have f Jl in the lowest part of the spectrum. J The critical parameter that determines if the electrons are cooling fast or slow is c , the Lorentz C factor of an electron that cools on a hydrodynamic time scale. To estimate c we compare t (Eq. C (58)) with t , the hydrodynamic time scale (in the observer's rest frame): c "3m c/4p ; c t . C C 2 #
(62)
Fast cooling occurs if c (c . All the electrons cool rapidly and the electrons' distribution C C e!ectively extends down to c . If c 'c only the high-energy tail of the distribution (those C C C electrons above c ) cool and the system is in the slow cooling regime. C For the GRB itself we must impose the condition of fast cooling: the relativistic shocks must emit their energy e!ectively } otherwise there will be a serious ine$ciency problem. Additionally, we will not be able to explain the variability if the cooling time is too long. The electrons must cool rapidly and release all their energy. In this case c 'c [105] and all the electrons cool down roughly C C to c . The observed #ux, F , is given by C J
(63)
(64)
(l/l )F , l 'l, J , l 'l'l , F J (l/l )\F J
J (l /l )\(l/l )\NF , l'l ,
J
where l ,l (c ),l ,l (c ) and F is the observed peak #ux.
C C J It is most likely that during the latter stages of an external shock (that is within the afterglow phase } provided that it arises due to external shocks) there will be a transition from fast to slow cooling [21,23,47,250,25]. When c 'c , only those electrons with c 'c can cool. We call this C C slow cooling, because the electrons with c &c , which form the bulk of the population, do not C C cool. Integration over the electron distribution gives in this case: (l/l )F , l 'l,
J
F J (l/l )\N\F , l 'l'l , J
J
(l /l )\N\(l/l )\NF , l'l . J
For fast cooling l (l . We "nd that the peak #ux is at l while the peak energy emitted (which
corresponds to the peak of lF ) is at l . For slow cooling the situation reverses l (l . The peak J
#ux is at l while the peak energy emitted is at l .
Typical spectra corresponding to fast and slow cooling are shown in Fig. 22. The light curve depends on the hydrodynamic evolution, which in turn determines the time dependence of l ,l
and F . J For fast cooling the power emitted is simply the power given to the electrons, that is e times the C power generated by the shock: P "e dE/dt. C
(65)
T. Piran / Physics Reports 314 (1999) 575}667
627
Fig. 22. Synchrotron spectrum of a relativistic shock with a power-law electron distribution. (a) Fast cooling, which is expected at early times (t(t ). The spectrum consists of four segments, identi"ed as A,B,C,D. Self-absorption is important below l . The frequencies, l , l , l , decrease with time as indicated; the scalings above the arrows correspond ? K A ? to an adiabatic evolution, and the scalings below, in square brackets, to a fully radiative evolution. (b) Slow cooling, which is expected at late times (t't ). The evolution is always adiabatic. The four segments are identi"ed as E,F,G,H. From [249].
For slow cooling the emitted power is determined by the ability of the electrons to radiate their energy: P "N P (c ), (66) C C where N is the number of electrons in the emitting region and P (c ), the synchrotron power of C C an electron with c , is given by Eq. (57). C 8.3. Synchrotron self-absorption An important e!ect that we have ignored so far is the possibility of self-absorption. This is irrelevant during the GRB itself. One of the essential features of the GRB spectrum is that it is
628
T. Piran / Physics Reports 314 (1999) 575}667
produced in an optically thin region. However, self-absorption may appear at late time and typically in radio emission [18,25,250}252]. When it appears it will cause a steep cuto! of the low-energy spectrum, either as the commonly known l or as l. To estimate the self-absorption frequency we need the optical depth along the line of sight. A simple approximation is a R /c where a is the absorption coe$cient [248]: JY J J JY
(p#2) N(c ) C. a " dc P (c ) JY 8pm l C JYC C c C C A
(67)
The self-absorption frequency l satis"es: a R/c"1. It can be estimated only once we have JY a model for the hydrodynamics and how do R and c change with time [251,252]. The spectrum below the self-absorption frequency depends on the electron distribution. One obtains the well-known [248], l slope when the synchrotron frequency of the electron emitting the self-absorbed radiation is inside the self-absorption range. One obtains a slope of l2 if there is self-absorption, but the radiation in that range is due to the low-energy tail of electrons radiating e!ectively at higher energies. For this latter case, which is more appropriate for GRB afterglow we "nd that [18,25]: F Jl[k ¹ /(cm c)]R, J C N
(68)
where R is the radius of the radiating shell and the factor k ¹ /(cm c) describes the degree of C N electron equipartition in the plasma shock-heated to an internal energy per particle cm c and N moving with Lorentz factor c. 8.4. Inverse Compton emission Inverse Compton (IC) scattering may modify our analysis in several ways. IC can in#uence the spectrum even if the system is optically thin (as it must be) to Compton scattering (see e.g. [248]). In view of the high energies involved we assume that only one IC scattering takes place. After this scattering the photon's energy is so high that in the electron's rest frame it is above the Klein}Nishina energy and in this case the decrease in the Compton cross section makes this scattering unlikely. The e!ect of IC depends on the Comptonization parameter >"cq . For fast C cooling one can show [105] that > satis"es
e /e >" C (e /e C
if e e , C if e e . C
(69)
IC is unimportant if >(1 and in this case it can be ignored. If >'1, which corresponds to e 'e and to >"(e /e then a large fraction of the low-energy C C synchrotron radiation will be up scattered by IC and a large fraction of the energy will be emitted via the IC processes. If those IC up scattered photons will be in the observed energy band then the observed radiation will be IC and not synchrotron photons. Those IC photons might be too energetic, that is their energy may be beyond the observed energy range. In this case IC will not in#uence the observed spectra directly. However, as IC will take a signi"cant fraction of the energy
T. Piran / Physics Reports 314 (1999) 575}667
629
of the cooling electrons it will in#uence the observations in two ways: it will shorten the cooling time (the emitting electrons will be cooled by both synchrotron and IC process). Second, assuming that the observed c-ray photons results from synchrotron emission, IC will in#uence the overall energy budget and reduce the e$ciency of the production of the observed radiation. We turn now to each of this cases. Consider, "rst, the situation in which >'1 and the IC photons are in the observed range so that some of the observed radiation may be due to IC rather than synchrotron emission. This is an interesting possibility since one might expect that the IC process will ease the requirement of rather large magnetic "elds that is imposed by the synchrotron process. We show here that, somewhat surprisingly, this cannot be the case. An IC scattering boosts the energy of the photon by a factor c. Typical IC photons will be C observed at the energy
c
q B C , ) (hl ) " C cc "12 MevB (c % # (m /m ) '! mc C # N C C
(70)
where B "B/1 G and c ,c . The Lorentz factor of electrons radiating synchrotron % # # photons which are IC scattered on electrons with the same Lorentz factor and have energy hl in the observed range is the square root of the c required to produce synchrotron radiation in the same C frequency. The required value for c is rather low relative to what one may expect in an external C shock (in which c &e (m /m )c ). In internal shocks we expect lower values (c &e (m /m )) C C N C C C N C but in this case the equipartition magnetic "eld is much stronger (of the order of few thousand Gauss, or higher). Thus IC might produce the observed photons in internal shocks if e is rather small (of order 10\). These electrons are cooled both by synchrotron and by IC. The latter is more e$cient and the cooling is enhanced by the Compton parameter >. The cooling time scale is 6pc( t " e mqB(e (hl)cp # 2 '! C C C
"8;10 s
e B\(c )\(hl/100 keV)\. # e % C
(71)
As we see in the following discussion for external shocks, t (100 keV), the IC cooling time if the IC '! radiation is in the observed range (soft gamma-rays) is too long, while for internal shocks t '! (100 keV) is marginal. However, even if IC does not produce the observed c-ray photons it still in#uences the process if >'1. It will speed up the cooling of the emitting regions and shorten the cooling time, t estimated earlier (Eq. (59)) by a factor of >. Additionally IC also reduces the e$ciency by the same factor, and the e$ciency becomes extremely low as described below. 8.5. Radiative ezciency The e$ciency of a burst depends on three factors: First only the electrons' energy is available. This yields a factor e . Second, if e (e there is an additional factor of min[1,(e /e ] if the IC C C C
630
T. Piran / Physics Reports 314 (1999) 575}667
radiation is not observed. Third, there is a speci"c Lorentz factor, c( , of an electron which emits C synchrotron (or IC) radiation in the 100 keV energy band. Therefore, only the energy radiated by electrons with c 5c( is observed as soft c-rays. Assuming a power law electron distribution with C C an index p"2.5 (see [105]) this gives a factor of (c /c( ) (which is valid of course provided that C C c (c( ). The total e$ciency is the multiplication of those three factors and it is given by C C e"e min[1,(e /e ](c /c( ). (72) C C C C The e$ciency depends "rst of all on the electrons' energy density and to a lesser extent on the magnetic energy density. Both should be close to equipartition in order that the e$ciency will be large. Additionally, in order that there will be photons in the 100 keV range c should be smaller C than c( . However, e$cient production of soft c-rays requires that c will not be too small C compared with c( . This estimate is, of course, di!erent if the observed c-rays are produced by inverse Compton scattering. 8.6. Internal shocks Internal shocks are the leading mechanism for energy conversion and production of the observed gamma-ray radiation. We discuss, in this section, the energy conversion process, the typical radiation frequency and its e$ciency. 8.6.1. Parameters for internal shocks Internal shocks take place when an inner shell overtakes a slower outer shell. Consider a fast inner shell with a Lorentz factor c that collides with a slower shell whose Lorentz factor is c . If c 9c &c then the inner shell will overtake the outer one at R +cd+10 cmd c (73) B where d is the initial separation between the shells in the observer's rest frame and d "d/10 cm and c "c/100. Clearly internal shocks are relevant only if they appear before the external shock that is produced as the shell sweeps up the ISM. We show in Section 8.7.1 that the necessary condition for internal shocks to occur before the external shock is m'f,
(74)
where m and f are two dimensionless parameters. The parameter, m, characterizes the interaction of the #ow with the external medium and it is de"ned in Eq. (92) (see Section 8.7.1). The second parameter, f, characterizes the variability of the #ow: f,d/*41.
(75)
We have seen in Section 7.4 that for internal shocks the duration of the burst ¹+*/c and the duration of individual spikes d¹+d/c. The observed ratio N de"ned in Section 2.2 must equal 1/f and this sets f+0.01. The overall duration of a burst produced by internal shocks equals */c. Thus, whereas external shocks require an extremely large value of c to produce a very short burst, internal shocks can
T. Piran / Physics Reports 314 (1999) 575}667
631
Fig. 23. Di!erent scenarios in the * (in cm) } c plane for f,d/*"0.01. Relativistic ES occur for large * and large c } upper right } above the m"1 line (dark gray and light gray regions). Newtonian ES occur below m"1 } lower left } white region. IS occur, if there are su$cient variation in c below the m"f line (light gray and white regions). The equal duration ¹"1 s curve is shown for Newtonian ES (solid line) a relativistic ES (dotted line) and IS (dashed line). Note that a relativistic ES and an internal shock with the same parameters have the same overall duration ¹ but di!erent temporal substructure depending on d. From [235].
produce a short burst with a modest value of the Lorentz factor c. This eases somewhat the baryon purity constraints on source models. Condition (74) can be turned into a condition that c is su$ciently small: c42800f\¹\l, (76) where we have used ¹"*/c and we have de"ned ¹ "¹/10 s and f "f/0.01. It follows that internal shocks take place in relatively `lowa c regime. Fig. 23 depicts the regimes in the physical parameter space (*,c) in which various shocks are possible. It also depicts an example of a ¹"*/c"1 s line. Too low a value of the Lorentz factor leads to a large optical depth in the internal shocks region. Using Eq. (27) for R , at which the optical depth for Compton scattering of the photons on the C shell's electrons equals one, Eq. (73) for R and the condition R 4R we "nd B C B Ep 2 "130 ¹\f\E. (77) c5 4cdm p N In addition, the radius of emission should be large enough so that the optical depth for ccPe>e\ will be less than unity (q (1). There are several ways to consider this constraint. The strongest AA constraint is obtained if one demands that the optical depth of an observed high-energy, e.g. 100 MeV photon will be less than unity [213,214]. Following these calculations and using Eq. (73) to express R we "nd B c'570(f ¹ )\. (78)
632
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 24. Allowed regions for internal shocks in the d (in cm), c plane. Note that the horizontal d axis also corresponds to the typical peak duration, dt multiplied by c. Internal shocks are impossible in the upper right (light gray) region. The lower boundary of this region depends on f,d/* and are marked by two solid curves, the lower one for f"1 and the upper one for f"0.01. Also shown are q "1 for an observed spectrum with no upper bound (dotted line), q "1 for an AA AA observed spectrum with an upper bound of 100 MeV (dashed line) and q "1 (dashed-dotted). The optically thin internal C shock region is above the q"1 curves and below the m"f (solid) lines. From [235].
This constraint, which is due to the cc interaction, is generally more important than the constraint due to Compton scattering, that is q 'q . AA C Eq. (76), and the more restrictive Eq. (78), constrains c to a relatively narrow range: 570f\¹\4c 42800f\¹\l. # This can be translated to a rather narrow range of emission radii:
(79)
10 cmf ¹4R 42.5;10 cm¹l. (80) B In Fig. 24, we plot the allowed regions in the c and d parameter space. Using the less restrictive q limit 77 we "nd C 5;10 cmf ¹E4R 42.5;10 cm¹l B 130f\¹\E4c 42800f\¹. # Three main conclusions emerge from the discussion so far. First, if the spectrum of the observed photons extends beyond 100 MeV (as was the case in the bursts detected by EGRET [85]) and if those high-energy photons are emitted in the same region as the low-energy ones then the condition on the pair production, q , Eq. (78) is stronger than the condition on Compton AA scattering, Eq. (81). This increases the required Lorentz factors. Second, the Compton scattering limit (which is independent of the observed high-energy tail of the spectrum) poses also a lower
T. Piran / Physics Reports 314 (1999) 575}667
633
limit on c. However, this is usually less restrictive than the q limit. Finally, one sees in Fig. 24 that AA optically thin internal shocks are produced only in a narrow region in the (d,c) plane. The region is quite small if the stronger pair production limit holds. In this case there is no single value of c that can produce peaks over the whole range of observed durations. The allowed region is larger if we use the weaker limits on the opacity. But even with this limit there is no single value of c that produces peaks with all durations. The IS scenario suggests that bursts with very narrow peaks should not have very high-energy tails and that very short bursts may have a softer spectrum. 8.6.2. Physical conditions and emission from internal shocks Provided that the di!erent parts of the shell have comparable Lorentz factors di!ering by factor of &2, the internal shocks are mildly relativistic. The protons' thermal Lorentz factor will be of order of unity, and the shocked regions will still move highly relativistically towards the observer with approximately the initial Lorentz factor c. In front of the shocks the particle density of the shell is given by the total number of baryons E/cm c divided by the co-moving volume of the shell N at the radius R which is 4pR*c. The particle density behind the shock is higher by a factor of B B 7 which is the limiting compression possible by Newtonian shocks (assuming an adiabatic index of relativistic gas, i.e., 4/3). We estimate the pre-shock density of the particles in the shells as: [E/(cm c)]/(4p(dc)c*). We introduce c as the Lorentz factor of the internal shock. As this shock N is relativistic (but not extremely relativistic) c is of order of a few. Using Eq. (48) for the particle density n and the thermal energy density e behind the shocks we "nd 4E(c /2) n + 4pccm d* N "2;10 cm\E (c /2)c\ *\d\, (81) e "(c /2)n m c. (82) N We have de"ned here * "*/10 cm. Using Eq. (49) we "nd B "6;10GeEc\ *\d\(c /2). (83) Using Eqs. (49), (51), (56) and (81) we can estimate the typical synchrotron frequency from an internal shock. This is the synchrotron frequency of an electron with a `typicala Lorentz factor:
q B (hl ) " C " C cc"220 keV Ee 6A 7 mc C C ;f (c /2)c\ *\[c /(m /m )]. C N C The corresponding observed synchrotron cooling time is
(84)
(85) t " C "1.3;10\ s e\d * c E\(c /2)\. 6A 7 Using Eq. (53) we can express c in terms of c to estimate the minimal synchrotron frequency: C (hl ) " C "24 keV Eee A C ;f\ (c /2)c\ *\. (86)
634
T. Piran / Physics Reports 314 (1999) 575}667
The energy emitted by a `typical electrona is around 220 keV. The energy emitted by a `minimal energya electron is about one order of magnitude lower than the typical observed energy of &100 keV. This should correspond to the break energy of the spectrum. This result seems in a good agreement with the observations. But this estimate might be misleading as both e and e might be signi"cantly lower than unity. Still these values of (hl ) are remarkably close to the C observations. One might hope that this might explain the observed lower cuto! of the GRB spectrum. Note that a lower value of e or e might be compensated by a higher value of c . This is C advantageous as shocks with higher c are more e$cient (see Section 8.6.4). The synchrotron cooling time at a given frequency (in the observer's frame) is given by
hl \ t (hl)"2;10\ s e\ 100 keV ;d*c E(c /2)\. (87) We recover the general trend t J(hl)\ of synchrotron emission. However if (as we expect quite generally) this cooling time is much shorter than ¹ it does not determine the width of the observed peaks. It will correspond to the observed time scales if, for example, e is small. But then the `typicala photon energy will be far below the observed range. Therefore, it is not clear this relation can explain the observed dependence of the width of the bursts on the observed energy. 8.6.3. Inverse Compton in internal shocks The calculations of Section 8.4 suggest that the typical inverse Compton (IC) (actually synchrotron } self-Compton) radiation from internal shocks will be at energy higher by a factor c then the C typical synchrotron frequency. Since synchrotron emission is in the keV range and c +m /m , C N C the expected IC emission should be in the GeV or even TeV range. This radiation might contribute to the prompt very high energy emission that accompanies some of the GRBs [85]. However, if the magnetic "eld is extremely low: e &10\ then we would expect the IC photons to be in the observed &100 keV region: hl "800 keV (e /10\) d\*\c\ [c /(mp/me)]E[c /2]. (88) '!\ C Using Eqs. (71) and (83) we "nd that the cooling time for synchrotron-self-Compton in this case is (89) q "1 s e\(e /10\)\ d * c [c /(mp/me)]\E\(c /2)\. C C '! This is marginal. It is too large for some bursts and possibly adequate for others. It could possibly be adjusted by a proper choice of the parameters. It is more likely that if Inverse Compton is important then it contributes to the very high (GeV or even TeV) signal that accompanies the lower energy GRB (see also [253]). 8.6.4. Ezciency in internal shocks The elementary unit in the internal shock model (see Section 7.4) is a binary (two shells) encounter between a rapid shell (denoted by the subscript r) that catches up a slower one (denoted s). The two shells merge to form a single shell (denoted m). The system behaves like an inelastic collision between two masses m and m .
T. Piran / Physics Reports 314 (1999) 575}667
635
The e$ciency of a single collision between two shells was calculated earlier in Section 8.1.1. For multiple collisions the e$ciency depends on the nature of the random distribution. It is highest if the energy is distributed equally among the di!erent shells. This can be explained analytically. Consider a situation in which the mass of the shell, m is correlated with the (random) Lorentz G factor, c as m JcE. Let all the shells collide and merge and only then emit the thermal energy as G G G radiation. Using conservation of energy and momentum we can calculate the overall e$ciency: (90) e"1!RcE/(RcE\RcE>. G G G Averaging over the random variables c , and assuming a large number of shells NPR we obtain G
(c /c )E>!1 g(g#2) 1e2&1! . (91) g#1 [(c /c )E!1][(c /c )E>!1]
This formula explains qualitatively the numerical results: the e$ciency is maximal when the energy is distributed equally among the di!erent shells (which corresponds to g"!1). In a realistic situation we expect that the internal energy will be emitted after each collision, and not after all the shells have merged. In this case there is no simple analytical formula. However, numerical calculations show that the e$ciency of this process is low (less than 2%) if the initial spread in c is only a factor of two [32]. However, the e$ciency could be much higher [33]. The most e$cient case is when the shells have a comparable energy but very di!erent Lorentz factors. In this case (g"!1, and spread of Lorentz factor c /c '10) the e$ciency is as high as 40%.
For a moderate spread of Lorentz factor c /c "10, with g"!1, the e$ciency is 20%.
The e$ciency discussed so far is the e$ciency of conversion of kinetic energy to internal energy. One should multiply this by the radiative e$ciency, discussed in Section 8.5 (Eq. (72)) to obtain the overall e$ciency of the process. The resulting values may be rather small and this indicates that some sort of beaming may be required in most GRB models in order not to come up with an unreasonable energy requirement. 8.6.5. Summary } internal shocks Internal shocks provide the best way to explain the observed temporal structure in GRBs. These shocks, that take place at distances of &10 cm from the center, convert 2}20% of the kinetic energy of the #ow to thermal energy. Under reasonable conditions the typical synchrotron frequency of the relativistic electrons in the internal shocks is around 100 keV, more or less in the observed region. Internal shocks require a variable #ow. The situation in which an inner shell is faster than an outer shell is unstable [254]. The instability develops before the shocks form and it may a!ect the energy conversion process. The full implications of this instability are not understood yet. Internal shocks can extract at most half of the shell's energy [32,33,70]. Highly relativistic #ow with a kinetic energy and a Lorentz factor comparable to the original one remains after the internal shocks. Sari and Piran [20] pointed out that if the shell is surrounded by ISM and collisionless shock occurs the relativistic shell will dissipate by `external shocksa as well. This predicts an additional smooth burst, with a comparable or possibly greater energy. This is most probably the source of the observed `afterglowa seen in some counterparts to GRBs which we discuss later. This leads to the internal}external scenario [255,20,26] in which the GRB itself is produced by an
636
T. Piran / Physics Reports 314 (1999) 575}667
internal shock, while the `afterglowa that was observed to follows some GRBs is produced by an external shock. The main concern with the internal shock model is its low e$ciency of conversion of kinetic energy to c-rays. This could be of order 20% under favorable conditions and signi"cantly lower otherwise. If we assume that the `inner enginea is powered by a gravitational binding energy of a compact object (see Section 10.1) a low e$ciency may require beaming to overcome an overall energy crisis. 8.7. Shocks with the ISM}external shocks We turn now to the interaction of a relativistic shell with the ISM. We have seen in Section 7.4 that external shocks cannot produce bursts with a complicated temporal structure. Still it is worthwhile to explore this situation. First, there are some smooth bursts that might be produced in this way. Second, one needs to understand the evolution of external shocks in order to see why they cannot satisfy the condition R /c4*. Third, it is possible that in some bursts emission is observed # from both internal and external shocks [256]. Finally, as we see in the following Section 9 the observed afterglow is most likely produced by external shocks. 8.7.1. Newtonian vs. relativistic reverse shocks The interaction between a relativistic #ow and an external medium depends, like in SNRs, on the Sedov length, l,(E/n m c). The ISM rest mass energy within a volume l equals the energy of N the GRB: E. For a canonical cosmological burst with E+10 erg and a typical ISM density n "1 particle/cm we have l+10 cm. A second length scale that appears in the problem is *, the width of the relativistic shell in the observer's rest frame. There are two possible types of external shocks [236]. They are characterized according to the nature of the reverse shock: Newtonian reverse shock (NRS) vs. relativistic reverse shock (RRS). If the reverse shock is relativistic (RRS) then it reduces signi"cantly the kinetic energy of each layer that it crosses. Each layer within the shell loses its energy independently from the rest of the shock. The energy conversion process is over once the reverse shock crosses the shell (see Fig. 13). A Newtonian or even mildly relativistic reverse shock (NRS) is comparatively weak. Such a shock reduces the energy of the layer that it crosses by a relatively small amount. Signi"cant energy conversion takes place only after the shock has crossed the shell several time after it has been re#ected as a rarefraction wave from the inner edge (see Fig. 12). The shell behaves practically like a single object and it loses its energy only by the time that it accumulates an external mass equal to M/c. The question which scenario is taking place depends on the parameters of the shell relative to the parameters of the ISM. As we see shortly it depends on a single dimensionless parameter m constructed from l, * and c: [236]: m,(l/*)c\.
(92)
As the shell propagates outwards it is initially very dense and the density ratio between the shell and the ISM, f,n /n , is extremely large (more speci"cally f'c). The reverse shock is initially Newtonian (see Eq. (43)). Such a shock converts only a small fraction of the kinetic energy to thermal energy. As the shell propagates the density ratio, f, decreases (like R\ if the width of the
T. Piran / Physics Reports 314 (1999) 575}667
637
shell is constant and like R\ if the shell is spreading). Eventually the reverse shock becomes relativistic at R where f"c. The question where is the kinetic energy converted depends on , whether the reverse shock reaches the inner edge of the shell before or after it becomes relativistic. There are four di!erent radii that should be considered. The following estimates assume a spherically symmetric shell, or that E and M are energy and rest mass divided by the fraction of a sphere into which they are launched. The reverse shock becomes relativistic at R , where , f"n /n "1: R "l/*c. (93) , Using the expression for the velocity of the reverse shock into the shell (Eq. (46)) we "nd that the reverse shock reaches the inner edge of the shell at R [236]: R "l*. (94) A third radius is R , where the shell collects an ISM mass of M/c [27,18]. For NRS this is where an A e!ective energy release occurs:
E l 5.4;10 cm En\c\, (95) R" " A c n m cc N where we de"ned n "n /1 particle/cm. Finally, we have R "dc (see Eq. (73)). The di!erent B radii are related by the dimensionless parameter m, and this determines the character of the shock: R /f"R /m"R m"R /m. B A , If m'1 then
(96)
R (R (R (R . (97) B A , The reverse shock remains Newtonian or at best mildly relativistic during the whole energy extraction process. The reverse shock reaches the inner edge of the shock at R while it is still Newtonian. At this stage a re#ected rarefraction wave begins to move forwards. This wave is, in turn, re#ected from the contact discontinuity between the shell's material and the ISM material, and another reverse shock begins. The overall outcome of these waves is that in this case the shell acts as a single #uid element of mass M+E/cc that is interacting collectively with the ISM. It follows from Eq. (39) that an external mass m"M/c is required to reduce c to c/2 and to convert half of the kinetic energy to thermal energy. Energy conversion takes place at R . Comparison of A R with R (Eq. (27)) shows that the optical depth is much smaller than unity. A C If the shell propagates with a constant width then R /m"R "(mR (see Fig. 25) and for m'1 , A the reverse shock remains Newtonian during the energy extraction period. If there are signi"cant variations in the particles velocity within the shell it will spread during the expansion. If the typical variation in c is of the same order as c then the shell width increases like R/c. Thus * changes with time in such a manner that at each moment the current width, *(t), satis"es *(t)&max[*(0),R/c]. This delays the time that the reverse shock reaches the inner edge of the shell and increases R . It also reduces the shell's density which, in turn, reduces f and leads to a decrease in R . The overall , result is a triple coincidence R +R +R with a mildly relativistic reverse shock and a signi"cant , A energy conversion in the reverse shock as well. This means that due to spreading a shell which begins with a value of m'1 adjusts itself so as to satisfy m"1.
638
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 25. (a) Schematic description of the di!erent radii for the case m'1. The di!erent distances are marked on a logarithmic scale. Beginning from the inside we have *R, the initial size of the shell, R , the radius in which a "reball E becomes matter-dominated (see the following discussion), R , the radius where inner shells overtake each other and A collide, R , where the reverse shock reaches the inner boundary of the shell, and RC, where the kinetic energy of the shell is converted into thermal energy. (b) Same as (a) for m(1. RC does not appear here since it is not relevant. R marks the , place where the reverse shock becomes relativistic. From [31].
For m51 we "nd that ¹ &¹ &R /c'*. Therefore, NRS can produce only smooth A bursts. The bursts' duration is determined by the slowing down time of the shell. In Section 7 we have shown that only one time scale is possible in this case. Given the typical radius of energy conversion, R , this time scale is A d¹+¹ +R /(cc)+R /(cc)+170 s En\c\. A # A
(98)
If c or * are larger then m(1. In this case the order is reversed: R (R (R . , A
(99)
The reverse shock becomes relativistic very early (see Fig. 25). Since c "c c the relativistic reverse shock converts very e$ciently the kinetic energy of the shell to thermal energy. Each layer of the shell that is shocked loses e!ectively all its kinetic energy at once and the time scale of converting the shell's kinetic energy to thermal energy is the shell crossing time. The kinetic energy is consumed at R , where the reverse shock reaches the inner edge of the shell. Using Eq. (94) for R and Eq. (45) we "nd that at R c "c "(l/*). #
(100)
Note that c is independent of c. The observed radial or angular time scales are # ¹ +¹ +R /c+*/c"30 s * . #
(101)
Thus, even for RRS we "nd that d¹&¹ and there is only one time scale. This time scale depends only on * and it is independent of c! Spreading does not a!ect this estimate since for m(1 spreading does not occur before the energy extraction. In the following discussions we focus on the RRS case and we express all results in terms of the parameter m. By setting m(1 in the expressions we obtain results corresponding to RRS, and by
T. Piran / Physics Reports 314 (1999) 575}667
639
choosing m"1 in the same expressions we obtain the spreading NRS limit. We shall not discuss the case of non-spreading NRS (m1), since spreading will always bring these shells to the mildly relativistic limit (m&1). Therefore, in this way, the same formulae are valid for both the RRS and NRS limits. If m'1 it follows from Eq. (97) that internal shocks will take place before external shocks. If m(1 then the condition for internal shocks R (R becomes Eq. (74): m'f. As we have seen B earlier (see Section 8.6.1) this sets an upper limit on c for internal shocks. 8.7.2. Physical conditions in external shocks The interaction between the outward moving shell and the ISM takes place in the form of two shocks: a forward shock that propagates into the ISM and a reverse shock that propagates into the relativistic shell. This results in four distinct regions: the ISM at rest (denoted by the subscript 1 when we consider properties in this region), the shocked ISM material which has passed through the forward shock (subscript 2 or f ), the shocked shell material which has passed through the reverse shock (3 or r), and the unshocked material in the shell (4); see Fig. 21. The nature of the emitted radiation and the e$ciency of the cooling processes depend on the conditions in the shocked regions 2 and 3. Both regions have the same energy density e. The particle densities n and n are, however, di!erent and hence the e!ective `temperaturesa, i.e. the mean Lorentz factors of the random motions of the shocked protons and electrons, are di!erent. The bulk of the kinetic energy of the shell is converted to thermal energy via the two shocks at around the time the shell has expanded to the radius R . At this radius, the conditions at the forward shock are as follows: c "cm, n "4c n , e "4cn m c, N while at the reverse shock we have
(102)
c "m\, c "cm, n "4mcn , e "e . (103) Substitution of c "c "cm in Eq. (49) yields (104) B"(32pcecmmn"(40 G)emc n. N If the magnetic "eld in region 2 behind the forward shock is obtained purely by shock compression of the ISM "eld, the "eld would be very weak, with e 1. Such low "elds are incompatible with observations of GRBs. We therefore consider the possibility that there may be some kind of a turbulent instability which may bring the magnetic "eld to approximate equipartition. In the case of the reverse shock, magnetic "elds of considerable strength might be present in the pre-shock shell material if the original exploding "reball was magnetic. The exact nature of magnetic "eld evolution during "reball expansion depends on several assumptions. Thompson [227] found that the magnetic "eld will remain in equipartition if it started o! originally in equipartition. MeH szaH ros et al. [243], on the other hand, estimated that if the magnetic "eld was initially in equipartition then it would be below equipartition by a factor of 10\ by the time the shell expands to R . It is uncertain which, if either, is right. As in the forward shock, an instability could boost the "eld back to equipartition. Thus, while both shocks may have e 1 with pure #ux freezing, both could achieve e P1 in the presence of instabilities. In principle, e could be di!erent for the two shocks, but we limit ourselves to the same e in both shocks.
640
T. Piran / Physics Reports 314 (1999) 575}667
In both regions 2 and 3 the electrons have a power-law distribution with a minimal Lorentz factor c given by Eq. (53) with the corresponding Lorentz factors for the forward and the C reverse shock. 8.7.3. Synchrotron cooling in external shocks The typical energy of synchrotron photons as well as the synchrotron cooling time depend on the Lorentz factor c of the relativistic electrons under consideration and on the strength of C the magnetic "eld. Using Eq. (53) for c we "nd the characteristic synchrotron energy for the C forward shock: (hl ) " C "160 keV ee(c /100)n C A
(105)
and t " C "0.085 s e\e\(c /100)\n\ . (106) A C The characteristic frequency and the corresponding cooling time for the `typicala electron are larger by a factor of [(p!2)/(p!1)] and shorter by a factor of [(p!2)/(p!1)], correspondingly. These photons seems to be right in the observed soft gamma-ray range. However, one should recall that the frequency calculated in Eq. (105) depends on the fourth power of c . An increase of the canonical c by a factor of 3 (that is c "300 instead of c "100) will yield a `typicala synchrotron emission at the 16 MeV instead of 160 keV. The Lorentz factor of a `typical electrona in the reverse shock is lower by a factor m. Therefore, the observed energy is lower by a factor m while the cooling time scale is longer by a factor m\. Alternatively, we can check the conditions in order that there are electrons with a Lorentz factor c( that be emitting soft gamma-rays with energies &100 keV. Using Eq. (56) we calculate c( : C C m chl hl "5;10e\ c\ mn\ . (107) c( " C C
q c B 100 keV C Electrons with c "c( are available in the shocked material if c (c( . This corresponds to the C C C C condition
hl e (80e\ c\ n\ CP @ 100 keV
(108)
in the reverse shock, and the condition
hl c\ m\n\ e D(0.8e\ C 100 keV
(109)
in the forward shock. Since by de"nition e 41, we see that the reverse shock always has electrons C with the right Lorentz factors to produce soft gamma-ray synchrotron photons. However, the situation is marginal in the case of the forward shock. If c'100 and if the heating of the electrons is e$cient, i.e. if e &1, then most of the electrons may be too energetic. Of course, as an electron C cools, it radiates at progressively softer energies. Therefore, even if c is initially too large for the
synchrotron radiation to be in soft gamma-rays, the same electrons would at a later time have c &c( and become visible. However, the energy remaining in the electrons at the later time will C C
T. Piran / Physics Reports 314 (1999) 575}667
641
also be lower (by a factor c( /c ), which means that the burst will be ine$cient. For simplicity, we
ignore this radiation. Substituting the value of c( from Eq. (107) into the cooling rate, Eq. (58), we obtain the cooling C time scale as a function of the observed photon energy to be
hl \ c\ n\ . t (hl)"1.4;10\ s e\ 100 keV
(110)
Eq. (110) is valid for both the forward and reverse shock, and is moreover independent of whether the reverse shock is relativistic or Newtonian. The cooling time calculated above sets a lower limit to the variability time scale of a GRB since the burst cannot possibly contain spikes that are shorter than its cooling time. However, it is unlikely that this cooling time actually determines the observed time scales. 8.8. The internal}external scenario Internal shocks can convert only a fraction of the total energy to radiation [32,33,70]. After the #ow has produced a GRB via internal shocks it will interact via an external shock with the surrounding medium [20]. This will produce the afterglow } a signal that will follow the GRB. The idea of an afterglow in other wavelengths was suggested earlier [17,18,21] but it was suggested as a follow up of the, then standard, external shock scenario. In this case, the afterglow would have been a direct continuation of the GRB activity and its properties would have scaled directly to the properties of the GRB. According to internal}external models (internal shocks for the GRB and external shocks for the afterglow) di!erent mechanisms produce the GRB and the afterglow. Therefore, the afterglow should not be scaled directly to the properties of the GRB. This was in fact seen in the recent afterglow observations [25,26]. In all models of external shocks the observed time satisfy tJR/c C and the typical frequency satis"es lJc. Since most of the emission takes place at practically the C same radius and all that we see is the variation of the Lorentz factor we expect quite generally [25]: lJt!ι. The small parameter ι re#ects the variation of the radius and it depends on the speci"c assumptions made in the model. We would expect that t /t &5 and t /t &300. The observations V A A of GRB970508 show that (t /t ) +10. This is in a clear disagreement with the single A external shock model for both the GRB and the afterglow. Under quite general conditions the initial typical synchrotron energy for either the forward or the reverse external shock may fall in the soft GRB band. In this case the initial stage of the afterglow might overlap the c-ray emission from the internal shock [256]. The result will be superposition of a rapidly varying signal on top of a long smooth and softening pulse. This possibility should be explored in greater detail.
9. Afterglow It is generally believed that the observed afterglow results from slowing down of a relativistic shell on the external ISM. The afterglow is produced, in this case, by an external shock. A second
642
T. Piran / Physics Reports 314 (1999) 575}667
Table 6 Afterglow models
Slow cooling Fast cooling
Adiabatic hydrodynamics
Radiative hydrodynamics
Arbitrary e C e (1 C
Impossible e +1 C
alternative is of `continuous emissiona. The `inner enginea that powers the GRB continues to emit energy for much longer duration with a lower amplitude (36) and may produce the earlier part ("rst day or two in GRB970228 and GRB970508) of the afterglow. It is most likely that both processes take place to some extent [26]. We discuss in this section theoretical models for the production of the afterglow focusing on the external shock model. 9.1. Hydrodynamics of a slowing down relativistic shell Within the external shock model there are several possible physical assumptions that one can make. The `standarda model assumes adiabatic hydrodynamics (energy losses are negligible and do not in#uence the hydrodynamics), slow cooling (the electrons radiate a small fraction of the energy that is generated by the shock) and synchrotron emission [17,18,21}23,250,47]. However, there are other possibilities. First, the electrons' energy might be radiated rapidly. In this case the radiation process is fast and the observed #ux is determined by the rate of energy generation by the shock. If the electrons carry a signi"cant fraction of the total internal energy fast cooling will in#uence the hydrodynamics which will not be adiabatic any more. In this case we have a radiative solution [25,24] which di!ers in its basic scaling laws from the adiabatic one. The di!erent possibilities are summarized in Table 6. 9.1.1. A simple collisional model We consider "rst a simple model for the slowing down of the shell. In this model the slowing down is described by a series of in"nitesimal inelastic collisions between the shell and in"nitesimal external masses. We assume a homogeneous shell described by its rest frame energy M (rest mass and thermal energy) and its Lorentz factor c. Initially, E "M cc . The shell collides with the surrounding matter. We denote the mass of the ISM that has already collided with the shell by m(R). As the shell propagates it sweeps up more ISM mass. Additional ISM mass elements, dm, which are at rest collides inelastically with the shell. Energy and momentum conservation yield dc/c!1"!dm/M ,
(111)
dE"(c!1) dm ,
(112)
and where dE is the thermal energy produced in this collision. We de"ne e as the fraction of the shock generated thermal energy (relative to the observer frame) that is radiated. The incremental total
T. Piran / Physics Reports 314 (1999) 575}667
643
mass satis"es dM"(1!e) dE#dm"[(1!e)c#e] dm .
(113)
These equations yields analytic relations between the Lorentz factor and the total mass of the shell: (c!1)(c#1)\C "(M/M )\ , (c !1)(c #1)\C
(114)
and between m(R) (and therefore R) and c.
A m(R) "!(c !1)(c #1)\C (c!1)\(c#1)\>C dc . M A
(115)
These relations completely describe the hydrodynamical evolution of the shell. Two basic features can be seen directly from Eq. (116). First, we can estimate the ISM mass m that should be swept to get signi"cant deceleration. Solving Eq. (116) with an upper limit c /2 and using c 1 we obtain the well-known result: a mass m M /(2c ) is required to reach c"c /2. Apparently, this result is independent of the cooling parameter e. A second simple result can be obtained in the limit that c c1: m(R)"(M /(2!e)c )(c/c )\>C ,
(116)
so that cJR\\C. For e"0 this yields the well-known adiabatic result pRnm cc"E , N
(117)
and cJR\ [241,18,22,23,250]. For e"1 this yields the completely radiative result pRnm ccc "E , N
(118)
and cJR\ [241,24,25]. For comparison with observations we have to calculate the observed time that corresponds to di!erent radii and Lorentz factors. The well-known formula t "R/2cc
(119)
is valid only for emission along the line of sight from a shell that propagates with a constant velocity. Sari [256] pointed out that as the shell decelerates this formula should be used only in a di!erential sense: dt "dR/2cc .
(120)
Eq. (120) should be combined with the relation (117) or (118) and integrated to get the actual relation between observed time and emission radius. For an adiabatic expansion, for example, this yields t "R/16cc [256]. Eq. (120) is valid only along the line of sight. The situation is complicated further if we recall that the emission reaches the observe from an angle of order
644
T. Piran / Physics Reports 314 (1999) 575}667
c\ around the line of sight. Averaging on all angles yields another numerical factor [257}259] and altogether we get t +R/c c c , (121) A where the value of the numerical factor, c , depends on the details of the solution and it varies E? between &3 and &7. Using Eqs. (121) and (117) or (118) we obtain the following relations between R,c and t:
R(t)
(3Et/pm nc), ad, N (4ct/¸)¸, rad,
(122)
c(t)
(3E/256pnm ct), ad, N (4ct/¸)\, rad,
(123)
where ¸,(3E/4pnm cc) is the radius where the external mass equals the mass of the shell. N One can proceed and use the relation between R and c and t (Eqs. (122) and (123)) to estimate the physical conditions at the shocked material using Eqs. (44). Then one can estimate the emitted radiation from this shock using Eqs. (56) and (57). However, before doing so we explore the Blandford}McKee self-similar solution [241], which describes more precisely the adiabatic expansion. This solution is inhomogeneous with a well-determined radial pro"le. The matter at the front of the shell moves faster than the average speed. This in#uences the estimates of the radiation emitted from the shell. 9.1.2. The Blandford}McKee self-similar solution Blandford and McKee [241] discovered a self-similar solution that describes the adiabatic slowing down of an extremely relativistic shell propagating into the ISM. Using several simpli"cations and some algebraic manipulations we rewrite the Blandford}McKee solution as [256] n(r,t)"4nc(t)[1#16c(t)(1!r/R)]\ , c(r,t)"c(t)[1#16c(t)(1!r/R)]\ , e(r,t)"4nm cc(t)[1#16c(t)(1!r/R)]\ , (124) N where n(r,t), e(r,t) and c(r,t) are, respectively, the density, energy density and Lorentz factor of the material behind the shock (not to be confused with the ISM density n) and c(t)"c(R(t)) is the Lorentz factor of material just behind the shock. n(r,t) and e(r,t) are measured in the #uid's rest frame while c(r,t) is relative to an observer at rest. The total energy in this adiabatic #ow equals E"E , the initial energy. The scaling laws of R(t) and c(t) that follow from these pro"les and from the condition that the total energy in the #ow equals E is
17Et "3.2;10 cm En\t , Q pm nc N 17E 1 "260En\t\ . c(t)" Q 4 pnm ct N R(t)"
(125)
T. Piran / Physics Reports 314 (1999) 575}667
645
The scalings 125 are consistent with the scalings 122 and 123 which were derived using conservation of energy and momentum. They provide the exact numerical factor that cannot be calculated by the simple analysis of Section 9.1.1. These equations can serve as a starting point for a detailed radiation emission calculation and a comparison with observations. The Blandford}McKee solution is adiabatic and as such it does not allow for any energy losses. With some simplifying assumptions it is possible to derive a self-similar radiative solution in which an arbitrary fraction of the energy generated by the shock is radiated away [260]. 9.2. Phases in a relativistic decelerating shell There are several phases in the deceleration of a relativistic shell: fast cooling (with either radiative or adiabatic hydrodynamics) is followed by slow cooling (with adiabatic hydrodynamics). Then if the shell is non-spherical its evolution changes and a phase of sideways expansion and much faster slow down begins when the Lorentz factor reaches h\ [261]. Finally, the shell becomes Newtonian when enough mass is collected and c+1. In the following, we estimate the time scale for the di!erent transitions. We de"ne c ,c e (m /m )c and t "(1#z)R/4c cc C AC N C R such that the factors c and c re#ect some of the uncertainties in the model. The canonical values of A R these factors are c +0.5 and c +1. A R The deceleration begins in a fast cooling phase. If e is close to unity than this cooling phase will C also be radiative. The "rst transition is from fast to slow cooling. There are several di!erent ways to estimate this transition. One can compare the cooling time scale to the hydrodynamic time scale; alternatively one can calculate the fast cooling rate (given by the rate of energy generation by the shell) and compare it to the slow cooling rate (given by the emissivity of the relativistic electrons). We have chosen here to calculate this time as the time when the `typical electrona cools } that is when l "l :
210 days eeE n , C t " 4.6 days eeEc\n, C
ad, rad.
(126)
All methods of estimating t give the same dependence on the parameters. However, the numerical factor is quite sensitive to the de"nition of this transition. If the solution is initially radiative the transition from fast to slow cooling and from a radiative hydrodynamics to adiabatic hydrodynamics takes place at t
"1.3 days Enee((1#z)/2)(c /0.5)c\(c /100)\ . \ C A R
(127)
During a radiative evolution the energy in the shock decreases with time. The energy that appears in Eqs. (122) in the radiative scalings is the initial energy. When a radiative shock switches to adiabatic evolution, it is necessary to use the reduced energy to calculate the subsequent adiabatic evolution. The energy E which one should use in the adiabatic regime is related to the initial E of the "reball by E "0.022e\e\E c\n\ . C
(128)
646
T. Piran / Physics Reports 314 (1999) 575}667
If the shell is not spherical and it has an opening angle h, then the evolution will change when c&h\ [261]. Earlier on, the jet expands too rapidly to expand sideways and it evolves as if it is a part of a spherical shell. After this stage the jet expands sideways and it accumulates much more mass and slows down much faster. This transition will take place, quite generally, during the adiabatic phase at t +0.5 days En\((1#z)/2)(h/0.1)c\ . (129) F R The shell eventually becomes non-relativistic. This happens at R+l" (4E /4pn m c) for an N adiabatic solution. This corresponds to a transition at t
+l/c+300 days En\ . (130) ,0 A radiative shell loses energy faster and it becomes non-relativistic at R"¸" l/c"(4E /4pn m cc ). This will take place at N t +65 days En\(c /100)\ . (131) ,0 However, the earlier estimate of the transition from fast to slow cooling suggests that the shell cannot remain radiative for such a long time. 9.3. Synchrotron emission from a relativistic decelerating shell We proceed now to estimate the expected instantaneous spectrum and light curve from a relativistic decelerating shell. The task is fairly simple at this stage as all the ground rules have been set in the previous sections. We limit the discussion here to a spherical shock propagating into a homogeneous external matter. We consider two extreme limits for the hydrodynamic evolution: fully radiative and fully adiabatic. If e is somewhat less than unity during C the fast cooling phase (t(t ) then only a fraction of the shock energy is lost to radiation. The scalings will be intermediate between the two limits of fully radiative and fully adiabatic discussed here. For simplicity, we assume that all the observed radiation reaches the observer from the front of the shell and along the line of sight. Actually to obtain the observed spectrum we should integrate over the shell's pro"le and over di!erent angles relative to the line of sight. A full calculation [262] of the integrated spectrum over a Blandford}McKee pro"le shows that that di!erent radial points from which the radiation reaches the observers simultaneously conspire to have practically the same synchrotron frequency and therefore they emit the same spectrum. Hence, the radial integration over a Blandford}McKee pro"le does not change the observed spectrum (note that this result di!ers from the calculation for a homogeneous shell [258]). On the other hand, the contribution from angles away from the line of sight is important and it shapes the observed spectrum, the light curve and the shape of the Afterglow (see Fig. 26). The instantaneous synchrotron spectra from a relativistic shock were described in Section 8.2.4. They do not depend on the hydrodynamic evolution but rather on the instantaneous conditions at the shock front, which determines the break energies l and l . The only assumption made
is that the shock properties are fairly constant over a time scale comparable to the observation time t.
T. Piran / Physics Reports 314 (1999) 575}667
647
Fig. 26. Calculated spectra or light curve from a Blandford}McKee solution. F (t) is plotted as a function of J
,Const.;lt. Thus if we consider a constant time this "gure yields the spectrum while if we consider a "xed frequency it yields the light curve. The solid curve depictes emission from the full fireball. The dashed line depictes the spectrum resultsing from emission along the line of sight. From [262].
Using the adiabatic shell conditions (Eqs. (122)}(123)), Eqs. (44) for the shock conditions, Eq. (56) for the synchrotron energy and Eq. (62) for the `cooling energya we "nd l "2.7;10 Hz e\E\n\t\ , l "5.7;10 Hz eeEt\ ,
C F "1.1;10 lJ eE nD\ , (132) J where t is the time in days, D "D/10 cm and we have ignored cosmological redshift e!ects. Fig. 22 depicts the instantaneous spectrum in this case. For a fully radiative evolution we "nd l "1.3;10 Hz e\E\c n\t\ , l "1.2;10 Hz eeEc\n\t\ ,
C F "4.5;10 lJeEc\nD\t\ , (133) J where we have scaled the initial Lorentz factor of the ejecta by a factor of 100: c ,c /100. These instantaneous spectra are also shown in Fig. 22. 9.3.1. Light curves The light curves at a given frequency depend on the temporal evolution of the break frequencies l and l and the peak power N P (c ) (see Eq. (66)). These depend, in turn, on how c and
C C N scale as a function of t. C
648
T. Piran / Physics Reports 314 (1999) 575}667
The spectra presented in Fig. 22 show the positions of l and l for typical parameters. In both
the adiabatic and radiative cases l decreases more slowly with time than l . Therefore, at
su$ciently early times we have l (l , i.e. fast cooling. At late times we have l 'l , i.e., slow
cooling. The transition between the two occurs when l "l at t (see Eq. (126)). At t"t , the
spectrum changes from fast cooling (Fig. 22a) to slow cooling (Fig. 22b). In addition, if e +1, the C hydrodynamical evolution changes from radiative to adiabatic. However, if e 1, the evolution C remains adiabatic throughout. Once we know how the break frequencies, l , l , and the peak #ux F vary with time, we can J calculate the light curve. Consider a "xed frequency (e.g. l"10l Hz). From the "rst two equations in [132,133] we see that there are two critical times, t and t , when the break
frequencies, l and l , cross the observed frequency l:
7.3;10\ days e\E\n\l\, t" 2.7;10\ days e\E\c n\l\,
ad, rad,
0.69 days eeEl\, ad, C t "
0.29 days eeEc\l\n\, rad. C
(134)
(135)
There are only two possible orderings of the three critical times, t , t , t , namely t 't 't and
t (t (t . We de"ne the critical frequency, l "l (t )"l (t ):
1.8;10e\e\E\n\ Hz, ad, C l " 8.5;10e\e\E\c n\ Hz, rad. C
(136)
When l'l , we have t 't 't and we refer to the corresponding light curve as the high
frequency light curve. Similarly, when l(l , we have t (t (t , and we obtain the low-frequency
light curve. Fig. 27a depicts a typical high-frequency light curve. At early times the electrons cool fast and l(l and l(l . Ignoring self-absorption, the situation corresponds to segment B in Fig. 22, and
the #ux varies as F &F (l/l ). If the evolution is adiabatic, F is constant, and F &t. J J J J In the radiative case, F &t\ and F &t\. The scalings in the other segments, J J which correspond to C, D, H in Fig. 22, can be derived in a similar fashion and are shown in Fig. 27a. Fig. 27b shows the low-frequency light curve, corresponding to l(l . In this case, there are four phases in the light curve, corresponding to segments B, F, G and H. The time dependences of the #ux are indicated on the plot for both the adiabatic and the radiative cases. For a relativistic electron distribution with a power distribution c\N the uppermost spectral part behaves like l\N. The corresponding temporal index (for adiabatic hydrodynamics) is !3p/4. In terms of the spectral index a, this yields the relation F Jt\?. Alternatively for slow cooling J there is also another frequency range (between l and l ) for which the spectrum is given by
l\N\ and the temporal decay is !3(p!1)/4. Now we have F Jt\?. Note that in both cases J there is a speci"c relation between the spectral index and the temporal index which could be tested by observations.
T. Piran / Physics Reports 314 (1999) 575}667
649
Fig. 27. Light curve due to synchrotron radiation from a spherical relativistic shock, ignoring the e!ect of selfabsorption. (a) The highfrequency case (l'l ). The light curve has four segments, separated by the critical times, t , t , t . The labels, B,C,D,H, indicate the correspondence with spectral segments in Fig. 22. The observed #ux varies A K with time as indicated; the scalings within square brackets are for radiative evolution (which is restricted to t(t ) and the other scalings are for adiabatic evolution. (b) The low-frequency case (l(l ). From [249].
9.3.2. Parameter xtting for GRBs from afterglow observation and GRB970508 Shortly after the observation of GRB970228, MeH szaH ros et al. [22] showed that the decline in the intensity in X-ray and several visual bands (from B to K) "t the afterglow model well (see Fig. 28). The previous discussion indicates that this agreement shows that the high-energy tail (or late time behavior) is produced by a synchrotron emission from a power law distribution. There are much more data on the afterglow of GRB970508. The light curves in the di!erent optical bands generally peak around two days. There is a rather steep rise before the peak which is
650
T. Piran / Physics Reports 314 (1999) 575}667
Fig. 28. Decline in the afterglow of GRB970228 in di!erent wavelength and theoretical model. From [22].
followed by a long power-law decay (see Figs. 6 and 7). In the optical band the observed power law decay for GRB970508 is !1.141$0.014 [153]. This implies for an adiabatic slow cooling model (which MeH szaH ros et al. [22] use) a spectral index a"!0.761$0.009. However, the observed spectral index is a"!1.12$0.04 [263]. A fast cooling model, for which the spectral index is p/2 and the temporal behavior is 3p/4!1/2, "ts the data better as the temporal power law implies that p"2.188$0.019 while the spectral index implies, consistently, p"2.24$0.08 [249,264] (see Fig. 29). Unfortunately, at present this "t does not tell us much about the nature of the hydrodynamical processes and the slowing down. Using both the optical and the radio data one can try to "t the whole spectrum and to obtain the unknown parameters that determine the "reball evolution [251,252]. Wijers and Galama [251] have attempted to do so using the spectrum of GRB970508. They have obtained a reasonable set of parameters. However, more detailed analysis [252] reveals that the solution is very sensitive to assumptions made on how to "t the observational data to the theoretical curve. Moreover, the initial phase of the light curve of GRB970508 does not "t any of the theoretical curves. This suggests that at least initially an additional process might be taking place. Because of the inability to obtain a good "t for this initial phase there is a large uncertainty in the parameters obtained in this way. No deviation in the observed decaying light curve from a single power law was observed for GRB970508, until it faded below the level of the surrounding nebula. This suggests that there was no signi"cant beaming in this case. If the out#ow is in the form of a jet the temporal behavior will change drastically when the opening angle of the jet equals 1/c [261].
T. Piran / Physics Reports 314 (1999) 575}667
651
Fig. 29. The X-ray to radio spectrum of GRB 970508 on May 21.0 UT (12.1 days after the event). The "t to the low-frequency part, a "0.44$0.07, is shown as well as the extrapolation from X-ray to optical (solid lines). \ %& The local optical slope (2.1}5.0 days after the event) is indicated by the thick solid line. Also indicated is the extrapolation F Jl\ (lines). Indicated are the rough estimates of the break frequencies l , l and l for May 21.0 UT. From [263]. J
9.4. New puzzles from afterglow observations Afterglow observations "t well the "reball picture that was developed for explaining the GRB phenomena. The available data is not good enough to distinguish between di!erent speci"c models. But in the future we expect to be able to distinguish between those models and even to be able to determine the parameters of the burst E and c (if the data is taken early enough), the surrounding ISM density and the intrinsic parameters of the relativistic shock e , e and p. Still the current data C is su$cient to raise new puzzles and present us with new questions. 䢇 Why afterglow accompany some GRBs and not others? X-ray, optical and radio afterglows have been observed in some bursts but not in others. According to the current model afterglow is produced when the ejecta that produced the GRB is shocked by the surrounding matter. Possible explanations to this puzzle invoke environmental e!ects. A detectable afterglow might be generated e$ciently in some range of ISM densities and ine$ciently in another. High ISM densities would slow down of the ejecta more rapidly. This could make some afterglows detectable and others undetectable. ISM absorption is another alternative. While most interstellar environments are optically thin to gamma-rays high density ISM regions can absorb and attenuate e$ciently X-rays and optical radiation.
652
T. Piran / Physics Reports 314 (1999) 575}667
䢇
Jets and the energy of GRB971214. How can we explain the 10 erg required for isotropic emission in GRB971214? As we discuss in the next section this amount is marginal for most models that are based on the formation of a compact source. This problem can be resolved if we invoke beaming, with h&0.1. However, such beaming would result in a break at the light curve when the local Lorentz factor would reach a value of 1/h. Such a break was not seen in other afterglows for which there are good data. Note that recently Perna and Loeb [266] inferred from the lack of radio transients that GRB beams cannot be very narrow. If typical GRBs are beamed, the beam width h should be larger than 63. 䢇 GRB980425 and SN1998bw. SN1998bw (and the associated GRB980425) is a factor of a hundred nearer than a typical GRB (which are expected to be at z&1). The corresponding (isotropic) gamma-ray energy, &5;10 erg, is four order of magnitude lower than a regular burst. This can be in agreement with the peak #ux distribution only if the bursts with such a low luminosity compose a very small fraction of GRBs. This leads naturally to the question is there an observational coincidence between GRBs and SNs? To which there are con#icting answers [267}270].
10. Models of the inner engine We turn now to the most di$cult part: the nature of the beast that produces the GRB-modeling of the inner engine. We examine a few general considerations in Section 10.1 and then we turn to the Binary Neutron star merger model in Section 10.2. 10.1. The `inner enginea The "reball model is based on an `inner enginea that supplies the energy and accelerates the baryons. This `enginea is well hidden from direct observations and it is impossible to determine what is it from current observations. Unfortunately, the discovery of afterglow does not shed additional direct light on this issue. However, it adds some indirect evidence from the association of the location of the bursts in star forming regions. Once the cosmological origin of GRBs was established we had two direct clues on the nature of the `inner enginea: the rate and the energy output. GRBs occur at a rate of about one per 10 years per galaxy [56] and the total energy is &10 erg. These estimates assume isotropic emission. Beaming with an angle h changes these estimates by a factor 4p/h in the rate and h/4p in the total energy involved. These estimates are also based on the assumption that the burst rate does not vary with cosmic time. The observations that GRB hosts are star-forming galaxies [16,130}133] indicates that the rate of GRBs may follow the star formation rate [195}197]. In this case the bursts are further and they take place at a lower rate and have signi"cantly higher energy output. The "reball model poses an additional constraint: the inner engine should be capable of accelerating &10\M to relativistic energies. One can imagine various scenarios in which > 10 erg are generated within a short time. The requirement that this energy should be converted to a relativistic #ow is much more di$cult as it requires a `cleana system with a very low but non zero baryonic load. This requirement suggests a preference for models based on electromagnetic energy transfer or electromagnetic energy generation as these could more naturally satisfy this condition (see [271,227,70,230]). PaczynH ski [46] has recently suggested a unique hydrodynamical model in
T. Piran / Physics Reports 314 (1999) 575}667
653
which 10 erg are dumped into an atmosphere with a decreasing density pro"le. This is a cosmological variant of Colgate's [265] galactic model. This would lead to an acceleration of fewer and fewer baryons and eventually to a relativistic velocities. Overall one could say that the `baryonic loada problem is presently the most bothersome open question in the `"reball modela. The recent realization that energy conversion is most likely via internal shocks rather than via external shocks provides additional information about the inner engine: The relativistic #ow must be irregular (to produce the internal shocks), it must be variable on a short time scale (as this time scale is seen in the variability of the bursts), and it must be active for up to a few hundred seconds and possibly much longer [36] } as this determines the observed duration of the burst. These requirements rule out all explosive models. The engine must be compact (&10 cm) to produce the observed variability and it must operate for a million light crossing times to produce a few hundred-second signals. There are more than a hundred GRB models [272]. At a certain stage, before BATSE, there were probably more models than observed bursts. Most of these models are, however, galactic and those have been ruled out if we accept the cosmological origin of GRBs. This leaves a rather modest list of viable GRB models: binary neutron star mergers } NS Ms } [35] (see also [273,274,53,275,276]), failed supernova [277], white dwarf collapse [271] and hypernova [46]. All these are based on the formation of a compact object of one type or another and the release of its binding energy. With a binding energy of &5;10 erg or higher, all these models have, in principle, enough energy to power a GRB. However they face similar di$culties in channeling enough energy to a relativistic #ow. This would be particularly di$cult if indeed 10 erg are needed, as some recent burst have indicated. PaczynH ski's hypernova is an exception as in this model all the energy is channeled initially to a non-relativistic #ow and only later a small fraction of it is converted to relativistic baryons. All these models are consistent with the possibility that GRBs are associated with star forming regions as the life time of massive stars is quite short and even the typical life time of a neutron star binary (&10 yr) is su$ciently short to allow for this coincidence. Other models are based on an association of GRBs with massive black holes associated with Quasars or AGNs in galactic centers (e.g. [278]). These are ruled out as all GRBs with optical afterglow are not associated with such objects. Furthermore, such objects do not appear in other small GRB error boxes searched by Schaefer et al. [128]. From a theoretical point of view it is di$cult to explain the observed energy and time scales with such objects. 10.2. NSMs: binary neutron star mergers Binary neutron star mergers (NSMs) [35] or, with a small variant: neutron star-black hole mergers [275] are probably the best candidates for GRB sources. These mergers take place because of the decay of the binary orbits due to gravitational radiation emission. A NSM results, most likely, in a rotating black hole [280]. The process releases +5;10 erg [281]. Most of this energy escapes as neutrinos and gravitational radiation, but a small fraction of this energy su$ces to power a GRB. The discovery of the famous binary pulsar PSR 1913#16 [48] demonstrated that this decay is taking place [279]. The discovery of other binary pulsars, and in particular of PSR 1534#12 [282], has shown that PSR 1913#16 is not unique and that such systems are common. These observations suggest that NSMs take place at a rate of +10\ events per year per galaxy [50}52]. This rate is comparable to the simple estimate of the GRB event rate (assuming no beaming and no cosmic evolution of the rate) [56,276,185].
654
T. Piran / Physics Reports 314 (1999) 575}667
It has been suggested [283,284] that most neutron star binaries are born with very close orbits and hence with very short lifetimes, (see, however, [285,286]). If this idea is correct, then the merger rate will be much higher. This will destroy, of course, the nice agreement between the rates of GRBs and NSMs. Consistency can be restored if we invoke beaming, which might even be advantageous as far as the energy budget is concerned. Unfortunately, the short lifetime of those systems, which is the essence of this idea means that at any given moment of time there are only about a hundred such systems in the Galaxy (compared to about 10 wider neutron star binaries). This makes it very hard to con"rm or rule out this speculation. We should be extremely lucky to detect such a system. It is not clear yet how NSMs form. The question is how does the system survive the second supernova event? The binary system will be disrupted if this explosion ejects more than half of its total mass. There are two competing scenarios for the formation of NSMs. In one scenario the "rst neutron star that forms sinks into the envelope of its giant companion and its motion within this envelope lead to a strong wind that carries away most of the secondary's mass. When the secondary reaches core collapse it has only a small envelope and the total mass ejected is rather small. In a second scenario the second supernova explosion is asymmetric. The asymmetric explosion gives a velocity of a few hundred km/s to the newborn neutron star. In a fraction of the cases this velocity is in the right direction to keep the binary together. Such a binary system will have a comparable center of mass velocity [28,287}289]. This second scenario has several advantages. First it explains both the existence of binary neutron stars and the existence of high velocity pulsars [287,290]. Second, and more relevant to GRBs, with these kick velocities these binaries could escape from their parent galaxy, provided that this galaxy is small enough. Such escaping binaries will travel a distance of &200 kpc(v/200 km/s) (¹/10 yr) before they merge. The GRB will occur when the system is at a distance of the order of hundred kpc from the parent galaxy. Clearly there is no `no hosta problem in this case [28]. While a NSM has enough energy available to power a GRB it is not clear how the GRB is produced. A central question is, of course, how does a NSM generate the relativistic wind required to power a GRB. Most of the binding energy (which is around 5;10 erg escapes as neutrinos [281]. Eichler et al. [35] suggested that about one thousandth of these neutrinos annihilate and produce pairs that in turn produce gamma-rays via llPe>e\Pcc. This idea was criticized on several grounds by di!erent authors. Jaroszynksi [291, 292] pointed out that a large fraction of the neutrinos will be swallowed by the black hole that forms. Davies et al. [280] and Ru!ert and Janka [293}295] who simulated neutron star mergers suggested that the central object will not be warm enough to produce a signi"cant neutrino #ux because the merger is nearly adiabatic [70]. The neutrinos are also emitted over a di!usion time of several seconds, too long to explain the rapid variations observed in GRB [80], but to short to explain the observed GRB durations. Wilson and Mathews [296,297] included approximate general relativistic e!ects in a numerical simulation of a neutron star merger. They found that the neutron stars collapse to a single black hole before they collide with each other. This again will suppress the neutrino emission from the merger. However, the approximation that they have used has been criticized by various authors and it is not clear yet that the results are valid. Others suggested that the neutrino wind will carry too many baryons. However, it seems that the most severe problem with this model stems from the fact that the prompt neutrino burst could produce only a single smooth pulse. This explosive burst is incompatible with the internal shocks scenario.
T. Piran / Physics Reports 314 (1999) 575}667
655
An alternative source of energy within the NSM is the accretion power of a disk that forms around the black hole [287,70,230]. Various numerical simulations of neutron star mergers [280,293}295] "nd that a &0.1M forms around the central black hole. Accretion of this disk on > the central black hole may take a few dozen seconds [70]. It may produce the wind needed to produce internal shocks that could produce, in turn a GRB. How can one prove or disprove this, or any other, GRB model? Theoretical studies concerning speci"c details of the model can, of course, make it more or less appealing. But in view of the fact that the observed radiation emerges from a distant region which is very far from the inner `enginea I doubt if this will ever be su$cient. It seems that the only way to con"rm any GRB model will be via detecting in time coincidence another astronomical phenomenon, whose source could be identi"ed with certainty. Unfortunately, while the recent afterglow observations take us closer to this target they do not tell us what are the sources of GRBs. We still have to search for additional signals. NSMs have two accompanying signals, a neutrino signal and a gravitational radiation signal. Both signals are extremely di$cult to detect. The neutrino signal could be emitted by some of the other sources that are based on a core collapse. Furthermore, with present technology detection of neutrino signals from a cosmological distance is impossible. On the other hand, the gravitational radiation signal has a unique characteristic form. This provides a clear prediction of coincidence that could be proved or falsi"ed sometime in the not too distant future when suitable gravitational radiation detectors will become operational. 10.3. Binary neutron stars vs. black hole}neutron star mergers We have grouped together binary neutron star mergers and black-hole neutron star mergers. At present several neutron star binaries are known while no black hole}neutron star binary was found. Still on theoretical grounds one should expect a similar rate for both events [50]. Some even suggest that there are more black hole}neutron star binaries than neutron star}neutron star binaries [286]. There is a lot of similarity between the two processes which are both driven by gravitational radiation emission and both result in a single black hole. First, unless the mass of the black hole is of rather small the neutron star will not be tidally disrupted before it is captured by the black hole. Even if such a tidal disruption will take place then while in the binary neutron star merger we expect a collision, in the black hole}neutron star merger we expect at most a tidal disruption followed by infall of the debris on the black hole. This could lead to a situation in which one of the two events will produce a GRB and the other will not. Presently, it is too di$cult to speculate which of the two is the right one. One should recall, however, that there is a marked di!erence between the gravitational signature of those events and thus hopefully when we discover a coincidence between a GRB and a gravitational radiation signal we would also be able to "nd which of the two mechanism is the right one.
11. Other related phenomena It is quite likely that other particles (in addition to c-rays) are emitted in these events. Let f be VA the ratio of energy emitted in other particles relative to c-rays. These particles will appear as a burst
656
T. Piran / Physics Reports 314 (1999) 575}667
accompanying the GRB. The total #uence of a `typicala GRB observed by BATSE, F is A 10\ erg/cm, and the #uence of a `stronga burst is about hundred times larger. Therefore, we should expect accompanying bursts with typical #uences of
F particles A f F "10\ V cm VA 10\ erg/cm
E \ V , GeV
(137)
where E is the energy of these particles. This burst will be spread in time and delayed relative to the V GRB if the particles do not move at the speed of light. Relativistic time delay will be signi"cant (larger than 10 s) if the particles are not massless and their Lorentz factor is smaller than 10! similarly a de#ection angle of 10\ will cause a signi"cant time delay. In addition to the prompt burst we should expect a continuous background of these particles. With one 10 erg GRB per 10 years per galaxy we expect &10 events per galaxy in a Hubble time (provided of course that the event rate is constant in time). This corresponds to a background #ux of
E particles A f F "3;10\ V cm s VA 10 erg
R 10\ yr/galaxy
E \ V . GeV
(138)
For any speci"c particle that could be produced one should calculate the ratio f and then VA compare the expected #uxes with #uxes from other sources and with the capabilities of current detectors. One should distinguish between two types of predictions: (i) predictions of the generic "reball model which include low-energy cosmic rays [223], UCHERs [298}300] and high-energy neutrinos [301] and (ii) predictions of speci"c models and in particular the NSM model. These include low-energy neutrinos [281] and gravitational waves [35,305]. 11.1. Cosmic rays Already in 1990, Shemi and Piran [223] pointed out that "reball model is closely related to Cosmic rays. A `standarda "reball model involved the acceleration of &10\M of baryons to > a typical energy of 100 GeV per baryon. Protons that leak out of the "reball will become low-energy cosmic rays. However, a comparison of the GRB rate (one per 10 yr per galaxy) with the observed low-energy cosmic rays #ux, suggests that even if f +1 this will amount only to !0\A 1}10% of the observed cosmic ray #ux at these energies. Cosmic rays are believed to be produced by SNRs. Since supernovae are 10 000 times more frequent than GRBs, unless GRBs are much more e$cient in producing Cosmic rays in some speci"c energy range their contribution will be swamped by the SNR contribution. 11.2. UCHERs } ultra-high-energy cosmic rays Waxman [298] and Vietri [299] have shown that the observed #ux of UCHERs (above 10 eV) is consistent with the idea that these are produced by the "reball shocks provided that f +1. Milgrom and Usov [300] pointed out that the error boxes of the two highest energy 3!\A UCHERs contain strong GRBs } suggesting an association between the two phenomena. The relativistic "reball shocks that appear in GRBs are among the few astronomical objects that satisfy
T. Piran / Physics Reports 314 (1999) 575}667
657
the conditions for shock acceleration of UCHERs. Waxman [302] has shown that the spectrum of UCHERs is consistent with the one expected from Fermi acceleration within those shocks. 11.3. High-energy neutrinos Waxman and Bahcall [301] suggested that collisions between protons and photons within the relativistic "reball shocks produce pions. These pions produce high-energy neutrinos with E &10 eV and f '0.1. The #ux of these neutrinos is comparable to the #ux of J J\A atmospheric neutrinos but those will be correlated with the position of strong GRBs. This signal might be detected in future km size neutrino detectors. 11.4. Gravitational waves If GRBs are associated with NSMs then they will be associated with gravitational waves and low-energy neutrinos. The spiraling in phase of a NSM produces a clean chirping gravitational radiation signal. This signal is the prime target of LIGO [303] and VIRGO, the two [304] large interferometers that are build now in the USA and in Europe. The observational scheme of these detectors is heavily dependent on digging deeply into the noise. Kochaneck and Piran [305] suggested that a coincidence between a chirping gravitational radiation signal from a neutron star merger and a GRB could enhance greatly the statistical signi"cance of the detection of the gravitational radiation signal. At the same time this will also verify the NSM GRB model. 11.5. Low energy neutrinos Most of the energy generated in any core collpase event and in particular in NSM is released as low-energy (&5}10 MeV) neutrinos [281]. The total energy is quite large&a few;10 erg, leading to f +10. However, this neutrino signal will be quite similar to a supernova J\A neutrino signal, and at present only galactic SN neutrinos can be detected. Supernovae are ten thousand times more frequent than GRBs and therefore low-energy neutrinos associated with GRB constitute an insigni"cant contribution to the background at this energy range. 11.6. Black holes An NSM results, inevitably, in a black hole [280]. Thus a direct implication of the NSM model is that GRBs signal to us (indirectly) that a black hole has just (with the appropriate time of #ight in mind) formed.
12. Cosmological implications Cosmological GRBs seem to be a relatively homogeneous population of sources with a narrow luminosity function (the peak luminosity of GRBs varies by less than a factor of 10 [185,188]) that is located at relatively high redshifts [56,306}308,185]. The universe and our Galaxy are transparent to MeV c-rays (see e.g. [309]). Hence GRBs constitute a unique homogeneous population of
658
T. Piran / Physics Reports 314 (1999) 575}667
sources which does not su!er from any angular distortion due to absorption by the Galaxy or by any other object. Could GRBs be the holy grail of Cosmology and provide us with the standard candles needed to determine the cosmological parameters H , X, and K? Lacking any spectral feature, there is no indication of the redshift of individual bursts. The available number vs. peak luminosity distribution is not suitable to distinguish between di!erent cosmological models even when the sources are perfect standard candles with no source evolution [185]. The situation might be di!erent if optical afterglow observations would yield an independent redshift measurement of a large number of bursts. If the GRB luminosity function is narrow enough this might allow us, in the future, to determine the cosmological closure parameter X using a peak-#ux vs. redshift diagram (or the equivalent more common magnitude}redshift diagram). For example a hundred bursts with a measured z are needed to estimate X with an accuracy of pX"0.2, if p /¸"1 [204]. * Currently, the rate of detection of bursts with counterparts is a few per year and of those detected until now only two have a measured redshift. This rate is far too low for any cosmological measurement. However, there is an enormous potential for improvements. For example, systematic measurements of the redshift of all bursts observed by BATSE (+300 per year) would yield an independent estimate of X, with pX"0.1, even if the luminosity function is wide (p /¸"0.9), * within one year. Direct redshift measurements would also enable us to determine the cosmological evolution of the rate of GRBs [204]. Most current cosmological GRB models suggest that the GRB rate follows (with a rather short time delay) the rate of star formation [310]. Consequently measurements of the rate of GRBs as a function of the redshift will provide an independent tool to study star formation and galactic evolution. It is also expected that the bursts' sources follow the matter distribution. Then GRBs can map the large scale structure of the Universe on scales that cannot be spanned directly otherwise. Lamb and Quashnock [311] have pointed out that a population of several thousand cosmological bursts should show angular deviations from isotropy on a scale of a few degrees. This would immediately lead to new interesting cosmological limits. So far there is no detected anisotropy in the 1112 bursts of the BATSE 3B catalog [312]. But the potential of this population is clear and quite promising. A more ambitious project would be to measure the multipole moments of the GRB distribution and from this to estimate cosmological parameters [58]. However, it seems that too many bursts are required to overcome the signal to noise ratio in such measurements. GRBs can also serve to explore cosmology as a background population which could be lensed by foreground objects [53]. While standard gravitational lensed object appears as several images of the same objects, the low angular resolution of GRB detectors is insu$cient to distinguish between the positions of di!erent images of a lensed GRB. However, the time delay along the di!erent lines of sight of a gravitationally lensed burst will cause such a burst to appear as repeated bursts with the same time pro"le but di!erent intensities from practically the same position on the sky. Mao [313] estimated that the probability for lensing of a GRB by a regular foreground galaxy is 0.04}0.4%. Hence, the lack of a con"rmed lensed event so far [240] is not problematic yet. In the future, the statistics of lensed bursts could probe the nature of the lensing objects and the dark matter of the Universe [54]. The fact that no lensed bursts have been detected so far is su$cient to rule out a critical density (X"1) of 10 M to 10 M black holes [55]. Truly, this was not the > > leading candidate for cosmological dark matter. Still this result is a demonstration of the power of
T. Piran / Physics Reports 314 (1999) 575}667
659
this technique and the potential of GRB lensing. The statistics of lensing depends on the distance to the lensed objects which is quite uncertain at present. The detection of a signi"cant number of counterparts whose redshift could be measured would improve signi"cantly this technique as well.
13. Summary and conclusions Some 30 years after the discovery of GRBs a generic GRB model is beginning to emerge. The observations of isotropy, peak #ux distribution and time dilation indicated that GRBs are cosmological. The measurement of a redshift provided a "nal con"rmation for this idea. All cosmological models are based on the "reball mode. The discovery of the afterglow con"rmed this general scheme. The Fireball-internal}external Shocks model seems to have the necessary ingredients to explain the observations. Relativistic motion, which is the key component of all Fireball Model provided the solution for the compactness problem. The existence of such motion was con"rmed by the radio afterglow observations in GRB970508. Energy conversion via internal shocks can produce the observed highly variable light curves while the external shock model agrees, at least qualitatively, with afterglow observations. There are some indications that the two kind of shocks might combine and operate within the GRB itself producing two di!erent components of the signal. This "reball model has some fascinating immediate implications on accompanying UCHER and high-energy neutrino signals. An observations of these phenomena in coincidence with a GRB could provide a "nal con"rmation of this model. In spite of this progress we are still far from a complete solution. There are many open questions that has to be resolved. Within the internal}external shocks model there is a nagging e$ciency problem in conversion of the initial kinetic energy to the observed radiation. If the overall e$ciency is too low the initial energy required might be larger than 10 erg and it is di$cult to imagine a source that could provide so much energy. Beaming might provide a solution to this energy crisis. However, so far there is no indication for the corresponding break in the afterglow light curve, which is essential in any relativistic beaming model when c&h\. These last two facts might be consistent if GRB970508 took place within a very low density ISM } an issue that should be explored further. Afterglow observations agree qualitatively, but not quantitatively with the model. Better observations and more detailed theoretical modeling are needed. Another nagging open question is what determines the appearance of afterglow. Why there was no X-ray afterglow in the very strong 970111? Why was optical afterglow observed in GRB970228 and in GRB970508 but it was not seen in others (in particular in GRB970828 [46])? Finally we turn to the GRB itself and wonder why is the observed radiation always in the soft c-ray band? Is there an observational bias? Are there other bursts that are not observed by current instruments? If there are none and we do observer all or most of the bursts why is the emitted radiation always in the soft gamma-ray range? Why it is insensitive to a likely variability in the Lorentz factors of the relativistic #ow and to variability of other parameters in the model. While there are many open questions concerning the "reball and the radiation emitting regions the "rst and foremost open question concerning GRBs is what are the inner engines that power GRBs? In spite of all the recent progress we still don't know what produces GRBs. My personal impression is that binary neutron mergers are the best candidates. But other models that are based
660
T. Piran / Physics Reports 314 (1999) 575}667
on the formation of a compact object and release a signi"cant amount of its binding energy on a short time scale are also viable. A nagging question in all these models is what produces the `observeda ultra-relativistic #ow? How are &10\M of baryons accelerated to an ultra> relativistic velocity with c&100 or larger? Why is the baryonic load so low? Why is it not lower? There is no simple model for that. An ingenious theoretical idea is clearly needed here. However, I believe that theoretical reasoning won't be enough and only observations can provide a "nal resolution of the questions what is, are the sources of GRBs? The binary neutron star merger model has one speci"c observational prediction: A coincidence between a (near by and therefore strong) GRB and a characteristic gravitational radiation signal. Luckily these events have a unique gravitational radiation signature. The detection of these gravitational radiation events is the prime target of three gravitational radiation detector that are being built now. Hopefully, they will become operational within the next decade and their observations might con"rm or rule out this model. Such predictions, of an independently observed phenomena, are clearly needed for all other competing models. GRBs seem to be the most relativistic phenomenon discovered so far. They involve a macroscopic relativistic motion not found elsewhere before. As cosmological objects they display numerous relativistic cosmological phenomena. According the the NSM model they are associated with the best sources for gravitational radiation emission and more than that they signal, though in directly, the formation of a new black hole.
Acknowledgements I thank E. Cohen, J. Granot, J.I. Katz, S. Kobayashi, R. Narayan, and R. Sari for many helpful discussions and D. Band and G. Blumenthal for helpful remarks. I thank Columbia University Physics Department and Marc Kamionkowski for hospitality while this manuscript was completed.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
R.W. Klebesadel, I.B. Strong, R.A. Olson, Astrophys. J. Lett. 182 (1973) L85. E.P. Mazets, S.V. Golenetskii, V.N. Ilyinskii, JETP Lett. 19 (1974) 77. T. Cline, Astrophys. J. Lett. 185 (1973) L1. R. Hillier, Gamma-Ray Astronomy, Clarendon Press, Oxford, England, 1984. P.V. Ramana Murthy, A.W. Wolfendale, Gamma-Ray Astronomy, Cambridge University Press, Cambridge, England, 1986. P. MeH szaH ros, High Energy Radiation from Magnetized Neutron Stars, The University of Chicago Press, Chicago, 1992. T.L. Cline, in: S.P. Maran (Ed.), The Astronomy & Astrophysics Encyclopedia, Van Nostrand Reinhold & Cambridge University Press, New York, 1992, p. 284. J.P. Luminet, in: J. Adouze, G. Israel (Eds.), The Cambridge Atlas of Astronomy, Cambridge University Press, Cambridge, 1992. C.A. Meegan et al., Nature 355 (1992) 143. E. Costa et al., Nature 387 (1997) 783. J. van Paradijs et al., Nature 386 (1997) 686. H.E. Bond, IAU circ. 6665, 1997.
T. Piran / Physics Reports 314 (1999) 575}667 [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56]
661
D.A. Frail et al., Nature 389 (1997) 261. M.R. Metzger et al., Nature 387 (1997) 878. S. Kulkarni et al., Nature 393 (1998) 35. S.G. Djorgovski et al., CGN notice 139, 1998. B. PaczynH ski, J. Rhoads, Astrophys. J. Lett. 418 (1993) L5. J.I. Katz, Astrophys. J. 422 (1994) 248. J.I. Katz, Astrophys. J. 432 (1994) L107. R. Sari, T. Piran, Astrophys. J. 485 (1997) 270. astro-ph/9701002. P. MeH szaH ros, M.J. Rees, Astrophys. J. 476 (1997) 232. A.M.J. Wijers, M.J. Rees, P. MeH szaH ros, Mon. Not. R. Astron. Soc. 288 (1997) L51. E. Waxman, Astrophys. J. Lett. 485 (1997) L5. M. Vietri, Astrophys. J. Lett. 478 (1997) L9. J.I. Katz, T. Piran, Astrophys. J. 490 (1997) 772. J.I. Katz, T. Piran, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp. AIP Conf. Proc. 428, AIP, New York, 1997. P. MeH szaH ros, M.J. Rees, Mon. Not. R. Astron. Soc. 258 (1992) 41p. R. Narayan, B. PaczynH ski, T. Piran, Astrophys. J. Lett. 395 (1992) L83. M.J. Rees, P. MeH szaH ros, Astrophys. J. Lett. 430 (1994) L93. B. PaczynH ski, G. Xu, Astrophys. J. 427 (1994) 709. T. Piran, in: J.N. Bahcall, J.P. Ostriker (Eds.), Some Unsolved Problems in Astrophysics, Princeton University Press, Princeton, NJ, 1997. R. Mochkovitch, V. Maitia, R. Marques, in: K. Bennett, C. Winkler (Eds.), Towards the Source of Gamma-Ray Bursts, Proc. 29th ESLAB Symp., p. 531. S. Kobayashi, T. Piran, R. Sari, Astrophys. J. 490 (1997) 92. R. Mochkovitch, F. Daigne, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts, 4th Huntsville Symp. AIP Conf. Proc. 428, AIP, New York, 1997. D. Eichler, M. Livio, T. Piran, D.N. Schramm, Nature 340 (1989) 126. J.I. Katz, T. Piran, R. Sari, Phys. Rev. Lett. 80 (1998) 1580. G.J. Fishman, C.A. Meegan, Annu. Rev. Astron. Astrophys. 33 (1995) 415. J.G. Fishman, PASP 107 (1995) 1145. M.S. Briggs, Astrophys. Space Sci. 231 (1995) 3. C. Kouveliotou, in M. Matsuoka, N. Kawai (eds), All-Sky X-ray Observations in the Next Decade, 1997 RIKEN, Japan, p. 201. D. Hartmann, Astron. Astrophys. Rev. 6 (1995) 225. W.S. Paciesas, G.J. Fishman (Eds.), AIP Conf. Proc., vol. 265, Gamma-Ray Bursts, AIP, New York, 1991. G.J. Fishman, J.J. Brainerd, K. Hurley (Eds.), AIP Conf. Proc., vol. 307, Gamma-Ray Bursts, Second Workshop, Huntsville, Alabama, AIP, New York, 1994. C. Kouveliotou, M.S. Briggs, G.J. Fishman (Eds.), AIP Conf. Proc., vol. 384, Gamma-Ray Bursts, 3rd Huntsville Symp., AIP, New York, 1995. C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts, 4th Huntsville Symp., AIP, New York. B. PaczynH ski, Astrophys. J. Lett. 494 (1998) L45 (see also astro-ph/9706232). P. MeH szaH ros, M.J. Rees, A.M.J. Wijers, astro-ph/9709273. R.A. Hulse, J.H. Taylor, Astrophys. J. 368 (1975) 504. J.P.A. Clark, D. Eardley, Astrophys. J. 215 (1977) 311. R. Narayan, T. Piran, A. Shemi, Astrophys. J. Lett. 379 (1991) L1. E.S. Phinney, Astrophys. J. 380 (1991) L17. E.P.J. van den Heuvel, D.R. Lorimer, Mon. Not. R. Astron. Soc. 283 (1996) L37. B. PaczynH ski, Astrophys. J. Lett. 308 (1986) L43. O.M. Blaes, R.L. Webster, Astrophys. J. 391 (1992) L66. R.J. Nemiro! et al., Astrophys. J. 414 (1993) 36. T. Piran, Astrophys. J. Lett. 389 (1992) L45.
662
T. Piran / Physics Reports 314 (1999) 575}667
[57] D.Q. Lamb, J.M. Quashnock, Astrophys. J. Lett. 415 (1993) L1. [58] T. Piran, A. Singh, Astrophys. J. 483 (1997) 552. [59] K. Hurley, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts, 4th Huntsville Symp. AIP Conf. Proc., vol. 428, AIP, New York. [60] G.J. Fishman et al., Astron. Astrophys. Suppl. 97 (1993) 17. [61] R. Klebesadel, J. Laros, E.E. Fenimore, Bull. Am. Astron. Soc. 16 (1984) 1016. [62] T.M. Koshut et al., Astrophys. J. 452 (1995) 145. [63] J. Laros et al., Proc. 19th ICRC, San Diego, CA, 1995, OG 1.1-2, 5. [64] T. Murakami et al., Nature 350 (1991) 592. [65] A. Yoshida et al., PASJ 41 (1989) 509. [66] P.N. Bhat et al., Nature 359 (1992) 217. [67] B.E. Schaefer, K.C. Walker, astro-ph/9810270. [68] K. Hurley, Astrophys. J. Suppl. 90 (1994) 857. [69] C. Meegan et al., IAU Circ. 6518, 1996. [70] J.I. Katz, Astrophys. J. 490 (1997) 633. [71] K. Hurley, in: W. Paciesas, G.J. Fishman (Eds.), Gamma Ray Bursts, AIP, New York, p. 3. [72] E.P. Mazet et al., Astrophys. Space Sci. 80 (1981) 3. [73] C. Kouveliotou et al., Astrophys. J. 413 (1993) L101. [74] D.Q. Lamb, C. Graziani, I.A. Smith, Astrophys. J. 413 (1993) L11. [75] S. Mao, R. Narayan, T. Piran, Astrophys. J. 420 (1994) 171. [76] R.W. Klebesadel, in: C. Ho, R.I. Epstein, E.E. Fenimore (Eds.), Gamma-Ray Bursts, Cambridge University Press, Cambridge, 1992, p. 161. [77] J.-P. Dezaley et al., in: G.J. Fishman, J.J. Brainerd (Eds.), Gamma-Ray Bursts, 2nd Huntsville Symp., AIP Conf. Proc. 307, AIP, New York, 1992. [78] E. Cohen, T. Kollat, T. Piran, Astro-ph/9406012, 1994. [79] T. Piran, in: E.P.J. van den Heuvel, J. van den Paradijs (Eds.), IAU Symp. 165 on Compact Stars in Binaries, The Hague, Netherlands, 15}19 Aug. 1995, Kluwer Publishing, Dordrecht, 489, 1996. [80] J.I. Katz, L.M. Canel, Astrophys. J. 471 (1996) 527. [81] K.C. Walker, B.E. Schaefer, E.E. Fenimor, astro-ph/9810271. [82] H.S. Park et al., Astrophys. J. Lett. (1997) submitted, astro-ph/9708130. [83] K.J. Hurley et al., Nature 372 (1994) 652. [84] B. Dingus et al., in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp., AIP Conf. Proc. 428 AIP, New York, 1997. [85] R.M. Kippen et al., in: C. Kouveliotou, M.S. Briggs, G.J. Fishman (Eds.), Gamma-Ray Bursts 3rd Huntsville Symp., AIP Conf. Proc. 384, AIP, New York, 1996. [86] D.L. Band et al., Astrophys. J. 413 (1993) 281. [87] B.E. Schaefer et al., in: G.J. Fishman, J.J. Brainerd (Eds.), Gamma-Ray Bursts 2nd Huntsville Symp. AIP Conf. Proc. 307 AIP, New York, 1992. [88] J. Greiner et al., Astron. Astrophys. 302 (1994) 1216. [89] B.E. Schaefer et al., Astrophys. J. 492 (1998) 696. [90] R.S. Mallozi et al., Astrophys. J. 454 (1995) 597. [91] E. Cohen, R. Narayan, T. Piran, Astrophys. J. 500 (1998) 888, astro-ph/9710064. [92] T. Piran, R. Narayan, in: C. Kouveliotou, M.S. Briggs, G.J. Fishman (Eds.), Gamma-Ray Bursts, 3rd Huntsville Symp., AIP Conf. Proc. 384, AIP, New York, 1995. [93] N.M. Llyods, V. Petrossian, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp. AIP Conf. Proc. 428, AIP, New York, 1997. [94] R. Lingenfelter, J. Higdon, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp. AIP Conf. Proc. 428, AIP, New York, 1997. [95] M. Harris, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp. AIP Conf. Proc. 428, AIP, New York, 1997. [96] J.N. Imamura, R.I. Epstein, Astrophys. J. 313 (1987) 711.
T. Piran / Physics Reports 314 (1999) 575}667
663
[97] E. Cohen et al., Astrophys. J. 480 (1997) 330. [98] R.D. Preece et al., in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp., AIP Conf. Proc. 428, AIP, New York, 1997. [99] S.V. Golenetskii et al., Nature 306 (1983) 451. [100] I.G. Mitrofanov et al., Sov. Astron. 28 (1984) 547. [101] J.P. Norris et al., Astrophys. J. 301 (1986) 213. [102] D. Band et al., in: W.S. Paciesas, G.J. Fishman (Eds.), Gamma-Ray Bursts, AIP Conf. Proc. 265, AIP, New York, 1991. [103] L.A. Ford et al., Astrophys. J. 439 (1995) 307. [104] E.E. Fenimore et al., Astrophys. J. Lett. 448 (1995) L101. [105] R. Sari, R. Narayan, T. Piran, Astrophys. J. 473 (1996) 204. [106] T. Murakami et al., Nature 335 (1988) 234. [107] E.E. Fenimore et al., Astrophys. J. Lett. 335 (1988) L71. [108] E. Mazets et al., Sov. Astron. Lett. 6 (1980) 372. [109] D.M. Palmer et al., Astrophys. J. Lett. 433 (1994) L77. [110] D.L. Band et al., Astrophys. J. 458 (1996) 746. [111] D.L. Band et al., Astrophys. J. 447 (1995) 289. [112] P. MeH szaH ros, M.J. Rees, Astrophys. J. Lett. (1993) in press, astro-ph/9804119. [113] G.M. Pendleton et al., in: C. Kouveliotou, M.S. Briggs, G.J. Fishman (Eds.), Gamma-Ray Bursts 3rd Huntsville Symp., AIP Conf. Proc. 384, AIP, New York, 1995. [114] C. Graziani, in: C. Kouveliotou, M.S. Briggs, G.J. Fishman (Eds.), Gamma-Ray Bursts 3rd Huntsville Symp., AIP Conf. Proc. 384, AIP, New York, 1995. [115] M. Briggs, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp., AIP Conf. Proc. 428, AIP, New York, 1997. [116] K. Hurley et al., Astron. Astrophys. Suppl. 97 (1993) 39. [117] T.L. Cline, Ann. N. Y. Acad. Sci. 262 (1975) 159. [118] S. Van Den Bergh, Astrophys. Space Sci. 97 (1983) 385. [119] B.R. Schaefer, in: G.J. Fishman, J.J. Brainerd (Eds.), Gamma-Ray Bursts 2nd Huntsville Symp., AIP Conf. Proc. 307, AIP, New York, 1992. [120] F.J. Vrba, in: C. Kouveliotou, M.S. Briggs, G.J. Fishman (Eds.), Gamma-Ray Bursts 3rd Huntsville Symp., AIP Conf. Proc. 384, AIP, New York. [121] B.R. Schaefer et al., Astrophys. J. 313 (1987) 226. [122] B.R. Schaefer et al., Astrophys. J. 340 (1989) 455. [123] B.R. Schaefer, Astrophys. J. 364 (1990) 590. [124] F.J. Vrba, D.H. Hartmann, M.C. Jennings, Astrophys. J. 446 (1995) 115. [125] C.B. Luginbuhl et al., Astrophys. Space Sci. 231 (1995) 289. [126] S.B. Larson, I.S. McLean, E.E. Becklin, Astrophys. J. Lett. 460, L95. [127] S.B. Larson, I.S. McLean, Astrophys. J. Lett. 491 (1997) 93L. [128] B.R. Schaefer et al., Astrophys. J. 489 (1997) 636. [129] D. Band, D.H. Hartmann, Astrophys. J. 493 (1998) 555. [130] D.W. Hogg, A.S. Fruchter, astro-ph9807262, 1998. [131] S.G. Djorgovski, astro-ph/9808188, 1998. [132] J.S. Bloom et al., astro-ph/9807315, 1998. [133] A. Fruchter et al., astro-ph/9807295, 1998. [134] L. Piro, L. Sacrsi, R. Butler, Proc. SPIE 2517 (1995) 169. 686. [135] K. Sahu et al., Nature 387 (1997) 476. [136] D. Lamb, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp. AIP Conf. Proc. 428, AIP, New York, 1997. [137] F. Frontera et al., 1997, preprint, astro-ph/9711279. [138] F. Frontera et al., IAU Circ. 6637, 1997. [139] T. Murakami et al., in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp., AIP Conf. Proc. 428, AIP, New York, 1997.
664 [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187]
T. Piran / Physics Reports 314 (1999) 575}667 T.J. Galama et al., Nature 387 (1997) 497. A. Fruchter et al., IAU Circ. 6747, 1997. D.A. Frail, S. Kulkarni, private communication, 1997. C. Kouveliotou et al., IAU Circ. No. 6660, 1997. L. Piro, IAU Circ. No. 6656, 1997. S.G. Djorgovski et al., IAU Circ. No 6655. S.G. Djorgovski et al., IAU Circ. No 6658. S.G. Djorgovski et al., IAU Circ. No 6660. M. Mignoli et al., IAU Circ. No 6661. C. Chevalier, S.A. Ilovaisky, IAU Circ. No 6663, 1997. G. Taylor et al., Nature 389 (1997) 263. E. Pian et al., Astrophys. J. Lett. 492 (1998) 103. P. Natarajan et al., New Astron. 2 (1997) 471. T.J. Galama et al., Astrophys. J. Lett. 497 (1998) L13. A.J. Castro-Tirado et al., Science 279 (1998) 1011. H. Pedersen et al., Astrophys. J. 496 (1998) 311. V.V. Sokolov et al., in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp., AIP Conf. Proc. 428 AIP, New York, 1997. J.S. Bloom, GCN Note C30, 1998. A.J. Castro-Tirado et al., IAUC 6848, 1998. V.V. Sokolov et al., Astron. Astrophys. 334 (1998) 117. J. Goodman, New Astron. 2 (1997) 449. R.A. Remillard et al., IAU Circ. 6726, 1997. D. Smith et al., IAU Circ. 6728, 1997. P.J. Groot et al., Astrophys. J. Lett. 493 (1998) L27, astro-ph/9711171. J. Heise et al., IAU Circ. 6787, 1997. J. Halpern et al., IAU Circ. 6788, 1997. C. Meegan et al., BATSE catalogue, 1998, http://www.batse.msfc.nasa.gov/data/grb/catalog/. T.J. Galama et al., Nature, submitted; astro-ph/9806175 (1998). R.M. Kippen et al., GCN Circ. no. 143, 1998. J. Quashnock, D. Lamb, Mon. Not. R. Astron. Soc. 265 (1993) L59. R. Narayan, T. Piran, Mon. Not. R. Astron. Soc. 265 (1993) L65. D.H. Hartmann et al., in: G.J. Fishman, J.J. Brainerd, K. Hurley (Eds.), AIP Conf. Proc., Gamma-Ray Bursts, Second Workshop, Huntsville, Alabama, 1993, 307, AIP, New York, 1994. C.A. Meegan et al., Astrophys. J. 434 (1995) 552. M. Tegmark et al., Astrophys. J. 466 (1996) 757. D.H. Hartmann, G.R. Blumenthal, Astrophys. J. 342 (1989) 521. T. Kollat, T. Piran, Astrophys. J. Lett. 467 (1996) 41L. D. Kompaneetz, B. Stern, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp., AIP Conf. Proc. 428, AIP, New York, 1997. K. Hurley et al., Astrophys. J. Lett. 479 (1997) 113. N. Schartel, H. Andernach, J. Greiner, Astron. Astrophys. 323 (1997) 659. M. Schmidt, J.C. Higdon, G. Jueter, Astrophys. J. Lett. 329 (1988) L85. G.N. Pendleton et al. (1995) in preparation (quoted in [39]). J.I. Katz, Astrophys. Space Sci. 197 (1992) 163. D.H. Hartmann et al., Astrophys. Space Sci. 231 (1995) 361. T.J. Loredo, I.M. Wassermann, Astrophys. J. Suppl. 96 (1995) 59. T.J. Loredo, I.M. Wassermann, Astrophys. J. Suppl. 96 (1995) 261. E. Cohen, T. Piran, Astrophys. J. 444 (1995) L25. S. Weinberg, General Relativity & Cosmology, Wiley, New York, 1972. R.E. Rutledge, L. Hui, W.H.G. Lewin, Mon. Not. R. Astron. Soc. 276 (1995) 753.
T. Piran / Physics Reports 314 (1999) 575}667 [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236]
665
J.M. Horack, A.G. Emslie, Astrophys. J. 428 (1994) 620. P. MeH szaH ros, A. MeH szaH ros, Astrophys. J. 449 (1995) 9. A. MeH szaH ros, P. MeH szaH ros, Astrophys. J. 466 (1996) 29. I. HorvaH th, P. MeH szaH ros, A. MeH szaH ros, Astrophys. J. 470 (1996) 56. D.E. Reichart, P. MeH szaH ros, Astrophys. J. 483 (1997) 597. E. Cohen, T. Piran, in: C. Kouveliotou, M.S. Briggs, G.J. Fishman (Eds.), Gamma-Ray Bursts, Huntsville, Alabama, 1995, AIP, New York, 1996. A.P. Kirshner et al., Astrophys. J. 88 (1983) 1285. T. Totani, Astrophys. J. Lett. 486 (1997) 71. K. Sahu et al., Astrophys. J. Lett. 489 (1997) L127. R.A.M.J. Wijers et al., Mon. Not. R. Astron. Soc. 294 (1998) 13. J. Kommers et al., astro-ph/9809300, 1988. S.J. Lilly et al., Astrophys. J. Lett. 460 (1996) L1. P. Madau et al., Mon. Not. R. Astron. Soc. 283 (1996) 1388. A.J. Connolly et al., Astrophys. J. Lett. 486, L11. M. Krumhotz, S.E. Thorsett, F.A. Harrison, astro-ph/9807117, 1998. E.E. Fenimore et al., Nature 366 (1993) 40. E. Cohen, T. Piran, Astrophys. J. Lett. 488 (1997) L7. R.J. Nemiro! et al., Astrophys. J. 423 (1994) 432. J.P. Norris et al., Astrophys. J. 439 (1995) 542. E.E. Fenimore, J.S. Bloom, 453 (1995) 25. M. Ruderman, Ann. N. Y. Acad. Sci. 262 (1975) 164. W.K.H. Schmidt, Nature 271 (1978) 525. P.W. Guilbert, A.C. Fabian, M.J. Rees, Mon. Not. R. Astron. Soc. 205 (1983) 593. B.J. Carigan, J.I. Katz, Astrophys. J. 399 (1992) 100. T. Piran, A. Shemi, Astrophys. J. Lett. 403 (1993) L67. E.E. Fenimore, R.I. Epstein, C.H. Ho, Astron. Astrophys. Suppl. 97 (1993) 59. E. Woods, A. Loeb, Astrophys. J. 383 (1995) 292. M.G. Baring, A.K. Harding, Astrophys. J. 491 (1997) 663. A. Loeb, Phys. Rev. D48 (1993) 3419. L. Woltjer, Astrophys. J. Lett. 146 (1966) 597. M.J. Rees, Mon. Not. R. Astron. Soc. 135 (1967) 345. I.F. Mirabel, L.F. Rodriguez, Nature 371 (1995) 46. J. Goodman, Astrophys. J. 308 (1986) L47. J.H. Krolik, E.A. Pier, Astrophys. J. 373 (1991) 277. P. MeH szaH ros, M.J. Rees, Astrophys. J. 405 (1993) 278. A. Shemi, T. Piran, Astrophys. J. 365 (1990) L55. B. PaczynH ski, Astrophys. J. 363 (1990) 218. T. Piran, In: G.J. Fishman, J.J. Brainerd, K. Hurley (Eds.), AIP Conf. Proc. 307, Gamma-Ray Bursts, Second Workshop, Huntsville, Alabama, 1993, AIP, New York, 1994, p. 495. R. Narayan, T. Piran, 1995, in preparation. C. Thompson, Mon. Not. R. Astron. Soc. 270 (1994) 480. V.V. Usov, Mon. Not. R. Astron. Soc. 267 (1994) 1035. V.V. Usov, M.V. Smolsky, Astrophys. J. 461 (1996) 858. P. Mesz, M.J. Rees, Astrophys. J. Lett. 482 (1997) L89. S. Kobayashi, T. Piran, R. Sari, astro-ph/9803217, 1998. T. Piran, A. Shemi, R. Narayan, Mon. Not. R. Astron. Soc. 263 (1993) 861. E.E. Fenimore, C. Madras, S. Nayakshin, Astrophys. J. 473 (1996) 998. E.E. Fenimore et al., astro-ph/9802200, 1998. R. Sari, T. Piran, Mon. Not. R. Astron. Soc. 287 (1997) 110. R. Sari, T. Piran, Astrophys. J. Lett. 455 (1995) 143.
666 [237] [238] [239] [240] [241] [242] [243] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] [255] [256] [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] [270] [271] [272] [273] [274] [275] [276] [277] [278] [279] [280] [281] [282] [283] [284] [285]
T. Piran / Physics Reports 314 (1999) 575}667 A. Shemi, Mon. Not. R. Astron. Soc. 269 (1994) 1112. N. Shaviv, A. Dar, Mon. Not. R. Astron. Soc. 277 (1995) 287. P. MeH szaH ros, The 17th Texas Symp. on Relativistic Astrophysics and Cosmology, 1995. R.J. Nemiro! et al., Astrophys. J. 432 (1994) 478. R.D. Blandford, C.F. McKee, Phys. Fluids 19 (1976) 1130. R.D. Blandford, C.F. McKee, Mon. Not. R. Astron. Soc. 180 (1976) 343. P. MeH szaH ros, P. Laguna, M.J. Rees, Astrophys. J. 415 (1993) 181. B.R. Schaefer et al., Astrophys. J. 492 (1997) 696. V.V. Usov, M.V. Smolsky, Phys. Rev. E57 (1998) 2267. J.G. Kirk, in: A.O. Benz, T.J.L. Courvoisier (Eds.), it Plasma Astrophysics. Springer, New York, 1994. A.M. Hillas, Annu. Rev. Astron. Astrophys. 22 (1984) 42. G.B. Rybicki, A.P. Lightman, Radiative Processes in Astrophysics, 1979. R. Sari, T. Piran, R. Narayan, Astrophys. J. Lett. 497 (1998) L41. E. Waxman, Astrophys. J. Lett. 489 (1997) L33. R.A.M.J. Wijers, T.J. Galama, astro-ph/9805341. J. Granot, T. Piran, R. Sari, astro-ph/9808007. R. Pilla, A. Loeb, Astrophys. J. Lett. 494 (1998) 167, astro-ph/9710219. E. Waxman, T. Piran, Astrophys. J. 433 (1994) L85. R. Sari, T. Piran, in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts 4th Huntsville Symp., AIP Conf. Proc. 428, AIP, New York, 1997. R. Sari, Astrophys. J. Lett. 489 (1997) 37. R. Sari, Astrophys. J. Lett. 494 (1998) L49. E. Waxman, Astrophys. J. Lett. 491 (1997) L19. A. Panaitescu, P. MeH szaH ros, Astrophys. J. 492 (1998) 683, astro-ph/9709284. E. Cohen, T. Piran, R. Sari, astro-ph/9803258. J.E. Rhoads, Astrophys. J. Lett. 487 (1997) L1. J. Granot, T. Piran, R. Sari, astro-ph/9806192. T.J. Galama et al., Astrophys. J. Lett., 1998, in press. astro-ph/9804191. T.J. Galama et al., Astrophys. J. Lett., 1998, in press. astro-ph/9804190. S.A. Colgae, Astrophys. J. 187 (1974) 333. R. Perna, A. Loeb, astro-ph/9810085. L. Wang, J.C. Wheeler, Astrophys. J. Lett. 584 (1998) 87, astro-ph/9806212. R.M. Kippen et al., astro-ph/9806364. J.S. Bloom et al., astro-ph/9807050. C. Graziani, D.Q. Lamb, G.H. Marion, asto-ph/9810374. V.V. Usov, Nature 357 (1992) 472. R.J. Nemiro!, Commun. Astrophys. 17 (1993) 189. S.I. Belinikov, I.D. Novikov, T.V. Perevodchikova, A.G. Polnarev, Sov. Astron. Lett. 10 (1984) 177. J. Goodman, A. Dar, S. Nussinov, Astrophys. J. Lett. 314 (1987) L7. B. PaczynH ski, Acta Astron. 41 (1991) 257. T. Piran, R. Narayan, A. Shemi, in: W.S. Paciesas, G.J. Fishman (Eds.), AIP Conf. Proc. 265, Gamma-Ray Bursts, Huntsville, Alabama, 1991, AIP, New York, 1992. p. 149. S.E. Woosley, Astrophys. J. 405 (1993) 273. B. Carter, Astrophys. J. Lett. 391 (1992) L67. J.H. Taylor, J.M. Weisberg, Astrophys. J. 253 (1982) 908. M.B. Davies, W. Benz, T. Piran, F.K. Thielemann, Astrophys. J. 431 (1994) 742. J.P.A. Clark, D. Eardley, Astrophys. J. 215 (1977) 311. A. Wolszczan, Nature 350 (1991) 688. A.V. Tutukov, L.R. Yungelson, Mon. Not. R. Astron. Soc. 268 (1994) 871. V.M. Lipunov, K.A. Postnov, M.E. Prokhorov, Mon. Not. R. Astron. Soc. 288, 245. L.R. Yungelson, S.F. Portegies Zwart, astro-ph/9801127, 1998.
T. Piran / Physics Reports 314 (1999) 575}667 [286] [287] [288] [289] [290] [291] [292] [293] [294] [295] [296] [297] [298] [299] [300] [301] [302] [303] [304] [305] [306] [307] [308] [309] [310] [311] [312] [313]
667
H.A. Bethe, G.E. Brown, astro-ph/9802084. A.G. Lyne, D.R. Lorimer, Nature 369 (1994) 127. N.E. White, J. van Paradijs, Astrophys. J. Lett. 473 (1996) L25. C. Fryer, V. Kalogera, Astrophys. J. (1997), in press. J.M. Cordes, D.F. Cherno!, astro-ph/9707308. M. Jaroszynksi, Acta Astron. 43 (1993) 183. M. Jaroszynksi, Astron. Astrophys. 305 (1996) 839. M. Ru!ert, H.-T. Janka, Astron. Astrophys. 307 (1996) L33. M. Ru!ert, H.-T. Janka, G. Schaefer, Astron. Astrophys. 311, 532. M. Ru!ert, H.-T. Janka, K. Takahashi, G. Schaefer, Astron. Astrophys. 319, 122. J.R. Wilson, G.J. Mathews, P. Marronetti, Phys. Rev. D 54 (1996) 1317. G.J. Mathews et al., in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts, 4th Huntsville Symp., AIP Conf. Proc. 428, AIP, New York, 1997. E. Waxman, Astrophys. J. Lett. 452 (1995) 1. M. Vietri, Astrophys. J. 453 (1995) 883. M. Milgrom, V. Usov, Astrophys. J. Lett. 449 (1995) L37. E. Waxman, J.N. Bahcall, Phys. Rev. Lett. 78 (1997) 2292. E. Waxman, Phys. Rev. Lett. 75 (1995) 386. A. Abramovichi et al., Science 256 (1992) 325. C. Bardachia et al., Nucl. Inst. A. 289 (1990) 518. C. Kochaneck, T. Piran, Astrophys. J. Lett. 417 (1993) L17. S. Mao, B. PaczynH ski, Astrophys. J. Lett. 388 (1992) L45. C.D. Dermer, Phys. Rev. Lett. 68 (1992) 1799. W.A.D.T. Wickramasinghe et al., Astrophys. J. Lett. 411 (1993) L55. A.A. Zdziarski, R. Svensson, Astrophys. J. 344 (1989) 551. M. Livio et al., in: C. Meegan, R. Preece, T. Koshut (Eds.), Gamma-Ray Bursts, 4th Huntsville Symp. AIP Conf. Proc. 428, AIP, New York, 1997. D.Q. Lamb, J.M. Quashnock, Astrophys. J. Lett. 415 (1993) L1. M. Tegmark, D.H. Hartmann, M.S. Briggs, C.A. Meegan, Astrophys. J. 486 (1995) 214. S. Mao, Astrophys. J. Lett. 389 (1992) L41.
Physics Reports 314 (1999) 669
Erratum
The physics and mathematics of the second law of thermodynamics (Physics Reports 310 (1999) 1}96)夽 Elliott H. Lieb , Jakob Yngvason Departments of Physics and Mathematics, Princeton University, Jadwin Hall, P.O. Box 708, Princeton, NJ 08544, USA Institut fu( r Theoretische Physik, Universita( t Wien, Boltzmanngasse 5, A 1090 Vienna, Austria
Due to an oversight by the Publisher, on page 74 an incorrect version of Fig. 8 has been printed. The correct "gure is reproduced below.
Fig. 8. This shows isotherms in the (;, <) plane near the triple point of a simple system. If one substituted pressure or temperature for ; or < as a coordinate then the full two-dimensional region would be compressed into a onedimensional region. In the triple point region the temperature is constant, which shows that the isotherms need not be one-dimensional curves.
夽
PII of the original article: S0370-1573(98)00082-9
0370-1573/99/$ - see front matter 1999 E.H. Lieb and J. Yngvason. Published by Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 9 ) 0 0 0 2 9 - 0