Preface
This monograph, based on the author’s 1997 EMS Lectures given at the University of Helsinki in May/June 1997, ...
7 downloads
457 Views
884KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Preface
This monograph, based on the author’s 1997 EMS Lectures given at the University of Helsinki in May/June 1997, outlines the Loeb measure construction (a way to construct rich measure spaces using Robinson’s nonstandard analysis) and discusses recent applications in stochastic fluid mechanics, stochastic calculus of variations (“Malliavin calculus” and related topics), and mathematical finance theory. The four lectures in Helsinki were designed for a general audience, as is the expanded version presented here. No previous knowledge of either nonstandard analysis or the fields of application is assumed, beyond the general knowledge of the working mathematician. The aim in Chapter 1 is to provide a brief but coherent account of the fundamentals of nonstandard analysis (NSA) and the Loeb construction that is sufficient to make sense of the applications of the later chapters. For each of these we have endeavoured to provide sufficient by way of introduction to the topics concerned to enable even the reader unfamiliar with them to appreciate the basic ideas of the field and then the particular contributions that can be made using NSA and Loeb measures. In fact, one of the major contributions that NSA has made to many fields of application is to aid in understanding of the basic ideas of that field1 – and it is hoped that among other things this will come over in this monograph. To cover both an introduction to NSA and Loeb measures together with applications to three advanced and diverse fields of current research in four lectures (and now in four corresponding chapters) is somewhat ambitious. Necessarily the treatment will omit many details. The present volume should be seen then as something of a trailer for in depth study of both Loeb measures and the way in which they can make useful contributions to mathematical research. The topics chosen for discussion in Chapters 2–4 are drawn mainly from work of the author in collaboration others, and bring together material that has mostly been published elsewhere but is scattered. In each 1
A classic example of this is Anderson’s construction of Brownian motion as an infinitesimal random walk, discussed in Chapter 3; at a more elementary level is Robinson’s original discovery of how NSA can be used to develop real analysis rigorously, using infinitesimals to make precise the informal ideas of differentiation and integration.
II
Preface
of the areas the applications include results that represent advances in the standard theory. The applications in Chapters 2–4 are in the three seemingly unrelated areas mentioned in the opening paragraph. The link between them, from the point of view of this monograph, is the common methodology of Loeb measure techniques. This stems of course from the fact that each field involves measures and integration – and in most cases there is the more specific common feature of stochastic analysis in a variety of guises. But there is also a less obvious unifying factor that is harder to pin down precisely. This involves the idea of passage to a limit in a very generalised way. NSA facilitates this because, having constructed or defined an object Xn , say, for each finite n in the standard world we have automatically in the nonstandard world an object XN for infinite natural number N (the meaning of this will be made precise in Chapter 1). Such N is called hyperfinite – that is, finite from the point of view of NSA but infinite in that N > n for all n ∈ N.2 The work is then to find some standard (real world) object associated with XN which will provide the solution to the problem in question. In the applications to fluid mechanics, for example (Chapter 2), the fundamental equations (PDEs and stochastic PDEs) are normally solved by solving finite dimensional approximations, which are ODEs and stochastic DEs, and then “passing to the limit”. In our approach, the passage to the limit is achieved by taking XN (in the terminology above) to be a solution for the (infinite but) hyperfinite dimensional approximation of dimension N . From the nonstandard solution XN a standard solution is obtained. The “Malliavin” calculus – treated in Chapter 3 – is at heart a kind of differential calculus for the Wiener space C0 [0, 1], thought of as a subspace of R[0,1] which is itself viewed as a product space generalising Rn , with its associated differential calculus. Conventional expositions do not make this so clear. In our approach we achieve the “passage to the limit” from Rn to C0 [0, 1] by considering ∗RN for infinite hyperfinite N . The Malliavin calculus can then be seen as a suitable projection of classical calculus in ∗RN onto C0 [0, 1]. Among other applications, this shows how Wiener √ measure “is” simply the uniform probability measure on the sphere S ∞ ( ∞), and allows a precise formulation of the experts’ intuition that the infinite √ dimensional Ornstein–Uhlenbeck process “is” Brownian motion on S ∞ ( ∞). The final applications, in Chapter 4, are in the field of modern mathematical finance theory, which, in common with the the previous topics, has stochastic analysis (particulary Brownian motion and Itˆ o integration) at its foundation. Here there is great interest in connecting the approach using financial models based on a discrete model of time, with the other main 2
As the reader may be aware, the starting point of NSA is to construct a field ∗ R ⊃ R that contains both infinitesimal and infinite elements – and this contains a corresponding extension ∗N of N. Then an infinite hyperfinite number N is an element of ∗N \ N.
Preface
III
approach, using continuous time. The latter is in some sense obtained by “passing to the limit” in discrete-time models – and again the NSA framework greatly facilitates this. In essence, if we have for each n ∈ N a discrete financial model Mn then we immediately have MN for infinite hyperfinite N . Then we can show that the continuous model is obtained as a suitable projection of MN , and as a result obtain some new powerful convergence results. Acknowledgments On a personal note, I should like to thank - the EMS for the kind invitation to give the 1997 EMS Lectures; - the Mathematics Department of the University of Helsinki, for their generous hospitality; - my collaborators Marek Capi´ nski, Jerry Keisler, Siu-Ah Ng, Ekkehard Kopp and Walter Willinger, who have worked jointly with him on much of the work that is described in these lectures. Many thanks too to my wife Mary, who has been so supportive, patient and encouraging over the many years during which the material surveyed here was developed. Hull, November 1999
Nigel Cutland
Contents
1
Loeb Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Nonstandard Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 The hyperreals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 The nonstandard universe . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 ℵ1 -saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4 Nonstandard topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Construction of Loeb Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Example: Lebesgue measure . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Example: Haar measure . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3 Example: Wiener measure . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Loeb measurable functions . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Loeb Integration Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Elementary Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 Lebesgue integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.2 Peano’s Existence Theorem . . . . . . . . . . . . . . . . . . . . . . . . 1.5.3 Itˆ o integration and stochastic differential equations . . .
1 1 2 2 7 10 11 13 16 17 17 19 20 23 23 24 27
2
Stochastic Fluid Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Function spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Functional formulation of the Navier–Stokes equations 2.1.3 Definition of solutions to the stochastic Navier–Stokes equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Nonstandard topology in Hilbert spaces . . . . . . . . . . . . . 2.2 Solution of the Deterministic Navier-Stokes Equations . . . . . . . 2.2.1 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Solution of the Stochastic Navier–Stokes Equations . . . . . . . . . 2.3.1 Stochastic flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Nonhomogeneous stochastic Navier–Stokes equations . . 2.4 Stochastic Euler Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Statistical Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 The Foias equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Construction of statistical solutions using Loeb measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29 29 30 31 32 33 34 36 37 39 40 40 42 42 43
VI
Contents
2.6
2.7 2.8
2.9
3
4
2.5.3 Measures by nonstandard densities . . . . . . . . . . . . . . . . . 2.5.4 Construction of statistical solutions using nonstandard densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.5 Statistical solutions for stochastic Navier–Stokes equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Attractors for Navier–Stokes Equations . . . . . . . . . . . . . . . . . . . . 2.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Nonstandard attractors and standard attractors . . . . . . 2.6.3 Attractors for 3-dimensional Navier–Stokes equations . Measure Attractors for Stochastic Navier–Stokes Equations . . Stochastic Attractors for Navier–Stokes Equations . . . . . . . . . . 2.8.1 Stochastic attractors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Existence of a stochastic attractor for the Navier–Stokes equations . . . . . . . . . . . . . . . . . . . . . . . . . . Attractors for 3-dimensional Stochastic Navier–Stokes Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44 45 46 46 46 48 49 50 52 52 53 55
Stochastic Calculus of Variations . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Flat Integral Representation of Wiener Measure . . . . . . . . . . . . 3.3 The Wiener Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 BM on the Wiener Sphere and the Infinite Dimensional O–U Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Malliavin Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Notation and preliminaries . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 The Wiener-Itˆo chaos decomposition . . . . . . . . . . . . . . . . 3.5.3 The derivation operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.4 The Skorohod integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.5 The Malliavin operator . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61 61 63 64 66
Mathematical Finance Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Cox-Ross-Rubinstein Models . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Options and Contingent Claims . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Pricing a claim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 The Black-Scholes Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 The Black-Scholes Model and Hyperfinite CRR Models . . . . . . 4.5.1 The Black-Scholes formula . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 General claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Convergence of Market Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Discretisation Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Further Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.1 Poisson pricing models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.2 American options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85 85 86 88 90 92 94 95 95 96 98 99 99 99
69 72 73 75 77 79 83
Contents
VII
4.8.3 Incomplete markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.8.4 Fractional Brownian motion . . . . . . . . . . . . . . . . . . . . . . . 100 4.8.5 Interest rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
1. Loeb Measures
1.1 Introduction Loeb measures, discovered by Peter Loeb in 1975 [71], are very rich standard measure spaces, constructed using nonstandard analysis (NSA). The range of fields in which they have found significant applications is vast, including measure and probability theory, stochastic analysis, differential equations (ordinary, partial and stochastic), functional analysis, control theory, mathematical physics, economics and mathematical finance theory. The richness of Loeb measures makes them good for - constructing measures with special properties (for example the rich probability spaces of Fajardo & Keisler [49, 50, 62]); - representing complex measures in ways that make them more manageable (for example Wiener measure) – see section 1.3.3 below; - modelling physical and other phenomena; - proving existence results in analysis – for example solving differential equations (DEs) of all kinds (including partial DEs, stochastic DEs and even stochastic partial DEs) and showing the existence of attractors. Later lectures will describe some recent uses of Loeb spaces that illustrate these themes – in fluid mechanics, in stochastic calculus of variations and related topics, and in mathematical finance theory. This lecture will outline the basic Loeb measure construction and give some simple applications, with a little of the theory of Loeb integration. From one point of view, Loeb measures are simply ultraproducts of previously given measure spaces, such as were considered in an early paper of DacunhaCastelle & Krivine [46]. The rˆ ole of NSA in their construction is to provide a systematic way to understand their properties, which opens the way for efficient and powerful applications; without this we would have a supply of very rich measure spaces but only ad hoc means to comprehend them. Necessarily these lectures will be somewhat informal and lacking in a great deal of detail. The aim is to convey some of the basic ideas and flavour of Loeb measures and how they work, as well as pointing to the literature where the topics can be pursued in depth. We must begin with a brief and informal look at NSA itself.
N.J. Cutland: LNM 1751, pp. 1–28, 2000. c Springer-Verlag Berlin Heidelberg 2000
2
1 Loeb Measures
1.2 Nonstandard Analysis 1.2.1 The hyperreals Nonstandard analysis (discovered by Abraham Robinson in 1960 [83]) begins with the construction of a richer real line ∗R called the hyperreals or nonstandard reals. This is an ordered field that extends the (standard) reals R in two main ways: (1) ∗R contains non-zero infinitesimal numbers; and (2) ∗R contains positive and negative infinite numbers. This is made precise by the following definitions (where | · | is the extension1 of the modulus function to ∗R). Definition 1.1 Let x ∈ ∗R. We say that (i) x is infinitesimal if |x| < ε for all ε > 0, ε ∈ R; (ii) x is finite if |x| < r for some r ∈ R; (iii) x is infinite if |x| > r for all r ∈ R. (iv) We say that x and y are infinitely close, denoted by x ≈ y, if x − y is infinitesimal. So x ≈ 0 is another way to say that x is infinitesimal. (v) The monad of a real number r is the set monad(r) = {x : x ≈ r} of hyperreals that are infinitely close to r. Thus monad(0) is the set of infinitesimals, and monad(r) = r + monad(0). Of course, once a field has non-zero infinitesimals, then there must be infinite elements also – these are the reciprocals of infinitesimals. It follows also that R is enriched in having, for each r ∈ R, new elements x with x ≈ r (taking x = r + δ where δ is infinitesimal). One way to construct ∗R is as an ultrapower of the reals ∗
R = RN /U
where U is a nonprincipal ultrafilter2 (or maximal filter) on N. That is, ∗R consists of equivalence classes of sequences of reals under the equivalence relation ≡U , defined by (an ) ≡U (bn )
⇐⇒
{n : an = bn } ∈ U.
Sets in U should be thought of as big sets, or more strictly U-big, with those not in U designated U−small. The ultrafilter property means that every 1 2
This takes its values in ∗R, and is defined just as in R, so that |x| = x if x ≥ 0 and |x| = −x if x < 0. A nonprincipal ultrafilter U on N is a collection of subsets of N that is closed under intersections and supersets, contains no finite sets, and for every set A ⊆ N has either A ∈ U or N \ A ∈ U .
1.2 Nonstandard Analysis
3
set is either U-big (those in U) or U-small (those not in U). It is convenient to use the terminology U-almost all to mean “for a set A of natural numbers with A ∈ U”. Using this terminology we can say that the equivalence relation ≡U identifies sequences (an ) and (bn ) that agree on a U-big set of indices n, or that agree U-almost always. We denote the equivalence class of a sequence (an ) by (an )U (sometimes the notation [(an )] is used instead). The reals R are identified with the equivalence classes of constant sequences, so that ∗R is then an extension of R. The algebraic operations +, × and the order relation < are extended to ∗R pointwise (after checking that this is safe); strictly the extensions should be denoted ∗+, ∗×, ∗<, but there is usually no ambiguity if the ∗ is dropped. It is almost immediate that an example of a non-zero infinitesimal is given by (1, 12 , 13 , . . .)/U. The way to picture ∗R is as follows (note that some features in this diagram are yet to be explained). Infinitesimal microscope
monad(r) = {x ∈ ∗R : x ≈ r}
r
infinite elements
∗
R r
0 1 2
N N +1
standard part mapping
U r
0 1 2
R The Hyperreals With the above construction of ∗R it is easy to prove: Theorem 1.2 (∗R, +, ×, <) is an ordered field. Exercise Prove this! Most of the field axioms follow easily from the fact that they hold at each co-ordinate of the representing sequences. The axiom
4
1 Loeb Measures
of inverses is not quite so obvious. If x = (an )U = 0 then we could have an = 0 for some indices n. As a hint, note that nevertheless we have an = 0 for U-almost all n, so we can define y = (bn )U with bn = a−1 n for those n. For the remaining U-small set of n define bn = 0. Now show that xy = 1. The axioms for an ordered field are proved in somewhat similar fashion. To see what else can be said about ∗R, first note that all functions f and relations R on R (including unary relations – that is, subsets of R) can be extended to ∗R pointwise – with the extensions denoted by ∗f and ∗R say.3 As an exercise the reader might like to show that the extension ∗| · | of the modulus function defined in this way is the same as that used in Definition 1.1; that is, if x = (an )U and y = ∗|x| = (|an |)U then y = x if x∗ ≥ 0 and y = −x otherwise. Further, show that x is finite (according to Definition 1.1) if there is some r ∈ R with |an | < r for U-almost all n, and x is infinitesimal if, for every real ε > 0 we have |an | < ε for U-almost all n. Important examples of extensions of relations include ∗N, ∗Z and ∗Q, the sets of hypernatural numbers, hyperintegers and hyperrationals respectively. A hyperrational number is thus an element x = (an )U with an ∈ Q for U-almost all n. It is not hard to see that the properties of functions f and relations R are transferred to (or inherited by) ∗f and ∗R – for example, if f is an injection, so is ∗f , and if R is an equivalence relation then so is ∗R. If f : A → B then ∗ f : ∗A → ∗B. Moreover, connections between functions and relations are also transferred – for example ∗sin2 x + ∗cos2 x = 1 for all x ∈ ∗R. The full extent of this idea is described neatly by the Transfer Principle discussed below. First let us write R = (R, (f )f ∈F , (R)R∈R ) for the full structure with domain R together with every possible function and relation on it, and then write ∗
R = (∗R, (∗f )f ∈F , (∗R)R∈R ).
The following fundamental result gives the complete picture as to which properties of R are inherited by (or transferred to) ∗R. Theorem 1.3 (Transfer Principle) Let ϕ be any first order statement. Then ∗ ϕ holds in R ⇐⇒ ϕ holds in ∗R A first order statement ϕ (respectively ∗ϕ) is one that refers to elements of R (respectively ∗R), both fixed and variable, and to fixed relations and functions f, R (respectively ∗f, ∗R). First order statements can use the usual 3
By the pointwise extension of a binary relation R ⊂ R × R, say, we mean that ((an )U , (bn )U ) ∈ ∗R ⇔ (an , bn ) ∈ R for U -almost all n; so ∗R ⊂ ∗R × ∗R. It is easy to see that this is equivalent to defining ∗R using a pointwise extension of the characteristic function – i.e. χ∗R ((an )U , (bn )U ) = (χR (an , bn ))U .
1.2 Nonstandard Analysis
5
logical connectives of mathematics, namely and (symbolically ∧), or (∨), implies (→) and not (¬). Moreover, we can quantify over elements (∀x, ∃y) but not over relations or functions (so ∀f, ∃R are not allowed). Here are some illustrations of this. 1. Density of the rationals in the reals. The density of the rationals in the reals can be expressed by a first order statement ϕ that is a formal version of the following. Between every two distinct reals there is a rational. We could for example take ϕ as the statement ∀x∀y (x < y → ∃z(z ∈ Q ∧ (x < z < y))) The transfer principle tells us that ∗ϕ holds in ∗R which means that the hyperrationals are dense in the hyperreals. 2. Discreteness of the ordering of the integers. This can be expressed by a first order statement ψ which is a formal version of the following. Every n ∈ Z has an immediate predecessor and successor in Z . The Transfer Principle tells us that ∗ψ holds in ∗R which means that Every n ∈ ∗Z has an immediate predecessor and successor in ∗Z . Thus the discreteness of Z is inherited by ∗Z. The reader is invited to check that the immediate predecessor and successor of a hyperinteger (mn )U ∈ ∗Z are given by (mn ∓ 1)U . Likewise the density of the hyperrationals can be established quite easily from first principles (if x = (an )U and y = (bn )U then take z = (cn )U with an < cn < bn where possible). The proof of the Transfer Principle is just a generalisation of the procedure involved in a direct verification of these two examples. The Transfer Principle itself avoids the need to verify properties of ∗R on an ad hoc basis, and instead gives us all properties from the beginning. A key result that allows us to get back to R from ∗R (and extends to more general topological situations) is the following (recall the definition 1.1 of a finite hyperreal). Theorem 1.4 (Standard Part Theorem) If x ∈ ∗R is finite, then there is a unique r ∈ R such that x ≈ r; i.e. any finite hyperreal x is uniquely expressible as x = r + δ with r a standard real and δ infinitesimal. Proof Put r = sup{a ∈ R : a ≤ x} = sup A, say. The set A is nonempty and bounded above (in R) since x is finite, and so the least upper bound r exists. It is routine to check that |x − r| < ε for every real ε > 0. Definition 1.5 (Standard Part) If x is a finite hyperreal the unique real r ≈ x is called the standard part of x.
6
1 Loeb Measures
For a finite hyperreal x ∈ ∗R there are two notations (both useful) for the standard part of x: ◦
x = st(x) = the standard part of x.
On occasions, when considering extended real valued functions (with values in R = R ∪ {−∞, ∞}), it is convenient to write ◦x = ±∞ if x is positive (resp. negative) infinite. Remark The Standard Part Theorem is equivalent to the completeness of R. The next two theorems illustrate the way in which real analysis develops using the additional structure of ∗R. For the sake of completeness we give brief proofs that provide a flavour of the nonstandard methodology, and especially the use of the Transfer Principle. For a full account of the development of real analysis using infinitesimals, see any of the references [30, 54, 47, 56, 58, 60, 69]. For the following, note that since a sequence s = (sn )n∈N of reals is just a function s : N → R, its nonstandard extension ∗s = (sn )n∈∗N is simply a function ∗s : ∗N → ∗R. Theorem 1.6 Let (sn ) be a sequence of real numbers and let l ∈ R. Then sn → l as n → ∞
⇐⇒
∗
sK ≈ l for all infinite K ∈ ∗N.
Proof Suppose first that sn → l, and fix infinite K ∈ ∗N. We have to show that |∗sK − l| < ε for all real ε > 0. For any such ε there is a number n0 ∈ N such that the following holds in R: ∀n ∈ N[n ≥ n0 → |sn − l| < ε] The Transfer Principle now tells us that ∀N ∈ ∗N[N ≥ n0 → |∗sN − l| < ε] is true in ∗R. In particular taking N = K we see that |∗sK − l| < ε as required. Conversely, suppose that ∗sK ≈ l for all infinite K ∈ ∗N. Then, for any given real ε > 0, we have ∃K ∈ ∗N ∀N ∈ ∗N[N ≥ K → |∗sN − l| < ε] The Transfer Principle applied to this statement shows that in R: ∃k ∈ N∀n ∈ N[n ≥ k → |sn − l| < ε] Taking n0 to be any such k proves that sn → l.
For the next result note that if f is a real function defined on the open interval ]a, b[ then ∗f is defined on the hyperreal interval ∗]a, b[= {x ∈ ∗R : a < x < b}, and takes values in ∗R.
1.2 Nonstandard Analysis
7
Theorem 1.7 Let c ∈]a, b[ (where a, b, c ∈ R) and f :]a, b[→ R. Then f is continuous at c
⇐⇒
∗
f (z) ≈ f (c) whenever z ≈ c in ∗R.
Proof The proof is very similar to that of Theorem 1.6. Suppose first that f is continuous at c, and fix a hyperreal z ≈ c. We have to show that |∗f (z) − f (c)| < ε for all real ε > 0. For any such ε there is a number 0 < δ ∈ R such that the following holds in R: ∀x[|x − c| < δ → |f (x) − f (c)| < ε] The Transfer Principle now tells us that ∀X[|X − c| < δ → |∗f (X) − f (c)| < ε] is true in ∗R. In particular taking X = z we see that |∗f (z) − f (c)| < ε as required. Conversely, suppose that |∗f (z) − f (c)| ≈ 0 for all z ≈ c in ∗R. Let a real ε > 0 be given. Then taking Y to be any positive infinitesimal the following holds in ∗R: ∃Y ∀X[|X − c| < Y → |∗f (X) − f (c)| < ε] The Transfer Principle applied to this statement gives, in R: ∃y∀x[|x − c| < y → |f (x) − f (c)| < ε] Taking δ to be any such y shows that f is continuous at c as required. Before moving to the next section, it should be pointed out that there are several other ways to construct the hyperreals. Moreover, the conventional terminology is misleading in that different constructions do not necessarily give isomorphic structures. All versions of the hyperreals however obey the Transfer Principle, and this is all that is needed to do basic nonstandard real analysis. Indeed, one perfectly workable approach to the subject is an axiomatic one, which merely specifies that ∗R is an extension of R that obeys the Transfer Principle. (This approach would be parallel to a development of real analysis that proceeds without being concerned with any particular construction of R, using only the assumption that R is a complete ordered field.) 1.2.2 The nonstandard universe To use Robinson’s ideas beyond the realm of real analysis, it is necessary to repeat the construction of ∗R for any mathematical object M that might be needed, giving a nonstandard version ∗M of M that contains ideal elements (such as infinitesimals in the case of ∗R). M could be a group, ring, measure space, metric space or any mathematical object, however complicated.
8
1 Loeb Measures
Rather than construct each nonstandard extension ∗M as required, it is more economical to construct at the outset a nonstandard version ∗V of a working portion of the mathematical universe V that contains each object M that might be needed. Then ∗V will contain ∗M for every M ∈ V. Such a construction has the additional advantage that the corresponding Transfer Principle preserves connections between structures as well as their intrinsic properties. Here, briefly, is the way it works. First, for most mathematical practice, an adequate portion of the mathematical universe is the superstructure over R, denoted by V = V(R), defined as follows. V0 (R) = R Vn+1 (R) = Vn (R) ∪ P(Vn (R)), and V = V (R) =
n∈N
Vn (R).
n∈N
(If V (R) is not big enough to contain all the objects4 required, simply replace the starting set R by a suitable larger set S, giving V = V (S).) The next step is to construct a mapping ∗ : V (R) → V (∗R) which associates to each object M ∈ V a nonstandard extension ∗M ∈ V (∗R). Roughly, we have M ⊂ ∗M with ∗M \ M consisting of “ideal” or “nonstandard” elements. For example ∗N \ N consists of infinite (hyper)natural numbers; if M is an infinite dimensional Hilbert space H together with its finite dimensional subspaces then ∗M will contain some infinite hyperfinite dimensional subspaces. The way to visualise the resulting nonstandard universe is as follows. External objects
U
? Standard objects
z
A
V = V (R)
∗ A
∗
V (R) (= internal objects)
R
∗
R
The Nonstandard Universe The nonstandard universe is in fact the collection 4
We are now taking the view that all mathematical objects are sets.
V (∗R)
1.2 Nonstandard Analysis
9
∗
V = {x : x ∈ ∗M for some M ∈ V}
consisting of all new and old members of sets in V. Although ∗V ⊂ V (∗R), it is crucial to realise that ∗V is not the same as V (∗R). Sets in ∗V are known as internal sets. One way5 to construct ∗V is by means of an ultrapower VN /U although there is a little more work to do (compared to the corresponding construction of ∗R). The set membership relation ∈ that gives the structure (V, ∈), when extended pointwise to the ultrapower VN /U, gives a “pseudomembership” relation E, say, resulting in the structure (VN /U, E). It is then necessary to take the “Mostowski collapse” of this structure, which constructs simultaneously the collection ∗V and an injection i : (∗V, ∈) → (VN /U, E). Although i is not surjective, its range includes the equivalence class of each constant sequence, and then ∗M is defined by ∗
M = i−1 ((M, M, M, . . . M)/U).
The key property of the nonstandard universe ∗V is a Transfer Principle which again indicates precisely which properties of the superstructure V are inherited by ∗V. Theorem 1.8 (The Transfer Principle) Suppose that ϕ is a bounded quantifier statement. Then ϕ holds in V if and only if ∗ϕ holds in ∗V. A bounded quantifier statement (bqs) is simply a statement of mathematics that can be written in such a way that all quantifiers range over a prescribed set. That is, we have subclauses such as ∀x ∈ A and ∃y ∈ B but not unbounded quantifiers such as ∀x and ∃y. Most quantifiers in mathematical practice are bounded (often only implicitly in exposition). A bqs ϕ may also contain fixed sets M from V, which will be replaced in ∗ϕ by ∗M. Members of internal sets are internal (this follows easily from the construction) and since the sets ∗M are also internal, it follows that the information we obtain from the Transfer Principle is entirely about internal sets. To illustrate, the Transfer Principle tells us that any internal bounded subset of ∗R 5
This sketch of a construction of ∗V can be skipped without any loss – it is included to show that a nonstandard universe is a very down-to-earth and non-mysterious mathematical construct.
10
1 Loeb Measures
has a least upper bound, whereas this can fail for external6 sets. For example, the set N is a subset of ∗R that is bounded (by any infinite hyperreal) but has no least upper bound – from which we deduce that N is external. Incidentally this demonstrates that there actually are external sets – i.e. V(∗R) \ ∗V = Ω. An easy application of the Transfer Principle gives the following very useful properties. Proposition 1.9 Let A ⊆ ∗R be an internal set. (a) (Overflow) If A contains arbitrarily large finite numbers then it also contains an infinite number; (b) (Underflow) If A contains arbitrarily small positive infinite7 numbers then it contains a positive finite number. Taking reciprocals gives a corresponding pair of principles for the set of infinitesimals. As with ∗R it is possible (and quite convenient) to take an axiomatic approach to ∗V, which simply postulates the existence of a set ∗V and a mapping ∗ : V → ∗V that obeys the Transfer Principle. For most purposes (and certainly the construction of Loeb measures) one further assumption is needed, which we now discuss. 1.2.3 ℵ1 -saturation A nonstandard universe constructed as a countable ultrapower has an additional property called ℵ1 -saturation, which we highlight here because of its importance. Definition 1.10 A nonstandard universe ∗V is said to be ℵ1 -saturated if the following holds: if (Am )m∈N is a countable decreasing sequence of internal sets with each Am = Ω, then m∈N Am = Ω. Theorem 1.11 A nonstandard universe ∗V constructed as a countable ultrapower is ℵ1 -saturated. Proof (Sketch) Each set Am is represented by a sequence of standard sets (Xm,n )n∈N . Since each Am is nonempty and the sequence is decreasing, then for U-almost all8 n we have Xm+1,n ⊆ Xm,n and Xm,n = Ω. By a systematic modification of the sets Xm,n on a U-small9 set of indices n we may assume that Xm+1,n ⊆ Xm,n and Xm,n = Ω for all n and m. Now pick xn ∈ Xn,n 6 7 8 9
an external set is one that is not internal that is, for every positive infinite x ∈ ∗R there is an element a ∈ A with a infinite and a < x that is, the set {n : Xm+1,n ⊆ Xm,n } belongs to U. i.e. a set that is not in the ultrafilter U .
1.2 Nonstandard Analysis
11
and let y = (xn )U be the element represented by this sequence. Then y ∈ Am for every m since xn ∈ Xm,n for n ≥ m. ℵ1 -saturation is a kind of compactness property that is essential for the Loeb measure construction. For the rest of these lectures we assume that ∗V is a nonstandard universe that is ℵ1 -saturated. It is possible to build nonstandard universes with stronger saturation type properties, by an extension of the techniques discussed above. These are needed in some applications of Loeb measures involving topological spaces that do not have a countable sub-base, and in other “non-separable” mathematical applications. An equivalent and very useful formulation of ℵ1 -saturation, known as countable comprehension, goes as follows. Countable comprehension Given any sequence (An )n∈N of internal subsets of an internal set A, there is an internal sequence10 (An )n∈∗N of subsets of A that extends the original sequence. To see that ℵ1 -saturation implies countable comprehension, apply ℵ1 saturation to the sets Bm consisting of internal sequences (Cn )n∈∗N with Cn = An for n ≤ m. The reverse implication is proved using the overflow principle. (The reader may like to try proving this as an exercise.) 1.2.4 Nonstandard topology We gather together here some of the basic nonstandard topological notions that will be referred to later. First, we note that the idea of being infinitely close generalises to any topological space, extending the idea of a monad. Recall that for a ∈ R the monad of a is the set monad(a) = {x ∈ ∗R : x ≈ a}. More generally we have: Definition 1.12 Let (X, T ) be a topological space. (i) For a ∈ X the monad of a is ∗ monad(a) = U. a∈U ∈T
(ii) If x ∈ ∗X, we write x ≈ a to mean x ∈ monad(a). (Note that in general this is not a symmetric relationship.) (iii) x ∈ ∗X is nearstandard if x ≈ a for some a ∈ X. (iv) ns(Y ) is the set of nearstandard points in Y , for any Y ⊆ ∗X. (v) st(Y ) = {a ∈ X : x ≈ a for some x ∈ Y }; this is called the standard part of Y . 10
that is, an internal function with domain ∗N.
12
1 Loeb Measures
The idea of the pointwise standard part mapping for ∗R generalizes to Hausdorff spaces because of the next result. Proposition 1.13 A topological space X is Hausdorff if and only if monad(a) ∩ monad(b) = Ω Proof An easy exercise.
for
a = b,
a, b ∈ X.
This means that for Hausdorff spaces we can define the standard part mapping st : ns(∗X) → X by st(x) = the unique a ∈ X with a ≈ x. The following notation is often used: ◦
x = st(x).
If necessary we write stX or stT to denote the space or topology concerned. The following is another important notion that plays a key rˆ ole in constructing solutions to differential equations of all kinds. Definition 1.14 Suppose that Y is a subset of ∗ X for some topological space X, and F : ∗X → ∗R is internal. Then F is said to be S-continuous on Y if for all x, y ∈ Y we have x ≈ y =⇒ F (x) ≈ F (y). The importance of this notion is seen in the following result. Theorem 1.15 If F : ∗R → ∗R is S-continuous on an interval ∗[a, b] for real a, b, and F (x) is finite for some x ∈ ∗[a, b], then the standard function defined on [a, b] by f (t) = ◦F (t) is continuous, and ∗f (τ ) ≈ F (τ ) for all τ ∈ ∗[a, b]. Remark This theorem shows that S-continuous functions in ∗C[a, b]are precisely those that are nearstandard in the uniform topology on C[a, b], and the function f defined above is the standard part ◦F for this topology. One final result from general nonstandard topology that we will need is: Proposition 1.16 Let (X, T ) be separable, Hausdorff. Suppose that Y ⊆ ∗X is internal, and A ⊆ X. Then (a) st(Y ) is closed, (b) if X is regular and Y ⊆ ns(∗X) then st(Y ) is compact, (c) st(∗A) = A (the closure of A), (d) if X is regular, then A is relatively compact iff ∗A ⊆ ns(∗X).
1.3 Construction of Loeb Measures
13
Remark The condition that X should be separable in Proposition 1.16 can be omitted if the nonstandard model has more saturation – namely κsaturation (see the Remark at the end of the previous section), where the topology on X has a base of cardinality κ. However, in all our applications the relevant spaces X are separable, and so ℵ1 -saturation (which we have in our model) is sufficient.
1.3 Construction of Loeb Measures A Loeb measure is a measure constructed from a nonstandard measure by the following construction of Peter Loeb [71]. We confine our attention in these lectures mainly to finite (or bounded) Loeb measures. Suppose that an internal set Ω and an internal algebra A of subsets of Ω are given, and suppose further that µ is a finite internal finitely additive measure on A. This means that µ is an internal mapping µ : A → ∗[0, ∞) with µ(A∪B) = µ(A)+µ(B) for disjoint A, B ∈ A, and that µ(Ω) is finite.11 Thus µ(A) is finite for each A ∈ A, so we may define the mapping ◦
µ : A → [0, ∞)
by ◦µ(A) = ◦(µ(A)). Clearly ◦µ is finitely additive, so that (Ω, A, ◦µ) is a standard finitely additive measure space. In general this is not a measure space, because A is not σ-additive unless A is finite. Nevertheless, if (An )n∈N is a family of sets from A, then the set n∈N An is almost in A. It differs from a set in A by a null set (a notion to be defined shortly); see the Key Lemma (Lemma 1.19) and its corollary below. This is what lies at the heart of the following fundamental result proved by Loeb. Theorem 1.17 There is a unique σ-additive extension of ◦µ to the σ-algebra σ(A) generated by A. The completion of this measure is the Loeb measure corresponding to µ, denoted µL and the completion of σ(A) is the Loeb σalgebra, denoted by L(A). Proof For a quick proof we can apply Caratheodory’s extension theorem. It is only necessary to check σ-additivity of ◦µ on A. Suppose that (An )n∈N is a sequence of pairwise disjoint sets from A such that An ∈ A. A= n∈N 11
It also means of course that all sets in A are internal
14
1 Loeb Measures
By ℵ1 -saturation (applied to the decreasing sequence of sets A \ there is m ∈ N such that m An = An . n∈N
m n=1
An )
n=1
So Ak = Ω for k > m, and m m ◦ ◦ ◦ ◦ µ An = µ An = µ(An ) = µ(An ), n∈N
n=1
n=1
n∈N
using finite additivity. Caratheodory’s theorem (see [88] for example) now gives the result. It is quite straightforward and rather more illuminating to prove Loeb’s theorem from “first principles” and here is one way to proceed – based around the idea of a Loeb null set. (See [29] for full details of this approach.) Definition 1.18 Let B ⊆ Ω (not necessarily internal). We say that B is a Loeb null set if for each real ε > 0 there is a set A ∈ A with B ⊆ A and µ(A) < ε. The following result makes it clear that A is almost a σ-algebra. Lemma 1.19 (Key Lemma) Let (An )n∈N be an increasing family of sets, with each An in A, and let B = n∈N An . Then there is a set A ∈ A such that (a) B ⊆ A; (b) ◦µ(A) = limn→∞ ◦µ(An ); (c) A \ B is null. Proof
Let α = limn→∞ ◦µ(An ). For each finite n, µ(An ) ≤ ◦µ(An ) +
1 1 ≤ α+ . n n
Now, using ℵ1 -saturation, take an increasing internal sequence (An )n∈∗N of sets in A extending the sequence (An )n∈N . Overflow gives an infinite N such that 1 µ(AN ) ≤ α + . N Let A = AN . Then (a) holds because A ⊇ An for each finite n. Moreover, µ(An ) ≤ µ(A) for each finite n, so ◦µ(An ) ≤ ◦µ(A) ≤ α, giving ◦µ(A) = α, which is (b). Moreover, ◦µ(A\An ) = ◦µ(A)− ◦µ(An ) → 0. Now A\B ⊆ A\An so A \ B is null. From this Key Lemma, it is clear that A is almost a σ-algebra – and in fact it is a σ-algebra modulo null sets. The following makes this precise.
1.3 Construction of Loeb Measures
15
Definition 1.20 (i) Let B ⊆ Ω. We say that B is Loeb measurable if there is a set A ∈ A such that A∆B 12 is Loeb null. Denote the collection of all Loeb measurable sets by L(A). (ii) For B ∈ L(A) define µL (B) = ◦µ(A) for any A ∈ A with A∆B null, and call µL (B) the Loeb measure of B. It is then quite straightforward to prove: Theorem 1.21 L(A) is a σ-algebra, and µL is a complete (σ-additive) measure on L(A). The measure space Ω = (Ω, L(A), µL ) is called the Loeb space given by (Ω, A, µ), and L(A) is called the Loeb algebra. Of course L(A) depends on both A and µ, so strictly we should write L(A, µ), but usually it is clear which measure is intended. If µ(Ω) = 1 then Ω is a Loeb probability space and µL is the Loeb probability measure given by µ. The following are alternative characterisations of Loeb measurable sets, and are often taken as the fundamental definition (see [3], [20] or [69] for example). First some definitions are required. Definition 1.22 Let B ⊆ Ω (not necessarily internal). (i) B is µ-approximable if for every real ε > 0 there are sets A, C ∈ A with A ⊆ B ⊆ C and µ(C \ A) < ε. (ii) The inner and outer Loeb measure of B, µ(B) and µ(B) are given by µ(B) = sup{◦µ(A) : A ⊆ B, A ∈ A} µ(B) = inf{◦µ(A) : A ⊇ B, A ∈ A} Then we have Theorem 1.23 The following are equivalent: (a) B is Loeb measurable. (b) B is µ-approximable. (c) µ(B) = µ(B).
12
A∆B is the symmetric difference (A \ B) ∪ (B \ A)
16
1 Loeb Measures
Loeb counting measure For a simple illustration of the Loeb construction (but one which has far reaching applications) consider the Loeb counting measure, as follows. Let Ω = {1, 2, . . . , N } where N ∈ ∗N \ N, so that Ω is a infinite hyperfinite set (necessarily internal), and let ν be the counting probability measure on Ω, defined by |A| |A| = ν(A) = |Ω| N for A ∈ ∗P(Ω) = A, say.13 Here |A| denotes the number14 of elements in A. Note that ∗P(Ω) is a proper subset of P(Ω), since, for example, the set N ∈ P(Ω) \ ∗P(Ω), which in turn shows that A is not a σ-algebra. The Loeb counting measure νL is the completion of the extension to σ(A) of the finitely additive measure ◦ν. 1.3.1 Example: Lebesgue measure A first simple application of Loeb measure is an intuitive construction of Lebesgue measure. First we define the hyperfinite (time)15 line T corresponding to the interval [0, 1]. Definition 1.24 Fix N ∈ ∗N \ N and let ∆t = N −1 . The hyperfinite time line (based on ∆t, for the interval [0, 1]) is the set T = {0, ∆t, 2∆t, 3∆t, . . . , 1 − ∆t}. (In applications hyperfinite time lines may be taken with different end points, according to need.) We will use sanserif symbols t, s for elements of T to distinguish them from those in [0, 1]. Theorem 1.25 Let νL be the Loeb counting measure on the hyperfinite time line T. Define −1 (i) M = {B ⊆ [0, 1] : st−1 T (B) is Loeb measurable}, where stT (B) = ◦ {t ∈ T : t ∈ B}. (ii) λ(B) = νL (st−1 T (B)) for B ∈ M. Then ([0, 1], M, λ) is Lebesgue measure (i.e. M is the Lebesgue completion of the Borel sets B[0, 1], and λ(B) is the Lebesgue measure of B ∈ M.) 13 14
15
An application of the Transfer Principle tells us that this is the collection of all internal subsets of Ω. The Transfer Principle tells us that for internal subsets A of Ω there is a unique M ∈ ∗N, M ≤ N , such that there is an internal bijection F : {1, 2, . . . , M } → A – and this M is what is meant by |A|. Equivalently, | · | is the extension to ∗V of the standard function | · | that gives the cardinality of finite sets. This has become the conventional terminology for this discrete representation of the interval [0,1] when it is used to represent time.
1.3 Construction of Loeb Measures
17
Proof (Sketch) It is routine to check that M that contains each ∗ is a σ-algebra 1 1 ([a, b]) = [a− , b+ ]∩T , which is standard interval [a, b] (since st−1 T n∈N n n a countable intersection of internal sets), and that λ is a complete probability measure on M. Showing that λ is translation invariant and λ([a, b]) = b − a is straightforward, so that ([0, 1], M, λ) is an extension of Lebesgue measure. Now take B ∈ M, and an inner approximation A ⊆ st−1 T (B) with A internal. Then the set st(A) is a closed inner approximation of B, and this suffices to show that B is Lebesgue measurable. This result is a particular case of a general theorem of Anderson [6] that shows how any Radon measure on a Hausdorff space can be represented by a hyperfinite Loeb counting measure. A famous example of this is Anderson’s representation of Wiener measure, below. A less well known but very pleasant example is David Ross’ very intuitive construction of Haar measure16 , as follows (taken from [85, 87]). 1.3.2 Example: Haar measure Let G be a compact group, and take an internal infinitesimal neighbourhood17 V of 1. Take a minimal ∗open cover Ω of ∗G consisting of sets that are translates of V . So Ω = {V1 , . . . , VN } say with each Vi = gi V for some gi ∈ ∗G. Let νL be the Loeb counting probability measure on Ω. For Borel sets B ⊆ G define −1
m(B) = νL (stΩ (B)) where stΩ : Ω → G is the generalisation of the standard part mapping18 to this context. Then m is Haar measure on G. To see this, first it routine to show that m is a Borel probability measure on G; the other required property is that m should be translation invariant – that is, m(B) = m(gB) for each B ∈ B and g ∈ G. It is sufficient to show that m(B) ≤ m(gB), and for this take an internal set A ⊆ st−1 Ω (B). Let C = {Vj : Vj ∩ gVk = Ω for some Vk ∈ A} and note that C ⊆ st−1 Ω (gB). It is easy to check that the collection (Ω \ A) ∪ g −1 C is a cover of ∗G by sets that are translates of V , so by minimality of the collection Ω this gives |C| ≥ |A|. Thus m(gB) ≥ m(B) as required. 1.3.3 Example: Wiener measure Perhaps the best known measure construction using Loeb measure theory is Anderson’s construction [5] of Wiener measure, which we now describe. Recall 16 17 18
Haar measure on a compact group is the unique probability measure that is invariant under multiplication by group elements. This means that V ⊂ ∗U for each open neighbourhood U of 1. Actually we have the mapping stG : ∗G → G, but since V is an infinitesimal neighbourhood, the set stG (Vi ) is a singleton for each i, so it makes sense to define stΩ : Ω → G.
18
1 Loeb Measures
that Wiener measure W on C = C0 [0, 1] (the set of continuous functions x with x0 = 0) is the unique Borel probability on C such that
−y 2 1 dy exp W ({x : xt − xs ∈ B}) = 1 2(t − s) (2π(t − s)) 2 B for s < t and Borel B ⊂ R, and such that disjoint increments xt − xs of paths x ∈ C are independently distributed under W . Take the hyperfinite time line T = T ∪ {1}, where T is as above and let CN be the set of all polygonal paths B(t)t∈T filled in linearly between the time points t ∈ T, with B(0) = 0 and √ B(t + ∆t) − B(t) = ∆B(t) = ± ∆t. Let WN =counting probability on CN , giving the internal probability space (CN , AN , WN ) where AN = ∗P(CN ). This gives the corresponding Loeb space Ω = (CN , L(AN ), PN ) where PN = (WN )L .
√ 3 ∆t √ 2 ∆t √ ∆t 0
∆t
5∆t
10∆t
15∆t
√ − ∆t √ −2 ∆t √ −3 ∆t
An infinitesimal random walk
1.3 Construction of Loeb Measures
19
Theorem 1.26 (Anderson) (a) For a.a.19 B ∈ CN , B is S-continuous, and gives a continuous path b = ◦B ∈ C. (b) For Borel D ⊆ C W (D) = PN (st−1 (D)) is Wiener measure.20 (c) Writing Ω = CN and ω instead of B for a generic point in Ω, the process b : [0, 1] × Ω → R defined by b(t, ω) = ◦ω(t) is Brownian motion on the probability space Ω. This is arguably the most intuitive of all the many constructions of Brownian motion/Wiener measure, and captures precisely the stochastic analyst’s rule of thumb “db2 = dt”, since we really do have ∆B 2 = ∆t. Anderson [5] used it to give an elementary proof of Donsker’s invariance principle, together with a pathwise construction of the Itˆ o integral and an intuitive proof of Itˆ o’s Lemma. His construction opened the way for a large number of important applications in stochastic analysis and related fields, either directly or as an inspiration in more general situations. One of the first and most important of these, due to Keisler [61] is the idea of solving stochastic differential equations by means of hyperfinite difference equations. The paper [63] indicates some of the more recent developments in this area. We will discuss these ideas later (see section 1.5.3 below, and also Lecture 2), after we have outlined the basics of Loeb integration theory. First it is necessary to consider Loeb measurable functions. 1.3.4 Loeb measurable functions Suppose we have a Loeb space Ω = (Ω, L(A), µL ) constructed from the internal space (Ω, A, µ). A Loeb measurable function f : Ω → R is simply a function that is measurable in the conventional sense with respect to the Loeb algebra L(A). That is, f −1 (] − ∞, a]) ∈ L(A) for every real interval [a, b]. There is of course another concept of measurable function, given by the transfer of the standard definition. A function F : Ω → ∗R is ∗measurable if F is internal and F −1 ([α, β]) ∈ A for every hyperreal interval [α, β] (with α, β ∈ ∗R). The fundamental connection between these two notions is as follows. 19 20
with respect to the Loeb measure PN of course. The standard part mapping here is the restriction to CN of the mapping st : ∗ C → C for the uniform topology – see section 1.2.4 above.
20
1 Loeb Measures
Theorem 1.27 Let f : Ω → R. Then the following are equivalent. (a) f is Loeb measurable; (b) there is a ∗measurable function F : Ω → ∗R such that f (ω) ≈ F (ω) for almost all ω ∈ Ω (with respect to the Loeb measure µL ).21 For a proof see [29], [87], or the original paper of Anderson [6], who proved the result for measurable functions into a second-countable Hausdorff space. David Ross has extended this further to include all metric spaces [86]. Definition 1.28 A function F as given by Theorem 1.27 is called a lifting of f ; that is, a lifting (with respect to µL ) of a function f : Ω → R is an internal ∗measurable function F : Ω → ∗R such that f (ω) ≈ F (ω) for almost all ω ∈ Ω (with respect to the Loeb measure µL ). A general lifting result that is very useful is Anderson’s ‘Luzin’ theorem [6]. Theorem 1.29 Let (X, C, µ) be a complete Radon space and suppose that f : X → R is measurable. Then ∗f is a lifting of f with respect to µL . That is, ∗ f (x) ≈ f (◦x) for (∗µ)L almost all x ∈ ∗X. Remarks 1. The kind of lifting given by this theorem is known as a two-legged lifting, to distinguish it from the kind of lifting in Definition 1.28. 2. Anderson actually established this result for the situation where the range of f is any Hausdorff space with a countable base of open sets.
1.4 Loeb Integration Theory Given a Loeb space Ω = (Ω, L(A), µL ) and its originating internal space (Ω, A, µ), there are two integrals to consider. First, there is the internal integral
F dµ Ω 21
This result also holds for extended real valued functions f : Ω → R provided we adopt the terminology ◦x = ±∞ and hence x ≈ ±∞ if x ∈ ∗R is positive (resp. negative) infinite.
1.4 Loeb Integration Theory
21
for any (internal) ∗integrable function F : Ω → ∗R. The value of this integral is a hyperreal that is given by the transfer of the construction of the integral on a standard space. Secondly there is the classical Lebesgue integral
f dµL Ω
defined in the usual way for a Loeb integrable function f : Ω → R: the term Loeb integrable function f means simply that f is integrable (in the conventional sense) with respect to the Loeb measure µL on Ω. Loeb integration theory gives the connection between these two
integrals. Its importance stems from the fact that the internal integral F dµ may be quite simple (for example a hyperfinite sum) while a closely related Loeb integral can represent a general standard integral (such as a Lebesgue integral on the real line or a Wiener integral). Here are the details. Theorem 1.30 If F is a finitely bounded internal measurable function then
◦ F dµ = ◦F dµL .
Corollary 1.31 If F is a (finitely) bounded lifting of a Loeb measurable f , then
◦ f dµL = F dµ.
We cannot in general expect equality of ◦ F dµ and ◦F dµL since F may be large on a set of infinitesimal measure, as in the following example. Example Consider Ω = ∗[0, 1] and define F : ∗[0, 1] → ∗R by 1 K for τ ≤ K F (τ ) = 0 otherwise. Let Λ denote ∗Lebesgue measure. Then ◦F (τ
◦
) = 0 almost everywhere with respect to ΛL , and hence F dΛL = 0. But F dΛ = 1. We always have Theorem 1.32 For any internal A-measurable F with F ≥ 0
◦ ◦ F dµL ≤ F dµ, where we allow the value ∞ on either side.
22
1 Loeb Measures
To obtain equality of ◦ F dµ and ◦F dµL it is necessary to have some condition on F akin to standard integrability — roughly, so that F is not too big on small sets. The following is the appropriate condition. Definition 1.33 Let a function F : Ω → ∗R be A-measurable and internal and µ an
internal finite measure. Then F is S-integrable if (i) Ω |F |dµ is finite,
(ii) if A ∈ A and µ(A) ≈ 0, then A |F |dµ ≈ 0. Note If µ is not finite an extra condition
has to be added: (iii) if A ∈ A and F ≈ 0 on A, then A |F |dµ ≈ 0. This is always satisfied for a finite measure µ. If F ≈ 0 on A and µ(A) = 0, then for any 0 < ε ∈ R we have |F | < ε. Hence A |F |dµ < εµ(A), which is enough since µ(A) is finite. 1 The function in the example above is not S-integrable because A = [0, K ] has Λ(A) ≈ 0 but A F dΛ = 1.
Note that F is S-integrable if and only if its positive and negative parts F + and F − are S-integrable, and equivalently if |F | is S-integrable. The next result shows the importance of S-integrability. Theorem 1.34 Let F : Ω → ∗R be A-measurable with F ≥ 0. Then the following conditions are equivalent: (a) F is S-integrable, (b) ◦F is Loeb integrable and
◦ F dµ = ◦F dµL .
The following is an equivalent formulation of S-integrability (the proof is left as an exercise). Proposition 1.35 An internal function F is S-integrable if and only if for all infinite K
|F |dµ ≈ 0. |F |>K
To complete the basic theory of Loeb integration we have: Theorem 1.36 Let f : Ω → R be Loeb measurable. Then f is µL -integrable if and only if it has an S-integrable lifting F : Ω → ∗R. Definition 1.37 We say that F : Ω → ∗R is SLp (p > 0) if |F |p is Sintegrable (so SL1 means S-integrable). Here is a very useful test for S-integrability isolated by Lindstrøm [68] and frequently applied in the case p = 2.
1.5 Elementary Applications
Theorem 1.38 Suppose µ(Ω) < ∞. If F : Ω → measurable, and
|F |p dµ < ∞
∗
23
R is internal, A-
Ω
for some p > 1, p ∈ R, then F is S-integrable.
1.5 Elementary Applications As a warm up for the more substantial applications of Loeb measures in later lectures, we present here a few simple illustrations of their power. 1.5.1 Lebesgue integration Recall the hyperfinite time set T defined above (Definition 1.24), which carries the counting Loeb measure νL . For any function f : [0, 1] → R we may define a corresponding function fˆ : T → ∗R by fˆ(t) = f (◦t). The characterisation (or definition) of Lebesgue measure given by Theorem 1.25, combined with Theorem 1.27 yields immediately: Theorem 1.39 The following are equivalent: (a) f is Lebesgue measurable; (b) fˆ is Loeb measurable (wrt νL ); (c) there is an internal function F : T → ∗R (a lifting of fˆ) such that for a.a. t ∈ T f (◦t) = ◦F (t) The lifting F of fˆ is a two-legged lifting in the sense described earlier. Now apply Theorem 1.36 to give the following pleasant characterisation of the Lebesgue integral. Theorem 1.40 Suppose that f, fˆ are as above. Then the following are equivalent: (a) f is Lebesgue integrable; (b) fˆ is Loeb integrable; (c) there is an S-integrable function F : T → ∗R that is a lifting of f (and fˆ). If any of (a)–(c) holds then
1
f dλ = F (t)∆t, fˆdνL = ◦ T 0 t∈T
the summation term t∈T F (t)∆t being another way of writing T F dν.
24
1 Loeb Measures
1.5.2 Peano’s Existence Theorem The above characterisation of the Lebesgue integral as a hyperfinite sum leads naturally to the method of hyperfinite difference equations for solving ODEs – an appealing technique pioneered by Keisler and extended to great effect especially for stochastic differential equations – see [61]. Here is an outline of a proof of Peano’s fundamental existence theorem using this technique. Theorem 1.41 (Peano) Suppose that f : [0, 1] × R → R is bounded, measurable, and continuous in the second variable, and let x0 ∈ R. Then there is a solution to the differential equation dx(t) = f (t, x(t))dt x(0) = x0
(1.1)
(Of course, what is meant is really the corresponding integral equation.) Proof Without any loss of generality we may assume that x0 = 0 (otherwise consider the equation for x(t) − x0 ). Suppose that |f | ≤ c. An extension of the Lifting Theorem 1.27 is used (see below for details) to obtain an internal function F : T × ∗[−c, c] → ∗R such that |F | ≤ c and for almost all t∈T F (t, X) ≈ f (◦t, ◦X) (1.2) for all |X| ≤ c. The hyperfinite difference equation corresponding to (1.1) is now ∆X(t) = F (t, X(t))∆t, where ∆X(t) = X(t + ∆t) − X(t), together with the initial condition X(0) = 0. This is an internal equation for an internal function X : T → ∗R, with solution X(t) defined recursively by X(0) = 0 X(t + ∆t) = X(t) + F (t, X(t))∆t. Then X is S-continuous, and |X(t)| ≤ ct ≤ c for all t ∈ T. So we may define a continuous function x : [0, 1] → R by x(t) = ◦X(t) for any t ≈ t. Clearly |x(t)| ≤ c. To see that x(t) is a solution, observe that, by (1.2) and the definition of x, for almost all t ∈ T F (t, X(t)) ≈ f (◦t, ◦X(t)) = f (◦t, x(◦t)) which means that the function G(t) = F (t, X(t)) is a lifting of the function g(t) = f (t, x(t)). So, applying Theorem 1.40 to g(t) and its lifting G(t) we have (putting t = ◦t)
1.5 Elementary Applications
x(t) = ◦X(t) =
◦
25
F (s, X(s))∆t
s
t
=
f (s, x(s))ds 0
as required. The lifting F above satisfying (1.2) is obtained as follows. Define the measurable function fˆ : [0, 1] → C([−c, c]) by fˆ(t)(z) = f (t, z). for |z| ≤ c. From this we obtain (using Theorem 1.25) a Loeb measurable function fˇ : T → C([−c, c]) (where T is the hyperfinite time line, endowed with the counting measure ν as above) by fˇ(t) = fˆ(◦t). for t ∈ T. Taking the uniform topology on C([−c, c]) and the extension of Theorem 1.27 to separable metric spaces, we obtain a lifting Fˆ : T → ∗ C([−c, c]) such that for almost all t ∈ T (with respect to νL ) Fˆ (t) ≈ fˇ(t) = fˆ(◦t) (in the uniform topology) and |Fˆ | ≤ c. This means that for all such t Fˆ (t)(X) ≈ fˇ(t)(◦X) = fˆ(◦t)(◦X) = f (◦t, ◦X) for all |X| ≤ c. Now define F : T × ∗[−c, c] → ∗R by F (t, X) = Fˆ (t)(X). Then |F | ≤ c and for almost all t ∈ T F (t, X) ≈ f (◦t, ◦X) for all |X| ≤ c, which is (1.2). A slightly different Loeb measure approach to differential equations is to work with an infinitesimal delayed equation, and we illustrate this with an alternative proof of the Peano theorem. Alternative Proof of Theorem 1.41 Let ∆ = ∆t = N −1 as above, and define an internal function X : ∗[−∆, 1] → ∗R by X(τ ) = x0 τ X(τ ) = x0 + 0 ∗f (σ, X(σ − ∆))dσ
for −∆ ≤ τ ≤ 0 for 0 ≤ τ ≤ 1
Note that X(τ ) is defined recursively on [k∆, (k + 1)∆] for 0 ≤ k < N .
26
1 Loeb Measures
Since f and hence ∗f is bounded, X is S-continuous and we can define a standard function x : [0, 1] → R by x(t) = ◦X(t) = ◦X(τ ) for any τ ≈ t. We claim that x(t) is a solution to equation (1.1). Let Λ = ∗λ = ∗Lebesgue measure. Using the extension of Anderson’s Luzin Theorem 1.29, mentioned above, and considering the function fˆ : [0, 1] → C(R) defined by fˆ(t)(z) = f (t, z) we have that for almost all τ (with respect to ΛL ) ∗ f (τ, y) ≈ f (◦τ, ◦y) for all finite y ∈ ∗R. Hence, for almost all τ ∈ ∗[0, 1] ∗
f (τ, X(τ − ∆)) ≈ f (◦τ, ◦X(τ − ∆)) = f (◦τ, x(◦τ ))
since ◦(τ −∆) = ◦τ . Now this means that G(τ ) = ∗f (τ, X(τ −∆)) is a bounded lifting of g(τ ) = f (◦τ, x(◦τ )) and so for any t ∈ [0, 1] ◦
x(t) = X(t) = x0 +
◦ t
G(τ )dτ = x0 +
0
t
g(τ )dL τ 0
where dL τ denotes integration with respect to ΛL . Since ΛL ◦st−1 is Lebesgue measure, we have
t
t
t g(τ )dL τ = f (◦τ, x(◦τ ))dL τ = f (t, x(t))dt 0
0
0
which shows that x(t) is a solution to equation (1.1). Loeb Differential Equations The existence of the Loeb-Lebesgue measure ∗λL on ∗R makes it possible (and natural) to formulate and solve Loeb differential equations for the “rich” time line ∗R. By this we mean integral equations of the following kind:
τ f (σ, x(σ))dL σ x(τ ) = x0 + 0 ∗
where f : [0, 1]×R → R is Loeb measurable in τ and continuous in the second variable. The solution x(τ ) will be S-continuous and real valued, so it will really be a continuous function. Such equations occur in the study of optimal control theory, where it is natural to consider Loeb measurable controls. In particular, it can be shown [55] that a general optimal control problem will always have an optimal Loeb control, even when there is no optimal Lebesgue control. There are close connections here with Young measures: this was shown in the context of control theory in [31], and is discussed in greater detail in the forthcoming paper [45].
1.5 Elementary Applications
27
1.5.3 Itˆ o integration and stochastic differential equations The hyperfinite difference approach has been used to great effect in the solution of Itˆ o stochastic differential equations (SDEs), based on Anderson’s hyperfinite random walk construction of Brownian motion and the Itˆ o integral [5]. Without going into details, the Itˆ o integral gives a meaning to the expression
t f (s, ω)db(s, ω) I(t, ω) = 0
where b(t, ω) is Brownian motion and f (t, ω) is a certain kind of random function (an adapted function). The Itˆo integral I(t, ω) is a continuous stochastic process, and is defined as an L2 -limit of simpler random processes. In the standard theory a pathwise (that is, ω-wise) definition of I is not possible. Nevertheless Anderson [5] showed how to represent the Itˆo integral pathwise as a hyperfinite sum, in a direct generalisation of the above representation of the Lebesgue integral. First recall Anderson’s Brownian motion b(t, ω) constructed earlier on the Loeb space Ω = (CN , L(AN ), PN ). This came from the canonical internal random walk B(t, ω) defined on Ω = CN by B(t, ω) = ω(t) √ Thus ∆B(t, ω) = B(t + ∆t, ω) − B(t, ω) = ± ∆t. A generalisation of Theorem 1.27 gives: Theorem 1.42 Let f (t, ω) be an adapted function. Then there is a nonanticipating 22 lifting F : T × Ω → ∗R of f such that f (◦t, ω) ≈ F (t, ω) for almost all (ω, t) ∈ T × Ω. Anderson [5] proved the following stochastic generalisation of Theorem 1.40. Theorem 1.43 (Anderson) Let F be nonanticipating lifting of an adapted function f as above, and define an internal hyperfinite stochastic integral G : T × Ω →∗R by G(t, ω) = F (s, ω)∆B(s, ω). s
Then G is a nonanticipating lifting of the stochastic integral I(t, ω) = f db defined above. That is, for almost all ω, the function G(t, ω) is S-continuous and 22
this means that F (t, ω) depends only on the values ω(s) for s ≤ t
28
1 Loeb Measures
I(◦t, ω) ≈ G(t, ω) for all t. In [61] Keisler pioneered the use of Anderson’s representation of the Itˆ o integral in the solution of stochastic differential equations (SDEs), generalising the technique described above to prove the Peano Existence Theorem for ODEs. Subsequently this technique has been developed by many authors, both in solving SDEs and in applications such as optimal control theory [22] and mathematical finance theory ([35] for example). The delay approach is also appropriate for certain SDEs – see [21]. Loeb space methods for SDEs have been extended to equations involving general stochastic integrals against martingales and semimartingales, beginning with Hoover & Perkins [57] and Lindstrøm [68]. For partial differential equations (PDEs) – or, more generally, infinite dimensional differential equations, in addition to the above approaches there are new possibilities for constructing solutions using hyperfinite dimensional representation of the objects concerned (and not necessarily using hyperfinite representation of time). The book [13] develops this idea in some detail for the Navier–Stokes equations, which are formulated as a differential equation in a certain separable Hilbert space H. We will discuss this in greater detail in the next lecture – which includes some developments since the publication of [13]. Measure valued equations on an infinite dimensional space can also be treated successfully using hyperfinite dimensional representation, together with the idea of nonstandard densities – and this is also touched upon in the next lecture.
2. Stochastic Fluid Mechanics
2.1 Introduction The Navier–Stokes equations describe the time evolution of the velocity of a fluid at each point in an appropriate domain. In this Lecture we illustrate a number of nonstandard techniques that have been developed with Marek Capi´ nski to shed fresh light on these equations, particularly when they are perturbed by some external noise. In this case the power of Loeb spaces has led to a number of new existence results We mainly confine attention to the equations for a fluid in a bounded domain D ⊆ Rd for d = 2, 3, with zero velocity on the boundary ∂D. On occasions we will consider periodic boundary conditions in dimension 2.
A general version of the stochastic Navier–Stokes equations in two or three dimensions (ie. d = 2 or 3) is as follows:
N.J. Cutland: LNM 1751, pp. 29–60, 2000. c Springer-Verlag Berlin Heidelberg 2000
30
2 Stochastic Fluid Mechanics
du = ν∆u − u, ∇u + f (t, u) − ∇p dt + g(t, u)dwt
(2.1) divu = 0
Here u(t, x, ω) is the (random) velocity of the fluid at the location x ∈ D at time t. Thus we have u : [0, ∞) × D × Ω → Rd where Ω is an underlying probability space. The randomness enters the equations explicitly by way of the random external force g(t, u)dwt , where w is a Wiener process, and implicitly through the feedback of u in the noise coefficient g(t, u) and the non-random external force f (t, u). Both f and g depend on t and the whole of the current velocity field u(t, ·, ω). The other terms in the equations are ν (the viscosity), p the pressure. The deterministic Navier–Stokes equations with feedback in the forces are obtained by taking g = 0, and were first solved by Leray in 1933-4. Even here there remain some major open problems, particularly for space dimension 3; here existence for all time is established only for so called weak solutions, while uniqueness has only been proved for strong solutions (where the problem of global existence is still to be settled). Stochastic Navier–Stokes equations were first considered in 1973 by Bensoussan and Temam [7]. They considered the case where f is independent of u and g(t, u) = 1; that is, with noise given by a one-dimensional Wiener process acting in an additive way. Subsequently this equation was studied by many authors but the noise always appeared in additive form (that is, with g independent of u). The general stochastic Navier–Stokes equations (2.1) were first solved in 1991 in [10] using the Loeb space methods that we will describe below. These ideas have been used to establish a range of new results for the Navier–Stokes equations – including existence results for statistical solutions and stochastic attractors, as well as a recent extension to the stochastic non-homogeneous Navier–Stokes equations. The case ν = 0 (the Euler equation) has also been treated. 2.1.1 Function spaces The classical framework (which we adopt) for the solution of the Navier– Stokes equations involves certain Hilbert spaces and operators, as follows. Let H be the closure of the set {u ∈ C0∞ (D, Rd ): div u = 0} in the L2 norm |u| = (u, u)1/2 , where (u, v) =
d
j=1
D
uj (x)v j (x)dx.
2.1 Introduction
31
The letters u, v, w will be used for elements of H. The space V is the closure of {u ∈ C0∞ (D, Rd ): div u = 0} in the stronger norm |u| + u where u = ((u, u))1/2 and d ∂u ∂v ((u, v)) = ( , ). ∂x j ∂xj j=1 H and V are Hilbert spaces with scalar products (·, ·) and ((·, ·)) respectively, and | · | ≤ c · for some constant c. By A we denote the self adjoint extension of the projection of −∆ in H. Classical theory shows that there is an orthonormal basis {ek } of the eigenfunctions of A with corresponding eigenvalues λk , λk > 0, λk ∞. For u ∈ H we write uk = (u, ek ), and write Prm for the projection of H on the subspace Hm spanned by {e1 , . . . , em }. Since each ek ∈ V then Hm ⊂ V. A trilinear form b is defined by b(u, v, w) =
d
i,j=1
D
uj (x)
∂v i (x)wi (x)dx = (u, ∇v, w) ∂xj
whenever the integrals make sense. Note the following well-known properties of the form b: b(u, v, w) = −b(u, w, v) b(u, v, v) = 0 |b(u, v, w)| ≤ cu v w. The last is a continuity property of b with respect to the norm · . There are a number of other important continuity properties for other norms, but we are not concerned with such detail here. 2.1.2 Functional formulation of the Navier–Stokes equations In the above framework, the stochastic Navier–Stokes equations may be formulated as a stochastic differential equation in H as follows: du = [−νAu − B(u) + f (t, u)]dt + g(t, u)dwt
(2.2)
where B(u) = b(u, u, ·). This is regarded as an equation in V (the dual of V) although it turns out that the solution lives in H (and in fact in V for almost all times). Compared to (2.1), note that the pressure has disappeared, because ∇p = 0 in V (using divv = 0 in V and an integration by parts). The equation (2.2) is really an integral equation, with the first integral being the Bochner integral and the second an extension of the Itˆ o integral to Hilbert spaces, due to Ichikawa [59]. The noise is given by a Wiener process w : [0, ∞) × Ω → H and so the noise coefficient g belongs to L(H, H). So it is assumed that
32
2 Stochastic Fluid Mechanics
g : [0, ∞) × V → L(H, H) while
f : [0, ∞) × V → V .
(The restriction to V in the domains is sufficient because we will have the solution in V for almost all times.) 2.1.3 Definition of solutions to the stochastic Navier–Stokes equations The following makes precise what is meant by a solution to the Navier–Stokes equations as formulated above. In fact there is a range of solution concepts of varying strength, each of which is appropriate in certain circumstances. Definition 2.1 Suppose that u0 ∈ H and f, g as above are given, together with a probability space Ω carrying an H-valued Wiener process w. A weak solution of the stochastic Navier–Stokes equations is a stochastic process u : [0, ∞) × Ω → H such that for a.a. ω (i) u ∈ L2 (0, T ; V) ∩ L∞ (0, T ; H) ∩ C(0, T ; Hweak ) for all T < ∞ , (ii) for all t ≥ 0
u(t) = u0 +
t
[νAu(s) − B(u(s)) + f (s, u(s))]ds +
0
t
g(s, u(s))dws (2.3) 0
A strong1 solution has in addition that for a.a. ω
sup u(t)2 + t≤T
T
|Au(t)|2 dt < ∞
0
for all T . The notion of solution for the deterministic case is given by taking g = 0 and removing the random parameter ω throughout, so the solution is a single function u ∈ L2 (0, T ; V) ∩ L∞ (0, T ; H) ∩ C(0, T ; Hweak ). The standard approach to the solution of the Navier–Stokes equations is to first formulate an approximate version in the finite dimensional space Hn for each n. This is the so called Galerkin approximation, and it can be solved easily using standard techniques from ODEs (or SDEs in the stochastic case). Calling the Galerkin approximate solutions un (t), the idea is then to pass to the limit to obtain a solution to the Navier–Stokes equations. However, this is where the difficulties lie. They are two-fold. First, some specialised compactness theorems are required to show that there is a subsequence of 1
Sometimes a strong solution is required to have the stronger property T E supt≤T u(t) 2 + 0 |Au(t)|2 dt) < ∞ for all T ; we prefer to call this strictly strong.
2.1 Introduction
33
(un (t))n∈N that does converge in an appropriate sense to a limit u(t) say. Second there is the problem of showing that this limit u(t) actually is a solution. Using nonstandard techniques, the above difficulties are handled as follows. Take the Galerkin approximate solution U = uN corresponding to an infinite N ∈ ∗N. This is of course a nonstandard function – whose existence is given by the transfer of the standard theory that gives the standard Galerkin approximations un . Consideration of the energy equation shows that U is nearstandard in an appropriate sense to a standard function (or process) u. Then it is necessary to show that this solves the Navier–Stokes equations – using basic Loeb theory. The above procedure will be explained carefully for the deterministic case, because it is the basis of all subsequent applications of these methods to the Navier–Stokes equations. We will then indicate the additional complications in the stochastic case. First it is necessary to set up a little notation and record a few basic facts about the nonstandard space HN . 2.1.4 Nonstandard topology in Hilbert spaces First note that we always denote elements of ∗[0, ∞) by σ, τ to distinguish them from elements s, t of [0, ∞). We use x < ∞ (for x ∈ ∗R) to mean ‘x is finite’. Fix an infinite integer N ∈ ∗N\N and let HN be the space spanned by {E1 , . . . , EN } where Ek = ∗ek . We use U , V , W for elements of HN and write Uk = (U, Ek ) for k ∈ ∗N. Then (U, V ) =
N
Uk Vk ,
k=1
((U, V )) =
N
|U |2 =
N
Uk2 ,
k=1
λk Uk Vk ,
k=1
U 2 =
N
λk Uk2 .
k=1
Thus we can identify the space HN with the (nonstandard) Euclidean space R . Note that for U ∈ HN and u ∈ H: (1) U ≈ u in the weak topology of H iff (U, ∗v) ≈ (u, v) for all v ∈ H (and then uk = ◦Uk for all finite k). (2) U is nearstandard in HN , i.e. nearstandard in the strong topology of H, iff N ∞ ◦ 2 Uk ≈ Uk2 < ∞. ∗ N
k=1
k=1
34
2 Stochastic Fluid Mechanics
(3) If U is nearstandard in HN , then U is weakly nearstandard in HN and the standard parts coincide. (4) If |U | < ∞ then U is weakly nearstandard and |w-stU | ≤ ◦|U |. (5) If U < ∞ then U is weakly nearstandard in V and nearstandard in H. The standard parts coincide and and ||stU ≤ ◦U . In the above, w-stU or ◦U is used to denote the weak standard part of ∗ ∗ U in HN defined on the set w-ns(H N ) of weakly nearstandard points in H. ◦ (There should be no confusion with the conventional notation x to denote the real standard part for a finite number x ∈ ∗R.) By stU we denote the standard part in the strong topology of HN , defined on the set ns(∗HN ) of strongly nearstandard points. Recall that a function U : ∗R → HN is S-continuous for finite time in a topology T on H if U (τ ) ≈T U (σ) whenever τ ≈ σ < ∞. For example, weak S-continuity means that (U (τ ) − U (σ), ∗v) ≈ 0 for all standard v.
2.2 Solution of the Deterministic Navier-Stokes Equations In this section set g = 0 and assume that we are given an element u0 ∈ H (the initial condition) and a forcing term f : [0, ∞) × V→ V which is continuous and has linear growth in the second variable. Let U0 = PrN ∗u0 ∈ HN and consider the following system of N equations for the internal function U : ∗[0, ∞) → HN . In vector form, U˙ (τ ) = −ν ∗AU (τ ) − BN (U (τ )) + F (τ, U (τ ))
(2.4)
with initial condition U (0) = U0 , where F (τ, V ) = PrN ∗f (τ, V ) and BN (V ) = PrN ∗b(V, V, ·) N Putting U (τ ) = k=1 Uk (τ )Ek ∈ HN , that is U˙ k (τ ) = −νλk Uk (τ ) − ∗b(U (τ ), U (τ ), Ek ) + Fk (τ, U (τ ))
(2.5)
for k = 1, . . . , N . This is simply the Galerkin approximation to the Navier–Stokes equations in dimension N . Transfer of standard results (the theory of ODEs in Rn ) shows immediately that there is a nonstandard solution.2 2
This will be unique if f satisfies a Lipschitz condition; alternatively, if it were needed, uniqueness of solution to the Galerkin equation on HN could be achieved by an infinitesimal modification of F so that it is ∗Lipschitz.
2.2 Solution of the Deterministic Navier-Stokes Equations
35
Elementary calculus on HN (which is, after all, isomorphic to ∗RN ) gives the following energy equation (using (B(V ), V ) = 0 and (AV, V ) = V 2 ): 1 d |U (τ )|2 = −νU (τ )2 + (F (τ, U (τ )), U (τ )). 2 dτ
(2.6)
Thus d |U (τ )|2 + 2νU (τ )2 ≤ 2|F (τ, U (τ ))|V U (τ ) dτ 1 ≤ νU (τ )2 + |F (τ, U (τ ))|2V ν
(2.7)
Integrating with respect to τ and using the growth condition on f (and hence F ) allows an application of Gronwall’s Lemma to give
T
sup |U (τ )| + ν 2
τ ∈[0,T ]
U (τ )2 dτ < ∞
(2.8)
0
for finite T . For finite times τ , this means that |U (τ )| is finite and so U (τ ) is weakly nearstandard. Moreover, U (τ ) is weakly S-continuous (since each component Uk (τ ) is S-continuous for finite τ ). This allows the definition of a weakly continuous standard function u : [0, ∞) → H as follows: u(t) = ◦U (t) = ◦U (τ ) for any τ ≈ t. In terms of co-ordinates, uk (t) = ◦Uk (t) for all finite k; in particular u(0) = u0 . Now we have Theorem 2.2 The function u(t) defined above is a weak solution to the (deterministic) Navier–Stokes equations with
sups≤t |u(s)|2 + ν
t
u(s)2 ds < ∞
(2.9)
0
for all t. Proof (Sketch) Note first that the inequality (2.9) follows immediately from (2.8) using the inequalities |◦V | ≤ ◦|V |, ◦V ≤ ◦V and Theorem 1.32. Thus u ∈ L2 (0, T ; V) ∩ L∞ (0, T ; H) ∩ C(0, T ; Hweak ) as required. To see that it provides a solution, it is sufficient to show that
t − ν((u(s), v)) − b(u(s), u(s), v) + (f (s, u(s)), v) ds (u(t), v) = (u0 , v) + 0
(2.10)
36
2 Stochastic Fluid Mechanics
for all t and v ∈ V . For this it is enough to consider v = ek for k = 1, 2, 3, . . .; that is
t [−νλk uk (s) − b(u(s), u(s), ek ) + fk (s, u(s))]ds uk (t) = (u0 , ek ) + 0
where fk = (f, ek ). For this it is an elementary application of Loeb theory (after checking that all the relevant integrands are S-integrable) to show that (1)
t
t
λk Uk (τ )dτ ≈
0
λk uk (◦τ )dL τ =
λk uk (s)ds =
0
(2)
t
0
0
∗
b(U (τ ), U (τ ), Ek )dτ ≈
t
◦
◦
0
((u(s), ek ))ds; 0
◦∗
b(u( τ ), u( τ ), ek )dL τ =
=
t
b(U (τ ), U (τ ), Ek )dL τ
0 t
t
t
b(u(s), u(s), ek )ds 0
using the continuity properties of b; (3)
t
t
t ◦ Fk (τ, U (τ ))dτ ≈ Fk (τ, U (τ ))dL τ = fk (s, u(s))ds 0
0
0
using the continuity property of f . Putting all this together, using (2.5), shows that u is indeed a solution to the Navier–Stokes equations. Remark It is shown in [13] (Section 4.3) that all weak solutions to the Navier-Stokes equations can be obtained by this method if we allow an infinitesimal perturbation of the force term F in (2.4). 2.2.1 Uniqueness In the case d = 2, provided that f is locally Lipshitz in u, it can be shown that the solution constructed above is unique – the technique is almost identical to the standard proof of uniqueness in dimension two. When d = 3, again assuming that f is Lipschitz, the solution U = uN to the Galerkin approximation is unique – but note that this does not imply uniqueness of the solution to the Navier–Stokes equations. Possible sources of non-uniqueness (i.e. constructions that might give a different solution) are: (a) make an infinitesimal perturbation of U (0); (b) make an infinitesimal perturbation of F ; (c) choose a different infinite value for N .
2.3 Solution of the Stochastic Navier–Stokes Equations
37
Any of these modifications in the construction might produce different solutions. The book [13] (Section 4.3) contains a fuller discussion of the information about the uniqueness problem that can be gleaned from the nonstandard method of solution.
2.3 Solution of the Stochastic Navier–Stokes Equations Let us now return to the general stochastic Navier–Stokes equations du = [−νAu − B(u) + f (t, u)]dt + g(t, u)dwt
(2.11)
in the Hilbert space setting, for which we seek solutions in the sense of Definition 2.1. We impose appropriate growth and continuity conditions on the coefficients f, g. Attempts to solve these equations using the standard Galerkin approximation technique encounter not only the difficulties discussed at the beginning of Section 2.1.3 but new ones on account of the stochastic terms. It is straightforward to construct a sequence of solutions un to the Galerkin approximations to equation (2.11), each of which is a stochastic process in Hn carried by some probability space Ω n . The problems arise when looking for a convergent subsequence, and where to find it. It turns out3 to be necessary to construct a new richer probability space Ω to accommodate a process u that is in the appropriate sense a limit of a subsequence of the processes un and their spaces Ω n . Then it is necessary to show that u is a solution to (2.11) for some Wiener process w on Ω. This difficult procedure is circumvented by the use of a Loeb space, which can be given (or constructed) in advance, carrying a prescribed Wiener process w that is the standard part ∗W of an internal ∗Wiener process on HN . The richness of the Loeb space means that all constructions can be carried out without leaving this space. The solution lives here and uses the prescribed Wiener process. In this sense the Loeb space solutions are stronger than those that were subsequently constructed by standard limiting arguments. The pattern of the solution is the same as for the deterministic case – first solving the Galerkin approximation on HN and then taking standard parts, after working to show that the internal solution UN is nearstandard in an appropriate sense. The solution then lives on the same space Ω and is driven by the Wiener process that was given in advance. 3
The first proof of existence of solutions to the general stochastic Navier–Stokes equations (2.11) was given in [10] using the Loeb space methods described here. Later it was discovered how to construct the limit of the Galerkin approximations, using, among other techniques, the Skorohod embedding theorem to construct a richer space.
38
2 Stochastic Fluid Mechanics
There are (whatever the approach) a number of additional technicalities to take into account before the equations can be solved. To make sense of the stochastic term (i.e. to be able to define the Ichikawa integral), the Wiener process w must have covariance Q, a nuclear (or trace class) operator on H. The Galerkin approximations living on Hn will then have a stochastic term driven by a Wiener process wn with n × n covariance matrix Qn = Prn QPrn . Moving to the internal space HN for infinite N , the Galerkin approximation to (2.11) is the following internal N -dimensional SDE for a stochastic process U (τ, ω) ∈ HN : dU (τ ) = [−ν ∗AU (τ ) − B(U (τ )) + F (τ, U (τ ))]dτ + G(τ, U (τ ))dW (τ )
U (0) = PrN ∗u0 (2.12)
where F , G are given by F (τ, U ) = PrN ∗f (τ, U ), G(τ, U )V = PrN ∗g(τ, U )V. Let Ω 0 = (Ω, A, (Aτ )τ ≥0 , Π) be an internal filtered space carrying an internal Wiener process W on HN with covariance QN .4 The growth and continuity conditions imposed on f and g ensure (using the transfer of the standard theory of SDE’s) that (2.12) has an internal solution U (τ, ω) for all τ ∈ ∗[0, ∞) on Ω 0 , and U is adapted to (Aτ )τ ≥0 . The transfer of Itˆ o’s formula gives
τ
τ 2 2 2 U (σ) dσ = |U (0)| + 2 (F (σ, U (σ)), U (σ))dσ |U (τ )| + 2ν 0
τ
τ0 T + tr[G(σ, U (σ))QN G(σ, U (σ)) ]dσ + 2 (U (σ), G(σ, U (σ)))dW (σ). 0
0
which corresponds to the equation (2.6). From this, a rather technical argument involving the Burkholder-Davis-Gundy inequalities, is used to establish the following counterpart of (2.8):
E
T
sup |U (τ )|2 +
τ ≤T
U (τ )2 dτ
<∞
(2.13)
0
for all finite T . The expectation here is with respect to the internal probability on Ω 0 . From this we have, almost surely with respect to the corresponding Loeb probability
T U (τ )2 dτ < ∞ (2.14) sup |U (τ )|2 + τ ≤T
4
0
For example we may take the canonical process on the space of ∗continuous paths in HN .
2.3 Solution of the Stochastic Navier–Stokes Equations
39
for all finite T . S-continuity properties of internal stochastic integrals show that in addition, almost every path U (·, ω) is weakly S-continuous for finite τ . This allows the construction of a standard process u(t, ω) on the Loeb space as in the deterministic case: for a.a. ω u(t, ω) = ◦U (t, ω) = ◦U (τ, ω) for any τ ≈ t. Theorem 2.3 The process u(t, ω) is a solution to the stochastic Navier– Stokes equations (2.11). Proof (Sketch) It is clear that this process has the property 2.1(i) required of a solution. To see that it satisfies the stochastic integral equation (2.11) proceed just as in the deterministic case for the terms in the deterministic (Bochner) integral. For the stochastic term it is necessary to establish that for a.a. ω
◦ T
T
G(τ, U (τ, ω))dW (τ ) =
0
g(t, u(t, ω))dwt 0
for all finite T . This is achieved by extending to the Ichikawa integral the theory initiated by Anderson and developed by Hoover, Perkins and Lindstrøm [57, 68] relating internal stochastic integrals to standard stochastic integrals, as discussed briefly in Lecture 1 (Theorem 1.43). An easy extension of the above method provides a solution to the stochastic Navier–Stokes equations when the initial condition is random (and independent of the Wiener process w).
2.3.1 Stochastic flow It was shown in [12] (see also [13] Section 6.3) that in two dimensions a stochastic flow of solutions to the stochastic Navier–Stokes equations can be constructed using the above methods, provided that the noise g is homogeneous in time, linear in u and is orthogonal to the velocity – i.e. (g(u), u) = 0. A stochastic flow of solutions is a single measurable function ϕ : [0, ∞) × H × Ω → H such that ϕ(·, ·, ω) is continuous for a.a. ω, and for each fixed u0 ∈ H the process u(t, ω) = ϕ(t, u0 , ω) is a solution of the stochastic Navier–Stokes equations (2.11) with initial condition u(0) = u0 . This will be discussed further in Section 2.8 below.
40
2 Stochastic Fluid Mechanics
2.3.2 Nonhomogeneous stochastic Navier–Stokes equations In a recent extension of the methods described above, Brendan Enright [48] has solved a new general version of the non-homogenous stochastic Navier– Stokes equations: these describe the time evolution of the velocity of a fluid for which the the density is not assumed to be constant either in space or time. The system (2.2) is augmented by an additional equation for the density ρ(t, x) and the density itself figures in the equation for the velocity. The equations become ρdu = [ν∆u − ρu, ∇u − ∇p + ρf (t, u)]dt + ρg(t, u)dwt
(2.15)
∂ρ + u, ∇ρ = 0 ∂t divu = 0. As in the deterministic case a weak solution u(t, ω) ∈ H is sought, so that the pressure term disappears initially. There are considerable complications as compared with (2.2), due to the extra density equation, and the solutions constructed have almost all paths u(·, ω) in the space L∞ (0, T ; H) ∩ L2 (0, T ; V). However it turns out that we may take ρ(t, ·)u(t, ω) to be weakly continuous in t. The density ρ(·, ·, ω) is in L∞ ([0, T ] × D) almost surely. Even in the deterministic case (g = 0) the nonhomogeneous equations are much harder to solve. The standard approach [1] based on the original proof by Kazhikov in 1974 uses the Galerkin approximation technique, but showing the existence of a limit involves many tedious calculations. In his thesis [48] Enright shows how the nonstandard framework allows considerable simplification for the deterministic non-homogeneous Navier–Stokes equations. He goes on to develop these techniques for the stochastic non-homogeneous equations with quite general forces and feedback of the solution. The results generalise earlier work on the stochastic nonhomogeneous equations by Yashima [100], who solved the equation (2.15) with additive noise, which is essentially the case g = 1 in (2.15).
2.4 Stochastic Euler Equations The Euler equations for an ideal fluid are simply the Navier–Stokes equations with the viscosity ν = 0. They are considerably harder to solve than the Navier–Stokes equations, where the presence of the Laplacian term greatly enhances the regularity of solutions. In particular, even in two dimensions uniqueness is a problem for the Euler equations. The Loeb space methods used for the stochastic Navier–Stokes equations have been used nevertheless to establish some results for the Euler equations, which we outline in this section. For details see the paper [16]. The stochastic Euler equations considered take the form
2.4 Stochastic Euler Equations
du = [−B(u) + f (t, u)]dt + g(t, u)dwt
41
(2.16)
(together with the incompressibility condition divu = 0, which is implicit since we continue to work in the function spaces H, V as before). Attention is restricted to two dimensions with periodic boundary condition in the space variable (so the domain of the velocity field u can be identified with the torus). This gives the additional property b(u, u, Au) = 0 which is used to overcome problems caused by the absence of the Laplacian. The main result proved for this equation in [16] is the following. Theorem 2.4 Suppose u0 ∈ V and f : [0, ∞) × V → V,
g : [0, ∞) × V → L(H, V)
are jointly measurable with appropriate continuity and growth conditions. Then the stochastic Euler equations (2.16) have a solution u with almost all trajectories in C(0, T ; Vweak ) ∩ L∞ (0, T ; V) for all T , and satisfying E sup u(t)2 < ∞. (2.17) t≤T
The proof is very similar to the proof of the main existence theorem for the stochastic Navier–Stokes equations (Theorem 2.3). The paper [16] also derives information about the connection between solutions to the stochastic Euler equations constructed by Loeb space methods, and the solution to the stochastic Navier–Stokes equations as the viscosity ν → 0. To describe this, first fix a Loeb space used in the construction of the solution to the stochastic Navier–Stokes equations as in Theorem 2.3; this carries a Wiener process w with covariance Q (constructed from an internal ∗ Wiener process W on HN ). Denote by uν the solution to the stochastic Navier–Stokes equations given by Theorem 2.3 – with ν now a parameter that can vary. The proof of Theorem 2.4 shows that it also makes sense to define uν for infinitesimal ν. For any ν let µν be the law of the process uν . Then we have: Theorem 2.5 (a) If ν ≈ 0 then uν solves the stochastic Euler equations. (b) The set {µν : ν ∈ (0, 1]} is relatively compact in the weak topology on the space of probability measures. (c) If νn → 0 and µνn → µ, then µ is the law of a solution to the stochastic Euler equation (2.16) on the Loeb space Ω with the same driving Wiener process w. In fact, for all sufficiently small infinite N the process uνN is a solution to (2.16) with law µ.
42
2 Stochastic Fluid Mechanics
2.5 Statistical Solutions Before discussing the way in which Loeb measures help in constructing statistical solutions for the Navier–Stokes equations we introduce the notion of a statistical solution in a general setting. 2.5.1 The Foias equation Consider the following abstract evolution equation taking place in a normed space H d u(t) = F (t, u(t)), t > 0. (2.18) dt Suppose that for each initial value v ∈ H there is a unique solution u(t) with u(0) = v. Denote this solution by u(t) = S(t, v) to emphasize the dependence on the initial function. Suppose now that the initial value is a random variable v : Ω → H. This random variable induces a probability measure µ0 on H by µ0 (A) = P (v ∈ A) where P is the given probability on Ω. The function S(t, v(ω)) is then a stochastic process with initial distribution µ0 . The probability distributions µt of the random variables S(t, ·) are the measures on H given by µt (A) = P (S(t, v) ∈ A) = µ0 ({v : S(t, v) ∈ A}). In other words, (µt )t≥0 is the family of time evolving measures on H induced by the equation (2.18) from the given initial measure µ0 . The idea behind statistical solutions to (2.18) is to find an equation that describes the time evolution of µt . Note that to describe each measure µt it is sufficient to characterize the integrals
θ(u)dµt (u). (2.19) H
for a sufficiently broad class of test functions θ. Computing the time derivative of (2.19) heuristically, assuming that θ ∈ H and drawing on equation (2.18) gives:
d d θ(u)dµt (u) = θ(S(t, v))dµ0 (v) (by the definition of µt ) dt H dt
H = (θ (S(t, v)), F (t, S(t, v)))H dµ0 (v) (from (2.18))
H = (θ (u), F (u))H dµt (u) (by definition of µt again). H
2.5 Statistical Solutions
43
After integrating from 0 to t we obtain the so-called Foias equation corresponding to the original evolution equation 2.18:
t
θ(u)dµt (u) − θ(u)dµ0 (u) = (θ (u), F (t, u))H dµs (u)ds. (2.20) H
H
0
H
Definition 2.6 Any solution (µt )t≥0 to the Foias equation is called a statistical solution to the equation (2.18). Note that the above derivation of the Foias requires that (2.18) have the uniqueness property – otherwise the function S does not exist. However, S does not occur in the Foias equation, so as an abstract equation for the time evolution of a family of measures it makes sense even when the underlying equation does not have a unique solution. This is the crucial point that was observed by Foias. Thus, in the case of the Navier–Stokes equations in 3-dimensions, since it is not known whether there is a unique solution, the above derivation does not make sense; however, the end result – the corresponding Foias equation – does makes sense and it is possible find statistical solutions. The Foias equation for the Navier–Stokes equations takes the form
θ(u)dµt (u) = θ(u)dµ(u) H H
t
− ν((u, θ (u))) − b(u, u, θ (u)) + (f (s, u), θ (u)) dµs (u)ds + 0
H
There is a corresponding notion of statistical solution for the stochastic Navier–Stokes equations which we will discuss later (see section 2.5.5). 2.5.2 Construction of statistical solutions using Loeb measures Loeb measures allow a simple construction of statistical solutions for the deterministic Navier–Stokes equations by utilising the uniqueness property5 for the Galerkin equation in HN . The steps are as follows. Step 1. For fixed V ∈ HN with |V | finite, solve the Galerkin equation 2.4 in HN with initial condition U (0) = V , and follow the rest of the proof of Theorem 2.2 to see that it provides a solution u(t) to the Navier–Stokes equations with initial condition u(0) = ◦V . Write u(t) = St V for this solution, to indicate its dependence on V . Step 2. Let M = ∗µ0 ◦ Pr−1 N , which is an internal probability measure on HN , and consider the corresponding Loeb measure ML . Step 3. Define a family of measures µt on H, for t ≥ 0 by 5
If necessary, to give uniqueness, we make an infinitesimal modification of F so that it is ∗Lipschitz in U .
44
2 Stochastic Fluid Mechanics
µt (A) = ML ({V ∈ HN : St V ∈ A}) Step 4. Carry out Foias’ heuristic argument in HN , using the fact that we have uniqueness of solutions in HN , and combine with Loeb space techniques, to show that µt is a statistical solution to the Navier–Stokes equations. A minor variation on this approach would be to write U (τ ) = Tτ V to denote the internal solution to the Galerkin approximation on HN and let Mτ be the corresponding internal family of measures induced on HN by means of Tτ with M0 = M . Foias’ argument shows that (Mτ )τ ≥0 is an internal statistical solution (i.e. solves the Foias equation corresponding to the Galerkin equation on HN ). Now define µt (A) = (Mt )L ◦ st−1 (A) for Borel A ⊆ H; it follows quite easily that this is a statistical solution. 2.5.3 Measures by nonstandard densities An alternative Loeb space technique for constructing statistical solutions involves the use of nonstandard densities – and is particularly pleasant when one considers the stochastic Navier–Stokes equations. Measures on infinite dimensional spaces cannot be described by densities due to the lack of Lebesgue measure as a reference measure. However, the hyperfinite space HN carries the nonstandard Lebesgue measure, since it can be identified with ∗RN . Thus we can introduce nonstandard densities here, and they turn out to be quite useful. Definition 2.7 An internal function Φ : HN → ∗R is a nonstandard density of the probability measure µ on the Hilbert space H if Φ is non-negative, ∗ integrable with respect to ∗Lebesgue measure on ∗RN , with Φ(U )dU = 1, and µ(B) = ML (st−1 H (B) ∩ HN ), where ML is the Loeb measure corresponding to the internal measure M on HN given by
M (A) = Φ(U )dU. A
To see that nonstandard densities exist for Borel probabilities, let N (X, C) denote the normal density on ∗RN with mean X and covariance C, and we have: Proposition 2.8 The function Φ
N (U − PrN v, ε2 · I) d ∗µ(v), Φ(U ) = ∗H
2.5 Statistical Solutions
45
is a nonstandard density of µ. Moreover
2 |U | dM (U ) ≤ |u|2 dµ(u) + N ε2 .
2.5.4 Construction of statistical solutions using nonstandard densities We first heuristically derive an equation for the densities of solutions to the Foias equation (2.21). Suppose that Φ(t, U ) is a density of µt so that for a test functional θ we have
θ(u) dµt (u) ≈ ∗θ(U ) Φ(t, U ) dU. Thenit is natural to rewrite the Foias equation replacing θ by the vector ∂θ ∗ ∂Uk , dµt by ΦdU etc. Then after integration by parts and dropping θ from both sides we obtain N ∂ ∂ Φ(τ, U ) + − νλk Uk − ∗b(U, U, Ek ) + Fk (τ, U ) · Φ(τ, U ) = 0 ∂τ ∂Uk k=1 (2.21) which is now a nonstandard equation with τ ∈ ∗[0, T ] and U ∈ HN ; we call it the density equation (for the Navier–Stokes equations). It is in fact a hyperfinite version of the Liouville equation. The next result shows how its solution provides statistical solutions. Theorem 2.9 Let µ be a Borel probability measure on H satisfying
|u|2 dµ(u) < ∞. Let Φ0 be a nonstandard density of µ with
|U |2 Φ0 (U )dU < ∞ Let Φ(τ, U ) be the solution to the density equation (2.21) with initial function Φ0 . Then the internal measures Mτ determined by Φ(τ ) are nearstandardly concentrated and the standard family of measures given by µt = (Mt )L ◦ st−1 ,
t ∈ [0, T ],
is a statistical solution of the Navier–Stokes equation with initial measure µ. Proof (Sketch) The density equation can be solved by the method of characteristics. By reversing the derivation of the density equation, it can be seen that the measures Mτ solve the Foias equation on HN – and it follows as before that the family µt is a statistical solution.
46
2 Stochastic Fluid Mechanics
2.5.5 Statistical solutions for stochastic Navier–Stokes equations The heuristic argument for the Foias equation for the Navier–Stokes equations can be carried through for the stochastic Navier–Stokes equations, resulting in an extra term as compared with equation (2.21):
t
H
θ(u)dµt (u) −
H
θ(u)dµ0 (u) = 0
H
−ν((u, θ (u) )) − b(u, u, θ (u) )
+ (f (s, u), θ (u) ) + 12 tr(Qg T (s, u)θ (u)g(s, u) ) dµs (u)ds
(2.22)
(here g T denotes the adjoint of g). Once we have a solution u(t, ω) to the stochastic Navier–Stokes equations (2.11) with random initial condition u(0, ω) distributed according to the initial measure µ0 , then it is easy to see that a statistical solution is given by µt (A) = P (u(t, ω) ∈ A). Alternatively, nonstandard densities give a way to find a statistical solutions. The advantage in this case is that there is no need for any stochastic analysis to solve (2.22) – after all, this is a deterministic equation for the evolution of a family of probability measures, so it might be expected that a solution can be found without the need for the development of the stochastic integral in H. The density equation is the same as the deterministic case (2.21) except for an additional second order term on the left: −
N ∂ 2 (γij Φ) ∂Ui ∂Uj i,j=1
where γ(τ, U ) = 12 G(τ, U )QN GT (τ, U ). Although it is a little more complicated, the construction of a statistical solution to the stochastic Navier– Stokes equations by first solving the density equation proceeds along the same lines as the deterministic case.
2.6 Attractors for Navier–Stokes Equations 2.6.1 Introduction The notion of attractor is concerned with the asymptotic behaviour of trajectories of semigroups of operators. Recall that a semigroup on a topological space X is a one parameter family (S(t))t≥0 of operators with S(t) : X → X, satisfying
2.6 Attractors for Navier–Stokes Equations
47
1) S(0) = idX , 2) S(t + s) = S(t) ◦ S(s). A set A ⊆ X is an attractor for S if 1) it is invariant, that is, S(t)A = A, for all t. 2) there is an open neighbourhood U of A, called the basin of attraction, such that for all x ∈ U , S(t)x → A (in the sense that for each open neighbourhood V of A, S(t)x ∈ V for t sufficiently large). If 2) holds with U = X, then A is a global attractor. Condition 1) is trivially fulfilled for A = Ω, while 2) holds for A = X, so the interest lies in having both conditions fulfilled simultaneously. Variations of this definition are possible and occur in the literature — for example, requirement 2) can be replaced by the stronger condition that A attracts sets from a certain class B, i.e. for all B ∈ B, S(t)B ⊆ V for sufficiently large t. An example of such a class is where B is the family of all bounded sets (assuming now that X is metric). For an evolution equation a semigroup S(t) is defined on the phase space by: S(t)x is the solution to the equation in question with the initial value x ∈ X. To get the semigroup property it is necessary that the equation be homogeneous, that is, the coefficients must be independent of time. Here we are concerned with existence of attractors for the Navier-Stokes equations. For deterministic Navier-Stokes equations, the existence of a global attractor in dimension 2 goes back to the work of Ladyzhenskaya [65] and Foias & Temam; for a full exposition see Chapter III (sec. 2) of Temam’s book [95]. The restriction to dimension 2 is because it is only in that case that the semigroup S(t) exists for the Navier–Stokes equations – in higher dimensions the question of uniqueness is still open. The nonstandard method of solution described in Section 2.2 above does give uniqueness of solution at the level of HN even for dimension 3; this was exploited in [14] to give a kind of attractor, and is discussed below (section 2.6.3). The new difficulties encountered when seeking attractors for the stochastic equations are twofold. First there is a problem with the very definition of attractors for stochastic equations, since the noise is not homogeneous in time – see the discussion below. Second, for the stochastic Navier–Stokes equations there is the issue of existence of solutions to the equations themselves – particularly existence of the stochastic equivalent of a semigroup of solutions. These difficulties can be tackled in various ways. In [19], for example, Crauel & Flandoli use the notion of a cocycle, a generalization of the semigroup idea to non homogeneous situations. They then formulate the attraction property in a special way, by introducing stochastic attractors as random sets that are stationary and attract at finite time the trajectories started at −∞, after the equations (and their
48
2 Stochastic Fluid Mechanics
solution) have been suitably extended to the whole real line. For this approach to make sense, however, one needs a stochastic flow. This has been constructed for Navier-Stokes equations with a special form of noise (referred to in section 2.3.1 above). In a recent paper [17] it was shown that this flow gives a cocycle, and the first general existence result for stochastic attractors for Navier–Stokes equations was obtained. We enlarge on this in section 2.8. For general time-homogeneous stochastic Navier-Stokes equations of the form du = (ν∆u − u, ∇u + f (u))dt + g(u)dwt (2.23) an alternative approach is to study the time evolution of the initial measure, that is, the probability distribution of the initial value. In other words, this is to consider attractors for statistical solutions to the stochastic Navier–Stokes equations. Such attractors are called measure attractors In dimension 2, where there is global uniqueness for statistical solutions, this evolution of measures gives a semigroup on the space M1 (H) of probability measures defined on the underlying Hilbert space. The results obtained for measure attractors using Loeb space methods are described in Section 2.7 below. 2.6.2 Nonstandard attractors and standard attractors In this section we outline a general nonstandard approach to attractors that underlies all of the applications to the Navier–Stokes equations below. In the general setting described above, suppose that X is now a Banach space with norm | · | and there is a norm-bounded set E that is an absorbing set, defined as follows. Definition 2.10 The set E is an absorbing set if for any bounded set B there is t0 > 0 such that St B ⊆ E whenever t ≥ t0 . If there is a bounded absorbing set E then a simple way to construct an attractor is as follows. First note that ∗E is S-absorbing – meaning that for any finitely bounded set B ⊂ ∗X there is a finite time τ0 such that ∗Sτ B ⊆ ∗E for all τ ≥ τ0 . This follows by transfer of the absorbing property of E. Now write Tτ for ∗Sτ and define the internal set C by C= Tτ ∗E = Tτ ∗E τ −infinite
n∈N τ ≥n
(the equivalence follows by an application of ℵ1 -saturation). Then it follows easily that C is a global S-attractor , by which we mean that C has the three properties noted in the following theorem.
2.6 Attractors for Navier–Stokes Equations
49
Theorem 2.11 (a) C is a countable intersection of internal sets; in fact C= Cn n∈N
where Cn =
Tτ ∗E.
τ ≥n
(b) Tτ C = C for all finite τ ; (c) For each n ∈ N and finitely bounded set B ⊂ ∗X there is t0 ∈ [0, ∞) such that T (τ )B ⊆ Cn for all τ ≥ t0 . The set C is bounded and hence weakly nearstandard in ∗X, so we may take the standard part in the weak topology A = w-stC = {w-stx : x ∈ C} and this is a weakly compact subset of X. It is now quite straightforward to see that A is a global attractor for the semigroup St , given certain natural continuity assumptions. The proof draws on the fact that C is an S-attractor. This idea for constructing attractors is adapted to give the particular applications to the Navier–Stokes equations that are described in the following sections. 2.6.3 Attractors for 3-dimensional Navier–Stokes equations In three dimensions, because of possible non-uniqueness, some alternative ideas are needed to discuss attractors for Navier–Stokes equations. The radical approach adopted by Sell [92] was to work with the entire set of solutions as phase space, and then define the semigroup St as translation of the solution in time by t. In the paper [14] the above nonstandard ideas were used, for 3-dimensional Navier–Stokes equations, but starting from the internal semigroup Tτ on HN which gives solutions to the Galerkin equation (2.4) on HN (this was mentioned in 2.5.2) – which exists because of uniqueness at this level. For simplicity the force f was taken to be constant. Briefly, the idea is as follows. The techniques used to give an attractor in dimension 2 can be applied to the Galerkin approximation on HN in dimension 3 to give a ball B(ρ) of finite radius in HN that is a global Sattractor for the semigroup Tτ . Then define C ⊂ HN and A = w-stC (the weak standard part) as above, and we have a weakly compact subset of H, which is an attractor in some sense for the 3-dimensional Navier–Stokes equations. To be more precise, we can define two multi-valued semiflows
50
2 Stochastic Fluid Mechanics
St u = {w-st(Tt U ) : w-stU = u, |U | < ∞}, if u ∈ /A w-st{Tt U : st(U ) = u} ˆ St u = w-st{Tt U : w-st(U ) = u, U ∈ C} if u ∈ A A sample of the results obtained for these semiflows is as follows, where An = w-stCn : ˆ Theorem 2.12 (a) St1 +t2 u ⊆ St1 (St2 u), and similarly for S; (b) A ⊆ St A for all t; (c) A = Sˆt A for all t; (d) for each n and bounded set B ⊂ H there is t0 such that for t ≥ t0 Sˆt B ⊆ An ; (e) for each weakly open set O ⊇A and bounded set B ⊂ H there is t0 such that for t ≥ t0 Sˆt B ⊆ O. As is to be expected, the results are less pleasing than in dimension 2. Rather more satisfactory results were obtained for an approach that uses small initial pieces of trajectories of solutions as phase space – this is an idea that is intermediate between the above and that of Sell. For full details, and further variations on this theme, see [14].
2.7 Measure Attractors for Stochastic Navier–Stokes Equations As noted above, the notion of measure attractor is the most appropriate for the general time-homogeneous stochastic Navier–Stokes equations in two dimensions du = (ν∆u − u, ∇u + f (u))dt + g(u)dwt . (2.24) A measure attractor is, roughly speaking, an attractor for the Foias equation. The precise definition is given below. Measure attractors for the Navier–Stokes equations were first studied by B.Schmalfuß [89], [90] (see also [75], which deals with a general class of equations that does not include the Navier–Stokes equations). A measure attractor for statistical solutions will be a subset of the set M1 (H) of Borel probabilities on H, viewed as a subset of the space of Borel measures M(H) on H (which we equip with the topology of weak convergence). The paper [15], on which the present section is based, establishes results on existence of measure attractors for Navier–Stokes equations that are more general than those obtained by Schmalfuß.
2.7 Measure Attractors for Stochastic Navier–Stokes Equations
51
Here are brief details. First, the phase space (where all the activity takes place) is the subset of M1 (H) given by
X = {µ ∈ M1 (H) : |u|2 dµ(u) < ∞}.
Write B = {µ ∈ X : r
|u|2 dµ(u) ≤ r}.
These sets play the rˆole of bounded sets in the discussion of attractors in the introduction above. The semigroup St on X is defined as the evolution of the (unique) statistical solution to the Navier–Stokes equations in two dimensions. That is, if (µt )t≥0 is the statistical solution with µt = µ then St µ = µt . An alternative description is that St µ = µt is defined by
ϑ(u)dµt (u) = Eϑ(v(t, u))dµ(u) H
H
for any continuous bounded ϑ, where v(t, u) is the solution to the stochastic Navier–Stokes equations with initial condition u. In other words, drawing on the uniqueness property we transport the initial measure along the trajectories of the solution, to give the semigroup St . Loeb space methods are used to give a simple proof of continuity of the function u → Eϑ(v(t, u)) (using the nonstandard construction of solution to the stochastic Navier–Stokes equations, as in Section 2.3). This is needed to show that for any r > 0 and t ≥ 0 the set St B r is compact. The next step is to show that there is ρ > 0 such that B ρ absorbs the sets B r . Having done this, a measure attractor is obtained as described in Section 2.6.2. Theorem 2.13 Write Tτ for ∗Sτ and define the internal set C ⊂ ∗X by C= Tτ ∗B ρ = Tτ ∗B ρ τ −infinite
n∈N τ ≥n
and let A = stC. Then A is a measure attractor for the stochastic Navier– Stokes equations (2.24). That is (a) A is weakly compact; (b) St A = A for all t; (c) for each open set O ⊇ A, and for each r > 0 St B r ⊆ O for all sufficiently large t.
52
2 Stochastic Fluid Mechanics
Remarks The above is a slightly simplified account of the main results of the paper [15]. There are two natural weak topologies that may be considered on M1 (H) and the above holds for each – but with slightly differing requirements on the growth and continuity of the coefficients f, g.
2.8 Stochastic Attractors for Navier–Stokes Equations Consider the stochastic Navier–Stokes equations in 2-dimensions
t
t [−νAu(s) − B(u(s)) + f ]ds + g(u(s))dws u(t) = u0 + 0
(2.25)
0
where w is a 1-dimensional Wiener process and g has the special property that (g(u), v) = −(u, g(v)) (which is equivalent to (g(u), u) = 0 if g is linear). As noted in Section 2.3.1, this allows the construction of a stochastic flow of solutions to (2.25). This in turn makes it possible to investigate the possibility of a stochastic attractor for this particular case. We first explain what is meant by this. 2.8.1 Stochastic attractors For a stochastic system such as (2.25), the introduction of new noise as time evolves means that there cannot be an attractor in the sense considered so far for a deterministic system. The idea developed by Crauel & Flandoli [19] is to introduce a system of shifts of the noise in time, and then consider a stochastic attractor to be a set which, at time 0, attracts trajectories “starting at −∞” (compared to the usual idea of an attractor being a set “at time ∞” that attracts trajectories starting at time 0). This idea is spelled out below. In proving the existence of a stochastic attractor for the system (2.25) the nonstandard framework makes it particularly easy to consider −∞. In a general setting, let ϕ be a stochastic flow of solutions to a system such as (2.25), as discussed in section 2.3.1. That is, ϕ is a measurable function ϕ : [0, ∞) × H × Ω → H such that ϕ(·, ·, ω) is continuous for a.a. ω, and for each fixed initial condition u0 the process u(t, ω) = ϕ(t, u0 , ω) is a solution to (2.25). Suppose that in addition there is given a one parameter group θt : Ω → Ω of measure preserving maps, which should be thought of as a shift of the noise to the left by t. The notion of a semigroup of operators in the definition of a deterministic attractor, along with the notion of an attractor itself is now replaced by the following.
2.8 Stochastic Attractors for Navier–Stokes Equations
53
Definition 2.14 (i) The flow ϕ is a crude cocycle if for each s ∈ R there is a full set Ωs such that for all ω ∈ Ωs ϕ(s + t, x, ω) = ϕ(t, ϕ(s, x, ω), θs ω) holds for each x ∈ H and t ∈ R. (ii) A cocycle is perfect if Ωs does not depend on s. (iii) Given a perfect cocycle ϕ, a random global attractor is a random compact subset A(ω) of H such that for almost all ω ϕ(t, A(ω), ω) = A(θt ω),
t ≥ 0,
lim dist(ϕ(t, B, θ−t ω), A(ω)) = 0
t→∞
for each bounded set B ⊂ H. Note that the existence of a perfect cocycle is necessary for the possibility of having a stochastic attractor. Constructing a perfect cocycle is difficult, particularly for infinite dimensional systems that are truly stochastic (as compared to random dynamical systems where paths may be treated individually). The papers [2], [19] for example, discuss this in a slightly more general setting. Under the condition of invertibility of ϕ(t, ·, ω) in finite dimensional spaces, it is shown in [2] that a crude cocycle can be made perfect, i.e. it may be modified to a stochastically indistinguishable perfect one. In the setting described below we can improve on this using Loeb space machinery: we construct a perfect cocycle in the infinite dimensional space H for a flow ϕ that is not invertible. Although the equation considered is a special case of the stochastic Navier–Stokes equations, the technique is applicable to a more general class of stochastic flows. 2.8.2 Existence of a stochastic attractor for the Navier–Stokes equations Turning now to the particular system (2.25) above in dimension 2, with the special form of the noise g as described, we assume in addition that we have periodic boundary conditions. The main result of the paper [17] is the following. Theorem 2.15 (a) With appropriate growth and continuity conditions on g, there is an adapted Loeb space carrying a stochastic flow of solutions to (2.25) that is a perfect cocycle, and there is a stochastic attractor A(ω) (compact in the strong topology of H) for this system. (b) If g has the additional property that ((g(v), v)) = 0 for v ∈ V the stochastic attractor is bounded and weakly compact in V.
54
2 Stochastic Fluid Mechanics
The proof of this result is quite long and complicated, so here we only sketch the basic ideas. Step 1 We begin with the Galerkin approximation (2.12) on HN with a one dimensional Wiener process Wτ defined for all time τ ∈ ∗(−∞, ∞) – so that the paths of this process all lie in ∗C0 (R) (∗continuous functions that are 0 at 0.) As underlying nonstandard probability space we take Ω = ∗C0 (R) with the Wiener measure. On Ω take the group of measure preserving shifts Θτ given by (Θτ ω)(σ) = ω(τ + σ) − ω(τ ). Step 2 On Ω construct a flow of solutions Φ(τ, V, ω) – which is a crude cocycle by the transfer of a result of [2] for finite dimensional stochastic systems. Step 3 From Φ define a superflow Ψ of solutions to (2.12) by Ψ (τ, s, V, ω) = Φ(τ − s, V, Θs ω). for all τ ≥ s, for s ∈ ∗Q, for all V ∈ HN , for all ω in a ∗full subset Ω1 of Ω. Ψ has the following properties (i) for each s the process U (τ, ω) = Ψ (τ, s, V, ω) is a solution to (2.12) on ∗ [s, ∞) with initial condition U (0) = V (ii) for all r ≥ s ∈ ∗Q, all τ ≥ r, all V ∈ HN and all ω ∈ Ω1 Ψ (τ, s, V, ω) = Ψ (τ, r, Ψ (r, s, V, ω), ω) Step 4 Show that there is a Loeb full subset of Ω1 such that (i) Ω1 is invariant under Θt for all finite t ∈ ∗Q; (ii) for all ω ∈ Ω1 the superflow Ψ (τ, s, V, ω) is S-continuous in (τ, s, V ) for finite τ and s and (strongly) nearstandard V ∈ HN . It is this step that is the crucial one, and requires a special adaptation of the Lindstrøm-Kolmogorov Continuity Theorem in [3]. In order to carry it out, it is necessary to establish some new strong regularity properties of the solutions to (2.12) in the particular case under consideration. Step 5 Show that there is a fixed finite radius ρ such that B(ρ) absorbs the paths of the superflow Ψ . This is where the special form of the noise is used – since it gives a deterministic equation for the evolution of the energy |U (τ )|2 of solutions to (2.12). Step 6 Define a stochastic S-attractor C(ω) ⊆ HN by
2.9 Attractors for 3-dimensional Stochastic Navier–Stokes Equations
C(ω) =
55
Ψ (0, −s, B(ρ), ω)
0<s∈∗Q, infinite
=
Ψ (0, −s, B(ρ), ω)
n∈N n≤s∈∗Q
=
Cn (ω)
n∈N
say. This has the following properties: (i) for finite s ∈ ∗Q, s ≥ 0 Φ(s, C(ω), ω) = C(Θs ω) (ii) for finite r and n there is finite t0 (n, r) ∈ Q such that Φ(t, BN (r), Θ−t ω) ⊂ Cn (ω) for all t with t0 (n, r) ≤ t ∈ ∗Q Step 7 Take standard parts! The natural probability space to take is the quotient of Ω1 under the equivalence ω ∼ ω iff ω = Θt ω for some hyperrational t ≈ 0. Then, abusing notation somewhat, on this quotient space define ϕ(◦τ, ◦V, ω) = ◦Φ(τ, V, ω) for nearstandard τ, V . Then this is a perfect cocycle, and the random set A(ω) = ◦C(ω) is a stochastic attractor, as required by Theorem 2.15.
2.9 Attractors for 3-dimensional Stochastic Navier–Stokes Equations The difficulties encountered in considering attractors for 3-dimensional Navier– Stokes equations noted above (Section 2.6.3) are compounded when considering the corresponding stochastic equations. Recently Keisler and the author [34] have used Loeb space methods to develop an extension of Sell’s approach [92] (see Section 2.6.3 above) to these equations. The idea is to take as phase space the set X of all stochastic processes on a given Loeb space that are solutions to the stochastic Navier–Stokes equations satisfying some natural conditions that can be derived heuristically. The equations considered take the form
t
t [−νAu(s) − B(u(s)) + f (u(s)]ds + g(u(s))dws (2.26) u(t) = u0 + 0
0
56
2 Stochastic Fluid Mechanics
These are similar to the equations discussed in Sections 2.7 and 2.8 but in dimension 3, with a general time-homogeneous multiplicative noise g and taking w to be a 1-dimensional Wiener process. (The only restrictions on f, g are that they are continuous and obey a certain growth condition which is certainly satisfied if they are bounded.) For the above equations, Sell’s idea was used by Flandoli & Schmalfuss in the paper [51] for the Navier–Stokes equations with a special form of multiplicative noise g which allowed essentially a pathwise solution. At the same time the notion of stochastic attractor had to be the one developed by Crauel & Flandoli [19], as in the previous Section 2.8 – involving running time backwards to −∞. In a later paper [52] Flandoli & Schmalfuss consider in the same framework 3D-Navier–Stokes equations with an irregular forcing term in place of gdw, having no feedback. The idea in the paper [34], outlined here, is to use Sell’s approach at the level of processes rather than paths. In this way the idea of an attractor is formulated in the conventional sense, examining the long term behaviour of solutions as t → ∞. To do this, it is necessary to have a single underlying probability space that is rich enough to carry a supply of solutions to the 3D stochastic Navier–Stokes equations that is sufficient for the concepts to make sense. For this we need the adapted Loeb space described later. But first we describe the basic idea in a little more detail. Suppose we are given a 1-dimensional Wiener process wt on a filtered6 probability space Ω = (Ω, F, (Ft )t≥0 , P ). It is necessary to assume that the space Ω is equipped with some additional structure – namely a family of measure preserving maps θt : Ω → Ω for t ≥ 0 with the following properties: (i) θ0 =identity and θt ◦ θs = θt+s ; (ii) θt Fs = Ft+s for all s, t ≥ 0; (iii) w(t + s, θt ω) − w(t, θt ω) = w(s, ω) for all s ≥ 0. Note that the property (iii) tells us that for a fixed t the increments of the process w(t + s, θt ω) are the same as those of the process w(s, ω). Thus θt can be thought of as a shift of the noise to the right by t. (This is in contrast to the shifts discussed in Section 2.8, which by convention are essentially shifts to the left. Here it is simply more convenient and natural to consider shifts to the right.) The family (θt ) allows the following definition of a semiflow Sr of stochastic processes. Definition 2.16 (Semiflow of Processes) Suppose that u = u(t, ω) is a stochastic process defined for t > 0. Then for any r ≥ 0 the process v = Sr u is defined by v(t, ω) = u(r + t, θr ω) 6
For the reader unfamiliar with stochastic analysis, the σ-algebras Ft are an increasing sequence of sub-algebras of F that represent the information available at each time t.
2.9 Attractors for 3-dimensional Stochastic Navier–Stokes Equations
57
Note that this ensures that if u(t, ·) is Ft -measurable, then so is v(t, ·).7 Suppose now that X is a class of solutions to the stochastic Navier–Stokes equations (2.26) on Ω (so that u ∈ X will be a stochastic process) with the property that St X ⊆ X for all t ≥ 0. We can make a preliminary definition of an attractor for the semiflow St as follows. Definition 2.17 (Provisional) An attractor for the semiflow St on X is a set A ⊆ X such that (i) St A = A for all t ≥ 0; (ii) A attracts bounded subsets of X, in some sense; that is, if B is a bounded subset of X then eventually St B gets “close” to A. (iii) A is “compact” in some sense to be made precise. Since existence results for the stochastic Navier–Stokes equations require a rather large probability space, a fortiori any single space Ω carrying an entire class of solutions X with the required closure properties is likely to be too big to allow a compact attractor in the usual sense. A suitable notion however is that of neocompactness, to be explained later. We can now state (modulo the definitions being made precise) the main theorem of the paper [34]. Theorem 2.18 On a suitable space Ω (which carries solutions to the stochastic Navier–Stokes equations for all L2 (F0 )-measurable initial conditions), there is a neocompact attractor for the class of solutions X described below. Here are a few more details showing how this result is established and fulfilling on the promises made earlier to explain the undefined notions. For Ω take Ω = ∗C0 (R), the internal space of ∗continuous functions ω : ∗ R → ∗R with ω(0) = 0, and let Q be the internal ∗Wiener measure on Ω. Thus the canonical process W (t, ω) = ω(t) is a two-sided ∗Wiener process under Q. This gives the internal filtered space Ω = (Ω, A, (Aτ )τ ≥0 , Q) where Aτ = ∗σ({W (τ ) : τ ≤ τ }) and A = τ ∈∗R Aτ . A family of internal measure preserving maps Θτ : Ω → Ω is defined for τ ≥ 0 by (Θτ (ω))(σ) = ω(σ − τ ) − ω(−τ ). That is, Θτ is a shift of the path ω to the right by τ . (Again note the contrast with the shifts Θτ defined in Section 2.8.2.) 7
Hence, if the process u is adapted then so is v.
58
2 Stochastic Fluid Mechanics
Now let P = QL be the Loeb measure obtained from Q with the corresponding Loeb σ-algebra F = L(A), giving the Loeb probability space Ω = (Ω, L(A), QL ) = (Ω, F, P ). The filtration (Ft )t≥0 is obtained from (Aτ )τ ≥0 in a routine way (see [61] or [13] for example). The family θt is simply the restriction of Θτ to standard times. The appropriate class X of solutions to (2.26) is described by a number of energy inequalities which can be derived heuristically from the equations themselves using an informal version of Itˆ o’s lemma. These take the following form. For a.a. t0 > 0 and all t1 ≥ t0 E(|u(t1 )|2 ) ≤ E(|u(t0 )|2 ) exp(−k(t1 − t0 )) + k
(2.27)
and
E supt0 ≤s≤t1 |u(s)|2 + ν
t1
u(s)2 ds
≤ αE(|u(t0 )|2 ) + β(t1 − t0 ) + γ
t0
(2.28) and various generalisations involving powers |u(t)|q for 1 ≤ q ≤ 2. The paths of solutions in X are also required to have certain natural regularity properties – such as weak continuity and various forms of Lp boundedness. Denoting by 8 Xk the solutions in X that are bounded in the appropriate norm by k, we have that X = k∈N Xk . The notion of neocompactness is that developed by Keisler & Fajardo in a series of papers (see [49, 50, 62]), and is a weakening of the standard notion of compactness that is appropriate for highly non-separable metric spaces. Although the general definition of neocompact requires considerable elaboration, in the context of a metric space obtained from an internal space, a neocompact set is one that is obtained as the standard part of a set C = n∈N Cn with each Cn internal and with all members of the set C nearstandard. The power of the notion of neocompactness derives from the connection with a related notion of neocontinuity which is stronger than the classical notion of continuity. Neocompactness and neocontinuity can be used together in much the same way that compactness and continuity are often combined to good effect. The papers noted above develop a powerful approach to existence theorems in analysis and probability theory, using the fact that many functions involved are actually neocontinuous, and frequently sets and spaces constructed are neocompact. Keisler and the author used these ideas in [33] to provide an alternative proof of existence of solutions to the general 3dimensional stochastic Navier–Stokes equations together with some new optimality results for such solutions. With almost all notions now explained, here is an outline of the procedure involved in the proof of Theorem 2.18. 8
∞ 1 The norm is given by |u| = E 0 |u(t, ω)|2 exp(−t)dt 2 .
2.9 Attractors for 3-dimensional Stochastic Navier–Stokes Equations
59
First, using the method of construction described in Section 2.3, show that the set X of solutions is non-empty. For the required energy inequalities, use the internal Itˆ o lemma, which is valid because internally we are essentially working in ∗RN . Next, formulate the idea of an internal approximate solution to the equations (2.26). The approximation extends to every aspect of being a solution, so that in addition to the equation itself, the energy inequalities are also only approximately satisfied - that is, to within an infinitesimal. The class of all approximate solutions is denoted X , and a crucial result shows that X = ◦X - where ◦U denotes the solution process obtained from an internal process U as in Section 2.3. Moreover, the set X = k∈N Xk where Xk denotes those approximate solutions that are bounded by k, and Xk = ◦Xk for each k. The natural internal counterpart of the semiflow St is denoted Tτ and it is easy to see from the definition that Tτ X ⊂ X for finite τ . Having shown that there is an S-absorbing set B ⊂ X , continue by defining the set Cn C= n∈N
where for each finite n Cn =
Tτ B.
τ ≥n
τ finite
Now show that C is an S-attractor ; that is Tτ C = C
(2.29)
for all finite τ , and for every finite n and k there is finite t(n, k) such that Tτ Xk ⊆ Cn for all finite τ ≥ t(n, k). Finally, the required attractor is defined to be A = ◦C having shown that all members of C are nearstandard. The neocompactness of A follows by showing that C is a countable intersection of internal sets, and the invariance property St A = A is a direct consequence of (2.29). The ◦ sense in which A attracts bounded sets is as follows. Putting An = Cn we have A = n∈N An . Then for every finite n and k there is finite t(n, k) such that St Xk ⊂ An for all t ≥ t(n, k). The above is necessarily an abbreviated account of the results and methods of [34] – and it is also slightly simplified in parts. Nevertheless it outlines
60
2 Stochastic Fluid Mechanics
the main ideas. The need (or so it seems) for the use of a Loeb space is that only such a space would be rich enough to carry a class of solutions to the equations (2.26) that is closed under the semiflow St and contains the “limit” processes that must appear in an attractor. As a consequence of the main result above, we can extend Sell’s approach to measure attractors for the 3-dimensional stochastic Navier–Stokes equations – simply by taking the laws of all the processes involved. For details consult [34].
3. Stochastic Calculus of Variations
3.1 Introduction This Lecture is a story in four acts. The common thread is provided by some intuitions about Brownian motion that can be made precise using the technology of nonstandard analysis, particularly Loeb measures. Recall that a one-dimensional Brownian motion (also called a Wiener process) on the interval [0, 1] is a stochastic process b : [0, 1] × Ω →R (where Ω is some probability space) with the following properties. 1. b(0, ω) = 0 2. For a.a. ω the path b(·, ω) is continuous; 3. The increments b(t, ·) − b(s, ·) are Gaussian distributed1 with mean 0 and variance |t − s|. 4. Disjoint increments are independent (for example if s1 < t1 ≤ s2 < t2 then the random variables b(ti , ·) − b(si , ·) (i = 1, 2) are independent). Wiener measure W on C = C0 [0, 1] (the set of continuous functions that are zero at zero) is the probability measure induced by Brownian motion. That is, for Borel A ⊆ C W (A) = P (b(·, ω) ∈ A). Recall Anderson’s famous construction of Brownian motion that we described in Section 1.3.3: the process b is obtained√as the standard part of an infinitesimal random walk B(t) with step sizes ± ∆t, where ∆t ≈ 0 (in fact we take ∆t = N −1 for a fixed, but arbitrary, infinite natural number N ). Since ∆B(t)2 = ∆t this construction captures the stochastic analyst’s rule of thumb (or intuition) that “db2t = dt” (Note that we use interchangeably the notation b(t) = bt ). Another intuitive formula – this time for Wiener measure – that is often employed by physicists, is Donsker’s “flat integral”
1 W (A) = κ exp − 12 0 x˙ 2t dt dλ(x) (3.1) 1
A
We will write N (µ, σ 2 ) for the Gaussian (normal) distribution with mean µ and variance σ 2 .
N.J. Cutland: LNM 1751, pp. 61–84, 2000. c Springer-Verlag Berlin Heidelberg 2000
62
3 Stochastic Calculus of Variations
for the Wiener measure W (A) of a Borel subset A ⊆ C. Here λ is ‘Lebesgue’ measure on C (i.e. a hypothetical translation invariant measure) and κ is a constant; x˙ denotes the derivative of a path in C. This formula fails to make sense for many reasons. First, there is no ‘Lebesgue’ measure on C (or C[0,1]). Second, the only way to make sense of
1 the term 0 x˙ 2t dt, which is the action (i.e. the integral of the kinetic energy along a path) is to define it to be infinite unless x is absolutely continuous with x˙ ∈ L2 [0, 1]. This means that the integrand in (3.1) is almost surely zero; thus, finally, the constant κ would need to be infinite to have W (A) finite. Donsker’s flat integral is the precursor of many other intuitive flat integral formulae that occur in physics, and can be used to give heuristic arguments for correct results – such as the Cameron-Martin formula for the translation of Wiener measure (see Theorem 3.2 below). It is based on the intuitive idea that the increments dbt of Brownian motion are Gaussian distributed with mean 0 and variance dt. Thus for each fixed t the random variable xt = dbt has “density” 1
(2πdt)− 2 exp(− 12
x 2 1 x2t t ) = (2πdt)− 2 exp(− 12 dt) dt dt
against Lebesgue measure. The informal product of these terms over t ∈ [0, 1] then gives (3.1). In Section 3.2 (the first Act) we will show how this can be made precise by a modification of Anderson’s construction, taking ∆B(t) to be N (0, ∆t) distributed. Although we lose the fact that ∆B(t)2 = ∆t we do have E(∆B(t)2 ) = ∆t which is quite sufficient. Act 2 (that is, Section 3.3) begins with another intuition about Wiener measure – this time due to Wiener himself [98]. He originally thought in terms of the differential b˙ t = dbt /dt of a path of Brownian motion, and informally ˙ thus, using db2 = dt: calculated its L2 [0, 1] norm b t
˙ 2= b 0
1
b˙ 2t dt =
0
1
(db2t /dt2 )dt
=
1
(1/dt)dt = 1/dt = ∞
0
˙ = √∞, and b˙ lives in the space S ∞ (√∞), which Wiener called differSo b ential space. He then thought of Wiener measure as the uniform probability on differential space. The apparatus of nonstandard analysis allows us to make sense of this; the appropriate space that√gives precision to the above ideas turns out to be the infinite sphere S N −1 ( N ) in ∗RN for an infinite integer N . For convenience only2 , in discussing this in Section 3.3 we rescale and work with the unit sphere S N −1 (1) which we called the Wiener sphere in [41]. 2
because it is more natural work with the Euclidean norm in ∗RN than to use to 2 2 2 the L norm X = t Xt ∆t
3.1 Introduction
63
In this framework it is shown that Wiener measure “is” the uniform measure on the Wiener sphere, and a pleasant geometrical derivation of the Cameron-Martin formula emerges. There is a discussion of differential space (which we can think of as the Wiener sphere) and related issues in the brilliant little paper [72]. In the third Act (Section 3.4) we pick up a remark of David Williams concerning the infinite dimensional Ornstein–Uhlenbeck process v, which plays an important rˆ ole in the stochastic calculus of variations (the ‘Malliavin Calculus’). This process is fundamental to infinite dimensional stochastic calculus, and it has been suggested that v should be regarded as the infinitedimensional counterpart of Brownian motion. Williams [99] remarks that the “correct” way to think of the infinite dimensional Ornstein–Uhlenbeck √ process is as “Brownian motion on the infinite dimensional sphere S ∞ ∞ ”. Given the Wiener sphere, it is possible to make sense of this – by √ considering ∗ Brownian motion on S N −1 (1), which is a scaled version of S N −1 ( N ) (again with N infinite). This, and some consequences, is spelled out in Section 3.4. The final part of the story (Act 4, Section 3.5) returns to the idea that the increments bt of Brownian motion are N (0, ∆t), and discusses the so called Malliavin calculus on the space L2 (W ) of functions ϕ : C → R that are L2 with respect to the Wiener measure. Informally, we can think of a sample point b ∈ C as completely determined by the ‘vector’ of increments x = (dbt )t∈[0,1] . Writing xt = dbt and x = (xt )t∈[0,1] we can therefore think of ϕ ∈ L2 (W ) as ϕ = ϕ(x) = ϕ((xt )t∈[0,1] ). At the heart of the Malliavin calculus is the idea of differentiation with respect to the variables xt . Thus, intuitively, the gradient or derivation operator D is given by ∂ϕ Dϕ(b, t) = (b). ∂xt There is a similar intuitive definition of the other basic operators of this calculus. In Section 3.5 we show how the construction of Brownian motion from infinitesimal increments ∆B(t) that are N (0, t) distributed allows the above ideas to be made precise using the transfer of classical calculus on ∗RN , and the basic results of the Malliavin calculus are derived as simple applications. 3.1.1 Notation Throughout this Lecture we will work with the following notation. Fix N ∈ ∗N \ N and let ∆t = N −1 . As in Section 1.3.1 we take the hyperfinite time line T = {0, ∆t, 2∆t, 3∆t, . . . , 1 − ∆t}. On occasions we need to include the end point, so write
64
3 Stochastic Calculus of Variations
T = T ∪ {1}. We use sanserif letters s, t, u as variables from T. For any internal function F : T → ∗R write ∆F (t) = F (t + ∆) − F (t) and
∆F (t) F˙ (t) = ∆t
so that ∆F, F˙ ∈ ∗RT . Write C = {Y ∈ ∗RT : Y (0) = 0}, which we identify with the space of polygonal paths obtained by joining the points (t, Y (t)). Then C is the internal path space that we will be working with throughout. The standard part mapping st(Y ) = ◦Y ∈ C is defined for S-continuous paths in C (see Section 1.2.4). A vector X in the internal space ∗RT will usually be regarded as a vector of increments of a path Y = ΣX ∈ C, where Σ is the following mapping Σ : ∗RT → C (ΣX)(0) = 0 (ΣX)(t) = X(s)
if t > 0.
s
The inverse mapping Σ −1 : ∗RT → ∗RT is simply ∆ as defined above. Finally, denote by γ the Gaussian measure on ∗R with mean zero and variance ∆t (i.e. with the normal distribution N (0, ∆t) ) and let Γ = γ T be the product Gaussian measure on ∗RT . Thus for internal Borel sets A ⊆ ∗RT we have
X 2 (t) −N/2 1 dX (3.2) Γ (A) = (2π∆t) exp − 2 ∆t A t∈T
∗
∗ T
where dX denotes Lebesgue measure on R .
3.2 Flat Integral Representation of Wiener Measure The main result of [23] is Theorem 3.1 Let W be the internal Borel probability defined on C by
W(A) = (2π∆t)−N/2 exp − 12 t∈T Y˙ (t)2 ∆t dY (3.3) A
3.2 Flat Integral Representation of Wiener Measure
65
Then with respect to the Loeb measure WL , almost every path Y ∈ C is S-continuous, and the Borel measure W on C given by W (B) = WL (st−1 (B))
(3.4)
is Wiener measure. The proof is quite similar to that of Anderson’s Theorem, and in fact a little easier, since under W the increments ∆Y (t) are independent N (0, ∆t) (see the next remark) and so Y (t) is N (0, t). Note that, making the change of variable X = ∆Y in (3.3) and checking that det(∆) = 1, the above definition of W is equivalent to the following: W(A) = Γ (∆A) = Γ (Σ −1 (A))
(3.5)
so it is clear that the increments of the paths Y under the measure W are independent Gaussian. Note also that the constant κ = (2π∆t)−N/2 in the formula (3.3) is indeed infinite, as was noted in the informal discussion above. Here now, as an application of this, is a quick proof of the CameronMartin Theorem, making precise an heuristic proof based on the flat integral (3.1). The Cameron–Martin formula (3.6) is concerned with translations of Wiener measure. For z ∈ C define W z (B) = W (B − z). Then we have: Theorem 3.2 (Cameron-Martin) If z ∈ C with z˙ ∈ L2 [0, 1] (i.e. z belongs to the so called Cameron-Martin subspace of C) then W z is absolutely continuous with respect to W and 1
1 dW z (b) = exp z˙t dbt − 12 z˙t2 dt (3.6) dW 0 0 = ρ(b) say. (3.7) Proof (Sketch) Let Z(t) = ∗zt and write W Z (A) = W(A − Z). Then it is easy to check that W z (B) = WLZ (st−1 (B)) (3.8) 1 Writing J(Y ) = 2 t∈T Y˙ t2 ∆t (the internal counterpart of the action of a path Y ) we have
66
3 Stochastic Calculus of Variations
W Z (A) = κ
exp(−J(Y ))dY
A−Z exp(−J(Y − Z))dY since dY is translation invariant =κ A
˙ =κ exp Z(t)∆Y (t) − J(Z) exp(−J(Y ))dY A
t∈T
R(Y )dW(Y )
= A
˙ (t) − J(Z)). So dW Z /dW = R. The proof where R(Y ) = exp( t∈T Z(t)∆Y is finished by showing that R is a lifting of the required density ρ. Further applications of the above construction of Wiener measure may be found in [23] – in particular an intuitive and elementary proof of Schilder’s large deviation result. These ideas were developed in subsequent papers – in some cases giving new large deviation results. See [25, 26, 27, 28].
3.3 The Wiener Sphere In this section we show how Loeb measures make it possible to give rigorous meaning √ to Wiener’s intuition that Wiener measure is uniform probability on S ∞ ∞. The results described below are based on the joint work with Siu-Ah Ng reported in [41]. Definition 3.3 The Wiener sphere S is the unit sphere in ∗RT . That is S = {X ∈ ∗RT : |X| = 1} where |X|2 =
t∈T
X(t)2 is the Euclidean norm.
As discussed in the introduction, Wiener thought of the differential of a path in C, in the L2 norm. For X ∈ S this means considering the vec˙ 2 = tor of differentials X˙ = X/∆t = (X(t)/∆t)t∈T and calculating X 2 ∆t = ∆t−1 |X|2 = N which is consistent with Wiener’s t∈T X(t)/∆t √ ∞ idea of S ( ∞). We find it more convenient to work with the increment vectors X and use the Euclidean norm – which means taking the Wiener sphere as the scene of activity. Now let µ be the uniform (internal) probability on S. The next theorem was proved in [41]. Theorem 3.4 (a) With respect to the uniform Loeb measure µL on S, almost all of the paths ΣX ∈ C are S-continuous; (b) Wiener measure W on C is defined by
3.3 The Wiener Sphere
W (B) = µL (ΣS−1 st−1 B)
67
(3.9)
where ΣS denotes the restriction of the map Σ to S. In other words, W (B) is the uniform Loeb measure of the set of increment vectors X ∈ S for which ◦ΣX ∈ B. The proof is a simple application of Theorem 3.1, and will be given below. In preparation, recall the construction of Wiener measure in Theorem 3.1 from the Gaussian probability Γ on the space ∗RT of increment vectors X. The connection is that since Γ is rotation invariant, the projection of Γ onto S gives µ. To be precise, let π : ∗RT \ {0} → S be the projection mapping π(X) = X/|X|. Then Proposition 3.5 For A ⊆ S µ(A) = Γ (π −1 A) Note further the following elementary relationship between S-continuity of ΣX and Σπ(X) Proposition 3.6 If X ∈ ∗RT has |X| ≈ 1 then ΣX is S-continuous if and only if Σπ(X) is S-continuous, and if either holds then ◦ΣX = ◦Σπ(X). Now we see that this proposition is applicable to almost all X ∈ ∗RT . Proposition 3.7 |X| ≈ 1 for almost all X ∈ ∗RT with respect to ΓL . Proof
Simply calculate that under Γ E (|X|2 − 1)2 = 2∆t ≈ 0
using elementary properties of the Gaussian distribution γ (namely its second and fourth moments are ∆t and 3∆t2 respectively). Remark In [41] and a recent paper of Li & Li [67] the thickness of the shell around S that supports the Loeb–Gaussian measure ΓL is calculated. It is shown that the following set has full ΓL measure √ √ {X : 1 − m N ≤ |X| ≤ 1 + m N } m∈N
and this is optimal – i.e. no set in the union has full measure. Li & Li also show that exactly half of the measure of ΓL is contained inside S (and half outside). Now we can verify the main theorem. Proof of Theorem 3.4. From (3.4) and (3.5)
68
3 Stochastic Calculus of Variations
W (B) = ΓL ({X ∈ ∗RT : ◦ΣX ∈ B}) = ΓL ({X ∈ ∗RT : ◦Σπ(X) ∈ B}) by Propositions 3.6 and 3.7 by Proposition 3.5 = µL ({X ∈ S : ◦ΣX ∈ B}) The idea underlying Theorem 3.4 goes back to Poincar´e [82], although this is √ hidden in the above proof. He noticed that the uniform probability on S n−1 ( n), when projected onto any axis3 approaches the Gaussian distribution N (0,1) as n → ∞. A proof of Theorem 3.4 from first principles using this fact is possible. This result gives almost immediately an invariance result for Wiener measure: W is the weak limit of measures on polygonal paths induced by the uniform probability measure on spheres S n (1) as n → ∞. (By this we mean of course that a point x ∈ S n (1) is regarded as the vector of increments of a polygonal path Σx ∈ Rn+1 .) This construction of Wiener measure gives another interesting approach to the Cameron-Martin formula (3.6). The Wiener sphere is not invariant under shifts, so we must use projection back onto S to define the counterpart of the shifted Wiener measure W z . Take z in the Cameron-Martin space (i.e. z˙ ∈ L2 [0, 1]) and define Z(t) = ∗ zt ∈ C as before. The following definition will be applied for D = ∆Z but it makes sense more generally. Definition 3.8 Let D ∈ ∗RT with |D| < 1. Define the internal measure µD on S by X −D µD (A) = µ :X∈A |X − D| It is straightforward to prove: Theorem 3.9 For D = ∆Z as above −1 −1 W z (B) = µD B) L (ΣS st
for B ⊆ C. Turning now to the calculation of the density of the shifted measure, we use the transfer of classical finite dimensional calculus, involving calculation of the Jacobian of the mapping X → (X − D)/|X − D| from S to itself, to find the derivative of µD with respect to µ. Theorem 3.10 If D ∈ ∗RT with |D| < 1 then for X ∈ S dµD 1 − X.D (X) = dµ |X − D|N where X.D denotes the Euclidean inner product. 3
More generally, any one-dimensional subspace.
(3.10)
3.4 BM on the Wiener Sphere and the Infinite Dimensional O–U Process
69
It remains to see how to interpret the righthand side of (3.10) to give the Cameron–Martin formula (3.6). We are now assuming that D = ∆Z with Z coming from z in the Cameron–Martin space. Note first that in this case |D| ≈ 0 and so 1 − X.D ≈ 1 for X ∈ S. Turning to the numerator in (3.10), we then have −N/2 2N X.D − N |D|2 −N = 1− |X − D| N 1 ≈ exp(N X.D − N |D|2 ) for a.a. X 2 for a.a. X ≈ ρ(◦ΣX) where ρ(b) is the Cameron-Martin formula as before. The last line above follows because
1 ∆Z(t) ∆B(t) ≈ z˙t dbt N X.D = ∆t 0 t∈T
since X(t) = ∆B(t) under µL gives the infinitesimal increments of Brownian motion according to Theorem 3.4. For the second term we have
1 ∆Z(t)2 ∆t ≈ z˙t2 dt. N |D|2 = ∆t2 0 t∈T
Remark It is worth comparing the construction of Brownian motion from uniform measure µ on the Wiener sphere S above with Anderson’s original construction from an infinitesimal random walk. In Anderson’s process the vectors of increments X are all of the form √ √ √ √ √ X = (± ∆t, ± ∆t, . . . , ± ∆t) ∈ { ∆t, − ∆t}T with the counting probability. Note that any such X has norm 1 and thus lies on S. The family of all such increment vectors for Anderson’s process is thus a collection of 2N vectors uniformly distributed on the Wiener sphere, lying on the lines through the origin that are equidistant from the axes.
3.4 BM on the Wiener Sphere and the Infinite Dimensional O–U Process The infinite dimensional Ornstein–Uhlenbeck process v of Malliavin [73] and Stroock [94] can be defined in a number of ways. (1) As an infinite dimensional process v : [0, ∞) × Ω → C
70
3 Stochastic Calculus of Variations
it is continuous and strong Markov with semigroup given by
√ 1 f e− 2 t θ + 1 − e−t ϕ dW (ϕ) (Pt f )(θ) =
(3.11)
C
(where f : C → R is L2 (W ), and W is Wiener measure and θ, ϕ ∈ C). (2) Thinking of v as a two parameter process v = v(t, s) (= v(t)(s) in terms of (1)), for t ≥ 0 and 0 ≤ s ≤ 1, an alternative description (eg Meyer [74]) is that v(t, s) is jointly continuous and 1 v(t, s) = e− 2 t v(0, s) + w(et − 1, s)
(3.12)
where w is the Brownian sheet (a two-time parameter generalisation of Brownian motion). The connection with the Malliavin calculus (see the next section) is that the generator of v is one of the fundamental operators of the calculus (see the next section). The aim here is to describe another construction [32] of the infinitedimensional O-U process based on the idea that v is “Brownian motion on √ S ∞ ( ∞)” – which for us can be interpreted (after scaling) as Brownian motion on the Wiener sphere S. An earlier nonstandard approach to v was given by Lindstrøm [70], using Anderson’s infinitesimal random walk construction of Brownian motion [5]. As noted above, the paths of Anderson’s random walk correspond to a lattice of 2N points distributed uniformly on the Wiener sphere S, and Lindstrøm’s construction is based on a random walk on these points. Lindstrøm [70] mentions another nonstandard construction of v, taking as starting point a nonstandard hyperfinite dimensional O-U process U in ∗ T R . This is an internal process U : Ω × ∗ [0, ∞) → ∗RT given by √ τ 1σ − 12 τ U (τ ) = e U (0) + ∆t e 2 dB(σ) (3.13) 0
where B(τ, ω) is Brownian motion in ∗RT . Note that the Gaussian measure Γ on ∗RT (see Section 3.1.1 above) is invariant for this process. The infinite dimensional Ornstein–Uhlenbeck process is now obtained as v = ◦ΣU . That is, V (τ ) = ΣU (τ ) gives a process in C whose standard part (in the uniform topology) gives v on the corresponding Loeb space. See Theorem 3.14 below. The construction we present here takes as starting point an appropriately scaled Brownian motion on S as the process X(τ ) of increments. Now define the process Y = ΣX : ∗[0, ∞) × Ω → C. The main result of the paper [32] is the following.
3.4 BM on the Wiener Sphere and the Infinite Dimensional O–U Process
71
Theorem 3.11 If Y (0) = (Y (0, s))s∈T is a.s. S-continuous, then Y (τ, s) is almost surely S-continuous in both variables for finite τ ≥ 0, and the process v = ◦Y is the infinite dimensional O-U process with initial distribution v(0) = ◦Y (0, ·). The following is an immediate corollary. Corollary 3.12 Wiener measure on C is an invariant measure for the infinite dimensional O-U process v. That is, if v(0) ∈ C is distributed according to Wiener measure, then so is v(t) for all t ≥ 0. Proof Take X(0) in the above theorem to be uniformly distributed on the Wiener sphere. Clearly X(τ ) is also uniformly distributed on S. By Theorem 3.4 both ◦Y (0) = v(0) and ◦Y (t) = v(t) are distributed according to Wiener measure. Here is a sketch of the proof of Theorem 3.11. From any standard reference (for example [66]) we have that a scaled Brownian motion X(τ ) on S is given by the SDE √ √ dX(τ ) = − 12 (1−∆t)X(τ )dτ + ∆t dB(τ )− ∆t X(τ ) X(τ ), dB(τ ) (3.14) where B is Brownian motion in ∗RT as above. (Note that this is √the projection onto S of “standard” Brownian motion on the sphere S N −1 ( N ).) Theorem 3.11 is proved by showing that the processes Y = ΣX and V are infinitely close for all finite time. Theorem 3.13 Let V = ΣU and Y = ΣX be the internal processes on C defined above with U (0) = X(0), and let Z(τ ) = V (τ ) − Y (τ ). Suppose that Y (0, ·) is S-continuous for Loeb almost all ω. Then for a.a. ω Z(τ, s) ≈ 0 for all finite τ and all s ∈ ∗T. The proof of this is carried out by investigation of the process Z(τ ) = eατ Z(τ ) where α = 12 (1 − ∆t), which has Z(0) = 0 and satisfies the internal SDE √ √ 1 dZ(τ ) = − ∆teατ V (τ )dτ + ∆t eατ V (τ )dγ(τ ) − ∆t Z(τ )dβ(τ ) 2 where β is a standard 1-dimensional Brownian motion in ∗R. The proof of Theorem 3.11 is now completed once we have proved that v = ◦V .
72
3 Stochastic Calculus of Variations
Theorem 3.14 Let U and V = ΣU as above, and suppose that V (0) = V (0, ·) is a.s. S-continuous. Then then V (τ, s) is almost surely S-continuous in both variables for finite τ ≥ 0, and the process v = ◦V is the infinite dimensional O-U process. The proof of this follows closely the pattern of Lindstrøm’s paper [70] and is quite routine. The key is to work with the process M (τ ) = exp( 12 τ )V (τ ) − V (0). A new invariance principle is obtained as a corollary to Theorem 3.11. Let √ √ ξ (n) be a standard Brownian motion on S n−1 ( n) and let x(n) = ξ (n) / n be the corresponding scaled Brownian motion on S n−1 (1); we allow x(n) (0) (n) (n) (n) to be random. Writing x(n) = (x1 , x2 , . . . xn ), define y (n) (t, s) for t ≥ 0 and s ∈ [0, 1] by y (n) (t, 0) = 0 (n) y (n) (t, k/n) = xi (t) i≤k
with y (n) (t, s) defined by linear interpolation for other s ∈ [0, 1]. Then Theorem 3.15 Suppose that the initial random variables y (n) (0) converge weakly to a random variable v0 on C. Then for each finite T ≥ 0 the processes y (n) (t, s) for t ≤ T and s ∈ [0, 1] converge weakly to the infinite dimensional Ornstein–Uhlenbeck process v with v(0) = v0 . The proof is routine and similar to Anderson’s proof of Donsker’s invariance principle. This invariance principle (and the construction of the infinite dimensional O-U process from Brownian motion on the Wiener sphere, from which it follows) is related to a more complicated invariance principle of Morrow & Silverstein [76].
3.5 Malliavin Calculus In this final Act of the story we return to the construction of Wiener measure W that is discussed in Section 3.2, where, intuitively, the paths b ∈ C have increments xt = dbt that are independent and Gaussian N (0, dt) distributed. We made this precise by considering internal paths in C with increments ∆Bt = Xt that are Gaussian with distribution N (0, ∆t). Here we discuss joint work with Siu-Ah Ng [43] that shows how to use this framework to make precise the intuition behind the so-called Malliavin calculus on L2 (W ). This is more accurately described as the stochastic calculus of variations (see [80] for example). Since its discovery, this calculus has been extended to a very much wider setting than the one we describe, but the essential ideas remain the same.
3.5 Malliavin Calculus
73
At the core of this theory there are three operators, given intuitively as follows, where b ∈ C and xt = dbt : (1) the gradient operator D : L2 (W ) → L2 (W × [0, 1]), given by Dϕ(b, t) =
∂ϕ (b). ∂xt
(3.15)
(2) the Skorohod integral operator δ : L2 (W × [0, 1]) → L2 (W ), given by
1
1 ∂u δu(b) = u(b, t)dbt − (b, t)dt (3.16) 0 0 ∂xt (3) the Malliavin operator L : L2 (W ) → L2 (W ), given by
1 2
1 ∂ϕ ∂ ϕ (b)dbt − Lϕ(b) = δDϕ(b) = 2 (b)dt 0 ∂xt 0 ∂xt
(3.17)
Of course, none of these formulae make sense as they stand, and the above informal definitions do not address the question of the domains of the operators. 3.5.1 Notation and preliminaries In this section we take as the fundamental probability space Ω = ∗RT with the internal Gaussian product measure Γ given by (3.2), and the corresponding Loeb measure P = ΓL . The notation · will be used to denote any of the internal or standard L2 norms that appear in the discussion, and ·, · will denote corresponding L2 inner products. In this section only we use x = (xt )t∈T to range over Ω. Each x ∈ Ω gives a path Σx ∈ C as above, and we will write Bt (x) = Σx(t) for the internal process on Ω thus defined. This means in particular that ∆Bt (x) = xt . The results noted in Section 3.3 tell us that Theorem 3.16 For a.a. x ∈ Ω, the path Bt (x) is S-continuous, and the process bt (x) defined on Ω by bt (x) = ◦Bt (x) for t = ◦t is Brownian motion. Thus Wiener measure on C is given by W (B) = P ({x ∈ Ω : b(x) ∈ B}) for Borel B ⊆ C.
74
3 Stochastic Calculus of Variations
In the nonstandard approach outlined below, the idea is as follows. For appropriate ϕ ∈ L2 (W ) take a lifting Φ(x) that is ∗differentiable in the classical sense. Then ∂Φ/∂xt makes sense, and if Φ is chosen with a little care, the function ∂Φ (x) ∇Φ(x, t) = ∂xt is an SL2 lifting of the gradient Dϕ. The approach for the other operators is similar. One standard approach to the definition of the operators D, δ, L is via the Wiener-Itˆ o chaos decomposition of L2 (W ) in terms of multiple Wiener integrals, which we now mention briefly. First, we have the following representation of the Itˆ o integral that is the exact counterpart of Anderson’s representation using his Brownian motion, Theorem 1.43. For this we assume that Ω is equipped with the usual filtration (generated by the internal process B). Theorem 3.17 Let f : Ω × [0, 1] → R be an adapted L2 process; that is, with
1 E( 0 f 2 ) < ∞. Then (a) f has an SL2 -lifting F : Ω × T → ∗R that is nonanticipating; (b) for any such lifting, define G(x, t) = F (x, s)xs s
and then for a.a. x, the function G(x, ·) is S-continuous, and
◦
t
f (x, s)dbs = ◦G(x, t)
0
for all t ∈ T. It is routine to extend this representation to multiple Wiener integrals, as follows. Let ∆n = {(t1 , . . . , tn ) : 0 ≤ t1 ≤ . . . tn ≤ 1} and suppose that f ∈ L2 (∆
n ) (with respect to Lebesgue measure). Then the multiple Wiener integral ∆n f db(n) is defined by
f db
(n)
1
=
∆n
tn
t2
... tn =0
=
tn−1 =0
f (t1 , . . . , tn )dbt1 . . . dbtn−1 dbtn
t1 =0
f (t1 , . . . , tn )dbt1 . . . dbtn . ∆n
This will be abbreviated as f db(n) or In (f ) when convenient. The counterpart of Theorem 3.17 above is as follows, where we define n ∆T n = {(t1 , t2 , . . . , tn ) ∈ T : t1 < t2 < . . . < tn }.
3.5 Malliavin Calculus
75
∗ and, for F : ∆T n → R we write In (F ) = F (t1 , t2 , . . . , tn )∆Bt1 . . . ∆Btn ∆T n
=
F (t1 , t2 , . . . , tn )xt1 . . . xtn
∆T n ∗ 2 2 Theorem 3.18 If F : ∆T n → R is an SL lifting of f ∈ L (∆n ) then In (F ) 2 is an SL lifting of In (f ). ∗ Remark Since E(xs xt ) = ∆tδs,t it is clear that for F : ∆T m → R and T ∗ G : ∆n → R we have if m = n 0 EΓ (Im (F )In (G)) = F (t1 , t2 , . . . , tn )G(t1 , t2 , . . . , tn )∆tn if n = m ∆T n
It follows easily that I m (f ) and In (g) are orthogonal in L2 (W ) if m = n, while E(In (f )In (g)) = ∆n f g. 3.5.2 The Wiener-Itˆ o chaos decomposition Let Zn = {In (f ) : f ∈ L2 (∆n )}, which is a closed subspace of L2 (W ). The Wiener-Itˆ o decomposition theorem is, putting Z0 = R: Theorem 3.19 (Wiener-Itˆ o) L2 (W ) =
∞
Zn
n=0
That is, ϕ ∈ L2 (W ) has a unique expression as ϕ = ϕ0 +
∞
In (fn )
(3.18)
n=1
where ϕ0 ∈ R and fn ∈ L2 (∆n ) for n ≥ 1. Moreover ϕ2 = ϕ20 +
∞
fn 2
n=1
where · denotes the appropriate L2 (∆n ) norms. Remark The last part follows from the remark above about the orthogonality of multiple Wiener integrals.
76
3 Stochastic Calculus of Variations
There are a number of proofs of Theorem 3.19 in the literature, including two elementary nonstandard proofs in [42], which we will not repeat here. The nonstandard framework we have set up does give some insights into why this theorem holds. The first relates to an insight of Wiener who, according to McKean [72], thought of the integrands fn in the chaos expansion as being given by the “recipe” fn (t1 , . . . , tn ) = E(ϕ(b)b˙ t1 b˙ t2 . . . b˙ tn )
(3.19)
with f0 = E(ϕ) = ϕ0 . The underlying idea here is that finite products of the independent random variables dbt (or scaled versions b˙ t ) form a basis for L2 (W ) – and a sum of the form (3.18) is the most general expansion of ϕ in this basis. To make this precise, suppose that Φ is an SL2 lifting of ϕ ∈ L2 (W ). Then we have ∗ Theorem 3.20 For each finite n the function Fn : ∆T n → R given by
Fn (t1 , t2 , . . . , tn ) = E(Φ(B)B˙ t1 B˙ t2 . . . B˙ tn )
(3.20)
is an SL2 lifting of fn in the expansion (3.18) of ϕ, and F0 ≈ ϕ0 . Hence for all infinite M the function Ψ = F0 +
M
In (Fn )
(3.21)
n=1
is an SL2 lifting of ϕ. Remark Note that any lifting Ψ such as above that reflects the chaos expansion of ϕ is a monomial lifting of ϕ; that is, Ψ = Ψ (x) is a polynomial in (xt )t∈T with no quadratic or higher powers of any one xt . Thus Ψ is a simpler kind of lifting than Φ from which it was obtained. An elementary calculation using the independence of the random variables B˙ t = xt /∆t shows that if Ψ is given by (3.21) then for finite n Fn (t1 , t2 , . . . , tn ) = E(Ψ (B)B˙ t1 B˙ t2 . . . B˙ tn ) which explains why the recipe (3.19) is intuitively correct. Another way to understand the Wiener-Itˆ o theorem involves Hermite polynomials. An SL2 lifting Φ of ϕ belongs to L2 (∗RT , Γ ) and thus has an expansion in term of the ∗Hermite polynomials, which form an orthonormal basis for this space. Theorem 3.19 shows that in this expansion, the nonmonomial terms make only an infinitesimal contribution.4 Throughout the following sections, if ϕ ∈ L2 (W ), then by ϕn we denote the projection of ϕ into Zn , which is the term In (fn ) in the chaos expansion 4
Of course this is only true for members Φ of L2 (∗RT , Γ ) that originate as liftings of functions in L2 (W ).
3.5 Malliavin Calculus
77
(3.18) of ϕ. The same scheme is adopted for Φ, so that Φn = In (Fn ) for any n ∈ ∗N, where Fn is given by the recipe (3.20). We also write Φ(m) =
m
Φn
n=0
for Φ ∈ ∗L2 (Γ ). 3.5.3 The derivation operator The derivation or gradient D is a densely defined operator D : L2 (W ) → L2 (W × [0, 1]) with domain D2,1 defined as follows.
∞ Definition 3.21 (i) D2,1 = {ϕ ∈ L2 (W ) : n=1 nϕn 2 < ∞} (ii) for ϕ ∈ D2,1 with chaos expansion given by (3.18) define Dϕ(b, t) =
∞
In−1 (fˆn (· · · , t))
n=1
where fˆn is the symmetric extension of fn to [0, 1]n . It is easy to check that In−1 (fˆn )2 = nfn 2 (in L2 (W × [0, 1]) ) and so 2 Dϕ = nϕn 2 .
1 For a Wiener integral ϕ(b) = 0 f (t)dbt (i.e. ϕ ∈ Z1 ) it is straightforward to see that Dϕ(b, t) = f (t) = “∂ϕ/∂(dbt )”, which agrees with the intuitive description of D above. For general ϕ ∈ L2 (W ) the fact that D is a derivative is clear from the following nonstandard approach. For Φ ∈ ∗L2 (Γ ) let ∇ be the internal classical derivative ∇Φ(x, t) =
∂Φ (x) ∂xt
defined first for ∗differentiable Φ and extended by linearity using the ∗Hermite polynomial expansion for Φ in the domain dom(∇) of ∇. Here is a summary of the results that provide the basis of the nonstandard approach to the derivation operator. Theorem 3.22 Let ϕ ∈ L2 (W ). (a) If ϕ = In (f ) and F is an SL2 lifting of f , and Φ = In (F ), then ∇Φ is an SL2 lifting of Dϕ. (b) (i) ϕ ∈D2,1 if and only if ϕ has an SL2 lifting Φ with ∇Φ < ∞; (ii) if ϕ ∈ D2,1 and Φ is any SL2 lifting of ϕ, then ∇Φ(M ) is an SL2 lifting of Dϕ for all sufficiently small infinite M ;
78
3 Stochastic Calculus of Variations
(iii) if ϕ ∈ D2,1 with SL2 lifting Φ, then Dϕ ≤ ◦∇Φ and ∇Φ is an SL2 lifting of Dϕ if and only if ∇Φ ≈ Dϕ. These results show that there is an analogy between S-integrable liftings of an integrable function (which give the correct integral ), and liftings of ϕ ∈ L2 (W ) that give the correct derivative. The appropriate notion is an SD2,1 lifting, defined as follows: Definition 3.23 Let Φ ∈ ∗L2 (Γ ). Then Φ is SD2,1 if it is SL2 and (i) ∇Φ is finite, (ii) ∇(Φ − Φ(M ) ) ≈ 0 for all infinite M . The importance of this definition is seen in the following. Theorem 3.24 Let ϕ ∈ L2 (W ). Then (a) ϕ ∈ D2,1 if and only if ϕ has an SD2,1 lifting; (b) if ϕ has an SL2 lifting Φ, then ∇Φ is an SL2 lifting of Dϕ if and only if Φ is SD2,1 . The above characterisations provide natural and elementary proofs of the basic results concerning the derivation operator such as the chain rule D(f ◦ ϕ) = f (ϕ)Dϕ for f ∈ C 1 with f bounded, and the product rule D(ϕψ) = ϕDψ + ψDϕ provided that ϕ, ψ ∈ D2,1 and in addition ϕDψ + ψDϕ ∈ L2 (W × [0, 1]). A pleasant application is a proof of Ocone’s formula for the integrand in the martingale representation theorem (or stochastic differentiation theorem) for ϕ ∈ D2,1 , which tells us that
1 g(b, t)dbt ϕ(b) = ϕ0 + 0
for a uniquely determined adapted L2 function g, the stochastic differential of ϕ. Theorem 3.25 (Ocone) Let ϕ ∈ D2,1 with martingale representation as above. The integrand (stochastic differential) g is given by g(b, t) = E(Dϕ(b, t)|Ft ) where Ft is the σ−algebra generated by (bs )s≤t .
3.5 Malliavin Calculus
79
The idea of the proof is to rewrite the internal chaos representation Φ=
M
Φn =
n=0
M
In (Fn )
n=0
as Φ = ϕ0 +
G(x, t)xt
(3.22)
t∈T
where G(x, t) involves only xs for s < t. (This is possible because Φ is a monomial.) Now take Φ that is an SD2,1 lifting of ϕ and verify that G(x, t) is a nonanticipating lifting of the integrand g. Working now with (3.22) it is clear that ∂G(x, s) ∇Φ(x, t) = G(x, t) + xs ∂xt t<s This gives
Dϕ(b, t) = g(b, t) +
1
Dgs (b, t)dbs t
where we are writing gs (b) for g(b, s) and Dgs is the gradient of the function gs ∈ L2 (W ). Taking conditional expectations gives the result. 3.5.4 The Skorohod integral The Skorohod integral δu (also called the divergence) is defined for certain integrands u ∈ L2 (W ×[0, 1]) and is an extension of the Itˆ o integral that allows u to be nonanticipating. It was first introduced in [93]. The usual standard definitions of δu (see [78, 80] for example) obscure the fact that it really is an integral with the intuitive meaning given by (3.16). The nonstandard approach below makes this quite apparent. First, for reference, we give the standard definition. Consider a function u ∈ L2 (W × [0, 1]), which has chaos expansion u(b, t) =
un (b, t) = u0 (t) +
n
∞
In (fn (· · · , t))
n=1
with fn ∈ L2 (∆n × [0, 1]). The symmetrisation of f = fn is the function f˜ ∈ L2 (∆n+1 ) given by f˜(t1 , . . . , tn+1 ) =
n+1 i=1
With this we have:
f (t1 , . . . , ti−1 , ti+1 , . . . , ti ).
80
3 Stochastic Calculus of Variations
Definition 3.26 The Skorohod integral δU is the L2 (W ) sum δu =
∞
In+1 (f˜n )
n=0
when it exists. It is routine to see that δ is linear and Zn ⊂ dom(δ) for each n. Since δu is given explicitly in terms of its chaos expansion, with zero constant term, we see that E(δu) = 0. ∗ Some routine but tedious combinatorics show that for F : ∆T n ×T → R ˜ ˜ and U (x, t) = In (F (· · · , t)), if F is defined in the same way as f we have In+1 (F˜ )(x) =
U (x, t)xt −
t∈T
∂U (x, t) t∈T
∂xt
x2t
(3.23)
Although we do not have x2t = ∆t as in Anderson’s model, the fact that these are close5 suggests the following representation of δ by the internal operator δ : ∗L2 (Γ × T) → ∗L2 (Γ ) defined thus:
δU (x) =
U (x, t)xt −
t∈T
∂U (x, t) t∈T
∂xt
∆t
for U ∈ ∗L2 (Γ × T). The identity (3.23) is the key to proving: Theorem 3.27 Suppose that u ∈ dom(δ) with SL2 lifting U . Then δU (M ) is an SL2 lifting of δu for all sufficiently small infinite M . The same result holds for the alternative nonstandard Skorohod integral taken directly from (3.23): ˇ (x) = δU
U (x, t)xt −
t∈T
∂U (x, t) t∈T
∂xt
x2t
which is more convenient for some purposes. If U is nonanticipating (that is, U (x, t) does not depend on xs for s ≥ t) ˇ give the usual nonstandard then ∂U (x, t)/∂xt = 0, and so both δU and δU representation of the Itˆ o integral since xt = ∆Bt . Thus
1 udb δu = 0
if u ∈ L2 (W × [0, 1]) is adapted. The following, together with Theorem 3.27 above, parallels Theorem 3.22 for the derivation operator. 5
In fact E(x2t ) = ∆t and E((x2t − ∆t)2 ) = 2∆t2
3.5 Malliavin Calculus
81
Theorem 3.28 Let u ∈ L2 (W × [0, 1]). (a) u ∈dom(δ) if and only if u has an SL2 lifting U with δU < ∞; (b) if u ∈ dom(δ) with SL2 lifting U , then δu ≤ ◦δU and δU is an SL2 lifting of δu if and only if δU ≈ δu. In the literature on the Malliavin calculus, there is much mention of various integration by parts formulae. The most fundamental of these shows that δ and D are dual to one another; at the nonstandard level of liftings, it is nothing other than classical integration by parts on Ω = ∗RT . Theorem 3.29 Suppose that ϕ ∈ D2,1 and u ∈ dom(δ). Then E(ϕδu) = E(Dϕ.u)
(3.24)
2
where v.u denotes the inner product in L ([0, 1]). That is, in terms of the inner products in L2 (W ) and L2 (W × [0, 1]) respectively ϕ, δu = Dϕ, u Proof (Sketch) Take SL2 liftings Φ and U such that ∇Φ and δU are SL2 liftings of Dϕ and δu. It is sufficient to show that E(ΦδU ) = E(∇Φ.U ) where E = EΓ and V.U = t∈T V (t)U (t)∆t. That is ∂U (x, t) ∂Φ(x) E Φ(x) U (x, t)xt − ∆t =E U (x, t)∆t ∂xt ∂xt t∈T
t∈T
t∈T
For each fixed t in the sums, this follows immediately by scaling the classical integration by parts formula for the standard Gaussian measure γ1 (distribution N (0, 1)) on R: E(f g + f g) = E(ξf g)
1 where Eh = R h(ξ)dγ1 (ξ) = (2π)− 2 R h(ξ) exp(− 12 ξ 2 )dξ. Early applications of the Malliavin calculus used the integration by parts formula (3.24) in a crucial way in proving the existence of smooth densities for the measures induced by certain L2 (W ) functionals – notably those coming from solutions of SDEs. A simple example to illustrate this kind of application is outlined below. As in many applications it involves analysis of the so-called Malliavin covariance σ(ϕ) ∈ L2 (W ) for ϕ ∈ D2,1 , given by
1 σ(ϕ)(b) = Dϕ(b, t)2 dt = Dϕ(b, ·)2L2 [0,1] 0
If Φ is an SD2,1 lifting of ϕ then (∇Φ.∇Φ)(x) = t∈T ∇Φ(x, t)∇Φ(x, t)∆t is an SL1 lifting of σ(b). (In higher dimensions σ(ϕ) is matrix valued.)
82
3 Stochastic Calculus of Variations
Theorem 3.30 Suppose that ϕ ∈ L2 (W ) with (a) ϕ ∈ D2,2 (that is, ϕ ∈ D2,1 and Dϕ ∈ D2,1 ), (b) σ(ϕ) ∈ D2,1 , (c) σ(ϕ) = 0 a.s. Then the measure induced on R by ϕ is absolutely continuous with respect to Lebesgue measure. Proof Here is a sketch of the proof; full details are given in [43]. For any real ε > 0 define Pε on Ω by dPε σ = dP σ+ε where σ = σ(ϕ), and let µε = Pε ◦ ϕ−1 . It is sufficient to show that µε ! Lebesgue measure for each ε. For a fixed ε, take an SD2,1 lifting Φ of ϕ such that ∇2 Φ is SL2 , and a 2,1 SD lifting S of σ. Since σ ≥ 0 a.s. we may take S ≥ − 12 ε surely. Take 1 ψ ∈ Cb and then
∇Φ.∇Φ ∗ ψ dµε ≈ E ψ (Φ) S+ε ∗ = E ∇ ψ(Φ), (S + ε)−1 ∇ΦT = E ∗ψ(Φ)δ((S + ε)−1 ∇Φ) ≤ ψ∞ E |δ((S + ε)−1 ∇Φ)| where we have used classical calculus including the duality of ∇ and δ (integration by parts) as above. Further internal classical calculus shows that δ((S + ε)−1 ∇Φ) = (S + ε)−1 δ∇Φ − (S + ε)−2 ∇S, ∇ΦT which in consequence is S-integrable. Thus
ψ dµε ≤ cψ∞ where c = ◦E |δ((S + ε)−1 ∇Φ)| < ∞, and this is sufficient to show that µε ! Lebesgue measure. A simple application of this result shows that any non-zero functional ϕ ∈ Zn induces a measure that is absolutely continuous with respect to Lebesgue measure. First show that any such ϕ has σ(ϕ) ∈ Z2n−2 as follows. If ϕˆ = 2 I (f ), take SL liftings F of f so that ∇Φ = m m m≤n m m m≤n Im−1 (Fm ) is an SL2 lifting of Dϕ and S = ∇Φ.∇Φ is an S-integrable lifting of σ(ϕ). In fact it can be shown that in this case S is SL2 . Since the monomial terms in S have degree ≤ 2(n − 1) this means that σ(ϕ) ∈ Z2n−2 . It is clear that σ(ϕ) = 0, and it is then routine using the local property of the operator D to show that σ(ϕ) = 0 a.s. This is now sufficient to apply Theorem 3.30. (For further details consult [43].)
3.5 Malliavin Calculus
83
3.5.5 The Malliavin operator The Malliavin operator L played a prominent rˆ ole in the early development of the Malliavin calculus, but it is now understood that it is not the most fundamental operator of the calculus. One of the simplest definitions is: Definition 3.31 The operator L on L2 (W ) is given by L = δD with domain the set of ϕ ∈ D2,1 with Dϕ ∈ dom(δ). The corresponding internal operator is given in the same way. Definition 3.32 Define LΦ by LΦ =
∂Φ(x) ∂xt
t∈T
xt −
∂ 2 Φ(x) t∈T
∂x2t
∆t
= δ∇Φ for all relevant Φ ∈ ∗L2 (Ω). It is straightforward to see that for Φ = In (F ) we have LΦ = nΦ and so from the results above it follows that: Theorem 3.33 L is the number operator on L2 (W ); that is dom(L) = {ϕ ∈ L2 (W ) :
∞
n2 ϕn 2 < ∞}
n=1
=D
2,2
say, and for ϕ ∈ D2,2 Lϕ =
∞
nϕn
n=1
The following results follow from the corresponding results for D and δ. Theorem 3.34 Let ϕ ∈ L2 (W ). (a) ϕ ∈D2,2 if and only if ϕ has an SL2 lifting Φ with LΦ < ∞; (b) if ϕ ∈ D2,2 with SL2 lifting Φ, then L(Φ(M ) ) = (LΦ)(M ) is an SL2 lifting of Lϕ for all sufficiently small infinite M ; (c) if ϕ ∈ D2,2 with SL2 lifting Φ, then Lϕ ≤ ◦LΦ, and LΦ is an SL2 lifting of Lϕ if and only if LΦ ≈ Lϕ.
84
3 Stochastic Calculus of Variations
Some of the deeper properties of the operator L are derived as a consequence of the fact that it is the generator of the infinite dimensional Ornstein– Uhlenbeck process of Section 3.4 This is seen by observing that L is the generator of the Ornstein–Uhlenbeck process U in ∗RT . For more details as to how the basics of the Malliavin calculus are developed in the framework we have outlined, see the paper [43]. In some more recent work the operators D, δ, L have been extended to a considerable portion of L2 (Ω, ΓL ), the Loeb space that sits above L2 (W ). Similar ideas have been developed by Horst Osswald in [81].
4. Mathematical Finance Theory
4.1 Introduction Mathematical finance theory, an area of research that has seen remarkable growth and interest in recent years, involves modelling of financial markets, including the random evolution of stock and interest prices, and the use of sophisticated ideas from stochastic analysis to price (fairly) financial instruments such as futures and options. There are, broadly speaking, two approaches to the fundamental modelling question – the discrete and the continuous. The former recognises that in reality financial transactions and other events that affect the market do take place at discrete moments in time, and prices are drawn from a discrete set of values. By contrast, the continuous approach is based on the limiting case where no lower bound is set on the time between events in the model, and prices can take values across the continuum. There are of course hybrid models, but we will not consider them here. An important issue is the relationship between the two approaches – in particular whether the discrete models converge to the continuous ones, and if so, in what sense. There is a natural rˆ ole here for ideas from nonstandard analysis, since this framework allows us to build discrete models with infinitesimal time intervals between events. This has been explored in a series of papers with Ekkehard Kopp and Walter Willinger [35, 36, 37, 39, 40]. These show that, roughly speaking, continuous models are the standard parts of hyperfinite discrete models (with infinitesimal time steps). As a consequence a new mode of convergence has been isolated (see [37]), sufficiently strong to give convergence of the discrete models and all their apparatus to the continuous counterparts. The convergence results in [37] are in fact a particular case of a general set of results showing how discrete stochastic calculus (including stochastic differentiation as well as stochastic integration) converges to the continuous counterpart – with stochastic differentiation given by the martingale representation theorem, and integration as formulated by Itˆ o. These ideas were presented in [38], and are discussed in more detail later in this Lecture. Since modern mathematical finance is based firmly on stochastic analysis, the development of the nonstandard approach based on Loeb measures (beginning with Anderson’s Brownian motion and developed by Keisler, Hoover
N.J. Cutland: LNM 1751, pp. 85–101, 2000. c Springer-Verlag Berlin Heidelberg 2000
86
4 Mathematical Finance Theory
& Perkins, Lindstrøm, and many others) meant that all the tools needed for the work mentioned above were readily available. The survey presented here will avoid as far as possible the technical details of the subject (which are many), seeking to convey the basic ideas and the ways in which the nonstandard methodology provides new insight and results. The discussion is confined to the Cox-Ross-Rubinstein (CRR) models for the discrete theory, and the Black-Scholes (BS) model for the continuous version (the setting of the famous Black-Scholes formula – see below). Both models give a mathematical representation of the evolution in time of stocks that fluctuate randomly, and bonds 1 which grow in value deterministically. Both models allow the discussion of so called derivative securities – and in particular options, which will be defined below.
4.2 The Cox-Ross-Rubinstein Models To begin we describe (a version of) the discrete Cox-Ross-Rubinstein (CRR) model Mn based on discrete time steps of length 1/n. Set ∆n = 1/n, and write Tn for the time set Tn = {k∆n : 0 ≤ k < n} and Tn = Tn ∪ {1}. We use sans serif symbols t,s etc. for elements of Tn or Tn . Write C√n for the set of paths of a simple random walk based on Tn with step size ± ∆n ; i.e. if X ∈ Cn then X(0) = 0 X(t + ∆n ) = X(t) ±
∆n
for t ∈ Tn
and X is filled in linearly between points of Tn . This gives the probability space Ω n = (Cn , An , Wn ) where Wn is the counting probability on Cn , and An = P(Cn ). The random walk Bn on Ω n is the canonical process Bn (t, X) = X(t) for each X ∈ Cn . By Bn (X) we mean the path (Bn (t, X))t∈Tn , and we write ∆Bn (t, X) = Bn (t + ∆n , X) − Bn (t, X) = ∆X(t). At the risk of confusion, but in order to emphasise the rˆ ole of Cn as the set of sample points of a probability space we will on occasions write Ωn = Cn and use ω as well as X to denote an element of Cn - so we could equally well write Bn (t, ω) above. The CRR model gives a mathematical representation of the evolution in time of the unit price of a randomly fluctuating stock2 Sn (t) for t ∈ Tn . The random walk Bn is not appropriate for this, since it can take negative values 1 2
equivalently, funds held in a bank or other interest bearing account The model we are describing is a one stock model; an m-stock model would represent the random evolution of m different stocks.
4.2 The Cox-Ross-Rubinstein Models
87
and has mean zero! The CRR model takes a geometrical version of this as follows. The initial value of the stock Sn (0) = s0 is assumed to be given, and then for t > 0 the price process is defined by Sn (t + ∆n , ω) = Sn (t, ω)(1 + µ∆n + σ∆B(t, ω))
(4.1)
or ∆Sn (t, ω) = Sn (t, ω)(µ∆n + σ∆B(t, ω)) (4.2) √ giving Sn : Tn × Ωn → R>0 provided that 1 + µ∆n ± σ ∆n > 0, which is assumed. In this definition the parameters µ and σ of the price process are the drift and volatility repectively. The drift µ gives the general trend of the price and the volatilty σ > 0 is a measure of the random fluctuation.3 The other component of the CRR model is the price of a bond, which is deemed to be a safe investment at a fixed interest rate r. By discounting to current prices, and by changing the unit of currency, we may without loss of generality take r = 0 and the initial price of the bond as 1. Within the CRR model a variety of financial activities can be represented. First, consider an investor who at any given time t ∈ Tn holds a portfolio consisting of a number of units each of the stock and the bond. Denote this as follows: Θb (t) = the number of units of the bond held at time t Θs (t) = the number of units of the stock held at time t Θ(t) = (Θb (t), Θs (t)) The investor may well, at time t, take account of the performance of the stock up to the time t, and so Θ can depend on ω up to time t. The nonanticipating4 function Θ : Tn × Ωn → R2 is the trading strategy of the investor - that is, it is a function that describes the changes he makes to his portfolio over time. The idea is that he holds the portfolio Θ(t) on the time interval [t, t + ∆n ) and at time t+∆n adjusts it to become Θ(t+∆n ). The value of this portfolio at time t is clearly given by the random process V (t, ω) = Θb (t, ω) + Θs (t, ω)Sn (t, ω).
(4.3)
If the portfolio is not changed at time t+∆n then at that time the investment will be worth Θb (t, ω) + Θs (t, ω)Sn (t + ∆n , ω) = V − (t + ∆n , ω) 3
4
(4.4)
If σ = 0 the price would behave like a fixed interest rate investment giving Sn (t) = Sn (0)(1 + µ∆n )nt , whereas if µ = 0 the price fluctuates but the mean remains at the original price; in fact the average change in price at each time step is zero. Recall that in this context this means that at time t the dependence in ω is only on ω(s) for s ≤ t.
88
4 Mathematical Finance Theory
say, and a gain5 of Θs (t, ω)(Sn (t + ∆n , ω) − Sn (t, ω)) = Θs (t, ω)∆Sn (t, ω) will have been made. The gains process for the trading strategy Θ is thus defined by G(t, ω) = Θs (s, ω)∆Sn (s, ω). s
In general it is assumed that the portfolio is changed at time t + ∆n , and this raises the question of financing any changes6 to the holding of stocks or bonds. If the investor neither injects new capital nor withdraws any funds, then his modified portfolio at time t + ∆n must have total value equal to the funds available to invest, namely V − (t + ∆n , ω) as given by (4.4) above – although the pattern of holding of bonds and stocks may change. When an investor pursues such a course of action throughout, the trading strategy Θ is called self-financing. An equivalent formulation of this notion is to say that any changes in the value of the portfolio are given entirely by the gains (or losses). Thus we have Definition 4.1 A trading strategy Θ is self-financing if any one of the following equivalent conditions holds. (i) V (t + ∆n , ω) = V − (t + ∆n , ω) for all t ∈ Tn ; (ii) V (t, ω) = V (0, ω) + G(t, ω) for all t ∈ Tn ; (iii) ∆V (t, ω) = Θs (t, ω)∆Sn (t, ω). Remark If no changes to a portfolio are made, then this is of course selffinancing. An important concept in financial models is that of an arbitrage opportunity; in the model Mn this is a self-financing strategy Θ with V (0) = 0 and V (1, ω) ≥ 0 all ω, with V (1, ω) > 0 for some ω. Informally an arbitrage opportunity is known as a free lunch. It turns out that in the models Mn there are no free lunches – and the models are said to be viable.
4.3 Options and Contingent Claims Financial options constitute one of the main kinds of derivative that are currently (and increasingly) being traded in financial markets, with levels of activity exceeding the trading of the underlying stocks on which options are based. 5 6
This could of course be negative - in which case it would be a loss. In this simple financial model it is assumed that the cost of making changes is zero - i.e. there are no transaction costs - which is of course unrealistic in practice.
4.3 Options and Contingent Claims
89
A call option on a stock7 is a right (in the form of a contract that has been purchased by the owner) to buy a certain quantity of shares at a fixed price K per share (written into the contract now) at some future date T . The owner of the option is not obliged to buy those shares, so he may or may not exercise the option when the time T comes. Whether or not he does will depend on the price of the shares at that time T , and how it compares with the strike price K. The fundamental question of option pricing is to determine the fair price of an option. Clearly the owner of the option cannot lose at time T : he will buy (and make a gain) if the market price of the shares at that time is more than the predetermined purchase price K guaranteed to him in his contract; otherwise he will not exercise his option. Plainly, an option of this kind is a valuable right, and so there will be a price to pay for it now. A simple example is the European call option, which we can describe in the above CRR model taking T = 1, and thinking of the present as time 0. A European call option with exercise time 1 and strike price K gives its owner the right to buy one unit of stock at time 1 for the price K. At time 1 it makes sense to exercise the option if and only if Sn (1, ω) ≥ K. This gives a guaranteed immediate profit of Sn (1, ω) − K. In the other situation, if Sn (1, ω) < K, to purchase the stock at price K would be absurd – if the stock were desired it could be obtained more cheaply directly from the market. To summarise, the owner of the European call option with strike price K and exercise time T = 1 owns an asset that will be worth (Sn (1, ω) − K)+ at time 1. A slightly more sophisticated option is an American call option, which gives the owner the right to buy at any time of his choosing, up to and including a set expiry time. There are many other so called ‘exotic’ options that are more complex and require more sophistication on the part of the trader. Abstractly an option is an example of a contingent claim -- which is, in the CRR model, simply a non-negative random variable C(ω). This represents the value at the time t = 1 of an asset that can be purchased now. In the case of the European call, we have C(ω) = (Sn (1, ω) − K)+ and in the case of the American option there is a more complicated formula involving stopping times. One of the first goals of the theory is to discover (if possible) the fair price at which such a claim should be traded now (i.e. at time t = 0). In the next section we will outline the general approach to pricing a claim C(ω) in the CRR model, and mention briefly the price so obtained for the European call. 7
The complementary option – the right to sell a quantity of stock - is called a put option. This is not, of course, the same as selling a call option.
90
4 Mathematical Finance Theory
4.3.1 Pricing a claim The idea for determining the ‘fair’ price Π(C) of an option C(ω) is the following. Suppose I (the person thinking of buying the option now, at time t = 0) knew that there was a self-financing trading strategy Θ such that at time t = 1 its value V (1, ω) was exactly C(ω) for all possible ω. Then in order to embark on that strategy now, I would have to invest a sum of V (0) = Θb (0) + Θs (0)s0 to buy the portfolio of bonds and stocks indicated by the strategy (note that at time t = 0 a strategy is non-random). Thus, assuming that there are no arbitrage opportunities, V (0) would be a fair price to buy (or sell) the option C(ω). This is because buyer or seller could equally well, for that price, purchase a random asset which would be worth C(ω) at time t = 1 by investing in stocks and bonds using the strategy Θ. This argument for fixing the fair price hinges crucially on the existence of a hedging or replicating strategy Θ, so the following result is fundamental. Theorem 4.2 Let C(ω) ≥ 0 be a contingent claim in the CRR model Mn . Then there is a unique self-financing trading strategy Θ such that C(ω) = V (1, ω)
(4.5)
for all ω. Proof (Sketch) This is actually simple linear algebra rather than probability theory, and is best illustrated by the case n = 1 and thinking about how it generalises. In the case n = 1 we have ∆n = 1 and C1 = Ω1 consists of two points (paths) ω = X + and ω = X − say for which X + (1) = 1 and X − (1) = −1. The claim C is given by its two values C(X + ) and C(X − ) and the desired trading strategy Θ has two unknown values Θb (0) and Θs (0). The equation V (1, ω) = C(ω) together with the requirement that Θ be self-financing (so that V (1, ω) = V − (1, ω)) gives the following equations for Θ C(X + ) = Θb (0) + Θs (0)S1 (1, X + ) C(X − ) = Θb (0) + Θs (0)S1 (1, X − ) which have a unique solution for Θ. The general case n > 1 is similar.
Remark There are alternative slightly more sophisticated ways to formulate the proof of the above result in the general case. One is to solve for the 2n values8 V (0) and Θs (t, ω) for t ∈ Tn using the 2n linear equations Θs (t, ω)∆Sn (t, ω) (4.6) C(ω) = V (1, ω) = V (0) + t<1 8
s
Since Θ is required to be nonanticipating, if t = k∆n is fixed then Θs (t, ω) takes only 2k values.
4.3 Options and Contingent Claims
91
and deriving Θb (t, ω) from (4.3) and the fact that9 V (t, ω) = V (0) + Θs (s, ω)∆Sn (s, ω). s
This approach, when combined with Theorem 4.4 below, shows that Theorem 4.2 is an application of the discrete martingale representation theorem. Definition 4.3 The fair price Π(C) of a contingent claim C(ω) is defined to be Π(C) = V (0) where V is the value process for the self-financing strategy Θ that generates C. Note that no probability theory was involved in the above result and definition, even though we used probabilistic notation. However, for the continuous counterpart of the above theory, the tools of probability and stochastic analysis are essential, and the counterpart of the following result plays an important rˆ ole. Theorem 4.4 There is a unique probability measure Qn on Ωn , called the equivalent martingale measure, that makes the price process Sn (t, ω) a martingale 10 . For any claim C(ω) the fair price Π(C) = V (0) derived from the unique replicating strategy Θ is also given by Π(C) = EQn (C(·)) Proof (Sketch) This is again routine combinatorics, simply defining the probability Qn so that the conditional mean, given Sn (t, ω), of the √ two values for ∆Sn (t, ω) = Sn (t, ω)(µ∆n + σ∆B(t, ω)) = Sn (t, ω)(µ∆n ± σ ∆n ) is zero. Once Sn is a martingale under Qn , the expression (4.6) gives V (0) = EQ (C) immediately. The upshot of the theory outlined above is that if we consider the following three entities: (i) a contingent claim C(ω); (ii) the self-financing strategy Θ(t, ω) that generates C(ω); (iii) the associated value process V (t, ω) 9 10
These equations and (4.6) are derived from the definition of the value (4.3) using the self-financing property. For those not familiar with this notion, a martingale is, in the present context, a nonanticipating stochastic process M (t, ω) such that at each t the average of the increments ∆M (t, ω), given knowledge of ω(s) for s ≤ t, is zero.
92
4 Mathematical Finance Theory
then any one element of the triple11 (C, Θ, V ) determines the other two, using Theorem 4.2 to obtain Θ from C, and the equations (4.3) and (4.5) to obtain V from Θ and C from V . In the next section we describe the continuous counterpart of this structure, before addressing the question of convergence of discrete models to the continuous one. The main theorem will show that with the “right” mode of convergence, if we have a sequence of triples (Cn , Θn , Vn ) then convergence of any one of the sequences Cn , Θn or Vn to a continuous version implies convergence of the other components of the triples. Before moving on, as promised above, we mention the application of the above theory to the particular case of a European call option, for which an explicit formula is obtained by routine (but tedious) combinatorics. A little notation is required. √ Let u = 1 + µ∆n + σ√ ∆n d = 1 + µ∆n√− σ ∆n q = 12 (1 − σµ ∆n ) A = the first integer m for which s0 um dn−m > K. Theorem 4.5 Let C(ω) = (Sn (1, ω) − K)+ be a European call option in the CRR model. Then the CRR price Π(C) is given by Π(C) = s0 Φ(A; n, uq) − KΦ(A; n, q)
(4.7)
where Φ is the complementary binomial distribution function Φ(m; n, p) =
n n j p (1 − p)n−j . m j=m
4.4 The Black-Scholes Model The simplest continuous time one-stock pricing model of Black-Scholes takes as underlying probability space Ω = C = C0 [0, 1] with P the Wiener measure. On this space let b(t, ω) = bt (ω) = ω(t) be the canonical Brownian motion. The price of a single risky stock s with price s0 at time t = 0 is modelled by the stochastic differential equation dst = σst dbt + µst dt which is the counterpart of (4.2) for a given volatility σ > 0 and (long-term) drift µ ∈ R. By Itˆ o’s formula the solution on [0, 1] is given by 11
We could also include the gains process G(t, ω) = V (t, ω) − V (0) and make this a quadruple, provided V (0) is specified along with G. See the discussion in [37].
4.4 The Black-Scholes Model
93
st = s0 exp(σbt + (µ − 12 σ 2 )t). A process of this kind is a geometric Brownian motion (with drift12 ). The model also has a bond which grows at a fixed interest rate r. As in the discrete case, we may assume without loss of generality that r = 0 and, moreover, that the fixed price of a unit of the bond is 1. The continuous time counterpart of the first part of Theorem 4.4 is: Theorem 4.6 There is a unique probability measure Q on Ω, called the equivalent martingale measure, that makes the price process s(t, ω) a martingale. In this model a trading strategy θ = (θs , θb ) is a pair of adapted processes θ , θb : [0, 1] × Ω → R giving the portfolio of number of units of stock and bond respectively held at time t. The value of this holding at time t is then s
v(t, ω) = θb (t, ω) + θs (t, ω)s(t, ω). The strategy θ is self-financing if for all t we have
t θs (t, ω)ds(t, ω), v(t, ω) = v(0) + 0
which is the counterpart of Definition 4.1(iii). It is a consequence of Theorem 4.6 that, as with the CRR models, there are no arbitrage opportunities in the BS model. A contingent claim in this model is an L2 (Q) random variable C(ω) ≥ 0. The fair price π(c) of such a claim is defined to be π(c) = v(0) where v is the value process for any self-financing strategy θ for which v(1, ω) = c(ω). The validity of this definition follows from the following, which is a disguised version of the martingale representation theorem for the Itˆ o calculus. (See Theorem 3.25, Section 3.5.3.) Theorem 4.7 Let c(ω) ≥ 0 be a contingent claim in the Black-Scholes pricing model. Then there is a unique self-financing trading strategy θ such that c(ω) = v(1, ω)
(4.8)
for a.a. ω. The continuous time counterpart of the second part of Theorem 4.4 is: 12
A (pure) geometric Brownian motion would have µ = 0.
94
4 Mathematical Finance Theory
Theorem 4.8 For any claim c(ω) ∈ L2 (Q) the fair price π(c) = v(0) derived from the unique replicating strategy θ is also given by π(c) = EQ (c(·)) The famous Black-Scholes formula gives explicitly the fair price π(c) of a European call c(ω) = (s(1, ω) − K)+ according to the preceding theory. Black and Scholes argued that the unique value of a European call option with strike K and expiry at time t = 1 is given by 13 s0 s0 π(c) = s0 ψ σ −1 log( ) + 12 σ − Kψ σ −1 log( ) − 21 σ K K where ψ denotes the normal cumulative density function.
(4.9)
4.5 The Black-Scholes Model and Hyperfinite CRR Models The first connection between the Black-Scholes (BS) model of the previous section and the CRR models Mn is that the former is in some sense the limit of the latter as n → ∞. One way to make this precise is to consider a hyperfinite (infinite) CRR model MN and “take standard parts”. Here are the details. Fix an infinite natural number N and consider the CRR model MN based on the internal probability space Ω N = (CN , AN , WN ). Recall first from Lecture 1 (Section 1.3.3) Anderson’s construction of Wiener measure/Brownian motion via the standard part mapping st : CN → C defined on the set of S-continuous paths in CN . On the Loeb space Ω = (CN , L(AN ), (WN )L ) Brownian motion is given by b(t, ω) = ◦ω(t). The space Ω carries the internal price process SN for the risky stock, given by the formula (4.1). Then we have, writing P = PN for the Loeb measure (WN )L : Theorem 4.9 With respect to P , almost all paths of SN are S-continuous and ◦ SN (t, ω) = st = s0 exp σbt + (µ − 12 σ 2 )t which is the Black-Scholes price model with s0 = ◦SN (0). 13
The usual Black-Scholes formula is a little more complex since it considers any future time T as expiry date, and also allows the bond interest rate to be r > 0. In full generality it gives the fair price at any time t < T in the future that is prior to the expiry date T .
4.5 The Black-Scholes Model and Hyperfinite CRR Models
95
4.5.1 The Black-Scholes formula One of the main results of [35] shows that the Black-Scholes formula is the standard part of the formula for the price of a European option (4.7) in the CRR model MN . Theorem 4.10 Let C(ω) = (SN (1, ω) − K)+ and c(ω) = (s(1, ω) − K)+ be European calls in the CRR and Black-Scholes market models respectively. Then π(c) = ◦Π(C) where the option prices π(c) and Π(C) are given by (4.9) and (4.7). The proof, which can be found in [35], is a straightforward application of the Central Limit Theorem, which links the complementary binomial function Φ with the normal cdf ψ. These results show that in some sense “the Black-Scholes market model contains an built-in version of the CRR model” – which is the economists’ intuition. Further elaboration of this point is provided in the following sections. 4.5.2 General claims The link between the Black-Scholes formula and the corresponding CRR formula is a particular case of a general relationship between entities in the BS and hyperfinite CRR models. Recall first the idea of liftings (see Section 1.3.4 of Lecture 1). In the present context we are concerned with two-legged liftings as follows. Definition 4.11 (i) Let f : C → R be an L2 (Q) random variable and F : CN → ∗R. Then F is an SL2 lifting of f if F is SL2 (with respect to QN ) and for a.a. ω ∈ CN F (ω) ≈ f (◦ω) (ii) Let g : [0, 1]×C → R be an adapted L2 function and G : TN ×CN → ∗R be nonanticipating. Then G is an SL2 lifting of g if G is SL2 (QN ) and for a.a. (t, ω) ∈ TN × CN G(t, ω) ≈ g(◦t, ◦ω) (iii) Let h : [0, 1] × C → R be an adapted L2 function with h(·, ω) a.s. continuous and suppose that H : TN ×CN → ∗R is nonanticipating. Then H is an S-continuous SL2 lifting of h if H is an SL2 (QN ) lifting and for a.a. ω ∈ C N the function H(·, ω) is S-continuous, so in fact for a.a. ω ∈ CN H(t, ω) ≈ h(◦t, ◦ω) for all t ∈ TN .
96
4 Mathematical Finance Theory
The following result is implicit in [35] and explicit in [37]. Theorem 4.12 Let (C, Θ, V ) be a triple in the hyperfinite CRR model Ω N consisting of a claim C(ω), the self-financing strategy Θ that generates it, and the corresponding value process V (t, ω) as in Section 4.3.1. Suppose further that (c, θ, v) is a triple of the same kind in the BS model, with the selffinancing strategy that generates the claim c and the corresponding value process v as given in Section 4.4. Then the following are equivalent: (a) C is an SL2 lifting of c; (b) Θ is an SL2 nonanticipating lifting of θ; (c) V is an S-continuous SL2 lifting of v.
4.6 Convergence of Market Models When considering the issue of convergence of entities (for example claims, value processes, trading strategies) in the discrete CRR market models Mn to similar entities in the BS model M it must be be recognised that the underlying probability spaces Ω n are changing as well as the entities themselves. In this sort of situation the conventional mode of convergence is weak convergence, but it is easy to see that this does not preserve the intrinsic structure and relationships such as that between a claim, the generating strategy and the corresponding value process that was described above14 . The nonstandard approach suggests an alternative mode of convergence, picking up the fundamental idea from basic calculus that a real sequence sn converges to s ∈ R if and only if ∗sN ≈ s for all infinite N . So we can tentatively make the following definitions. Definition 4.13 Suppose that (Cn )n∈N , (Θn )n∈N and (Vn )n∈N are sequences of claims, self-financing strategies and value processes in the CRR models Mn , and c is a claim, θ a self-financing strategy and v a value process in the BS model on Ω. Then (i) Cn D2 -converges to c if CN is an SL2 (QN ) lifting of c for all infinite N; (ii) Θn D2 -converges to θ if ΘN is a nonanticipating SL2 (QN ) lifting of θ for all infinite N ; (iii) Vn D2 -converges to v if VN is an S-continuous SL2 (QN ) lifting of v for all infinite N . 14
It is easy to construct an example of claims Cn in Mn that converge weakly whose generating trading strategies do not converge weakly.
4.6 Convergence of Market Models
97
D2
When convenient we will write −→ to denote D2 -convergence. The reason for the choice of the name D2 -convergence will be clear a little later on. First there are a number of issues to discuss. Since this notion depends on nonstandard analysis, it could conceivably be a notion whose meaning depends on the particular nonstandard universe being used. A related question is whether there is a standard convergence notion that is equivalent to the above definitions; if so then it would be independent of the nonstandard model. There is next the question of finding D2 -convergent sequences for a given entity in the BS model, and the relationship with other modes of convergence. Finally, there is the question as to whether D2 -convergence is either useful or natural. Taking the last point first, as motivation for the effort involved in answering the others, we have: Theorem 4.14 Suppose that (Cn , Θn , Vn )n∈N is a sequence of claims, together with their generating strategies and value processes, in CRR models Mn , and a similar triple (c, θ, v) in the BS model M. Then the following are equivalent: D2
(a) Cn −→ c; D2
(b) Θn −→ θ; D2
(c) Vn −→ v. Proof This is an immediate application of Theorem 4.12.
This result can be described by saying that D2 -convergence preserves the basic constructions of option pricing – unlike weak convergence – and this suggests that it an interesting notion. In answer to the other issues raised, although the nonstandard characterisation is perhaps the most natural one for D2 -convergence (given the Loeb space machinery), there are two equivalent standard definitions, so this is a genuine down-to-earth notion of the theory. For simplicity we will restrict our remarks here to the case of claims. It is necessary to recall the equivalent martingale measures Qn and Q for the CRR and BS models. Then the first standard characterisation of D2 -convergence is given by the following result. Theorem 4.15 Let (Cn ) be a sequence of contingent claims in the CRR models Mn and let c ∈ L2 (Q). The following are equivalent: D2
(a) Cn −→ c; (b) (Bn , Cn (Bn )) → (b, c(b)) weakly15 and EQn (Cn2 ) → EQ (c2 ). For details of the proof see [37], where a similar characterisation of D2 convergence of trading strategies and value processes may be found. 15
By this we mean convergence of the distribution of (Bn , Cn (Bn )) in Cn ×R under Qn to that of (b, c(b)) in C × R under Q.
98
4 Mathematical Finance Theory
4.7 Discretisation Schemes The second standard characterisation of D2 -convergence uses the idea of discretisation schemes to relate the continuous space C to the discrete approximations Cn . Here is the definition. Definition 4.16 A family (dn ) of measurable maps C → Cn is an adapted Q-discretisation scheme if for each n: (i) dn is adapted (i.e. (dn (b))(t) depends only on the values (bs )s≤t ); (ii) dn is measure-preserving (with respect to Q and Qn ); (iii) dn (b) → b in Q-probability, i.e. ∀ > 0 : Q(|dn (b) − b| < ) → 1 as n → ∞. (Here |.| denotes the sup norm in C.) The existence of such a scheme can be established by modifying a construction given by Frank Knight in 1962, using polygonal paths approximating Brownian motion; the proof is a little technical, so we refer the reader to [37] for details. The results below indicate that for the present purposes any two discretisation schemes are equivalent. The rˆole of such schemes in characterising D2 -convergence is as follows. Theorem 4.17 Let (Cn ) be a sequence of contingent claims in the CRR models Mn and let c ∈ L2 (Q). Suppose that an adapted Q-discretisation scheme (dn ) is given. The following are equivalent: (a) Cn is D2 -convergent to c; (b) Cn (dn (·)) converges to c(·) in L2 (Q)-norm. Remark It is from this characterisation of D2 -convergence that the name is derived – it is L2 -convergence with respect to a discretisation scheme. Note that this theorem shows that as far as D2 -convergence is concerned any two discretisation schemes are equivalent. There is a similar characterisation of D2 -convergence of self-financing strategies and value processes; since there is an extra time parameter it involves discretisation of time also, but this is quite straightforward – see [37] for details. The discretisation scheme characterisation of D2 -convergence may be of value in devising numerical approximation schemes for the calculation of prices and the like. At the least it provides the information that calculations using such schemes give approximations that converge. The results described above regarding convergence of discrete models to the continuous BS one are really a special case of a more abstract formulation of the convergence of discrete stochastic calculus to the continuous counterpart on the classical Wiener space. As has already been mentioned, obtaining the self-financing strategy that replicates a given contingent claim is an application of the martingale representation theorem – which can be thought of as stochastic differentiation. Going in the direction from a trading
4.8 Further Developments
99
strategy to the value process is, abstractly, simply stochastic integration. So D2 -convergence is a mode of convergence that preserves these basic operations of the stochastic calculus. These more abstract (but financial-jargonfree) formulations of the results described here are discussed in the paper [38], in the slightly simpler context of ordinary Brownian motion (as opposed to geometric Brownian motion as here). That paper shows that there are other operations that are stable under D2 -convergence – for example the Wiener-Itˆ o chaos representation of L2 functionals.
4.8 Further Developments The ideas outlined in this lecture so far have been extended in a number of different directions more recently, as follows. 4.8.1 Poisson pricing models An alternative discrete market model was suggested by Cox & Ross [18], based on jump processes modelling the stock prices. This model converges to a continuous model where prices are modelled by a geometric Poisson process. An early result of Loeb in his original paper [71] shows how a Poisson process is obtained as the standard part of a discrete jump process with infinitesimal time steps, using Loeb measure and the standard part mapping, in the same way that Brownian motion is obtained from an infinitesimal random walk. The paper [36] shows how this can be used to give results parallel to those linking the CRR discrete market model to the BS continuous time model discussed earlier. In particular the theory of D2 -convergence applies to this setting also and preserves the operations of finding self-financing trading strategies and value processes. Abstractly this is again a preservation result concerning stochastic integration and differentiation, this time with respect to a driving martingale that is a (compensated) Poisson process. 4.8.2 American options An American option is one where the exercise time16 is at the owner’s discretion at any time up to a fixed expiry time. Such options, in particular an American put option, are more complicated than European options both to deal in and also to investigate theoretically. The time to exercise is obviously a stopping time – that is, a random time that depends only on what has happened so far and doesn’t require any foreknowledge. The question arises as to what is the best or optimal time to exercise the option – naturally defined in terms of the stopping time that gives the best return. In the Black–Scholes 16
that is, the time when the owner can exercise his right to buy (for a call option) or sell (for a put option)
100
4 Mathematical Finance Theory
model the value vt of an American put option to its owner at time t is therefore defined as the supremum over stopping times τ ≥ t of the expected return if the option is exercised at the time τ . From an abstract point of view this is an optimal stopping problem, with the value process given by the Snell envelope of the return process (K − su )+ , where K is the strike price for this option. It turns out that there is a unique optimal stopping time that gives the first occurrence of the critical price (or optimal stopping boundary) sc at which it is optimal to exercise the option. The critical price is a deterministic function sc (u) that depends on the time u ≥ 0 at which the option is being valued. There is a discrete version of this theory for the CRR models Mn , and it is shown in [40] that all the entities mentioned above D2 -converge to the continuous version. In the case of the critical price, which is non-random, the results show that the discrete critical price functions Snc converge uniformly to sc in the BS model. Thus, in addition to providing a more transparent link between the theory of American options in the discrete setting and those in the continuous context, the machinery of D2 -convergence, coupled with the hyperfinite Loeb space machinery, again gives new stronger convergence information. For a complete exposition of this see the paper [40]. 4.8.3 Incomplete markets A model of a financial market is called complete if every claim can be uniquely replicated by a self-financing strategy as in Theorems 4.2 and 4.7. In such models the fair price for an option is the same whether calculated from the self-financing strategy or using the equivalent martingale idea. In incomplete markets the question of a fair price for an option is still a topic of discussion and research. A number of different but related approaches have been suggested. In his recent thesis [97] Wellmann suggests that stability under convergence – in particular D2 -convergence – should be one basic requirement for a satisfactory pricing methodology. He considers the mean-variance hedging and the variance-optimal pricing methodologies, and using an extension of the methods described above shows that both are stable under D2 -convergence. The thesis also examines practical aspects of these models in terms of their utility for numerical approximations. 4.8.4 Fractional Brownian motion Self-similar processes such as fractional Brownian motion (FBM) have been proposed as providing more realistic models for stock prices than those such as geometric Brownian motion that form the basis for the BS model. From observing the stock market it is clear that the time series in actual stock price data do not conform to the BS hypotheses, but appear to display some evidence of long-term dependence (though the latter is still hotly debated).
4.8 Further Developments
101
Fractional Brownian motion (FBM) models are good limiting models for longterm dependence, and thus have been proposed as alternative pricing models. However, FBM is not a semimartingale, which makes it impossible to use the apparatus of stochastic calculus in its current form. More seriously, a host of other issues is raised by the fact that the absence of the semimartingale property means that there is no equivalent martingale measure for FBM, and the rationale for pricing of options that gave rise to the theory outlined in the CRR and BS models is no longer valid. A nonstandard definition of FBM, based on a fractional version of the Anderson random walk, was constructed in [39] and arbitrage opportunities were identified in the hyperfinite model. How these might be adjusted to yield a set of fractional Brownian paths of positive Loeb measure along which arbitrage is possible, remains an open question. Other (standard) discussions of FBM as a pricing model have not so far displayed an explicit set of ‘arbitrage paths’. 4.8.5 Interest rates So called term structure models for interest rates form a major pre-occupation of finance theorists at present. There is no consensus on the ‘correct’ model to use and competing alternatives abound, based mostly on the insights gained from the Black-Scholes model and its generalisations. Work in this field using nonstandard methods was initiated by Wellmann [96], who discusses a hyperfinite version of the Heath-Jarrow-Morton model. This area seems to offer much scope for the development of nonstandard approaches.
References
1. S.N. Antontsev, A.V. Kazhikhov & V.N. Monakhov, Boundary value problems in mechanics of nonhomogeneous fluids, Amsterdam, 1990. 2. L. Arnold & M. Scheutzow, Perfect cocycles through stochastic differential equations, Probability and Related Fields, 101(1995), 65–88. 3. S. Albeverio, J.-E. Fenstad, R. Høegh-Krohn, & T. Lindstrøm, Nonstandard Methods in Stochastic Analysis and Mathematical Physics, Academic Press, New York, 1986. 4. S. Albeverio, W.A.J. Luxemburg & M.P.H. Wolff (eds.), Advances in Analysis, Probability and Mathematical Physics – Contributions from Nonstandard Analysis, Kluwer Academic Publishers, Dordrecht, Boston, London, 1995. 5. R.M. Anderson, A nonstandard representation for Brownian motion and Itˆ o integration, Israel Math. J. 25(1976), 15–46. 6. R.M. Anderson, Star–finite representations of measure spaces, Trans. Amer. Math. Soc. 271(1982), 667–687. 7. A. Bensoussan & R. Temam, Equations stochastiques du type Navier–Stokes, J. Functional Analysis 13(1973), 195–222. 8. L.O. Arkeryd, N.J. Cutland, & C.W. Henson, editors, Nonstandard Analysis: Theory and Practice, NATO Advanced Study Institutes Series Vol. 493, Kluwer Academic Publishers, Netherlands, 1997. 9. M. Capi´ nski & N.J. Cutland, Statistical solutions of Navier–Stokes equations by nonstandard densities, Mathematical Models and Methods in Applied Sciences 1:4(1991), 447–460. 10. M. Capi´ nski & N.J. Cutland, Stochastic Navier–Stokes equations, Acta Applicanda Mathematicae 25(1991), 59–85. 11. M. Capi´ nski & N.J. Cutland, A simple proof of existence of weak and statistical solutions of Navier–Stokes equations, Proc. Roy. Soc. Lond. Ser.A, 436(1992), 1–11. 12. M. Capi´ nski & N.J. Cutland, Navier–Stokes equations with multiplicative noise, Nonlinearity 6(1993), 71-77. 13. M.Capi´ nski & N.J.Cutland, Nonstandard Methods for Stochastic Fluid Mechanics, World Scientific, Singapore, London, 1995. 14. M. Capi´ nski & N.J. Cutland, Attractors for three-dimensional Navier–Stokes equations, Proc. Roy. Soc. Lond. Ser.A, 453(1997), 2413-2426. 15. M. Capi´ nski & N.J. Cutland, Measure attractors for stochastic Navier–Stokes equations, Electronic J. Prob. 3(1998), Paper 8, 1-15. 16. M. Capi´ nski & N.J. Cutland, Stochastic Euler equations on the torus, The Annals of Applied Probability, to appear.
104
References
17. M. Capi´ nski & N.J. Cutland, Existence of global stochastic flow and atractors for Navier–Stokes equations, Probability Theory & Related Fields115(1999), 121-151. 18. J. Cox & S. Ross, The valuation of options for alternative stochastic processes, J. Fin. Econ. 3(1976), 145–166. 19. H. Crauel and F. Flandoli, Attractors for random dynamical systems, Probability Theory & Related Fields 100(1994), 365–393. 20. N.J.Cutland, Nonstandard measure theory and its applications, Bull. London Math. Soc. 15(1983), 529-589. 21. N.J. Cutland, Simplified existence for solutions to stochastic differential equations, Stochastics 14(1985), 319–325. 22. N.J. Cutland, Infinitesimal methods in control theory: deterministic and stochastic, Acta Applicandae Mathematicae 5(1986), 105–135. 23. N.J. Cutland, Infinitesimals in action, J. London Math. Soc. 35(1987), 202– 216. 24. N.J. Cutland (ed.), Nonstandard Analysis and its Applications, Cambridge University Press, Cambridge 1988. 25. N.J. Cutland, An extension of the Ventcel-Freidlin large deviation principle, Stochastics 24(1988), 121-149. 26. N.J. Cutland, The Brownian bridge as a flat integral, Math. Proc. Cambridge Phil. Soc. 106(1989), 343-354. 27. N.J. Cutland, An action functional for L´evy Brownian motion, Acta Applic. Math. 18(1990), 261-281. 28. N.J. Cutland, On large deviations in Hilbert space, Proc. Edinburgh Math. Soc. 34(1991), 487-495. 29. N.J. Cutland, Loeb measure theory, in [44], 151-177. 30. N.J. Cutland, Nonstandard Real Analysis, in [8], 51–76. 31. N.J. Cutland, Internal controls and relaxed controls, J. London. Math. Soc. 27(1983), 130–140. 32. N.J. Cutland, Brownian motion on the Wiener sphere is the infinite dimensional Ornstein-Uhlenbeck process, Stochastic Processes and their Applications 79(1999), 95–107. 33. N.J. Cutland & H.J. Keisler, Neocompact sets and stochastic Navier–Stokes equations, in Stochastic Partial Differential Equations, (Ed. A. Etheridge), LMS Lecture Notes Series 216, CUP, 1995, 31–54. 34. N.J. Cutland & H.J. Keisler, Global attractors for 3-dimensional stochastic Navier–Stokes equations, in preparation. 35. N.J.Cutland, P.E. Kopp & W. Willinger, A nonstandard approach to option pricing, Mathematical Finance 1(4)(1991), 1-38. 36. N.J.Cutland, P.E. Kopp & W. Willinger, A nonstandard treatment of options driven by Poisson prices, Stochastics and Stochastic Reports 42(1993), 115133. 37. N.J.Cutland, P.E. Kopp & W. Willinger, From discrete to continuous financial models: new convergence results for option pricing, Mathematical Finance 3(1993), 101-123. 38. N.J.Cutland, P.E. Kopp & W. Willinger, From discrete to continuous stochastic calculus, Stochastics and Stochastic Reports, 52(1995), 173-192. 39. N.J.Cutland, P.E. Kopp & W. Willinger, Stock price returns and the Joseph effect: a fractional version of the Black-Scholes model, Progress in Probability 36(1995), 327-351.
References
105
40. N.J.Cutland, P.E. Kopp, W. Willinger & M.C. Wyman, Convergence of Snell envelopes and critical prices in the American put, in Mathematics of Derivative Securities (Eds. M.A.H. Dempster & S.R. Pliska), CUP 1997, 126–140. 41. N.J. Cutland and S-A. Ng, The Wiener sphere and Wiener measure, Annals of Probability 21(1993), 1–13. 42. N.J. Cutland and S-A. Ng, On homogeneous chaos, Math. Proc. Camb. Phil. Soc. 110(1991), 353-363. 43. N.J. Cutland and S-A. Ng, A nonstandard approach to Malliavin calculus, in Applications of Nonstandard-Analysis to Analysis, Functional Analysis, Probability Theory and Mathematical Physics (eds. S. Albeverio, W.A.J. Luxemburg and M. Wolff), 149-170, D. Reidel-Kluwer, Dordrecht, 1995. 44. N.J. Cutland, F. Oliveira, V. Neves, & J. Sousa-Pinto (Editors), Developments in Nonstandard Mathematics, Pitman Research Notes in Mathematics Series Vol. 336, Longman 1995. 45. N.J. Cutland & D.A. Ross, Young measures, in preparation. 46. D. Dacunha-Castelle & J.L. Krivine, Applications des ultraproduits ` a l’´etude des espaces et des alg`ebres de Banach, Studia Mathematica 41(1972), 315–334. 47. M. Davis, Applied Nonstandard Analysis, Wiley, New York, 1977. 48. B.E. Enright, A Nonstandard Approach to the Stochastic Nonhomogeneous Navier–Stokes Equations, PhD thesis, University of Hull, 1999. 49. S. Fajardo & H.J. Keisler, Neometric spaces, Advances in Mathematics 118(1996), 134–175. 50. S. Fajardo & H.J. Keisler, Existence theorems in probability theory, Advances in Mathematics 120(1996), 191–257. 51. F. Flandoli and B. Schmalfuss, Random attractors for the 3D stochastic Navier–Stokes equation with multiplicative white noise, Stochastic & Stochastics Reports, 59(1996), 21-45. 52. F. Flandoli and B. Schmalfuss, Weak solutions and attractors for threedimensioanl Navier–Stokes equations with nonregular force, J. Dynamics and Differential Equations, 11(1999), 355-398. 53. C. Foias, Statistical study of Navier–Stokes equations I, Rend. Sem. Mat. Univ. Padova 48(1973), 219–348. 54. R. Goldblatt, Lectures on the Hyperrreals: An Introduction to Nonstandard Analysis, Graduate Texts in Mathematics 188, Springer-Verlag 1998. 55. D.R.Gordon, Applications of Nonstandard Analysis in Differential Game Theory, PhD thesis, University of Hull, 1996. 56. J.M. Henle & E.M. Kleinberg, Infinitesimal Calculus, MIT Press, Cambridge, Massachusetts, 1979. 57. D.N. Hoover & E. Perkins, Nonstandard constructions of the stochastic integral and applications to stochastic differential equations I, II, Trans. Amer. Math. Soc. 275(1983), 1–58. 58. A.E. Hurd & P.A. Loeb, An Introduction to Nonstandard Real Analysis, Academic Press, New York 1985. 59. A. Ichikawa, Stability of semilinear stochastic evolution equations, Journal of Mathematical Analysis and Applications 90(1982), 12–44. 60. H.J. Keisler, Foundations of Infinitesimal Calculus, Prindle, Weber & Schmidt, Boston, 1976. 61. H.J.Keisler, An infinitesimal approach to stochastic analysis, Mem. Amer. Math. Soc. 297(1984).
106
References
62. H.J. Keisler, A neometric survey, in [44], 233-250. 63. H.J. Keisler, Stochastic differential equations with extra properties, in [8], 259–277. 64. P.E. Kopp, Hyperfinite mathematical finance, in [8], 279–307 65. O. A. Ladyzhenskaya, A dynamical system generated by the Navier-Stokes equations, J. Soviet Math. 3(1975), 458-479. 66. J.T. Lewis, Brownian motion on a submanifold of Euclidean space, Bull. London Math. Soc. 18(1986), 616-620. 67. B.-H. Li & Y.-Q. Li, Optimal estimation of shell thickness in Cutland’s construction of Wiener measure, Proc. Amer. Math. Soc. 126(1998), 225-229. 68. T.L.Lindstrøm, Hyperfinite stochastic integration I, II, III, Math. Scand. 46 (1980), 265–333. 69. T.L.Lindstrøm, An invitation to nonstandard analysis, in [24], 1-105. 70. T.L.Lindstrøm, Anderson’s Brownian motion and the infinite dimensional Ornstein–Uhlenbeck process, in [4]. 71. P.A.Loeb, Conversion from nonstandard to standard measure spaces and applications in probability theory, Trans. Amer. Math. Soc. 211(1975), 113–122. 72. H.P. McKean, Geometry of differential space, Annals of Probability, 1(1973), 197–206 73. P. Malliavin, Stochastic calculus of variation and hypo-elliptic operators, in Proc. Inter. Sym. Stoch. Differential Equations (ed K. Itˆ o), 195-263, Kinokuniya-Wiley 1978. 74. P.A. Meyer, Notes sur les processus d’Ornstein-Uhlenbeck, Seminaire de Prob XVI, Lecture Notes in Mathematics 920, 95–133, Springer 1982. 75. H. Morimoto, Attractors of probability measures for semilinear stochastic evolution equations, Stoch. Anal. Appl. 2:10(1992), 205–212. 76. G.J. Morrow & M.L. Silverstein, Two parameter extension of an observation of Poincar´e, Seminaire de Probabilit´e XX (Eds. J. Azema & M. Yor), Lecture Notes in Mathematics 1204, Springer 1986. 77. D. Nualart, The Malliavin Calculus and Related Topics, Springer-Verlag 1995. 78. D. Nualart & E. Pardoux, Stochastic calculus with anticipating integrands, Probability Theory and Related Fields 78(1988), 535–581. 79. D. Ocone, Malliavin’s calculus and stochastic integral representation of functionals of diffusion processes, Stochastics 12(1984), 161–185. 80. D. Ocone, A guide to the stochastic calculus of variations, Lecture Notes in Mathematics 1316 (eds H. Korezlioglu & A.S. Ustunel), 1-79, Springer 1988. 81. H. Osswald, Introduction to the analysis on the Wiener-space using infinitesimals, pre-print, Mathematics Institute, University of Munich. 82. H. Poincar´e, Calcul des Probabilit´es, Gauthier-Villars, Paris, 1912. 83. A. Robinson, Non-standard analysis, Proc. Roy. Acad. Amsterdam Ser A, 64(1961), 432–440. 84. A. Robinson, Nonstandard Analysis, North-Holland, Amsterdam, 1966 (2nd, Revised edition 1974). 85. D.A. Ross, Measures invariant under local homeomorphisms, Proc. Amer. Math. Soc. 102(1988), 901–905. 86. D.A. Ross, Unions of Loeb nullsets, Proc. Amer. Math. Soc. 124(1996), 188388. 87. D.A. Ross, Loeb measure and probability, [8], 91-120. 88. H.L.Royden, Real Analysis, Macmillan, New York 1968.
References
107
89. B. Schmalfuß, Long-time behaviour of the stochastic Navier–Stokes equations, Math. Nach. 152(1991), 7-20. 90. B. Schmalfuß, Measure attractors of the stochastic Navier–Stokes equation, Bremen Report no.258, 1991. 91. B. Schmalfuß, Measure attractors and stochastic attractors, Bremen Report no. 332, 1995. 92. G. Sell, Global attractors for the three-dimensional Navier–Stokes equations, J. Dyn. Differential Eqns. 8(1996), 1–33. 93. A.V. Skorohod, On a generalisation of a stochastic integral, Theor. Prob. Appl. 20(1975), 219–233. 94. D.Stroock, The Malliavin calculus and its applications to second order parabolic differential operators, I & II, Math Systems Theory 14(1981), 25-65; 141-171. 95. R. Temam, Infinite-Dimensional Dynamical Systems in Mechanics and Physics, Springer–Verlag, New York 1988; 2nd edition 1997. 96. V. Wellmann, Stochastic Models for the Term Structure of Interest Rates, MSc thesis, University of Hull, 1996. 97. V. Wellmann, Convergence in incomplete market models, PhD thesis, University of Hull, 1998. 98. N.Wiener, Differential space, J. Math. and Phys 2(1923), 132-174. 99. D. Williams, To begin at the beginning.. , in Stochastic Integrals (ed. D. Williams), Lecture Notes in Mathematics 851(1981), 1-55. 100. H.J. Yashima, Equations de Navier–Stokes stochastiques non homog`enes et applications, Tesi di Perfezionamento, Scuola Normale Superiore, Pisa, 1992.
Index
absorbing set, 48 action, 62 adapted discretisation scheme, 98 adapted function, 27 ℵ1 -saturation, 10 American call option, 89 American option, 99 Anderson’s Brownian motion, 18 Anderson’s Luzin theorem, 20 Anderson’s random walk, 70 arbitrage, 88 attractor, 47 – for Navier–Stokes equations, 46 – for stochastic Navier–Stokes equations, 55 – global, 47 – measure, 50 – random global, 53 – S-attractor, stochastic, 54 – stochastic, 47, 52 Black-Scholes formula, 94, 95 Black-Scholes model, 92 Bochner integral, 31 bounded quantifier statement, 9 Brownian motion, 19, 61 – fractional, 100 – geometric, 93 BS model, 94 call option, 89 Cameron–Martin formula, 62, 65, 68 Cameron–Martin subspace, 65 claim, contingent, 89, 90, 93 cocycle, 47 – crude, 53 – perfect, 53 complete market, 100
comprehension, countable, 11 contingent claim, 89, 90, 93 convergence – D2 -convergence, 97 convergence of market models, 96 countable comprehension, 11 Cox-Ross-Rubinstein model, 86 CRR model, 86, 94 – hyperfinite, 94 D2 -convergence, 97 delayed equation, 25 derivation operator, 63 discretisation scheme, 98 divergence, 79 Donsker’s flat integral, 61 Donsker’s invariance principle, 19 drift, 87 dynamical system, random, 53 equivalent martingale measure, 91, 93 Euler equations, stochastic, 40 European call option, 89, 92 external set, 10 fair price, 89 finance theory, 85 finite hyperreal, 2 flat integral, 61, 64 Foias equation, 42 fractional Brownian motion, 100 gains process, 88 Galerkin approximation, 32 geometric Brownian motion, 93 global attractor, 47 – random, 53 gradient operator, 63, 77
110
Index
Haar measure, 17 Hausdorff space, 12 hedging – mean variance, 100 – variance-optimal, 100 hedging strategy, 90 Hermite polynomial, 76 Hilbert spaces, 30 hyperfinite CRR model, 94 hyperfinite difference equation, 19, 24 hyperfinite set, 16 hyperfinite time line, 16, 63 hyperreals, 2 Ichikawa integral, 31 incomplete market, 100 infinite hyperreal, 2 infinitely close, 2 infinitesimal, 2 infinitesimal delayed equation, 25 infinitesimal random walk, 18 inner Loeb measure, 15 integration by parts, 81 interest rate model, 101 internal set, 9 Itˆ o integral, 19, 27, 74 Itˆ o’s lemma, 19 Key lemma, 14 Lebesgue measure, 16 lifting, 20, 25 – monomial, 76 lifting, two legged, 20 Loeb algebra, 13, 15 Loeb counting measure, 16 Loeb differential equation, 26 Loeb integrable function, 21 Loeb integration, 20 Loeb measurable, 15 Loeb measurable function, 19 Loeb measure, 13 Loeb null set, 14 Loeb probability, 15 Malliavin calculus, 63, 70, 72 – gradient operator, 73, 77 – Malliavin covariance, 81 – Malliavin operator, 73, 83
– Skorohod integral, 73, 79 Malliavin operator, 83 martingale representation theorem, 78, 98 measure attractor, 50 monad, 2, 11 monomial lifting, 76 µ-approximable, 15 multiple Wiener integral, 74 Navier–Stokes equations, 29 – nonhomogeneous, 40 – solution of deterministic, 34 – solution of stochastic, 37 – stochastic, 30, 31 – uniqueness of solution, 34 nearstandard, 11 neocompact, 57, 58 nonstandard densities, 44 nonstandard reals, 2 nonstandard topology, 11 – in Hilbert spaces, 33 nonstandard universe, 7, 8 number operator, 83 Ocone’s formula, 78 optimal stopping, 100 option – American, 89 – American put, 99 – call, 89 – European, 89, 92 – put, 89 option pricing, 89 Ornstein–Uhlenbeck process, 84 – infinite dimensional, 63, 69 outer Loeb measure, 15 overflow, 10 Peano’s existence theorem, 24 Poisson pricing model, 99 portfolio, 87 price process, 87 put option, 89 Q-discretisation scheme, 98 random attractor, 53 random dynamical system, 53 replicating strategy, 90
Index S-absorbing set, 48 S-continuous, 12 S-integrable, 22 saturated, ℵ1 -, 10 self-financing trading strategy, 88, 90, 93 semigroup, 46, 52 Skorohod integral, 79 Snell envelope, 100 standard part, 5, 12 statistical solution, 42, 43 – for stochastic Navier–Stokes equations, 46 – using Loeb measures, 43 – using nonstandard densities, 45 stochastic attractor, 47, 52 – for Navier–Stokes equations, 52, 53 stochastic differential equation, 19, 28 stochastic differentiation, 98, 99 stochastic Euler equations, 40 stochastic flow, 39, 53 stochastic Navier–Stokes equations, 30, 31 – attractor, 55 – definition of solution, 32 – nonhomogeneous, 40 – solution of, 37 stopping time, 99
111
strategy – hedging, 90 – replicating, 90 strike price, 89 strong solution, 32 superflow, 54 superstructure, 8 symmetrisation, 79 term structure for interest rates, 101 term structure models, 101 trading strategy, 87, 93 – self-financing, 88, 90, 93 Transfer Principle, 4, 9 ultrapower, 2, 9 underflow, 10 universe, nonstandard, 7 value process, 87 viable market model, 88 volatility, 87 weak solution, 32 Wiener measure, 17, 61, 65, 67 Wiener process, 61 Wiener sphere, 62, 66 – Brownian motion on, 69 Wiener-Itˆ o chaos decomposition, 75