OX FO R D G R A D U AT E T E X T S I N M AT H E M AT I C S
Series Editors R. COHEN S.K. DONALDSON S. HILDEBRANDT T . J . LY O N S M . J . TAY L O R
OX FO R D G R A D U AT E T E X T S I N M AT H E M AT I C S
Books in the series 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Keith Hannabuss: An introduction to quantum theory Reinhold Meise and Dietmar Vogt: Introduction to functional analysis James G. Oxley: Matroid theory N.J. Hitchin, G.B. Segal, and R.S. Ward: Integrable systems: twistors, loop groups, and Riemann surfaces Wulf Rossmann: Lie groups: An introduction through linear groups Qing Liu: Algebraic geometry and arithmetic curves Martin R. Bridson and Simon M. Salamon (eds): Invitations to geometry and topology Shmuel Kantorovitz: Introduction to modern analysis Terry Lawson: Topology: A geometric approach Meinolf Geck: An introduction to algebraic geometry and algebraic groups Alastair Fletcher and Vladimir Markovic: Quasiconformal maps and Teichmiiller theory Dominic Joyce: Riemannian holonomy groups and calibrated geometry Fernando Villegas: Experimental Number Theory P´ eter Medvegyev: Stochastic Integration Theory
Stochastic Integration Theory P´eter Medvegyev
1
3 Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York c P´ eter Medvegyev, 2007 The moral rights of the author have been asserted Database right Oxford University Press (maker) First published 2007 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britain on acid-free paper by Biddles Ltd., King’s Lynn, Norfolk ISBN 978–0–19–921525–6 1 3 5 7 9 10 8 6 4 2
To the memory of my father
This page intentionally left blank
Contents Preface
xiii
1 Stochastic processes 1.1
1.2
1.3
1.4
1
Random functions
1
1.1.1 Trajectories of stochastic processes
2
1.1.2 Jumps of stochastic processes
3
1.1.3 When are stochastic processes equal?
6
Measurability of Stochastic Processes
7
1.2.1 Filtration, adapted, and progressively measurable processes
8
1.2.2 Stopping times
13
1.2.3 Stopped variables, σ-algebras, and truncated processes
19
1.2.4 Predictable processes
23
Martingales
29
1.3.1 Doob’s inequalities
30
1.3.2 The energy equality
35
1.3.3 The quadratic variation of discrete time martingales
37
1.3.4 The downcrossings inequality
42
1.3.5 Regularization of martingales
46
1.3.6 The Optional Sampling Theorem
49
1.3.7 Application: elementary properties of L´evy processes
58
1.3.8 Application: the first passage times of the Wiener processes
80
1.3.9 Some remarks on the usual assumptions
91
Localization
92
1.4.1 Stability under truncation
93
1.4.2 Local martingales
94 vii
viii CONTENTS 1.4.3 Convergence of local martingales: uniform convergence on compacts in probability
104
1.4.4 Locally bounded processes
106
2 Stochastic Integration with Locally Square-Integrable Martingales 2.1
2.2 2.3
2.4
108
The Itˆo–Stieltjes Integrals
109
2.1.1 Itˆ o–Stieltjes integrals when the integrators have finite variation
111
2.1.2 Itˆ o–Stieltjes integrals when the integrators are locally square-integrable martingales
117
2.1.3 Itˆ o–Stieltjes integrals when the integrators are semimartingales
124
2.1.4 Properties of the Itˆ o–Stieltjes integral
126
2.1.5 The integral process
126
2.1.6 Integration by parts and the existence of the quadratic variation
128
2.1.7 The Kunita–Watanabe inequality
134
The Quadratic Variation of Continuous Local Martingales
138
Integration when Integrators are Continuous Semimartingales
146
2.3.1 The space of square-integrable continuous local martingales
147
2.3.2 Integration with respect to continuous local martingales
151
2.3.3 Integration with respect to semimartingales
162
2.3.4 The Dominated Convergence Theorem for stochastic integrals
162
2.3.5 Stochastic integration and the Itˆ o–Stieltjes integral
164
Integration when Integrators are Locally Square-Integrable Martingales
167
2.4.1 The quadratic variation of locally square-integrable martingales
167
2.4.2 Integration when the integrators are locally square-integrable martingales
171
2.4.3 Stochastic integration when the integrators are semimartingales
176
CONTENTS
3 The Structure of Local Martingales 3.1
ix 179
Predictable Projection
182
3.1.1 Predictable stopping times
182
3.1.2 Decomposition of thin sets
188
3.1.3 The extended conditional expectation
190
3.1.4 Definition of the predictable projection
192
3.1.5 The uniqueness of the predictable projection, the predictable section theorem
194
3.1.6 Properties of the predictable projection
201
3.1.7 Predictable projection of local martingales
204
3.1.8 Existence of the predictable projection
206
Predictable Compensators
207
3.2.1 Predictable Radon–Nikodym Theorem
207
3.2.2 Predictable Compensator of locally integrable processes
213
3.2.3 Properties of the Predictable Compensator
217
3.3
The Fundamental Theorem of Local Martingales
219
3.4
Quadratic Variation
222
3.2
4 General Theory of Stochastic Integration 4.1
4.2
4.3
4.4
225
Purely Discontinuous Local Martingales
225
4.1.1 Orthogonality of local martingales
227
4.1.2 Decomposition of local martingales
232
4.1.3 Decomposition of semimartingales
234
Purely Discontinuous Local Martingales and Compensated Jumps
235
4.2.1 Construction of purely discontinuous local martingales
240
4.2.2 Quadratic variation of purely discontinuous local martingales
244
Stochastic Integration With Respect To Local Martingales
246
4.3.1 Definition of stochastic integration
248
4.3.2 Properties of stochastic integration
250
Stochastic Integration With Respect To Semimartingales
254
4.4.1 Integration with respect to special semimartingales
257
x
CONTENTS
4.5
4.4.2 Linearity of the stochastic integral
261
4.4.3 The associativity rule
262
4.4.4 Change of measure
264
The Proof of Davis’ Inequality
277
4.5.1 Discrete-time Davis’ inequality
279
4.5.2 Burkholder’s inequality
287
5 Some Other Theorems 5.1
292
The Doob–Meyer Decomposition
292
5.1.1 The proof of the theorem
292
5.1.2 Dellacherie’s formulas and the natural processes
299
5.1.3 The sub- super- and the quasi-martingales are semimartingales
303
5.2
Semimartingales as Good Integrators
308
5.3
Integration of Adapted Product Measurable Processes
314
5.4
Theorem of Fubini for Stochastic Integrals
319
5.5
Martingale Representation
328
6 Itˆ o’s Formula
351
6.1
Itˆ o’s Formula for Continuous Semimartingales
353
6.2
Some Applications of the Formula
359
6.2.1 Zeros of Wiener processes
359
6.2.2 Continuous L´evy processes
366
6.2.3 L´evy’s characterization of Wiener processes
368
6.2.4 Integral representation theorems for Wiener processes
373
6.2.5 Bessel processes
375
Change of Measure for Continuous Semimartingales
377
6.3.1 Locally absolutely continuous change of measure
377
6.3.2 Semimartingales and change of measure
378
6.3.3 Change of measure for continuous semimartingales
380
6.3.4 Girsanov’s formula for Wiener processes
382
6.3.5 Kazamaki–Novikov criteria
386
6.3
CONTENTS
6.4
6.5
Itˆ o’s Formula for Non-Continuous Semimartingales
394
6.4.1 Itˆ o’s formula for processes with finite variation
398
6.4.2 The proof of Itˆ o’s formula
401
6.4.3 Exponential semimartingales
411
Itˆ o’s Formula For Convex Functions
417
6.5.1 Derivative of convex functions
418
6.5.2 Definition of local times
422
6.5.3 Meyer–Itˆ o formula
429
6.5.4 Local times of continuous semimartingales
438
6.5.5 Local time of Wiener processes
445
6.5.6 Ray–Knight theorem
450
6.5.7 Theorem of Dvoretzky Erd˝ os and Kakutani
457
7 Processes with Independent Increments 7.1
xi
460
L´evy processes
460
7.1.1 Poisson processes
461
7.1.2 Compound Poisson processes generated by the jumps
464
7.1.3 Spectral measure of L´evy processes
472
7.1.4 Decomposition of L´evy processes
480
7.1.5 L´evy–Khintchine formula for L´evy processes
486
7.1.6 Construction of L´evy processes
489
7.1.7 Uniqueness of the representation
491
Predictable Compensators of Random Measures
496
7.2.1 Measurable random measures
497
7.2.2 Existence of predictable compensator
501
7.3
Characteristics of Semimartingales
508
7.4
L´evy–Khintchine Formula for Semimartingales with Independent Increments
513
7.4.1 Examples: probability of jumps of processes with independent increments
513
7.4.2 Predictable cumulants
518
7.4.3 Semimartingales with independent increments
523
7.2
xii CONTENTS
7.5
7.4.4 Characteristics of semimartingales with independent increments
530
7.4.5 The proof of the formula
534
Decomposition of Processes with Independent Increments
538
Appendix
547
A
Results from Measure Theory
547
A.1 The Monotone Class Theorem
547
A.2 Projection and the Measurable Selection Theorems
550
A.3 Cram´er’s Theorem
551
A.4 Interpretation of Stopped σ-algebras
555
B
C
Wiener Processes
559
B.1 Basic Properties
559
B.2 Existence of Wiener Processes
567
B.3 Quadratic Variation of Wiener Processes
571
Poisson processes
579
Notes and Comments
594
References
597
Index
603
Preface I started to write this book a few years ago mainly because I wanted to understand the theory of stochastic integration. Stochastic integration theory is a very popular topic. The main reason for this is that the theory provides the necessary mathematical background for derivative pricing theory. Of course, many books purport to explain the theory of stochastic integration. Most of them concentrate on the case of Brownian motion, and a few of them discuss the general case. Though the first type of book is quite readable, somehow they disguise the main ideas of the general theory. On the other hand, the books concentrating on the general theory were, for me, of a bit sketchy. I very often had quite serious problems trying to decode what the ideas of the authors were, and it took me a long time, sometimes days and weeks, to understand some basic ideas of the theory. I was nearly always able to understand the main arguments but, looking back, I think some simple notes and hints could have made my suffering shorter. The theory of stochastic integration is full of non-trivial technical details. Perhaps from a student’s point of view the best way to study and to understand measure theory and the basic principles of modern mathematical analysis is to study probability theory. Unfortunately, this is not true for the general theory of stochastic integration. The reason for this is very simple: the general theory of stochastic integration contains too much measure theory. Perhaps the best way to understand the limits of measure theory is to study the general theory of stochastic integration. I think this beautiful theory pushes modern mathematics to its very limits. On the other hand, despite many technical details there are just a very few simple issues which make up the backbone of stochastic analysis. 1. The first one is, of course, martingales and local martingales. The basic concept of stochastic analysis is random noise. But what is the right mathematical model for the random noise? Perhaps the most natural idea would be the random walk, that is processes with stationary and independent increments: the so called L´evy processes, with mean value zero. But, unfortunately, this class of processes has some very unpleasant properties. Perhaps the biggest problem is that the sum of two L´evy process is not a L´evy process again. Modern mathematics is very much built on the idea of linearity. If there is not some very fundamental and very clear reason for it, then every reasonable class of mathematical objects should be closed under linear combinations. The concept of random noise comes xiii
xiv
PREFACE
very much from applications. One of the main goals of mathematics is to build safe theoretical tools and, like other scientific instruments, mathematical tools should be both simple and safe, similar to computer tools. Most computer users never read the footnotes in computer manuals, they just have a general feeling about the limits of the software. It is the responsibility of the writer of the software to make the software work in a plausible way. If the behaviour of the software is not reasonable, then its use becomes dangerous, e.g. you could easily lose your files, or delete or modify something and make the computer behave unpredictably, etc. Likewise, if an applied mathematical theory cannot guarantee that the basic objects of the theory behave reasonably, then the theory is badly written, and as one can easily make hidden errors in it, its usage is dangerous. In our case, if the theory cannot guarantee that the sum of two random noises is again a random noise, then the theory is very dangerous from the point of view of sound applications. The main reason for introducing martingales is that from the intuitive point of view they are very close to the idea of a random walk, but if we fix the amount of observable information they form a linear space. The issue of local martingales is a bit more tricky. Of course local martingales and not just real martingales form the class of random noise. Without doubt, local martingales make life for a stochastic analyst very difficult. From an intuitive, applied point of view, local martingales and martingales are very close and that is why it is easy to make mistakes. Therefore, in most cases the mathematical proofs have to be very detailed and cautious. On the other hand the local martingales form a large and stable class, so the resulting theory is very stable and simple to use. As in elementary algebra, most of the problems come from the fact that one cannot divide by zero. In stochastic analysis most of the problems come from the fact that not every local martingale is a martingale and therefore one can take expected values only with care. Is there some intuitive idea why one should introduce local martingales? Perhaps, yes. First of all one should realize that not really martingales, but uniformly integrable martingales, are the objects of the theory. If we observe a martingale up to a fixed, finite moment of time we get a uniformly integrable martingale, but most of the natural moments of time are special random variables. The measurement of the time-line is, in some sense, very arbitrary. Traditionally we measure it with respect to some physical, astronomical movements. For some processes this coordinate system is rather arbitrary. It is more natural, for example, to say ‘after lunch I called my friend’ than to say ‘I called my friend at twenty-three past and sometimes at twentytwo past one depending on the amount of food my wife gave me’. Of course the moment of time after lunch is a random variable with respect to the coordinate system generated by the relative position of the earth and the sun, but as a basis for observing my general habits this random time, ‘after lunch’, is the natural point of orientation. So, in some ways, it is very natural to say that a process is a random noise if one can define a sequence of random moments, so-called stopping times, τ 0 < τ 1 < . . . such that if we observe the random noise up to τ k the truncated processes are uniformly integrable martingales, which is exactly the definition of local martingales. The idea that local martingales are the good
PREFACE
xv
mathematical models for random noise comes from the fact that sometimes we want to perturb the measurement of the time-line in an order-preserving way, and we want the class of ‘random noise processes’ to be invariant under these transformations. 2. The second-most important concept is quadratic variation. One can think of stochastic analysis as the mathematical theory of quadratic variation. In classical analysis one can define an integral only when the integrator has bounded variation. Even in this case, one can define two different concepts of integration. One is the Lebesgue–Stieltjes type of integration and the other is the Riemann– Stieltjes concept of integration. If the integrand is continuous, then the two concepts are equal. It is easy to see, that if the integrand is left-continuous and in the Riemann–Stieltjes type integrals one may choose only the starting point of the sub-intervals of the partitions as test-point, then for these type of approximating sums the integrals of Riemann–Stieltjes type will converge and they are equal to the Lebesgue–Stieltjes integrals. One may ask whether one can extend this trick to some more general class of integrators. The answer is yes. It turns out that the same concept works if the integrators are local martingales. There is just one new element: the convergence of the integrating sums holds only in probability. If the integrators are local martingales or if they have finite variation then for this integral, the so-called integration by parts formula is valid. In this formula, the most notable factor is the quadratic co-variation [X, Y ] (t). If, for example, X is continuous and Y has finite variation then [X, Y ] (t) = 0 but generally [X, Y ] (t) = 0. As the stochastic integrals are defined only by convergence in probability the random variable [X, Y ] (t) is defined only up to a measure-zero set. This implies that the trajectories of the process t → [X, Y ] (t) are undefined. One can exert quite a lot of effort to show that there is a right-continuous process with limits from the left, denoted by [X, Y ] such that for every t the value of [X, Y ] at time t is a version of the random variable [X, Y ] (t). The key observation in the proof of this famous theorem is that XY −[X, Y ] is a local martingale and it is the only process for which this property holds and the jump-process of the process [X, Y ] is the process ∆X∆Y . The integration by parts formula is the prototype of Itˆ o’s formula, which is the main analytical tool of stochastic analysis. Perhaps it is not without interest to emphasize that the main difficulty in proving this famous formula, in the general case of discontinuous processes, is to establish the existence of the quadratic variation. It is worth mentioning that it is relatively easy to show the existence of the quadratic variation for the so-called locally square-integrable martingales. It is nearly trivial to show the existence of the quadratic variation when the trajectories of the process have finite variation. Hence, it is not so difficult to prove the existence of [X] [X, X] if process X has a decomposition X = V + H where the trajectories of V have finite variation and H is a so-called locally square-integrable martingale. The main problem is that we do not know that every local martingale has this decomposition! To
xvi PREFACE prove that this decomposition exists one should show the Fundamental Theorem of Local Martingales, which is perhaps the most demanding result of the theory. 3. The third most important concept of the theory is predictability. There are many interrelated objects in the theory modified by the adjective predictable. Perhaps the simplest and most intuitive one is the concept of predictable stopping time. Stopping times describe the occurrence of random events. The occurrence of a random event is predictable, if there is a sequence of other events which announces the predictable event. That is, a stopping time τ is predictable if there is a sequence of stopping times (τ n ) with τ n τ and τ n < τ whenever τ > 0. This definition is very intuitive and appealing. If τ is a predictable stopping time, then one can say that the event [τ , ∞) {(t, ω) : τ (ω) ≤ t} ⊆ R+ × Ω is also predictable. The σ-algebra generated by these type of predictable random intervals is called the σ-algebra of predictable events. One should agree that this definition of predictability is in some sense very close to the intuitive idea of predictability. Quite naturally, a stochastic process is called predictable if it is measurable with respect to the σ-algebra of the predictable events. It is an important and often useful observation that the set of predictable events is the same as the σ-algebra generated by the left-continuous adapted processes. Recall that a process is called adapted when its value for every moment of time is measurable with respect to the σ-algebra representing the amount of information available at that time. The values of left-continuous processes are at least infinitesimally predictable. One of the most surprising facts of stochastic integration theory is that in the general case the integrands of stochastic integrals should be predictable. Although it looks like a very deep mathematical observation, one should also admit that this is a very natural result. The best interpretation of stochastic integrals is that they are the net results of continuous-time trading or gaming processes. Everybody knows that in a casino one should play a trading strategy only if one decides about the stakes before the random events generating the gains occur. This means that the playing strategy should be predictable. An important concept related to predictability is the concept of the predictable compensator. If one has a risky stochastic process X, one can ask whether there is a compensator P for the risk of process X. The compensator should be ‘simpler’ than the process itself. Generally it is assumed that P is monotone or at least it has finite variation. The compensator P should be predictable and one should assume that X − P is a totally random process, that is X − P is a local martingale. This is of course a very general setup, but it appears in most of the applications of stochastic analysis. For a process X there are many compensators, that is there are many processes Y such that X − Y is a local martingale. Perhaps the simplest one is X itself. But it is very important that the predictable
PREFACE
xvii
compensator of X, if it exists and if it has finite variation, is in fact unique. The reason for this is that every predictable local martingale is continuous, and if the trajectories of a continuous local martingale have finite variation then the local martingale is constant. 4. Stochastic integration theory is built on probability theory. Therefore every object of the theory is well-defined only almost surely and this means that stochastic integrals are also defined almost surely. In classical integration theory, one first defines the integral over some fixed set and then defines the integral function. In stochastic integration theory this approach does not work as it is entirely non-trivial how one can construct the integral process from the almost surely defined separate integrals. Therefore, in stochastic integration theory one immediately defines the integral processes, so stochastic integrals are processes and not random variables. 5. There are basically two types of local martingales: continuous and purely discontinuous ones. The canonical examples of continuous local martingales are the Wiener processes, and the simplest purely discontinuous local martingales are the compensated Poisson processes. Every local martingale which has trajectories with finite variation is purely discontinuous, but there are purely discontinuous local martingales with infinite variation. Every local martingale has a unique decomposition L = L (0) + Lc + Ld , where Lc is a continuous local martingale and Ld is a purely discontinuous local martingale. A very important property of purely discontinuous local martingales is that they are sums of their continuously compensated single jumps. Si , by definition, is a single jump if there is a stopping time τ such that every trajectory of Si is constant before and after the random jump-time τ . The single jumps obviously have trajectories with finite variation, and as the compensators Pi , by definition, also have finite variation, the compensated single jumps Li Si − Pi also have trajectories with finite variation. Of course this does not imply that the trajectories of L, as infinite sums, should also have finite variation. If L is a purely discontinuous local martingale and L = i Li where Li are continuously compensated single jumps, then one can think about the stochastic integral with respect to L as the sum of the stochastic integrals with respect to Li . Every Li has finite variation so, in this case, the stochastic integral, as a pathwise integral, is well-defined and if the integrand is predictable then the integral is a local martingale. Of course one should restrict the class of integrands as one has to guarantee the convergence of the sum of the already defined integrals. If the integrand is predictable then the stochastic integral with respect to a purely discontinuous local martingale is a sum of local martingales. Therefore it is also a local martingale. 6. The stochastic integral with respect to continuous local martingales is a bit more tricky. The fundamental property of stochastic integrals with respect to local martingales is that the resulting process is also a local martingale. The intuition behind this observation is that the basic interpretation of stochastic integration is that it is the cumulative gain of an investment process into a randomly changing price process. Every moment of time we decide about the size
xviii PREFACE of our investment, this is the integrand, and our short term gains are the product of our investment and the change of the random price-integrator. Our total gain is the sum of the short term gains. If we can choose our strategy only in a predictable way it is quite natural to assume that our cumulative gain process will be also totally random. That is, if the investment strategy is predictable and the random integrator price process is a local martingale, then the net, cumulative gain process is also a local martingale. How much is the quadratic variation of the resulting gain process? If H • L denotes the integral of H with respect to the local martingale L then one should guarantee the very natural identity [H • L] = H 2 • [L], where the right-hand side expression H 2 • [L] denotes the classical pathwise integral of H 2 with respect to the increasing process [L]. The identity is really very natural as [L] describes the ‘volatility’ of L along the timeline, and if in every moment of time we have H pieces of L then our short term 2 change will be (H∆L) ≈ H 2 ·∆ [L]. So our aggregated ‘volatility’ is 2 in ‘volatility’ 2 H ∆ [L] H •L. It is a very nice observation that there is just one continuous local martingale, denoted by H • L, for which [H • L, N ] = H • [L, N ] holds for every continuous local martingale N . The stochastic integral with respect to a local martingale L is the sum of two integrals: the integral H • Lc with respect to the continuous and the integral H • Ld with respect to the purely discontinuous part of L. 7. As there are local martingales which have finite variation, one can ask whether the new and the classical definitions are the same or not? The answer is that if the integrand is predictable the two concepts of integration are not different. This allows us to further generalize the concept of stochastic integration. We say that process S is a semimartingale if S = L + V where L is a local martingale and V is adapted and has finite variation. One can define the integral with respect to S as the sum of the integrals with respect L and with respect to V . A fundamental problem is that in the discontinuous case, as we have local martingales with finite variation, the decomposition is not unique. But as for processes with finite variation the two concepts of integration coincide, this definition of stochastic integral with respect to semimartingales is well-defined. In the first chapter of the book we introduce the basic definitions and some of the elementary theorems of martingale theory. In the second chapter we give an elementary introduction to stochastic integration theory. Our introduction is built on the concept of Itˆ o–Stieltjes integration. In the third chapter we shall discuss the structure of local martingales and in Chapter Four we shall discuss the general theory of stochastic integration. In Chapter Six we prove Itˆ o’s formula. In Chapter Seven we apply the general theory to the classical theory of processes with independent increments. Finally it is a pleasure to thank to those who have helped me to write this book. In particular I would like to thank the efforts of Tam´ as Badics from University of Pannonia and Petrus Potgieter from University of South Africa. They read most of the book and without their help perhaps I would not have been able
PREFACE
xix
to finish the book. I wish to thank Istv´ an Dancs and J´ anos Sz´az from Corvinus University for support and help. I would like to express my gratitude to the Magyar K¨ ulkereskedelmi Bank for their support. Budapest, 2006
[email protected] medvegyev.uni-corvinus.hu
This page intentionally left blank
1 STOCHASTIC PROCESSES
In this chapter we first discuss the basic definitions of the theory of stochastic processes. Then we discuss the simplest properties of martingales, the Martingale Convergence Theorem and the Optional Sampling Theorem. In the last section of the chapter we introduce the concept of localization.
1.1
Random functions
Let us fix a probability space (Ω, A, P). As in probability theory we refer to the set of real-valued (Ω, A)-measurable functions as random variables. We assume that the space (Ω, A, P) is complete, that is all subsets of measure zero sets are also measurable. This assumption is not a serious restriction but it is a bit surprising that we need it. We shall need this assumption many times, for example when we prove that the hitting times1 of Borel measurable sets are stopping times2 . When we prove this we shall use the so-called Projection Theorem3 which is valid only when the space (Ω, A, P) is complete. We shall also use the Measurable Selection Theorem4 several times, which is again valid only when the measure space is complete. Let us remark that all applications of the completeness assumption are connected to the Predictable Projection Theorem, which is the main tool in the discussion of discontinuous semimartingales. In the theory of stochastic processes, random variables very often have infinite value. Hence the image space of the measurable functions is not R but the set of extended real numbers R [−∞, ∞]. The most important examples of random variables with infinite value are stopping times. Stopping times give the random time of the occurrence of observable events. If for a certain outcome ω the event never occurs, it is reasonable to say that the value of the stopping time for this ω is +∞. 1 See:
Definition 1.26, page 15. Definition 1.21, page 13. 3 See: Theorem A.12, page 550. 4 See: Theorem A.13, page 551. 2 See:
1
2
STOCHASTIC PROCESSES
1.1.1
Trajectories of stochastic processes
In the most general sense stochastic processes are such functions X(t, ω) that for any fixed parameter t the mappings ω → X(t, ω) are random variables on (Ω, A, P). The set of possible time parameters Θ is some subset of the extended real numbers. In the theory of continuous-time stochastic processes Θ is an interval, generally Θ = R+ [0, ∞), but sometimes Θ = [0, ∞] and Θ = (0, ∞) is also possible. If we do not say explicitly what the domain of the definition of the stochastic process is, then Θ is R+ . It is very important to append some remarks to this definition. In probability theory the random variables are equivalence classes, which means that the random variables X(t) are defined up to measure zero sets. This means that in general X(t, ω) is meaningless for a fixed ω. If the possible values of the time parameter t are countable then we can select from the equivalence classes X(t) one element, and fix a measure zero set, and outside of this set the expressions X(t, ω) are meaningful. But this is impossible if Θ is not countable5 . Therefore, we shall always assume that X(t) is a function already carefully selected from its equivalence class. To put it in another way: when one defines a stochastic process, one should fix the space of possible trajectories and the stochastic processes are function-valued random variables which are defined on the space (Ω, A, P). Definition 1.1 Let us fix the probability space (Ω, A, P) and the set of possible time parameters6 Θ. The function X defined on Θ × Ω is a stochastic process over Θ × Ω if for every t ∈ Θ it is measurable on (Ω, A, P) in its second variable. Definition 1.2 If we fix an outcome ω ∈ Ω then the function t → X(t, ω) defined over Θ is the trajectory or realization of X corresponding to the outcome ω. If all 7 the trajectories of the process X have a certain property then we say that the process itself has this property. For example, if all the trajectories of X are continuous then we say that X is continuous, if all the trajectories of X have finite variation then we say that X has finite variation, etc. Recall that in probability theory the role of the space (Ω, A, P) is a bit problematic. All the relevant questions of probability theory are related to the joint distributions of random variables and the whole theory is independent of the specific space carrying the random variables having these joint distributions. 5 This is what the author prefers to call the revenge of the zero sets. This is very serious and it will make our life quite difficult. The routine solution to this challenge is that all the processes which we are going to discuss have some sort of continuity property. In fact, we shall nearly always assume that the trajectories of the stochastic processes are regular, that is at every point all the trajectories have limits from both sides and they are either right- or left-continuous. As we want to guarantee that the martingales have proper trajectories we shall need the so-called usual assumptions. 6 In most of the applications Θ is the time parameter. Sometimes the natural interpretation of Θ is not the time but some spatial parameter. See: Example 1.126, page 90. In continuous ‘time’ theory of stochastic processes Θ is an interval in the half-line R+ . 7 Not almost all trajectories. See: Definition 1.8, page 6, Example 1.11, page 8.
RANDOM FUNCTIONS
3
Of course it is not sufficient to define the distributions alone. For instance, it is very important to clarify the relation between the lognormal and the normal distribution, and we can do it only when we refer directly to random variables. Hence, somehow, we should assume that there is a measure space carrying the random variables with the given distributions: if ξ has normal distribution then exp(ξ) has lognormal distribution. This is a very simple and very important relation which is not directly evident from the density functions. The existence of a space (Ω, A, P) enables us to use the power of measure theory in probability theory, but the specific structure of (Ω, A, P) is highly irrelevant. The space (Ω, A, P) contains the ‘causes’, but we see only the ξ (ω) ‘consequences’. We never observe the outcome ω. We can see only its consequence ξ(ω). As the space (Ω, A, P) is irrelevant one can define it in a ‘canonical way’. In probability theory, generally, Ω R, A B (R) and P is the measure generated by the distribution function of ξ or in the multidimensional case Ω Rn and A B (Rn ). In both cases Ω is the space of all possible realizations. Similarly in the theory of stochastic processes the only entities which one can observe are the trajectories. Sometimes it is convenient if Ω is the space of possible trajectories. In this case we say that Ω is given in its canonical form. It is worth emphasizing that in probability theory there is no advantage at all in using any specific representation. In the theory of stochastic processes the relevant questions are related to time and all the information about the time should be somehow coded in Ω. Hence, it is very plausible if we assume that the elements of Ω are not just abstract objects which somehow describe the information about the timing of certain events, but are also functions over the set of possible time values. That is, in the theory of stochastic processes, the canonical model is not just one of the possible representation: it is very often the right model to discuss certain problems. 1.1.2
Jumps of stochastic processes
Of course, the theory of stochastic processes is an application of mathematical analysis. Hence the basic mathematical tool of the theory of stochastic processes is measure theory. To put it another way, perhaps one of the most powerful applications of measure theory is the theory of stochastic processes. But measure theory is deeply sequential, related on a fundamental level to countable objects. We can apply measure theory to continuous-time stochastic processes only if we restrict the trajectories of the stochastic processes to ‘countably determined functions’. Definition 1.3 Let I ⊆ R be an interval and let Y be an arbitrary topological space. We say that the function f : I → Y is regular if at any point t ∈ I, where it is meaningful, f has left-limits f (t−) f− (t) lim f (s) ∈ Y st
4
STOCHASTIC PROCESSES
and right-limits f (t+) f+ (t) lim f (s) ∈ Y. st
We say that f is right-regular if it is regular and it is right-continuous. We say that f is left-regular if it is regular and it is left-continuous. If f is a real-valued function, that is if Y R in the above definition, then the existence of limits means that the function has finite limits. As, in this book, stochastic processes are mainly real-valued stochastic processes, to make the terminology as simple as possible we shall always assume that regular processes have finite limits. If the process X is regular and if t is an interior point of Θ then as the limits are finite it is meaningful to define the jump ∆X(t) X(t+) − X(t−) of X at t. It is not too important, but a bit confusing, that somehow one should fix the definition of jumps of the regular processes at the endpoints of the time interval Θ. If Θ = R+ then what is the jump of the function χΘ at t = 0? Is it zero or one? Definition 1.4 We do not know anything about X before t = 0 so by definition we shall assume that X(0−) X(0). Therefore for any right-regular process on R+ ∆X(0) X(0+) − X(0−) = 0.
(1.1)
In a similar way, if, for example, Θ [0, 1) and X χΘ , then X is rightregular and does not have a jump at t = 1. Observe that in both examples the trajectories were continuous functions on Θ so it is a bit strange to say that the jump process of a continuous process is not zero8 . It is not entirely irrelevant how we define the jump process at t = 0. If we consider process F χR+ as a distribution function of a measure then how much is the integral [0,1] 1dF ? We shall assume that the 1 distribution functions are right-regular and not leftregular. By definition9 0 1dF is the integral over (0, 1] and as F is right-regular 8 One can take another approach. In general: what is the value of an undefined variable? If X is the value process of a game and τ is some exit strategy, then what is the value of the game if we never exit from the game, that is if τ = ∞? It is quite reasonable to say that in this case the value of the game is zero. Starting from this example one can say that once a variable is undefined then we shall assume that its value is zero. If one uses this approach then X (0−) 0 and ∆X (0) = X (0+). b 9 In measure theory one can very often find the convention a f dµ [a,b) f dµ. We shall assume that the integrator processes are right- and not left-continuous, so we shall use the convention ab f dµ (a,b] f dµ.
RANDOM FUNCTIONS
the measure of (0, 1] is F (1) − F (0) = 0 so convention one can think that
1 0
5
1dF = 0. According to our
1dF = F (1) − F (0−) = F (1) − F (0) = 1 − 1 0. [0,1]
On the other hand one can correctly argue that
1dF
[0,1]
χ([0, 1])dF = 1. R
To avoid these type of problems we shall never include the set {t = 0} in the domain of integration. The regular functions have many interesting properties. We shall very often use the next propositions: Proposition 1.5 Let f be a real-valued regular function defined on a finite and closed interval [a, b]. For any c > 0 the number of the jumps in [a, b] bigger in absolute value then c is finite. The number of the jumps of f are at most countable. Proof. The second part of the proposition is an easy consequence of the first part. Assume that there is an infinite number of points (tn ) in [a, b] for which |∆f (tn )| ≥ c. As [a, b] is compact, one can assume that tn → t∗ . Obviously we can assume that for an infinite number of points tn ≤ t∗ or t∗ ≤ tn . Hence we can assume that tn t∗ . But f has a left-limit at t∗ so if x, y < t∗ are close enough to t∗ then |f (x) − f (y)| ≤ c/4. If tn is close enough to t∗ and x < tn < y are close enough to tn and to t∗ then c ≤ |f (tn +) − f (tn −)| ≤ ≤ |f (tn +) − f (y)| + |f (y) − f (x)| + |f (x) − f (tn −)| ≤
3 c, 4
which is impossible. Proposition 1.6 If a function f is real valued and regular then it is bounded on any compact interval. Proof. Fix a finite closed interval [a, b]. If f were not bounded on [a, b] then there would be a sequence (tn ) for which |f (tn )| ≥ n. As [a, b] is compact one could assume, that tn → t∗ . We could also assume that e.g. tn t∗ and therefore f (tn ) → f (t∗ −) ∈ R which is impossible.
6
STOCHASTIC PROCESSES
Proposition 1.7 Let f be a real valued regular function defined on a finite and closed interval [a, b]. If the jumps of f are smaller than c then for any ε > 0 there is a δ such that |f (t ) − f (t )| < c + ε
whenever
|t − t | ≤ δ.
Proof. If such a δ were not available then for some δ n 0 for all n there would be tn , tn such that |tn − tn | ≤ δ n and |f (tn ) − f (tn )| ≥ c + ε.
(1.2)
As [a, b] is compact, one could assume that tn → t∗ and tn → t∗ for some t∗ . Notice that except for a finite number of indexes (tn ) and (tn ) are on different sides of t∗ , since if, for instance, for an infinite number of indexes tn , tn ≥ t∗ then for some subsequences tnk t∗ and tnk t∗ and as the trajectories of f are regular limk→∞ f (tnk ) = limk→∞ f (tnk ) which contradicts (1.2). So we can assume that tn t∗ and tn t∗ . Using again the regularity of f, one has |∆f (t∗ )| ≥ c + ε which contradicts the assumption |∆f | ≤ c. 1.1.3
When are stochastic processes equal?
A stochastic process X has three natural ‘facets’. The first one is the process itself, which is the two-dimensional ‘view’. We shall refer to this as X(t, ω) or just as X. With the first notation we want to emphasize that X is a function of two variables. For instance, the different concepts of measurability, like predictability or progressive measurability, characterize X as a function of two variables. We shall often use the notations X(t) or sometimes Xt , which denote the random variable ω → X(t, ω), that is the random variable belonging to moment t. Similarly we shall use the symbols X(ω), or Xω as well, which refer to the trajectory belonging to ω, that is X(ω) is the ‘facet’ t → X(t, ω) of X. Definition 1.8 Let X and Y be two stochastic processes on the probability space (Ω, A, P). 1. The process X is a modification of the process Y if for all t ∈ Θ the variables X(t) and Y (t) are almost surely equal, that is for all t ∈ Θ P (X(t) = Y (t)) P ({ω : X(t, ω) = Y (t, ω)}) = 1. By this definition, the set of outcomes ω where X(t, ω) = Y (t, ω), can depend on t ∈ Θ. 2. The processes X and Y are indistinguishable if there is a set N ⊆ Ω which has probability zero, and whenever ω ∈ / N then X (ω) = Y (ω) , that is X(t, ω) = Y (t, ω) for all t ∈ Θ and ω ∈ / N.
MEASURABILITY OF STOCHASTIC PROCESSES
7
Proposition 1.9 Assume that the realizations of X and Y are almost surely continuous from the left or they are almost surely continuous from the right. If X is a modification of Y then X and Y are indistinguishable. Proof. Let N0 be the set of outcomes where X and Y are not left-continuous or right-continuous. Let (rk ) be the set of rational points10 in Θ and let Nk {X(rk ) = Y (rk )} {ω : X(rk , ω) = Y (rk , ω)} . X is a modification of Y hence P(Nk ) = 0 for all k. Therefore if N ∪∞ k=0 Nk then P(N ) = 0. If ω ∈ / N then X(rk , ω) = Y (rk , ω) for all k, hence as the trajectories X(ω) and Y (ω) are continuous from the same side X(t, ω) = Y (t, ω) for all t ∈ Θ. Therefore outside N obviously X(ω) = Y (ω), that is X and Y are indistinguishable. Example 1.10 With modification one can change the topological properties of trajectories.
In the definition of stochastic processes one should always fix the analytic properties like continuity, regularity, differentiability etc. of the trajectories. It is not a great surprise that with modification one can dramatically change these properties. For example, let (Ω, A, P) ([0, 1] , B, λ) and Y (t, ω) ≡ 0. The trajectories of Y are continuous. If χQ is the characteristic function of the rational numbers, and X(t, ω) χQ (t + ω) then for all ω the trajectories of X are never continuous but X is a modification of Y . From the example it is also obvious that it is possible for X to be a modification of Y but for X and Y not to be indistinguishable. If X and Y are stochastic processes then, unless we explicitly say otherwise, X = Y means that X and Y are indistinguishable.
1.2
Measurability of Stochastic Processes
As we have already mentioned, the theory of stochastic processes is an application of measure theory. On the one hand this remark is almost unnecessary as measure theory is the cornerstone of every serious application of mathematical analysis. On the other hand it is absolutely critical how one defines the class of 10 Recall that Θ is an interval in R. If X and Y are left-continuous then left-continuity is meaningless in the left endpoint of Θ, so if Θ has a left endpoint then we assume that this left endpoint is part of (rk ). Similarly when X and Y are right-continuous and Θ has right endpoint then we assume that this endpoint is in (rk ).
8
STOCHASTIC PROCESSES
measurable functions which one can use in stochastic analysis. Every stochastic process is a function of two variables, so it is obvious to assume that every process is product measurable. Example 1.11 An almost surely continuous process is not necessarily product measurable.
Let (Ω, A, P) ([0, 1] , B, λ) and let E be a subset of [0, 1] which is not Lebesgue measurable. The process 0 if ω = 0 X(t, ω) χE (t) if ω = 0 is almost surely continuous. X is not product measurable as by Fubini’s theorem the product measurability implies partial measurability but if ω = 0 then t → X(t, ω) is not measurable. Although the example is trivial it is not without any interest. Processes X and Y are considered to be equal if they are indistinguishable. So in theory it can happen that X is product measurable and X = Y but Y is not product measurable. To avoid these type of measurability problems we should for example, assume that the different objects of stochastic analysis, like martingales, local martingales, or semimartingales etc. are right-regular and not just almost surely right-regular. Every trajectory of a Wiener processes should be continuous, but it can happen that it starts only almost surely from zero. 1.2.1
Filtration, adapted, and progressively measurable processes
A fundamental property of time is its ‘irreversibility’. This property of time is expressed with the introduction of the filtration. Definition 1.12 Let us fix a probability space (Ω, A, P). For every t ∈ Θ let us select a σ-algebra Ft ⊆ A in such a way that whenever s < t then Fs ⊆ Ft . The correspondence t → Ft is called a filtration and we shall denote this correspondence by F. The quadruplet (Ω, A, P, F) is called a stochastic basis. With the filtration F one can define the σ-algebras Ft+ ∩s>t Ft ,
Ft− σ (∪s
F∞ σ (Ft : t ∈ Θ) .
1. The filtration F is right-continuous, if Ft = Ft+ for all t. 2. The filtration F is left-continuous, if Ft = Ft− for all t. 3. We say that the filtration F satisfies the usual conditions if F is rightcontinuous and Ft contains for all t all the measure zero sets of (Ω, A, P). 4. We say that the stochastic basis (Ω, A, P, F) satisfies the usual conditions, if (Ω, A, P) is complete and the filtration F satisfies the usual conditions.
MEASURABILITY OF STOCHASTIC PROCESSES
9
It is obvious from the introduced terminology, that generally we shall assume that the filtration and the stochastic basis satisfy the usual conditions. The usual interpretation of the σ-algebra Ft is that it contains the events which occurred up to time t, that is Ft contains the information which is available at moment t. As Ft is the information at moment t one can interpret Ft− as the information available before t and Ft+ is the information available just after11 t. A quite natural question is how one can define a filtration F. Let X be a stochastic process, that is let X be a function of two variables. Assume that X is product measurable. In this case X(t) is A-measurable for all t. Let us define the σ-algebras FtX ⊆ A generated by the sets {X(t1 ) ∈ I1 , . . . , X(tn ) ∈ In }
(1.3)
where t1 , . . . , tn ≤ t are arbitrary elements in Θ and I1 , . . . , In are arbitrary intervals. Obviously if s < t then FsX ⊆ FtX , hence F X is really a filtration. F X is called the filtration generated by X. Example 1.13 If w is the canonical Wiener process then F w , the filtration generated by w, is not right-continuous.
Let w be the canonical Wiener process. By definition this means that the set of trajectories of w is the space Ω {f : f ∈ C (R+ ) and f (0) = 0} . Let F be the set of outcomes ω ∈ Ω for which there is an ε > 0 that on the interval [0, ε] the trajectory w (ω) is zero. Obviously F = ∪n Fn , where Fn is the set of outcomes ω, for which w (ω) is zero on the interval [0, 1/n]. Fn is measurable as it is equal to the set 1 w (rn ) = 0, rn ∈ 0, ∩Q . n Obviously P(Fn ) = 0, therefore P(F ) = 0. By definition w(0) ≡ 0, therefore / F0w . If t > 0 and 1/n ≤ t, then obviously Fn ∈ F0w = {Ω, ∅}. Hence F ∈ w Ft , therefore ∪1/n≤t Fn ∈ Ftw . On the other hand for every t > 0 evidently ∪1/n≤t Fn = F , since obviously ∪1/n≤t Fn ⊆ F and if ω ∈ F then ω ∈ Fn ⊆ w w , that is F0w = F0+ . ∪1/n≤t Fn for some index n. Hence F ∈ ∩t>0 Ftw = F0+ Let us remark that, as we shall see later, if N is the collection of sets with 11 One can observe that the interpretation of F t− is intuitively quite appealing, but the interpretation of Ft+ looks a bit unclear. It is intuitively not obvious that what type of information one can get in an infinitesimally short time interval after t or to put it in another way it is not too clear why one can get Ft = Ft+ . Therefore from an intuitive point of view it is not a great surprise that we shall generally assume that Ft = Ft+ .
10
STOCHASTIC PROCESSES
measure-zero in A then the filtration Ft σ (Ftw ∪ N ) is right-continuous, so this extended F satisfies the usual conditions12 . The σ-algebra F0w = {Ω, ∅} is complete, which implies that to make F right-continuous one should add to the σ-algebra Ftw all the null sets from A, or at least the null sets of Ftw for all t and it is not sufficient to complete the σ-algebras Ftw separately. Definition 1.14 We say that process X is adapted to the filtration F if X(t) is measurable with respect to Ft for every t. A set A ⊆ Θ × Ω is adapted if the process χA is adapted. In the following we shall fix a stochastic basis (Ω, A, P, F) and if we do not say otherwise we shall always assume that all stochastic processes are adapted with respect to the filtration F of the stochastic basis. It is easy to see that the set of adapted sets form a σ-algebra. Example 1.15 If Ft ≡ {∅, Ω} for all t then only the deterministic processes are adapted. If Ft ≡ A for all t then every product measurable stochastic process is adapted.
The concept of adapted processes is a dynamic generalization of partial measurability. The dynamic generalization of product measurability is progressive measurability: Definition 1.16 A set A ⊆ Θ×Ω is progressively measurable if for all t ∈ Θ A ∩ ([0, t] × Ω) ∈ Rt B ([0, t]) × Ft , that is for all t the restriction of A to [0, t] × Ω is measurable with respect to the product σ-algebra Rt B ([0, t]) × Ft . The progressively measurable sets form a σ-algebra R. We say that a process X is progressively measurable if it is measurable with respect to R. It is clear from the definition that every progressively measurable process is adapted. Example 1.17 Adapted process which is not progressively measurable.
12 See:
Proposition 1.103, page 67.
MEASURABILITY OF STOCHASTIC PROCESSES
11
Let Ω Θ [0, 1] and let Ft A be the σ-algebra generated by the finite subsets of Ω. If D {t = ω} then the function X χD is obviously adapted. We prove that it is not product measurable. Assume that {X = 1} = D ∈ B (Θ) × A. By the definition of product measurability Y [0, 1/2] × Ω ∈ B (Θ) × A. So if D ∈ B (Θ) × A then Y ∩ D ∈ B (Θ) × A. Therefore by the projection theorem13 [0, 1/2] ∈ A which is impossible. Therefore D ∈ / B (Θ) × A. If Ft A for all t then X is adapted but not progressively measurable. Example 1.18 Every adapted, continuous from the left and every adapted, continuous from the right process is progressively measurable14 .
Assume, for example, that X is adapted and continuous from the right. Fix a t (n) (n) (n) and let 0 = t0 < t1 < . . . < tk = t be a partition of [0, t]. Let us define the processes Xn (s)
X (0)
(n) X tk
if if
s = 0 . (n) (n) s ∈ tk−1 , tk
As X is adapted Xn is measurable with respect to the σ-algebra Rt B ([0, t]) × (n) Ft . If the sequence of partitions (tk ) is infinitesimal, that is if (n) (n) lim max tk − tk−1 = 0
n→∞
k
then as X is right-continuous Xn → X. Therefore the restriction of X to [0, t] is Rt -measurable. Hence X is progressively measurable. Example 1.19 If X is regular then ∆X is progressively measurable.
Like the product measurability, the progressive measurability is also a very mild assumption. It is perhaps the mildest measurability concept one can use in stochastic analysis. The main reason why one should introduce this concept is the following much-used observation: Proposition 1.20 Assume that V is a right-regular, adapted process and assume that every trajectory of V has finite variation on every finite interval [0, t]. 1. If for every ω the trajectories X (ω) are integrable on any finite interval with respect to the measure generated by V (ω) then the parametric 13 If P (N ) = 0 if N is countable otherwise P (N ) = 1, then the probability space (Ω, A, P ) is complete. 14 Specially, if X(t, ω) is measurable in ω and continuous in t then X is product measurable.
12
STOCHASTIC PROCESSES
integral process
t
X (s, ω) V (ds, ω)
Y (t, ω)
(1.4)
0
X (s, ω) V (ds, ω) (0,t]
forms a right-regular process and ∆Y = X · ∆V . 2. If additionally X is progressively measurable then Y is adapted. Proof. The first statement of the proposition is a direct consequence of the Dominated Convergence Theorem. Observe that to prove the second statement one cannot directly apply Fubini’s theorem, but one can easily adapt its usual proof: Let H denote the set of bounded processes for which Y (t) in (1.4) is Ft -measurable. As the measure of finite intervals is finite H is a linear space, it contains the constant process X ≡ 1, and if 0 ≤ Hn ∈ H and Hn H and H is bounded then by the Monotone Convergence Theorem H ∈ H. This implies that H is a λ-system. If C ∈ Ft and s1 , s2 ≤ t, and B (s1 , s2 ] × C then as V is adapted the integral
t
χB dV = χC [V (s2 ) − V (s1 )] 0
is Ft -measurable. These processes form a π-system, hence by the Monotone Class Theorem H contains the processes which are measurable with respect to the σ-algebra generated by the processes χC χ ((s1 , s2 ]). As C ∈ Ft the πsystem generates the σ-algebra of the product measurable sets B ((0, t])×Ft . X is progressively measurable so its restriction to (0, t ] is (B ((0, t]) × Ft )-measurable. Hence the proposition is true if X is bounded. From this the general case follows from the Dominated Convergence Theorem. What is the intuitive idea behind progressive measurability? Generally the filtration F is generated by some process X. Recall that if Z (ξ α )α∈A is a set of random variables and X σ (ξ α : α ∈ A) denotes the σ-algebra generated by them then X = ∪S⊆A XS where the subsets S are arbitrary countable generated subsets of A and for any S set XS denotes the σ-algebra
by the
the countably many variables ξ αi α ∈S of Z, that is XS σ ξ αi : αi ∈ S . By this i structure of the generated σ-algebras, FtX contains all the information one can obtain observing X up to time t countably many times. If a process Y is adapted with respect to F X then Y reflects the information one can obtain from countable many observations of X. But sometimes, like in (1.4), we want information
MEASURABILITY OF STOCHASTIC PROCESSES
13
which depends on uncountable number of observations of the underlying random source. In these cases one needs progressive measurability! 1.2.2
Stopping times
After filtration, stopping time is perhaps the most important concept of the theory of stochastic processes. As stopping times describe the moments when certain random events occur, it is not a great surprise that most of the relevant questions of the theory are somehow related to stopping times. It is important that not every random time is a stopping time. Stopping times are related to events described by the filtration of the stochastic base15 . At every time t one can observe only the events of the probability space (Ω, Ft , P). If τ is a random time then at time t one cannot observe the whole τ . One can observe only the random variable τ ∧ t! By definition τ is a stopping time if τ ∧ t is an (Ω, Ft , P)-random variable for all t. Definition 1.21 Let Ω be the set of outcomes and let F be a filtration on Ω. Let τ : Ω → Θ ∪ {∞}. 1. The function τ is a stopping time if for every t ∈ Θ {τ ≤ t} ∈ Ft . We denote the set of stopping times by Υ. 2. The function τ is a weak stopping time if for every t ∈ Θ {τ < t} ∈ Ft . Example 1.22 Almost-surely zero functions and stopping times.
Assume that the probability space (Ω, A, P) is complete and for every t the σ-algebra Ft contains the measure-zero sets of A. If N ⊆ Ω is a measure-zero set and the function τ ≥ 0 is zero on the complement of N , then τ is stopping time, as for all t {τ ≤ t} ⊆ N ∈ Ft , hence {τ ≤ t} ∈ Ft . In a similar way if σ ≥ 0 is almost surely +∞ then σ is a stopping time. These examples are special cases of the following: If (Ω, A, P, F) satisfies the usual conditions and τ is a stopping time and σ ≥ 0 is almost surely equal to τ then σ is also a stopping time. We shall see several times that in the theory of stochastic processes the time axis is not symmetric. The filtration defines an orientation on the real axis. 15 If we travel from a city to the countryside then the moment when we arrive at the first pub after we leave the city is a stopping time, but the time when we arrive at the last pub before we leave the city is not a stopping time. In a similar way when X is a stochastic process the first time X is zero is a stopping time, but the last time it is zero is not a stopping time. One of the most important random times which is generally not a stopping time is the moment when X reaches its maximum on a certain interval. See: Example 1.110, page 73.
14
STOCHASTIC PROCESSES
An elementary but very import consequence of this orientation is the following proposition: Proposition 1.23 Every stopping time is a weak stopping time. If the filtration F is right-continuous then every weak stopping time is a stopping time. Proof. As the filtration F is increasing, if τ is a stopping time then for all n 1 ∈ Ft−1/n ⊆ Ft . τ ≤t− n Therefore {τ < t} = ∪n
1 τ ≤t− n
∈ Ft .
On the other hand if F is right-continuous that is if Ft+ = Ft then 1 {τ ≤ t} = ∩n τ < t + ∈ ∩n Ft+1/n Ft+ = Ft . n The right-continuity of the filtration is used in the next proposition as well. Proposition 1.24 If τ and σ are stopping times then τ ∧ σ and τ ∨ σ are also stopping times. If (τ n ) is an increasing sequence of stopping times then τ lim τ n n→∞
is a stopping time. If the filtration F is right-continuous and (τ n ) is a decreasing sequence of stopping times then τ lim τ n n→∞
is a stopping time. Proof. If τ and σ are stopping times then {τ ∧ σ ≤ t} = {τ ≤ t} ∪ {σ ≤ t} ∈ Ft , {τ ∨ σ ≤ t} = {τ ≤ t} ∩ {σ ≤ t} ∈ Ft . If τ n τ then for all t {τ ≤ t} = ∩n {τ n ≤ t} ∈ Ft . If τ n τ then for all t c
{τ ≥ t} = ∩n {τ n ≥ t} = ∩n {τ n < t} ∈ Ft
MEASURABILITY OF STOCHASTIC PROCESSES
15
that is {τ < t} = ∪n {τ n < t} ∈ Ft . Hence τ is a weak stopping time. If the filtration F is right-continuous then τ is a stopping time. Corollary 1.25 If the filtration F is right-continuous and (τ n ) is a sequence of stopping times then sup τ n , n
inf τ n n
lim sup τ n , n→∞
lim inf τ n n→∞
are stopping times. The next definition concretizes the abstract definition of stopping times: Definition 1.26 If Γ ⊆ R+ × Ω then the expression τ Γ (ω) inf {t : (t, ω) ∈ Γ}
(1.5)
is called the d´ebut of the set Γ. If B ⊆ Rn and X is a vector valued stochastic process then τ B (ω) inf {t : X(t, ω) ∈ B}
(1.6)
is called the hitting time of set B. If B ⊆ R and X is a stochastic process and if Γ {X ∈ B} then τ Γ = τ B which means that every hitting time is a special d´ebut. Example 1.27 The most important hitting times are the random functions τ a (ω) inf {t : X(t, ω)Ra} where R is one of the relations ≥, >, ≤, <. These type of hitting times are the so-called first passage times.
Theorem 1.28 (Construction of stopping times) If the stochastic base (Ω, A, P, F) satisfies the usual conditions and Γ is progressively measurable then the d´ebut of Γ is a stopping time. Proof. Define the set Γt Γ ∩ [0, t) × Ω. If τ Γ (ω) < t then for some s obviously (s, ω) ∈ Γt . Hence ω ∈ projΩ (Γt ). On the other hand if ω is in projΩ (Γt )
16
STOCHASTIC PROCESSES
then for some s ∈ [0, t) obviously (s, ω) ∈ Γ, hence τ Γ (ω) ≤ s < t that is {τ Γ < t} = projΩ (Γt ) . Γ is progressively measurable, hence Γt (Γ ∩ [0, t] × Ω) ∩ [0, t) × Ω ∈ B ([0, t]) × Ft . Recall that the projections of product measurable sets are not necessarily measurable. By the usual conditions A is complete, and also by the usual conditions Ft contains all the measure-zero sets, hence Ft is also complete and, therefore, by the Projection Theorem16 , the projection of the product measurable set Γt is Ft -measurable, so {τ Γ < t} = projΩ (Γt ) ∈ Ft . As F is right-continuous17 every weak stopping time is a stopping time so τ Γ is a stopping time. Corollary 1.29 If the stochastic base (Ω, A, P, F) satisfies the usual conditions, the process X is progressively measurable and B is a Borel set then the hitting time of B is a stopping time. Let X be a progressively measurable process and let σ be a stopping time. Instead of (1.5) very often we are interested in variables of the type τ inf {t > σ : X(t) ∈ B} . The set Γ {(t, ω) : X(t, ω) ∈ B} ∩ {(t, ω) : t > σ (ω)} is progressively measurable since by the progressive measurability of X the first set in the intersection is progressively measurable, and the characteristic function of the other set is adapted and left-continuous hence it is also progressively measurable. By the theorem above if (Ω, A, P, F) satisfies the usual conditions then the expression τ = τ Γ inf {t : (t, ω) ∈ Γ} is a stopping time. 16 See:
Theorem A.12, page 550. can happen that (s, ω) ∈ Γ for all s > t, but (t, ω) ∈ / Γ. In this case τ Γ (ω) = t, but ω∈ / projΩ (Γ ∩ [0, t) × Ω), therefore in the proof we used the right-continuity of the filtration. 17 It
MEASURABILITY OF STOCHASTIC PROCESSES
17
Corollary 1.30 If the stochastic base (Ω, A, P, F) satisfies the usual conditions, the process X is progressively measurable and B is a Borel set then the hitting times τ 0 0,
τ n+1 inf {t > τ n : X(t) ∈ B}
are stopping times. Example 1.31 If X is not progressively measurable then the hitting times of Borel sets are not necessarily stopping times.
Let X χD be the adapted but not progressively measurable process in Example 1.17. The hitting time of the set B {1} is obviously not a stopping time as / A F1/2 . {τ B ≤ 1/2} = [0, 1/2] ∈ The main advantage of the above construction is its generality. An obvious disadvantage of the just proved theorem is that it builds on the Projection Theorem. Very often we do not need the generality of the above construction and we can construct stopping times without referring to the Projection Theorem. Example 1.32 Construction of stopping times without the Projection Theorem.
1. If the set B is closed and X is a continuous, adapted process then one can easily proof that the hitting time (1.6) is a stopping time. As the trajectories are continuous the sets K(t, ω) X ([0, t] , ω) are compact for every outcome ω. As B is closed K(t, ω) ∩ B = ∅ if and only, if the distance between the two sets is positive. Therefore K(t, ω) ∩ B = ∅ if and only if τ B (ω) > t. As the trajectories are continuous X([0, t] ∩ Q, ω) is dense in the set K(t, ω). As the metric is a continuous function {τ B ≤ t} = {K(t) ∩ B = ∅} = {d (K(t), B) = 0} = = {ω : inf {d(X (s, ω) , B) : s ≤ t, s ∈ Q} = 0} . X(s) is Ft -measurable for a fixed s ≤ t, hence as x → d (x, B) is continuous d(X(s), B) is also Ft -measurable. The infimum of a countable number of measurable functions is measurable, hence {τ B ≤ t} ∈ Ft . 2. We prove that if B is open, the trajectories of X are right-continuous and adapted, and the filtration F is right-continuous then the hitting time (1.6) is a stopping time. It is sufficient to prove that {τ B < t} ∈ Ft for all t. As the trajectories are right-continuous and as B is open X(s, ω) ∈ B, if and only if,
18
STOCHASTIC PROCESSES
there is an ε > 0 such that whenever u ∈ [s, s + ε) then X(u, ω) ∈ B. From this {τ B < t} = ∪s∈Q∩[0,t) {X(s) ∈ B} ∈ Ft . 3. In a similar way one can prove that if X is left-continuous and adapted, F is right-continuous, and B is open, then the hitting time τ B is a stopping time. 4. If the filtration is right-continuous, and X is a right or left-continuous adapted process, then for any number c the first passage time τ inf {t : X(t) > c} is a stopping time. 5. If B is open and the filtration is not right-continuous, then even for continuous processes the hitting time τ B is not necessarily a stopping time18 . If X(t, ω) t · ξ(ω), where ξ is a Gaussian random variable, and Ft is the filtration generated by X, then F0 = {0, Ω} , and the hitting time τ B of the set B {x > 0} is τ B (ω)
0 if ξ(ω) > 0 . ∞ if ξ(ω) ≤ 0
/ F0 , so τ B is not a stopping time. Obviously {τ B ≤ 0} ∈ 6. Finally we show that if σ is an arbitrary stopping time and X is a right-regular, adapted process and c > 0, then the first passage time τ (ω) inf {t > σ : |∆X(t, ω)| ≥ c} is stopping time. Let us fix an outcome ω and let assume that ∞ > tn τ (ω) , where |∆X(tn , ω)| ≥ c. The trajectory X(ω) is right-regular, therefore the jumps which are bigger than c do not have an accumulation point. Hence for all indexes n large enough tn is already constant, that is τ (ω) = tn > σ (ω) , so |∆X(τ (ω))| = |∆X(tn )| ≥ c for some n. This means that |∆X (τ )| ≥ c on the set {τ < ∞} and on the set {σ < ∞} one has τ > σ. Let A(t) ([0, t] ∩ Q) ∪ {t}. We prove that τ (ω) ≤ t if and only if for all n ∈ N one can find a pair qn , pn ∈ A(t) for which σ(ω) < pn < qn < pn +
1 n
18 The reason for this is clear as the event {τ B = t} can contain such outcomes ω that the trajectory will hit the set B just after t therefore one should investigate the events {τ B < t}.
MEASURABILITY OF STOCHASTIC PROCESSES
19
and |X(pn , ω) − X(qn , ω)| ≥ c −
1 . n
(1.7)
One implication is evident, that is if τ (ω) ≤ t, then as the jumps bigger than c do not have accumulation points, |∆X(s, ω)| ≥ c for some σ(ω) < s ≤ t. Hence by the regularity of the trajectories one can construct the necessary sequences. On the other hand, let us assume that the sequences (pn ) , (qn ) exist. Without loss of generality one can assume that (pn ) and (qn ) are convergent. Let σ (ω) ≤ s ≤ t be the common limit point of these sequences. If for an infinite number of indexes pn ≥ s, then in any right neighbourhood of s there is an infinite number of intervals [pn , qn ], on which X changes more then c/2 > 0, which is impossible as X is right-continuous. Similarly, only for a finite number of indexes qn ≤ s as otherwise for an infinite number of indexes pn < qn ≤ s which is impossible as X(ω) is left-continuous. This means that for indexes n big enough σ (ω) < pn ≤ s ≤ qn . Taking the limit in the line (1.7) |∆X(s, ω)| ≥ c and hence τ (ω) ≤ s ≤ t. Using this property one can easily proof that {τ ≤ t} =
n∈N
p,q∈A(t) p
{σ < q}
1 |X (p) − X (q)| ≥ c − n
.
A(t) is countable, X is adapted therefore {τ ≤ t} ∈ Ft , which means that τ is a stopping time. 7. If X is a regular process and c > 0 then the hitting time τ inf {t : |∆X(t)| ≥ c} is a stopping time. 1.2.3
Stopped variables, σ-algebras, and truncated processes
With stopping times one can define stopped variables, truncated processes, and the stopped σ-algebras: Definition 1.33 Let X be a stochastic process, and let τ be a stopping time. 1. By a stopped or truncated process we mean the process X τ (t, ω) X (τ (ω) ∧ t, ω) . 2. We shall call the random variable Xτ (ω) X (τ (ω) , ω)
20
STOCHASTIC PROCESSES
a stopped variable. Instead of Xτ we shall very often use the more readable notation X (τ ). Observe that the definition of stopped variable is not entirely correct as X is generally not defined on the set {τ = ∞} and it is not clear / Θ then one can use the what the definition of Xτ on this set is. If τ (ω) ∈ definition19 Xτ (ω) 0. If one uses the convention that the product of an undefined value with zero is zero, then one can write the definition of the stopped variable Xτ in the following way: Xτ (ω) X (τ (ω) , ω) χ (τ ∈ Θ) (ω) . 3. The stopped σ-algebra Fτ is the set of events A ∈ A for which for all t A ∩ {τ ≤ t} ∈ Ft . 4. Fτ + is the set of events A ∈ A for which A ∩ {τ ≤ t} ∈ Ft+ for all t. One can easily check that Fτ and Fτ + are really σ-algebras. For example, if A ∈ Fτ then Ac ∈ Fτ as for every t Ac ∩ {τ ≤ t} = {τ ≤ t} \ (A ∩ {τ ≤ t}) ∈ Ft , and if An ∈ Fτ then (∪n An ) ∩ {τ ≤ t} = ∪n (An ∩ {τ ≤ t}) ∈ Ft . It is easy to see that if τ ≡ t, then Fτ = Ft and Fτ + = Ft+ , hence the notation is unambiguous. If we assume that the usual conditions are satisfied then of course Fτ = Fτ + hence there are not too many important theorems where the σ-algebra Fτ + plays a role. There are many simple observations related to the stopped processes, variables, and σ-algebras. Their proof is generally one or two lines. Let us show some of them: Proposition 1.34 Let F be a filtration, and let τ and σ be stopping times. 1. τ is Fτ -measurable. 2. If σ ≤ τ then Fσ ⊆ Fτ . 19 If X is the value process of some game and τ is an exit strategy then the present definition of Xτ is quite reasonable.
MEASURABILITY OF STOCHASTIC PROCESSES
21
3. Fσ ∩ Fτ = Fσ∧τ . 4. {σ ≤ τ } , {σ < τ } , {σ = τ } are Fσ∧τ -measurable. Proof. The proofs are simple consequences of the definitions. 1. We prove that {τ ≤ s} ∈ Fτ for all s. Let t be arbitrary. As τ is a stopping time {τ ≤ s} ∩ {τ ≤ t} = {τ ≤ s ∧ t} ∈ Fs∧t ⊆ Ft so {τ ≤ s} ∈ Fτ by the definition of Fτ . Hence τ is Fτ -measurable. 2. If σ ≤ τ then {τ ≤ t} ⊆ {σ ≤ t}. If A ∈ Fσ then A ∩ {τ ≤ t} = (A ∩ {σ ≤ t}) ∩ {τ ≤ t} ∈ Ft , as both sets in the intersection are in Ft . Hence A ∈ Fτ . 3. By the previous property Fτ ∧σ ⊆ Fτ ∩ Fσ . On the other hand if A ∈ Fτ ∩ Fσ then A ∩ {σ ∧ τ ≤ t} = A ∩ ({σ ≤ t} ∪ {τ ≤ t}) = = (A ∩ {σ ≤ t}) ∪ (A ∩ {τ ≤ t}) ∈ Ft . Hence A ∈ Fσ∧τ . 4. It is sufficient to prove that if σ and τ are stopping times then {σ ≤ τ } , {τ ≤ σ} ∈ Fσ . From this by the symmetry {σ ≤ τ } ∈ Fσ ∩ Fτ = Fσ∧τ , and {σ = τ } = {σ ≤ τ } ∩ {τ ≤ σ} ∈ Fσ∧τ and {σ < τ } = {σ ≤ τ } \ {σ = τ } ∈ Fσ∧τ . From the definition of stopping times if r ≤ t then {σ > r > τ } = {σ > r} ∩ {τ < r} = c
= {σ ≤ r} ∩ {τ < r} ∈ Ft . From this c
{σ ≤ τ } ∩ {σ ≤ t} = {σ > τ } ∩ {σ ≤ t} = = ∪r∈Q {σ > r > τ } ∩ {σ ≤ t} = = ∪r∈Q,r≤t {σ > r > τ } ∩ {σ ≤ t} ∈ Ft . Hence by the definition of Fσ one has {σ ≤ τ } ∈ Fσ . On the other hand {τ ≤ σ} ∩ {σ ≤ t} = {σ ≤ t} ∩ {τ ≤ t} ∩ {τ ∧ t ≤ σ ∧ t} ∈ Ft ,
22
STOCHASTIC PROCESSES
since the first two sets, by the definition of stopping times, are in Ft and the two random variables in the third set are Ft -measurable. Hence {τ ≤ σ} ∈ Fσ . Proposition 1.35 If X is progressively measurable and τ is an arbitrary stopping time then the stopped variable Xτ is Fτ -measurable, and the truncated process X τ is progressively measurable. Proof. The first part of the proposition is an easy consequence of the second as, if B is a Borel measurable set and X τ is adapted, then for all s {Xτ ∈ B} ∩ {τ ≤ s} = {X (τ ∧ s) ∈ B} ∩ {τ ≤ s} = = {X τ (s) ∈ B} ∩ {τ ≤ s} ∈ Fs , that is, in this case the stopped variable Xτ is Fτ -measurable. Therefore it is sufficient to prove that if X is progressively measurable then X τ is also progressively measurable. Let Y (t, ω)
1
if t < τ (ω)
0
if t ≥ τ (ω)
.
Y is right-regular. τ is a stopping time so {Y (t) = 0} = {τ ≤ t} ∈ Ft . Hence Y is adapted, therefore it is progressively measurable20 . Obviously if τ (ω) > 0 then21
Z (t, ω)
X (s, ω) Y (ds, ω) = (0,t]
0 if t < τ (ω) . −X (τ (ω) , ω) if t ≥ τ (ω)
As X is progressively measurable Z is adapted22 and also right-regular so it is again progressively measurable. As X τ = XY − Z + X (0) χ (τ = 0) X τ is obviously progressively measurable. Corollary 1.36 If G σ(X(τ ) : X is right-regular and adapted) then G = Fτ . Proof. As every right-regular and adapted process is progressively measurable G ⊆ Fτ . If A ∈ Fτ then the process X(t) χA χ (τ ≤ t) is right-regular and by 20 See:
Example 1.18, page 11. τ (ω) = 0 then Z (ω) = 0. 22 See: Proposition 1.20, page 11. 21 If
MEASURABILITY OF STOCHASTIC PROCESSES
23
the definition of Fτ {X(t) = 1} = A ∩ {τ ≤ t} ∈ Ft . Hence X is adapted. Obviously X (τ ) = χA . Therefore Fτ ⊆ G. 1.2.4
Predictable processes
The class of progressively measurable processes is too large. As we have already remarked, the interesting stochastic processes have regular trajectories. There are two types of regular processes: some of them have left- and some of them have right-continuous trajectories. It is a bit surprising that there is a huge difference between these two classes. But one should recall that the trajectories are not just functions: the time parameter has an obvious orientation: the time line is not symmetric, the time flows from left to right. Definition 1.37 Let (Ω, A, P, F) be a stochastic base, and let us denote by P the σ-algebra of the subsets of Θ × Ω generated by the adapted, continuous processes. The sets in the σ-algebra P are called predictable. A process X is predictable if it is measurable with respect to P. Example 1.38 A deterministic process is predictable if and only if its single trajectory is a Borel-measurable function.
Obviously we call a process X deterministic if it does not depend on the random parameter ω, more exactly a process X is called deterministic if it is a stochastic process on (Ω, {Ω, ∅}). If A {Ω, ∅} then the set of continuous stochastic processes is equivalent to the set of continuous functions, and the σ-algebra generated by the continuous functions is equivalent to the σ-algebra of the Borel measurable sets on Θ. The set of predictable processes is closed for the usual operations of analysis23 . The most important and specific operation related to stochastic processes is the truncation: Proposition 1.39 If τ is an arbitrary stopping time and X is a predictable stochastic process then the truncated process X τ is also predictable. Proof. Let L be the set of bounded stochastic processes X for which X τ is predictable. It is obvious that L is a λ-system. If X is continuous then X τ is also continuous hence the π-system of the bounded continuous processes is in L. From the Monotone Class Theorem it is obvious that L contains the set of bounded predictable processes. If X is an arbitrary predictable process then 23 Algebraic
and lattice type operations, usual limits etc.
24
STOCHASTIC PROCESSES
Xn Xχ (|X| ≤ n) is a predictable bounded process and therefore Xnτ is also predictable. Xnτ → X τ therefore X τ is obviously predictable. To discuss the structure of the predictable processes let us introduce some notation: Definition 1.40 If τ and σ are stopping times then one can define the random intervals {(t, ω) ∈ [0, ∞) × Ω : τ (ω) R1 tR2 σ (ω)} where R1 and R2 are one of the relations < or ≤. One can define four random intervals [σ, τ ] , [σ, τ ) , (σ, τ ] and (σ, τ ) where the meaning of these notations is obvious. One should emphasize that, in the definition of the stochastic intervals, the value of the time parameter t is always finite. Therefore if τ (ω) = ∞ for some ω then (∞, ω) ∈ / [τ , τ ]. In measure theory we are used to the fact that the σ-algebras generated by the different types of intervals are the same. In R or in Rn one can construct every type of interval from any other type of interval with a countable number of set operations. For random intervals this is not true! For example, if we want to construct the semi-closed random interval [0, τ ) with random closed segments [0, σ] then we need a sequence of stopping times (σ n ) for which σ n τ , and σ n < τ . If there is such a sequence24 then of course [0, σ n ] [0, τ ) , that is, in this case [0, τ ) is in the σ-algebra generated by the closed random segments. But for an arbitrary stopping time τ such a sequence does not exist. If τ is a stopping time and c > 0 is a constant, then τ − c is generally not a stopping time! On the other hand if c > 0 then τ + c is always a stopping time, hence as [0, τ ] = ∩n [0, τ + 1/n) the closed random intervals [0, τ ] are in the σ-algebra generated by the intervals [0, σ). This shows again that in the theory of the stochastic processes the time line is not symmetric! Definition 1.41 Y is a predictable simple process if there is a sequence of stopping times 0 = τ0 < τ1 < . . . < τn < . . . such that Y = η 0 χ ({0}) +
η i χ ((τ i , τ i+1 ])
(1.8)
i 24 If for τ there is a sequence of stopping times σ τ , σ ≤ τ and σ < τ on the set n n n {τ > 0} then we shall say that τ is a predictable stopping time. Of course the main problem is that not every stopping time is predictable. See: Definition 3.5, page 182. The simplest examples are the jumps of the Poisson processes. See: Example 3.7, page 183.
MEASURABILITY OF STOCHASTIC PROCESSES
25
where η 0 is F0 -measurable and η i are Fτ i -measurable random variables. If the stopping times (τ i ) are constant then we say that Y is a predictable step processes. Now we are ready to discuss the structure of predictable processes. Proposition 1.42 Let X be a stochastic process on Θ [0, ∞). The following statements are equivalent 25 : 1. X is predictable. 2. X is measurable with respect to the σ-algebra generated by the adapted leftregular processes. 3. X is measurable with respect to the σ-algebra generated by the adapted leftcontinuous processes. 4. X is measurable with respect to the σ-algebra generated by the predictable step processes. 5. X is measurable with respect to the σ-algebra generated by the predictable simple processes. Proof. Let P1 , P2 , P3 , P4 and P5 denote the σ-algebras in the proposition. Obviously it is sufficient to prove that these five σ-algebras are equal. 1. Obviously P1 ⊆ P2 ⊆ P3 . 2. Let X be one of the processes generating P3 , that is let X be a left-continuous, adapted process. As X is adapted Xn (t) X (0) χ({0}) +
k
X
k 2n
k k+1 , χ 2n 2n
is a predictable step process. As X is left-continuous obviously Xn → X so X is P4 -measurable hence P3 ⊆ P4 . 3. Obviously P4 ⊆ P5 . 4. Let F ∈ F0 and let fn be such a continuous functions that fn (0) = 1 and fn is zero on the interval [1/n, ∞). If Xn fn χF then Xn is obviously P1 measurable, therefore the process χF χ({0}) = lim Xn n→∞
25 Let us recall that by definition X (0−) X (0). Therefore if ξ is an arbitrary F -measurable 0 random variable then the process X ξχ ({0}) is adapted and left-regular, so if Z is predictable then Z + X is also predictable. Hence we cannot generate P without the measurable rectangles {0}×F, F ∈ F0 . If one wants to avoid these sets then one should define the predictable processes on the open half line (0, ∞). This is not necessarily a bad idea as the predictable processes are the integrands of stochastic integrals, and we shall always integrate only on the intervals (0, t], so in the applications of the predictable processes the value of the these processes is entirely irrelevant at t = 0.
26
STOCHASTIC PROCESSES
is also P1 -measurable. If η 0 is an F0 -measurable random variable then η 0 is a limit of F0 -measurable step functions therefore the process η 0 χ ({0}) is P1 measurable. This means that the first term in (1.8) is P1 -measurable. Let us now discuss the second kind of term in (1.8). Let τ be an arbitrary stopping time. If 1 if t ≤ τ (ω) 1 − n (t − τ (ω)) if τ (ω) < t < τ (ω) + 1/n Xn (t, ω) 0 if t ≥ τ (ω) + 1/n then Xn has continuous trajectories, and it is easy to see that Xn is adapted. Therefore χ ([0, τ ]) = lim Xn ∈ P1 . n→∞
If σ ≤ τ is another stopping time then χ ((σ, τ ]) = χ ([0, τ ] \ [0, σ]) = χ ([0, τ ]) − χ ([0, σ]) ∈ P1 . If F ∈ Fσ then σ F (ω)
σ (ω) if ω ∈ F ∞ if ω ∈ /F
is also a stopping time as {σ F ≤ t} = {σ ≤ t} ∩ F ∈ Ft . If σ ≤ τ then Fσ ⊆ Fτ , therefore not only σ F but τ F is also a stopping time. χF χ ((σ, τ ]) = χ ((σ F , τ F ]) ∈ P1 . If η is Fσ -measurable, then η is a limit of step functions, hence if η is Fσ measurable and σ ≤ τ then the process ηχ ((σ, τ ]) is P1 -measurable. By the definition of the predictable simple processes every predictable simple process is P1 -measurable. Hence P5 ⊆ P1 . Corollary 1.43 If Θ = [0, ∞) then the random intervals {0} × F, F ∈ F0 and (σ, τ ] generate the σ-algebra of the predictable sets. Corollary 1.44 If Θ = [0, ∞) then the random intervals {0} × F, F ∈ F0 and [0, τ ] generate the σ-algebra of the predictable sets. Definition 1.45 Let T denote the set of measurable rectangles {0} × F,
F ∈ F0
MEASURABILITY OF STOCHASTIC PROCESSES
27
and {(s, t] × F,
F ∈ Fs } .
The sets in T are called predictable rectangles. Corollary 1.46 If Θ = [0, ∞) then the predictable rectangles generate the σ-algebra of predictable sets. It is quite natural to ask what the difference is between the σ-algebras generated by the right-regular and by the left-regular processes. Definition 1.47 The σ-algebra generated by the adapted, right-regular processes is called the σ-algebra of the optional sets. A process is called optional if it is measurable with respect to the σ-algebra of the optional sets. As every continuous process is right-regular so the σ-algebra of the optional sets is never smaller than the σ-algebra of the predictable sets P. Example 1.48 Adapted, right-regular process which is not predictable.
The simplest example of a right-regular process which is not predictable is the Poisson process. Unfortunately, at the present moment it is a bit difficult to prove26 . The next example is ‘elementary’. Let Ω [0, 1] and for all t let Ft
σ (B ([0, t]) ∪ (t, 1]) if t < 1 . B ([0, 1]) if t ≥ 1
If s ≤ t then Fs ⊆ Ft , and hence F is a filtration. It is easy to see that the random function τ (ω) ω is a stopping time. Let A [τ ] [τ , τ ] be the graph 2 of τ , which is the diagonal of the closed rectangle [0, 1] . 1. Let us show that A is optional. It is easy to see that the process Xn χ ([τ , τ + 1/n)) is right-continuous. Xn is adapted as {Xn (t) = 1} =
τ ≤t<τ+
As Xn → χA , A is optional. 26 See:
Example 3.36, page 200, Example 3.56, page 219.
1 n
∈ Ft .
28
STOCHASTIC PROCESSES
2. Now we show that [τ , τ ] is not predictable. We show that if D ⊆ R+ × Ω and D = P ∩ [0, τ ] for some P ∈ P then there is a B ∈ B ([0, 1]) that D = (B × Ω) ∩ [0, τ ] .
(1.9)
Obviously [τ , τ ] = [τ , τ ] ∩ [0, τ ], therefore if P [τ , τ ] ∈ P then for some B ∈ B ([0, 1]) [τ , τ ] = (B × Ω) ∩ [0, τ ] which is impossible and therefore [τ , τ ] cannot be predictable. 3. It remains to show the validity of the decomposition (1.9). Let F ∈ Fs and R (s, t] × F with s < t. There are two possibilities27 . If F ∩ (s, 1] = ∅ then F ⊆ [0, s] , and hence R (s, t] × F ⊆ (s, t] × [0, s] ⊆ {(x, y) : x > y} so, as [0, τ ] = {(t, ω) : t ≤ τ (ω)} {(t, ω) : t ≤ ω} = = {(x, y) : x ≤ y} , obviously R ∩ [0, τ ] = ∅ = (∅ × Ω) ∩ [0, τ ] . By the structure of Fs the interval (s, 1] is an atom of Fs . Hence if F ∩ (s, 1] = ∅, then (s, 1] ⊆ F , hence for some B ∈ B ([0, s]) R (s, t] × F = (s, t] × (B ∪ (s, 1]) . So R ∩ [0, τ ] = (s, t] × (B ∪ (s, 1]) ∩ {(x, y) : x ≤ y} = = ((s, t] × (s, 1]) ∩ {(x, y) : x ≤ y} = = ((s, t] × Ω) ∩ [0, τ ] and therefore in both cases the intersection has representation of type B × Ω. This remains true if we take the rectangles of type {0} × F, F ∈ F0 . As 27 If we draw Ω on the y-axis and we draw on the time line the x-axis then [τ , τ ] is the line y = x, [0, τ ] is the upper triangle. In the following argument F is under the diagonal hence the whole rectangle R is under the diagonal.
MARTINGALES
29
the generation and the restriction of the σ-algebras are interchangeable operations P ∩ [0, τ ] = σ (T ) ∩ [0, τ ] = σ (T ∩ [0, τ ]) = = σ ((B × Ω) ∩ [0, τ ]) = σ (B × Ω) ∩ [0, τ ] = = (B ([0, 1]) × Ω) ∩ [0, τ ] , which is exactly (1.9). 4. As the left-regular χ ([0, τ ]) is adapted and χ ([τ , τ ]) is not predictable, the right-regular, adapted process χ ([0, τ )) = χ ([0, τ ]) − χ ([τ , τ ]) is also not predictable.
1.3
Martingales
In this section we introduce and discuss some important properties of continuoustime martingales. As martingales are stochastic processes one should fix the properties of their trajectories. We shall assume that the trajectories of the martingales are right-regular. The right-continuity of martingales is essential in the proof of the Optional Sampling Theorem, which describes one of the most important properties of martingales. There are a lot of good books on martingales, so we will not try to prove the theorems in their most general form. We shall present only those results from martingale theory which we shall use in the following. The presentation below is a bit redundant. We could have first proved the Downcrossing Inequality and from it we could have directly proved the Martingale Convergence Theorem. But I don’t think that it is a waste of time and paper to show these theorems from different angles. Definition 1.49 Let us fix a filtration F. The adapted process X is a submartingale if 1. the trajectories of X are right-regular, 2. for any time t the expected value of X + (t) is finite28 , a.s.
3. if s < t, then E(X (t) | Fs ) ≥ X(s). 28 Some authors, see: [53], assume that if X is a submartingale then X (t) is integrable for all t. If we need this condition then we shall say that X is an integrable submartingale. The same remark holds for supermartingales as well. Of course martingales are always integrable.
30
STOCHASTIC PROCESSES
We say that X is a supermartingale, if −X is a submartingale. X is a martingale if X is a supermartingale and a submartingale at the same time. This means that 1. the trajectories of X are right-regular, 2. for any time t the expected value of X (t) is finite, a.s. 3. if s < t, then E (X (t) | Fs ) = X (s). The conditional expectation is always a random variable—that is, the conditional expectation E(X(t) | Fs ) is always an equivalence class. As X is a stochastic process X(s) is a function and not an equivalence class. Hence the two sides in the definition can be equal only in almost sure sense. Generally we shall not emphasize this, and we shall use the simpler =, ≥ and ≤ relations. If X is a martingale, and g is a convex function29 on R and E (g(X(t))+ ) < ∞ for all t, then the process Y (t) g (X(t)) is a submartingale as by Jensen’s inequality g (X (s)) = g (E (X (t) | Fs )) ≤ E (g (X (t)) | Fs ) . p
In particular, if X is a martingale, p ≥ 1, and |X (t)| is integrable for all t, then p the process |X| is a submartingale. If X is a submartingale, g is convex and increasing, and Y (t) g(X(t)) is integrable, then Y is a submartingale, as in this case E (g (X (t)) | Fs ) ≥ g (E (X (t) | Fs )) ≥ g (X (s)) . In particular, if X is a submartingale, then X + is also a submartingale. 1.3.1
Doob’s inequalities
The most well-known inequalities of the theory of martingales are Doob’s inequalities. First we prove the discrete-time versions, and then we discuss the continuous-time cases. n
Proposition 1.50 (Doob’s inequalities, discrete-time) Let X (Xk , Fk )k=1 be a non-negative submartingale. 1. If λ ≥ 0, then
λP
max Xk ≥ λ
1≤k≤n
≤ E (Xn ) .
(1.10)
2. If p > 1, then30 p Xk p ≤ max Xk ≤ p − 1 Xn p q Xn p . 1≤k≤n p 29 Convex 30 Of
functions are continuous so g(X) is adapted. course as usual 1/p + 1/q = 1.
(1.11)
MARTINGALES
31
Proof. Let us remark that both inequalities estimate the size of the maximum of the non-negative submartingales. 1. Let λ > 0. A1 {X1 ≥ λ} ,
Ak
max Xi < λ ≤ Xk ,
A
1≤i
max Xk ≥ λ .
1≤k≤n
We show that λP (A) λP
max Xk ≥ λ
≤
1≤k≤n
Xn dP.
(1.12)
A
As Xn ≥ 0
Xn dP ≤
Xn dP = E (Xn ) , Ω
A
therefore from (1.12) inequality (1.10) is obvious. The sets Ak are disjoint and A = ∪k Ak . Observe that Ak ∈ Fk . Hence by the submartingale property, by the definition of the sets Ak and by the definition of the conditional expectation
λP (A) = λ
n
P (Ak ) ≤
k=1
=
n k=1
Ak
n k=1
Xk dP ≤
Ak
n k=1
E (Xn | Fk ) dP =
Ak
Xn dP =
Xn dP. A
2. Let us prove inequality (1.11). Let us introduce the notation Xn∗ max1≤k≤n Xk . As 0 ≤ Xk ≤ Xn∗ , the first inequality is trivial. If Xn∗ p ≥ Xn p = ∞, then the second inequality is also obvious. Let us assume that Xn p < ∞. One cannot exclude that Xn∗ p = ∞, hence we should truncate the variable Xn∗ . Fix a number N, and let η N ∧ Xn∗ . If A (x) {Xn∗ ≥ x} then by (1.12) xP (A (x)) ≤
Xn dP. A(x)
32
STOCHASTIC PROCESSES
As Xn ≥ 0 by Fubini’s theorem and by the Fundamental Theorem of the Calculus ∗ Xn ∧N
η
E (η p ) = E
E
pxp−1 dx 0
pxp−1 dx
N p−1
=E
px
χ (Xn∗
≥ x) dx
E
px
xp−1
xp−1 P (A (x)) dx ≤ 0
Ω
N
xp−2 0
η
0
Ω
Xn xp−2 χA(x) dPdx =
0
xp−2 dxdP =
Xn
N
Xn dPdx = p A(x)
=p
=
N
χA(x) dPdx = p
0
χA(x) dx
0
N
≤p
N p−1
0
=p
=
0
p p−1
Ω
Xn η p−1 dP. Ω
By H¨older’s inequality 1/q E (η p ) ≤ q Xn p η p−1 q = q Xn p E (η p ) . 1/q
Dividing31 both sides by E (η p ) Convergence Theorem
we get ηp ≤ q Xn p . By the Monotone
lim N ∧ Xn∗ p = Xn∗ p ,
N →∞
from which the inequality (1.11) follows. n
Corollary 1.51 If X (Xk , Fk )k=1 is a submartingale, then for arbitrary λ
λP
max Xk ≥ λ
1≤k≤n
≤ max E Xk+ = 1≤k≤n
(1.13)
= E Xn+ ≤ E (|Xn |) .
Proof.
We can assume that λ ≥ 0. As we remarked, if (Xk ) is a submartingale, then Xk+ is a non-negative submartingale. Hence the sequence of expected
values of Xk+ is not decreasing, so
max E Xk+ = E Xn+ .
1≤k≤n 31 If
E (η p ) = 0 then the inequality holds.
MARTINGALES
33
As λ ≥ 0 obviously
P
max Xk ≥ λ
=P
1≤k≤n
max Xk+ ≥ λ ,
1≤k≤n
so (1.13) follows from (1.10). n
Corollary 1.52 If X (Xk , Fk )k=1 is a martingale or a non-negative submartingale, then for any λ ≥ 0 and exponent p ≥ 1
λp P
max |Xk | ≥ λ
1≤k≤n
p
p
≤ max E (|Xk | ) ≤ E (|Xn | ) . 1≤k≤n
(1.14) n
Proof. The function |x| is convex, hence if X is a martingale then (|Xk | , Fk )k=1 is a non-negative submartingale, so we can assume that X is a non-negative submartingale. By Jensen’s inequality and by the definition of submartingales p
E (Xnp ) = E ((E (Xnp | Fk ))) ≥ E ((E (Xn | Fk )) ) ≥ E (Xkp ) , so the first inequality holds. If E (Xkp ) = ∞ for some k, then (1.14) trivially holds. On the half-line R+ the function xp is convex and increasing. If E(Xkp ) < ∞ for all k, then (Xkp ) is a submartingale, therefore we can apply (1.13). As
max |Xk | ≥ λ
1≤k≤n
=
p
max |Xk | ≥ λp
1≤k≤n
from (1.13) the inequality (1.14) is obvious. If Θ is an interval and the process X has right-regular trajectories, then to calculate the supremum of the trajectories it is sufficient to calculate the supremum over the rational numbers in Θ. Hence, if X has right-regular trajectories, then X ∗ supt X (t) is measurable. The submartingales by definition have right-regular trajectories, and, therefore, after applying the just proved Doob’s inequalities for a finite number of rational numbers with the Monotone Convergence Theorem one can easily prove the next continuous-time inequalities: Corollary 1.53 (Doob’s inequalities, continuous-time) Let Θ be an interval. 1. If p ≥ 1 and X is a martingale, or non-negative submartingale, then p λp P sup |X (t)| ≥ λ ≤ sup X (t)p . t∈Θ
t∈Θ
(1.15)
34
STOCHASTIC PROCESSES
2. If p > 1, then sup |X (t)| ≤ t∈Θ
p
p sup X (t)p . p − 1 t∈Θ
(1.16)
3. If Θ is closed and b is the finite or infinite right endpoint of Θ then under the conditions above λP sup X (t) ≥ λ ≤ X + (b)1 , (1.17) t∈Θ
p
λp P sup |X (t)| ≥ λ
≤ X (b)p ,
t∈Θ
sup |X (t)| ≤ t∈Θ
p
p X (b)p . p−1
(1.18)
We shall very often use the following corollary of (1.16): Corollary 1.54 If X is a martingale and p > 1, then X ∗ sup |Xk | ∈ Lp (Ω) t∈Θ
or (X ∗ ) p
p p sup |Xk | = sup |Xk | ∈ L1 (Ω) t∈Θ
t∈Θ
if and only if X is bounded in Lp (Ω). Definition 1.55 If p ≥ 1, then Hp will denote the space of martingales X for which sup |X(t)| < ∞. t
p
Hp also denotes the equivalence classes of these martingales, where two martingales are equivalent whenever they are indistinguishable. Definition 1.56 If X ∈ H2 , then we shall say that X is a square-integrable martingale. If supt |Xn (t) − X(t)|p → 0 then for a subsequence a.s
sup |Xnk (t) − X(t)| → 0, t
MARTINGALES
35
hence if Xn is right-regular for every n, then X is almost surely right-regular. From the definition of the Hp spaces it is trivial that for all p ≥ 1 the Hp martingales are uniformly integrable. From these the next observation is obvious: Proposition 1.57 Hp as a set of equivalence classes with the norm XHp
sup |X (t)| t
(1.19)
p
is a Banach space. If p > 1 then by Corollary 1.54 X ∈ Hp if and only if X is bounded in Lp (Ω). 1.3.2
The energy equality
An important elementary property of martingales is the following: Proposition 1.58 (Energy equality) Let X be a martingale and assume that X (t) is square integrable for all t. If s < t then
2 E (X (t) − X (s)) = E X 2 (t) − E X 2 (s) . Proof. The difference of the two sides is d 2 · E (X (s) · (X (s) − X (t))) . As s < t, by the martingale property dn 2 · E (X (s) χ (|X (s)| ≤ n) · (X (s) − X (t))) = = 2 · E (E (X (s) χ (|X (s)| ≤ n) · (X (s) − X (t)) | Fs )) = = 2 · E (X (s) χ (|X (s)| ≤ n) · E (X (s) − X (t) | Fs )) = = 2 · E (X (s) χ (|X (s)| ≤ n) · 0) = 0. As X (s) , X (t) ∈ L2 (Ω) obviously |X (s) · (X (s) − X (t))| is integrable. Hence one can use the Dominated Convergence Theorem on both sides so d = lim dn = 0. n→∞
Corollary 1.59 If X ∈ H then there is a random variable, denoted by X(∞), such that X(∞) ∈ L2 (Ω, F∞ , P) and 2
a.s.
X(t) = E(X(∞) | Ft )
(1.20)
36
STOCHASTIC PROCESSES
for every t. In L2 (Ω)-convergence lim X(t) = X(∞).
t→∞
Proof. Let tn ∞ be arbitrary. By the energy equality the sequence 2 X(tn )2 is increasing, and by the definition of H2 it is bounded from above. Also by the energy equality if n > m then 2
2
2
X(tn ) − X(tm )2 = X(tn )2 − X(tm )2 , hence (X(tn )) is a Cauchy sequence in L2 (Ω). As L2 (Ω) is complete the sequence (X(tn )) is convergent in L2 (Ω). It is obvious from the construction that the limit X (∞) as an object in L2 (Ω) is unique, that is X (∞) ∈ L2 (Ω) is independent of the sequence (tn ). X is a martingale, so if s ≥ 0 then a.s.
X (t) = E (X (t + s) | Ft ) . In probability spaces L1 -convergence follows from L2 -convergence and as the conditional expectation is continuous in L1 (Ω), if s → ∞ then a.s.
X (t) = E
lim X (t + s) | Ft E (X (∞) | Ft ) .
s→∞
Example 1.60 Wiener processes and the structure of the square-integrable martingales.
Let u < ∞ and let w be a Wiener process on the interval Θ [0, u]. As w has independent increments, for every t ≤ u E (w (u) | Ft ) = E (w (u) − w (t) | Ft ) + E (w (t) | Ft ) = w (t) . / H2 , On the half-line R+ w is not bounded in L2 (Ω) that is, if Θ = R+ then w ∈ and of course the representation (1.20) does not hold. Proposition 1.61 Let X be a martingale and let p ≥ 1. If for some random variable X(∞) Lp (Ω)
X (t) → X(∞), then a.s.
X (t) → X(∞)
MARTINGALES
37
and a.s.
X(t) = E (X(∞) | Ft ) ,
t ≥ 0.
(1.21)
Proof. As the conditional expectation is continuous in L1 (Ω) if s → ∞ then from the relation a.s.
X(t) = E (X(t + s) | Ft ) ,
t≥0
(1.21) follows. For an arbitrary s the increment N (u) X (u + s) − X (s) is a martingale with respect to the filtration Gu Fs+u . Let β(s) sup |X(u + s) − X(∞)| ≤ sup |N (u)| + |X(s) − X(∞)| . u
u≥0
X is right-regular, so it is sufficient to take the supremum over the rational numbers, so β(s) is measurable. sup N (u)p ≤ X (s) − X(∞)p + sup X (u + s) − X (∞)p . u
u
Lp
Let ε > 0 be arbitrary. As X (s) → X (∞) if s is large enough then the right-hand side is less than ε > 0. By Doob’s and by Markov’s inequalities P (β (s) > 2δ) ≤ P (|X(s) − X(∞)| > δ) + P sup |N (u)| > δ ≤ u
≤
X(s)
p − X(∞)p p
δ
+
ε p δ
.
P
Therefore if s → ∞ then β (s) → 0. Every stochastically convergent sequence has a.s. an almost surely convergent subsequence. By the definition of β (s) if β (sk ) → 0 a.s. then X(t) → X(∞). Corollary 1.62 If X ∈ H2 then there is a random variable X(∞) ∈ L2 (Ω) such that X(t) → X(∞), where the convergence holds in L2 (Ω) and almost surely. 1.3.3
The quadratic variation of discrete time martingales
Our goal is to extend the result just proved to spaces Hp , p ≥ 1. The main tool of stochastic analysis is the so-called quadratic variation. Let us first investigate the quadratic variation of discrete-time martingales. Proposition 1.63 (Austin) Let Z denote the set of integers. Let X = (Xn , Fn )n∈Z be a martingale over Z, that is let us assume that Θ = Z. If X
38
STOCHASTIC PROCESSES
is bounded in L1 (Ω) then the ‘quadratic variation’ of X is almost surely finite: ∞
2 a.s.
(Xn+1 − Xn ) < ∞.
(1.22)
n=−∞
Proof. As X is bounded in L1 (Ω) there is a k < ∞ such that Xn 1 ≤ k for all n ∈ Z. Let X ∗ supn |Xn |. |X| is a non-negative submartingale so by Doob’s inequality P (X ∗ ≥ p) ≤
k , p
therefore X ∗ is almost surely finite. Fix a number p and define the continuously and differentiable, convex function f (t)
t2 2p |t| − p2
if |t| ≤ p . if |t| > p
As f is convex the expression g (s1 , s2 ) f (s2 ) − f (s1 ) − (s2 − s1 ) f (s1 ) is non-negative. If |s1 | , |s2 | ≤ p then 2
g (s1 , s2 ) = s22 − s21 − (s2 − s1 ) 2s1 = (s2 − s1 ) . By the definition of f obviously f (t) ≤ 2p |t|. Therefore E (f (Xn )) ≤ 2pE (|Xn |) ≤ 2pk.
(1.23)
By the elementary properties of the conditional expectation E ((Xn+1 − Xn ) f (Xn )) = E (E ((Xn+1 − Xn ) f (Xn )) | Fn ) = = E (f (Xn ) E ((Xn+1 − Xn )) | Fn ) = 0 for all n ∈ Z. From this and from (1.23), using the definition of g, for all n 2pk ≥ E (f (Xn )) ≥ E (f (Xn ) − f (X−n )) = =
n−1 i=−n
E (f (Xi+1 ) − f (Xi )) =
MARTINGALES
=
n−1
39
E (f (Xi+1 ) − f (Xi ) − (Xi+1 − Xi ) f (Xi ))
i=−n
n−1
E (g (Xi+1 , Xi )) .
i=−n
By the Monotone Convergence Theorem 2 ∗ ∗ (Xn+1 − Xn ) χ (X ≤ p) = E g (Xn+1 , Xn ) χ (X ≤ p) ≤ E n∈Z
n∈Z
≤E =
g (Xn+1 , Xn )
=
n∈Z
E (g (Xn+1 , Xn )) ≤ 2pk.
n∈Z
As X ∗ is almost surely finite,
n∈Z
2
(Xn+1 − Xn ) is almost surely convergent.
Corollary 1.64 Let X (Xn , Fn ) be a martingale over the natural numbers N. If X is bounded in L1 (Ω) and (τ n ) is an increasing sequence of stopping times then almost surely ∞
2
(X(τ n+1 ) − X(τ n )) < ∞.
(1.24)
n=1
Proof. For every m let us introduce the bounded stopping times τ m n τ n ∧ m. By the discrete-time version of the Optional Sampling Theorem32
X m X (τ m n ) , Fτ m n n is a martingale, and therefore from the proof of the previous proposition ∞ 2 m 2pk ≥ E X τm χ (X ∗ ≤ p) . n+1 − X (τ n ) n=1
If m → ∞ then by Fatou’s lemma ∞ 2 E (X (τ n+1 ) − X (τ n )) χ (X ∗ ≤ p) ≤ 2pk, n=1
from which (1.24) is obvious. 32 See:
Lemma 1.83, page 49.
40
STOCHASTIC PROCESSES
Corollary 1.65 Let X (Xn , Fn ) be a martingale over the natural numbers N. If X is bounded in L1 (Ω) then there is a variable X∞ such that |X∞ | < ∞ and a.s.
lim Xn = X∞ .
n→∞
Proof. Assume that for some ε > 0 on a set of positive measure A lim sup |Xp − Xq | ≥ 2ε.
(1.25)
p,q→∞
Let τ 0 1, and let τ n+1 inf {m ≥ τ n : |Xm − Xτ n | ≥ ε} . Obviously τ n is a stopping time for all n and the sequence (τ n ) is increasing. On the set A |X(τ n+1 ) − X(τ n )| ≥ ε. By (1.24) almost surely
∞ n=0
2
(X(τ n+1 ) − X(τ n )) < ∞ which is impossible.
Corollary 1.66 If X = (Xn , Fn ) is a non-negative martingale then there exists a finite, non-negative variable X∞ such that X∞ ∈ L1 (Ω) and almost surely Xn → X ∞ . Proof. X is non-negative and the expected value of Xn is the same for all n, a.s. hence X is obviously bounded in L1 (Ω). So Xn → X∞ exists. By Fatou’s lemma
X (0) = E (Xn | F0 ) = lim E (Xn | F0 ) ≥ E lim inf Xn | F0 = n→∞
n→∞
= E (X∞ | F0 ) ≥ 0 and therefore X∞ ∈ L1 (Ω). Corollary 1.67 Assume that Θ = R+ . If X is a uniformly integrable martingale then there is a variable X (∞) ∈ L1 (Ω) such that X (t) → X (∞), where the convergence holds in L1 (Ω) and almost surely. For all t a.s.
X (t) = E (X (∞) | Ft ) .
(1.26)
Proof. Every uniformly integrable set is bounded in L1 , so if tn ∞, then a.s. there is an X(∞) such that X(tn ) → X(∞). By the uniform integrability the convergence holds in L1 (Ω) as well. Obviously X(∞) as an equivalence class is independent of (tn ). The relation (1.26) is an easy consequence of the L1 (Ω)continuity of the conditional expectation.
MARTINGALES
41
Corollary 1.68 Assume that p ≥ 1 and Θ = R+ . If X ∈ Hp then there is a variable X (∞) ∈ Lp (Ω) such that X (t) → X (∞), where the convergence holds in Lp (Ω) and almost surely. For all t a.s.
X (t) = E (X (∞) | Ft ) .
(1.27)
Proof. If the measure is finite and p ≤ q then Lq ⊆ Lp . Hence if p ≥ 1 and X ∈ Hp then X ∈ H1 so, if tn ∞, then there is a variable X(∞) such that a.s. p p X(tn ) → X(∞). As by the definition of Hp spaces |X(t)| ≤ sups |X(s)| ∈ L1 (Ω), so X(∞) ∈ Lp (Ω) and by the Dominated Convergence Theorem the convergence holds in Lp (Ω) as well. Obviously X(∞), as an equivalence class, is independent of (tn ). The relation (1.27) is an easy consequence of the L1 (Ω) continuity of the conditional expectation. Theorem 1.69 (L´ evy’s convergence theorem) If (Fn ) is an increasing sequence of σ-algebras, ξ ∈ L1 (Ω) and F∞ σ (∪n Fn ) , then Xn E (ξ | Fn ) → E (ξ | F∞ ) , where the convergence holds in L1 (Ω) and almost surely. Proof. Let Xn E (ξ | Fn ). As E( |Xn |) E (|E (ξ | Fn )|)) ≤ E (E (|ξ| | Fn ))) = E(|ξ|) < ∞, a.s.
X = (Xn , Fn ) is an L1 (Ω) bounded martingale. Therefore Xn → X∞ . After the proof we shall prove as a separate lemma that the sequence (Xn ) is uniformly L1
integrable, hence Xn → X∞ . If A ∈ Fn , and m ≥ n, then Xm dP = ξdP, A
A
L1
hence as Xm → X∞
X∞ dP = A
ξdP,
A ∈ ∪n Fn .
(1.28)
A
As X∞ and ξ are integrable it is easy to see that the sets A for which (1.28) is true is a λ-system. As (Fn ) is increasing ∪n Fn is obviously a π-system. Therefore by the Monotone Class Theorem (1.28) is true if A ∈ F∞ σ (∪Fn ). X∞ is a.s. obviously F∞ -measurable, hence X∞ = E (ξ | F∞ ).
42
STOCHASTIC PROCESSES
Lemma 1.70 If ξ ∈ L1 , and (Fα )α∈A is an arbitrary set of σ-algebras then the set of random variables Xα E (ξ | Fα ) ,
α∈A
(1.29)
is uniformly integrable. Proof. By Markov’s inequality for every α P (|Xα | ≥ n) ≤
1 1 E (E (|ξ| | Fα )) = E (|ξ|) . n n
Therefore for any δ there is an n0 that if n ≥ n0 , then P (|Xα | ≥ n) < δ. As that is for X ∈ L1 (Ω) the integral function A |ξ| dP is absolutely continuous, arbitrary ε > 0 there is a δ such that if P (A) < δ, then A |ξ| dP < ε. Hence if n is large enough, then
{|Xα |>n}
|Xα | dP ≤
{|Xα |>n}
E (|ξ| | Fα ) dP =
{|Xα |>n}
|ξ| dP < ε,
which means that the set (1.29) is uniformly integrable. 1.3.4
The downcrossings inequality
Let X be an arbitrary adapted stochastic process and let a < b. Let us fix a point of time t, and let S {s0 < s1 < · · · < sm } be a certain finite number of moments in the time interval [0, t). Let33 τ 0 inf {s ∈ S : X (s) > b} ∧ t. With induction define τ 2k+1 inf {s ∈ S : s > τ 2k , X (s) < a} ∧ t, τ 2k inf {s ∈ S : s > τ 2k−1 , X (s) > b} ∧ t. It is easy to check that τ k is a stopping time for all k. It is easy to see that if X is an integrable submartingale then the inequality a.s.
τ 2k ≤ τ 2k+1 < t 33 If
the set after inf is empty, then the infimum is by definition +∞.
MARTINGALES
43
is impossible as in this case X (τ 2k ) > b, X (τ 2k+1 ) < a and by the submartingale property34 b < E(X (τ 2k )) ≤ E(X (τ 2k+1 )) < a, which is impossible. We say that function f downcrosses the interval [a, b] if there are points u < v with f (u) > b and f (v) < a. By definition f has n downcrosses with thresholds a, b on the set S if there are points in S u1 < v1 < u2 < v2 < · · · < un < vn with f (uk ) > b, f (vk ) < a. Let us denote by DSa,b the a < b downcrossings of X in the set S. Obviously DSa,b ≥ n = {τ 2n−1 < t} ∈ Ft , and hence DSa,b is Ft -measurable. We show that χ
DSa,b
≥n ≤
m k=0
+
(X(τ 2k ) − X(τ 2k+1 )) + (X(t) − b) . n(b − a)
(1.30)
Recall that m is the number of points in S. Therefore the maximum number of possible downcrossings is obviously m. If we have more than n downcrossings then in the sum the first n term is bigger than b − a. For every trajectory all but the last non-zero terms of the sum are positive as they are all not smaller than b − a > 0. There are two possibilities: in the last non-zero term either τ 2k+1 < t or τ 2k+1 = t. In the first case X(τ 2k ) − X(τ 2k+1 ) > b − a > 0. In the second case still X (τ 2k ) > b, therefore in this case X (τ 2k ) − X (τ 2k+1 ) > b − X(t). Of course b − X(t) can be negative. This is the reason why we added to the sum + the correction term (X(t) − b) . If b − X(t) < 0 then +
X (τ 2k ) − X (τ 2k+1 ) + (X(t) − b) = X (τ 2k ) − X (τ 2k+1 ) + X(t) − b = = X (τ 2k ) − X (t) + X(t) − b = = X (τ 2k ) − b > 0, 34 See:
Lemma 1.83, page 49.
44
STOCHASTIC PROCESSES
which means that (1.30) always holds. Taking the expectation on both sides P
DSa,b
m k=0
≥n ≤E
+
(X(τ 2k ) − X(τ 2k+1 )) + (X(t) − b) n(b − a)
=
1 E (X(τ 2k ) − X(τ 2k+1 )) + n (b − a) k=0
1 + + E (X(t) − b) . n (b − a) m
=
Now assume that X is an integrable submartingale. As t ≥ τ 2k+1 ≥ τ 2k by the discrete Optional Sampling Theorem35 E (X(τ 2k ) − X(τ 2k+1 )) ≤ 0, so
P DSa,b ≥ n ≤
+ E (X(t) − b) n (b − a)
.
If the number of points of S increases by refining S then the number of downcrossings DSa,b does not decrease. If S is an arbitrary countable set then the number of downcrossings in S is the supremum of the downcrossings of the finite subsets of S. With the Monotone Convergence Theorem we get the following important inequality: Theorem 1.71 (Downcrossing inequality) If X is an integrable submartingale and S is an arbitrary finite or countable subset of the time interval [0, t) then
E ((X(t) − b)+ ) P DSa,b ≥ n ≤ . n (b − a) In particular
P DSa,b = ∞ = 0. There are many important consequences of this inequality. The first one is a generalization of the martingale convergence theorem. Corollary 1.72 (Submartingale convergence theorem) Let X (Xn , Fn ) be a submartingale over the natural numbers N. If X is bounded in L1 (Ω) then 35 See:
Lemma 1.83, page 49.
MARTINGALES
45
there is a variable X∞ ∈ L1 (Ω) such that a.s.
lim Xn = X∞ .
(1.31)
n→∞ a.s.
Proof. If Xn → X∞ then by Fatou’s lemma E (|X∞ |) ≤ lim inf E (|Xn |) ≤ k < ∞ n→∞
and X∞ ∈ L1 (Ω). Let a < b be rational thresholds, and let Sm {1, 2, . . . , m}. As E (|Xm |) ≤ k for all m
P DSa,b ≥n ≤ m
+ E (Xm − b) n (b − a)
≤
k . n(b − a)
If m ∞ then for all n
P DNa,b = ∞ ≤ P DNa,b ≥ n ≤
k , n(b − a)
which implies that P DNa,b = ∞ = 0. The convergence in (1.31) easily follows from the next lemma: Lemma 1.73 Let (cn ) be an arbitrary sequence of real numbers. If for every a < b rational thresholds the number of downcrossings of the sequence (cn ) is finite then the (finite or infinite) limit limn→∞ cn exists. Proof. The lim supn cn and the lim inf n cn extended real numbers always exist. If lim inf cn < a < b < lim sup cn n→∞
n→∞
then the number of the downcrossings of (cn ) is infinite. Definition 1.74 Let ξ ∈ L1 (Ω) and let Xn E (ξ | Fn ) , n ∈ N. Assume that the sequence of σ-algebras (Fn ) is decreasing, that is Fn+1 ⊆ Fn for all n ∈ N. These type of sequences are called reversed martingales. If Y−n Xn for all n ∈ N and G−n Fn then Y = (Yn , Gn ) is martingale over the parameter set Θ = {−1, −2, . . .}. If (Xn , Fn ) is a reversed martingale then one can assume that Xn = E (X0 | Fn ) for all n. If X is a continuous-time martingale and tn t∞ then the sequence (X(tn ), Ftn )n is a reversed martingale.
46
STOCHASTIC PROCESSES
Theorem 1.75 (L´ evy) If (Fn ) is a decreasing sequence of σ-algebras, X0 ∈ L1 (Ω) and F∞ ∩n Fn then Xn E (X0 | Fn ) → E (X0 | F∞ ) , where the convergence holds in L1 (Ω) and almost surely. Proof. As (Xn ) is uniformly integrable36 , it is sufficient to prove that (Xn ) is almost surely convergent. Let a < b be rational thresholds. On the set A
lim inf Xn < a < b < lim sup Xn n→∞
n→∞
the number of downcrossings is infinite. As n → X−n is a martingale on Z, the probability of A is zero. Hence a.s.
lim inf Xn = lim sup Xn . n→∞
1.3.5
n→∞
Regularization of martingales
Recall that, by definition, every continuous-time martingale is right-regular. Let F be an arbitrary filtration, and let ξ ∈ L1 (Ω). In discrete-time the sequence Xn E(ξ | Fn ) is a martingale as for every s < t a.s.
E (X(t) | Fs ) E (E(ξ | Ft ) | Fs ) = E (ξ | Fs ) X(s). In continuous-time X is not necessarily a martingale as the trajectories of X are not necessarily right-regular. Definition 1.76 A stochastic process X has martingale structure if E (X (t)) is finite for every t and a.s.
E (X(t) | Fs ) = X(s) for all s < t. Our goal is to show that if the filtration F satisfies the usual conditions then every stochastic process with martingale structure has a modification which is a 36 See:
Lemma 1.70, page 42.
MARTINGALES
47
martingale. The proof depends on the following simple lemma: Lemma 1.77 If X has a martingale structure then there is an Ω0 ⊆ Ω with P(Ω0 ) = 1, such that for every trajectory X(ω) with ω ∈ Ω0 and for every rational threshold a < b the number of downcrossings over the rational numbers a,b is finite. In particular if ω ∈ Ω0 then for every t ∈ Θ the (finite or infinite) DQ limits lim X(s, ω),
st, s∈Q
lim X(s, ω)
st, s∈Q
exist. Proof. The first part of the lemma is a direct consequence of the downcrossings inequality. If limn X(sn , ω) does not exist for some sn t then for some rational / Ω0 . thresholds a < b the number of downcrossings of (X(sn , ω)) is infinite, so ω ∈ Assume that X has a martingale structure. Let Ω0 ⊆ Ω be the subset in the lemma above. (t, ω) X
0 if ω ∈ / Ω0 . limst,s∈Q X(s, ω) if ω ∈ Ω0
(1.32)
is right-regular. Let t < s, ε > 0. We show that X (s, ω) ≤ X (t, ω) − X (tn , ω) + X (t, ω) − X
(s, ω) . + |X (tn , ω) − X (sn , ω)| + X (sn , ω) − X
As for an arbitrary ω ∈ Ω0 the number of ε/3 downcrossings of X over the Q is finite, so one can assume that in a right neighbourhood (t.t + u) of t for every tn , sn ∈ Q |X (tn , ω) − X (sn , ω)| <
ε . 3
From this, obviously, there is a δ such that if t < s < t + δ, then (s, ω) < ε. X (t, ω) − X have left limits. In a similar way one can prove that in Ω0 the trajectories of X Of course, without further assumptions we do not know that X is a modification of X. Assume that F satisfies the usual conditions. If tn t and tn ∈ Q then
48
STOCHASTIC PROCESSES
by L´evy’s theorem37 a.s. a.s. a.s = lim X(tn ) = lim E (X(t0 ) | Ftn ) = E (X(t0 ) | Ft+ ) . X(t) n→∞
n→∞
As F is right-continuous Ft+ = Ft . As X has a martingale structure a.s. a.s. X(t) = E (X(t0 ) | Ft ) = X(t),
is a modification of X. As F contains the measure-zero sets X and therefore X is F-adapted. This proves the following observation: Theorem 1.78 If X has a martingale structure and the filtration F satisfies the usual conditions then X has a modification which is a martingale. Corollary 1.79 If F satisfies the usual conditions and X is a uniformly inte over Θ [0, ∞] grable martingale over Θ = [0, ∞) then there is a martingale X which is indistinguishable from X over Θ. Proof. From the martingale convergence theorem38 there is an X(∞) ∈ L1 (Ω) a.s. such that X(t) = E(X(∞) | Ft ) for all t. On [0, ∞] X has a martingale structure which is a martingale over [0, ∞]. On [0, ∞) X and so it has a modification X are indistinguishable39 . X are right-regular so X and X From now on if X is a uniformly integrable martingale over Θ R+ then we shall always assume that X is a martingale over [0, ∞]. Example 1.80 One cannot extend the theorem to submartingales.
For arbitrary Ω let Ft A {∅, Ω}. Every function f on R+ is an adapted stochastic process. (Ω, A, P, F) obviously satisfies the usual conditions. If f is increasing then f has a ‘submartingale structure’, but if f is not right-continuous then f is not a submartingale. If f is not right-continuous then it does not have a right-continuous modification. Sometimes one cannot assume that the filtration satisfies the usual conditions. In this case one can use the following proposition: Proposition 1.81 Assume that X has a martingale structure and X is continuous in probability from the right. If F contains the measure-zero sets then X has a modification which is a martingale. 37 See:
Theorem 1.75, page 46. (1.26), page 40. 39 See: Proposition 1.9, page 7. 38 See:
MARTINGALES
49
a.s.
Proof. X is continuous from the right in probability. Therefore X(sn ) → X(t) is the right-regular process in (1.32) then for some sn t. So if X a.s. a.s. = lim X(sn ) = X(t). X(t) n→∞
is a right-regular and adapted modification of X. Therefore X The regularity of the trajectories is an essential condition. Example 1.82 If the trajectories of martingales were not regular then most of the results of the continuous-time martingale theory would not be true.
Let (Ω, A,P) ([0, 1] , B ([0, 1]) , λ) where λ denote the Lebesgue measure. If 2 Ft A B ([0, 1]) and ∆ denote the diagonal of [0, 1] then X χ∆ has a martingale structure, but if a = 1 then 1 = aP sup X (t) ≥ a > sup X + (t) = 0, t∈[0,1]
t∈I
1
that is, without the regularity of the trajectories Doob’s inequality does not hold. Of course Y ≡ 0 is regular modification of X, and for Y Doob’s inequality holds. 1.3.6
The Optional Sampling Theorem
As a first step let us prove the discrete-time version of the Optional Sampling Theorem40 . Lemma 1.83 Let X = (Xn , Fn ) be a discrete-time, integrable submartingale. If τ 1 and τ 2 are stopping times and for some p < ∞ P (τ 1 ≤ τ 2 ) = P (τ 2 ≤ p) = 1, then X(τ 1 ) ≤ E (X(τ 2 ) | Fτ 1 ) and E (X0 ) ≤ E (X(τ 1 )) ≤ E (X(τ 2 )) ≤ E (Xp ) . If X is a martingale then in both lines above equality holds everywhere. 40 The reader should observe that we have already used this lemma several times. Of course the proof of the lemma is independent of the results above.
50
STOCHASTIC PROCESSES
Proof. Let τ 1 ≤ τ 2 ≤ p and ϕk χ (τ 1 < k ≤ τ 2 ) . Observe that {ϕk = 1} = {τ 1 < k, τ 2 ≥ k} = c
= {τ 1 ≤ k − 1} ∩ {τ 2 ≤ k − 1} ∈ Fk−1 . By the assumptions Xk is integrable for all k, so Xk − Xk−1 is also integrable, therefore the conditional expectation of the variable Xk − Xk−1 with respect to the σ-algebra Fk−1 exists. ϕk is bounded, hence p E (η) E ϕk [Xk − Xk−1 ] = k=1
=
p
E (E (ϕk [Xk − Xk−1 ] | Fk−1 )) =
k=1
=
p
E (ϕk E (Xk − Xk−1 | Fk−1 )) ≥ 0.
k=1
If τ 1 (ω) = τ 2 (ω) for some outcome ω, then ϕk (ω) = 0 for all k, hence η (ω) 0. If τ 1 (ω) < τ 2 (ω), then η (ω) X (τ 1 (ω) + 1) − X (τ 1 (ω)) + X (τ 1 (ω) + 2) − X (τ 1 (ω) + 1) + . . . + X (τ 2 (ω)) − X (τ 2 (ω) − 1) , which is X (τ 2 (ω)) − X (τ 1 (ω)). Therefore E (η) = E (X (τ 2 ) − X (τ 1 )) ≥ 0. Xk is integrable for all k, therefore E (X (τ 1 )) and E (X (τ 2 )) are finite. By the finiteness of these expected values E (X (τ 2 ) − X (τ 1 )) = E (X (τ 2 )) − E (X (τ 1 )) , hence E (X (τ 2 )) ≥ E (X (τ 1 )) . Let A ∈ Fτ 1 ⊆ Fτ 2 , and let us define the variables τ k (ω) if ω ∈ A . τ ∗k (ω) p + 1 if ω ∈ /A
(1.33)
MARTINGALES
51
τ ∗1 and τ ∗2 are stopping times since if n ≤ p, then {τ ∗k ≤ n} = A ∩ {τ k ≤ n} = A ∩ {τ k ≤ n} ∈ Fn . By (1.33) E (X
(τ ∗2 ))
=
X (τ 2 ) dP+ Ac
A
X (p + 1) dP ≥ E (X (τ ∗1 )) =
X (τ 1 ) dP+
=
X (p + 1) dP. Ac
A
As Xp+1 is integrable one can cancel inequality so
Ac
X (p + 1) dP from both sides of the
X (τ 2 ) dP ≥
A
X (τ 1 ) dP. A
X (τ 1 ) is Fτ 1 -measurable and therefore E (X (τ 2 ) | Fτ 1 ) ≥ X (τ 1 ) . To prove the continuous-time version of the Optional Sampling Theorem we need some technical lemmas: Lemma 1.84 If τ is a stopping time, then there is a sequence of stopping times (τ n ) such that τ n has finite number of values41 , τ < τ n for all n and τn τ. (n)
Proof. Divide the interval [0, n) into n2n equal parts. Ik Let τ n (ω)
k/2n +∞
if otherwise
[(k − 1) /2n , k/2n ).
ω ∈ τ −1 (Ik ) (n)
.
(n)
Obviously τ < τ n . At every step the subintervals Ik are divided equally, and (n) (n) the value of τ n on τ −1 (Ik ) is always the right endpoint of the interval Ik . Therefore τ n τ . τ is a stopping time, hence, using that, every stopping time is a weak stopping time τ 41 τ
n (ω)
−1
(n) Ik
= +∞ is possible.
=
k τ< n 2
k−1 ∩ τ< 2n
c ∈ Fk/2n .
52
STOCHASTIC PROCESSES
Therefore
i τn ≤ n 2
=
k τn = n 2
k≤i
=
(n) τ −1 Ik ∈ Fi/2n . k≤i
The possible values of τ n are among the dyadic numbers i/2n and therefore τ n is a stopping time. Lemma 1.85 If (τ n ) is a sequence of stopping times and τ n τ then Fτ n + Fτ + . If τ n > τ and τ n τ then Fτ n Fτ + . Proof. Recall that by definition A ∈ Fρ+ if A ∩ {ρ ≤ t} ∈ Ft+ for every t. If A ∈ Fρ+ , then A ∩ {ρ < t} =
n
1 A∩ ρ≤t− n
∈ ∪n F(t−1/n)+ ⊆ Ft .
1. Let A ∈ Fρ+ and let ρ ≤ σ. A ∩ {σ ≤ t} = A ∩ {ρ ≤ t} ∩ {σ ≤ t} ∈ Ft+ as A ∩ {ρ ≤ t} ∈ Ft+ and {σ ≤ t} ∈ Ft . From this it is easy to see that Fτ + ⊆ ∩n Fτ n + . If A ∈ ∩n Fτ n + , then as τ n τ A ∩ {τ < t} = A
(∪n {τ n < t}) =
(A ∩ {τ n < t}) ∈ Ft .
n
So A ∩ {τ ≤ t} =
n
1 A∩ τ
∈
Ft+1/n = Ft+
n
that is A ∈ Fτ + . 2. If A ∈ Fρ+ and ρ < σ then as {σ ≤ t} ∈ Ft A ∩ {σ ≤ t} = A ∩ {ρ < t} ∩ {σ ≤ t} ∈ Ft , so A ∈ Fσ . From this it is easy to see that if τ < τ n then Fτ + ⊆ ∩n Fτ n . On the other hand ∩n Fτ n ⊆ ∩n Fτ n + = Fτ + and so if τ < τ n then ∩n Fτ n = Fτ + .
MARTINGALES
53
Theorem 1.86 (Optional Sampling Theorem for martingales) Let X be a martingale, and let τ 1 ≤ τ 2 be stopping times. If X is uniformly integrable, or if τ 1 and τ 2 are bounded then a.s.
X (τ 1 ) = E (X (τ 2 ) | Fτ 1 + ) .
(1.34)
Proof. Let τ be a bounded stopping time and let τ (n) τ , τ (n) > τ be a finitevalued approximating sequence42 . As τ is bounded there is an N large enough that τ (n) ≤ N . By the first lemma X(τ (n) ) = E (X(N ) | Fτ (n) ) .
(1.35)
As τ (n) > τ , by the last lemma ∩n Fτ (n) = Fτ + . So by the definition of the conditional expectation (1.35) means that X(τ (n) )dP = X(N )dP, A ∈ Fτ + . A
A
X(N ) is integrable therefore the sequence X(τ (n) ) is uniformly integrable43
by (1.35). By the right-continuity of the martingales X (τ ) = limn→∞ X τ (n) , so if A ∈ Fτ + then (n) X(N )dP = lim X(τ )dP = lim X(τ (n) )dP = n→∞
A
=
A
A n→∞
X(τ )dP. A
As X (τ ) is Fτ -measurable and Fτ ⊆ Fτ + , X (τ ) = E (X (N ) | Fτ + ) . If X is uniformly integrable then one can assume that X is a martingale on [0, ∞]. There is a continuous bijective time transformation f between the intervals [0, ∞] and [0, 1]. During this transformation the properties of X and τ do not change, but f (τ ) will be bounded, so using the same argument as above one can prove that X (τ ) = E (X (∞) | Fτ + ) . Finally if τ 1 ≤ τ 2 , then as Fτ 1 + ⊆ Fτ 2 + E (X (τ 2 ) | Fτ 1 + ) = E (E (X (N ) | Fτ 2 + ) | Fτ 1 + ) = = E (X (N ) | Fτ 1 + ) = X (τ 1 ) , where if X is uniformly integrable, then N ∞. 42 See: 43 See:
Lemma 1.84, page 51. Lemma 1.70, page 42.
54
STOCHASTIC PROCESSES
Corollary 1.87 If X is a non-negative martingale and τ 1 ≤ τ 2 , then X(τ 1 ) ≥ E (X(τ 2 ) | Fτ 1 + ) .
(1.36)
Proof. First of all let us remark, that as X is a non-negative martingale X(∞) is meaningful 44 , and if n ∞ then X (τ ∧ n) → X (τ ) for every stopping time τ . Let G σ ∪n F(τ ∧n)+ . Obviously G ⊆ Fτ + . Let A ∈ Fτ + . A ∩ {τ ≤ n} ∩ {τ ∧ n ≤ t} = A ∩ {τ ≤ t ∧ n} ∈ Ft+ , therefore A ∩ {τ ≤ n} ∈ F(τ ∧n)+ . So A ∩ {τ < ∞} ∈ G. Also A ∩ {τ > n} ∩ {τ ∧ n ≤ t} = A ∩ {t ≥ τ > n} ∈ Ft+ so A ∩ {τ > n} ∈ F(τ ∧n)+ . Hence A ∩ {τ = ∞} = A ∩ (∩n {τ > n}) ∈ G, therefore G = Fτ + . Let n1 ≤ n2 . By the Optional Sampling Theorem
X(τ 1 ∧ n1 ) = E X(τ 2 ∧ n2 ) | F(τ 1 ∧n1 )+ . X(τ 2 ∧ n2 ) ∈ L1 (Ω) and therefore by L´evy’s theorem X(τ 1 ) = E (X(τ 2 ∧ n2 ) | Fτ 1 + ) . By Fatou’s lemma X(τ 1 ) = lim E (X(τ 2 ∧ n2 ) | Fτ 1 + ) ≥ E n2 →∞
lim X(τ 2 ∧ n2 ) | Fτ 1 +
n2 →∞
=
= E (X(τ 2 ) | Fτ 1 + ) .
Proposition 1.88 (Optional Sampling Theorem for submartingales) Let τ 1 ≤ τ 2 bounded stopping times. If X is an integrable submartingale then X (τ 1 ) and X (τ 2 ) are integrable and X (τ 1 ) ≤ E (X (τ 2 ) | Fτ 1 ) .
(1.37)
The inequality also holds if τ 1 ≤ τ 2 are arbitrary stopping times and X can be extended as an integrable submartingale to [0, ∞]. Proof. The proof of the proposition is nearly the same as the proof in the martingale case. Again it is sufficient to prove the inequality in the bounded 44 See:
Corollary 1.66, page 40.
MARTINGALES (n)
55
(n)
case. Assume that τ 1 ≤ τ 2 ≤ K and let (τ 1 )n and (τ 2 )n be the finite-valued (n) (n) approximating sequences of τ 1 and τ 2 . By the construction τ 1 ≤ τ 2 , so by the first lemma of the subsection (n) (n) X(τ 1 )dP ≤ X(τ 2 )dP, F ∈ Fτ 1 + . F
F
(n)
By the right-continuity of submartingales X(τ k ) → X(τ k ) and therefore one should prove that the convergence holds in L1 (Ω), that is, one should prove the (n) uniform integrability of the sequences (X(τ k )). Since in this case one can take the limits under the integral signs therefore X(τ 1 )dP ≤ X(τ 2 )dP, F ∈ Fτ 1 + . F
F
As X(τ 1 ) is Fτ 1 + -measurable by the definition of the conditional expectation X (τ 1 ) = E (X (τ 1 ) | Fτ 1 ) ≤ E (X (τ 2 ) | Fτ 1 + ) . This means that (1.37) holds. Let us prove that the sequence uniformly integrable.
(n) X(τ k ) is
1. As X is submartingale, X + is also submartingale, therefore from the finite Optional Sampling Theorem
(n) ≤ E X + (K) | Fτ (n) . 0 ≤ X+ τ k k
The right-hand side is uniformly integrable45 , so the left-hand side is also uniformly integrable. (n)
2. Let Xn X(τ k ). By the finite Optional Sampling Theorem (Xn ) is obviously an integrable reversed submartingale. Let n > m. As (Xn ) is a reversed submartingale 0≤ Xn− dP = − Xn dP = Xn dP − E(Xn ) ≤ {Xn− ≥N } {Xn− ≥N } {Xn−
m and as Xn is a reversed submartingale E (X (0)) ≤ E(Xn ) ≤ E(Xm ) ≤ E(X (K)). 45 See:
Lemma 1.70, page 42.
56
STOCHASTIC PROCESSES
X is an integrable submartingale, so E(X (0)) is finite. Hence the sequence (E(Xn )) is convergent. If m is large enough then 0 ≤ E(Xm ) − E(Xn ) ≤ ε/2. Xm is integrable so the integral function A Xm dP is absolutely continuous. Therefore it is sufficient to prove that for an arbitrary δ there is an N such that for all n
P Xn− ≥ N ≤ δ. By Markov’s inequality
E(Xn− ) Xn 1 E(|Xn |) ≤ = . P Xn− ≥ N ≤ N N N But since (Xn+ ) is also a reversed submartingale
Xn 1 = E Xn+ − Xn− = E Xn+ + Xn+ − Xn =
= 2E Xn+ − E (Xn ) ≤ 2E X + (K) − E (Xn ) ≤
≤ 2E X + (K) − E (X (0)) L. Hence for all n
L →0 P Xn− ≥ N ≤ N and therefore (Xn ) is uniformly integrable. Example 1.89 The right-regularity and the assumption of uniform integrability of the martingales are important.
Let P be a Poisson process with parameter λ, and let π (t) P (t) − λt be the so-called compensated Poisson process. Recall that the Poisson processes are, by definition, right-continuous, so it is easy to see that π is a martingale. Let τ denote the time of the first jump of P . If N > 0, then σ τ ∧ N is a bounded stopping time. If π were not right but left-continuous then one could not apply the Optional Sampling Theorem: if P were left-continuous then P (σ) = 0, and E (π (0)) = 0 = E (−λσ) = E (P (σ) − λσ) = E (π (σ)) . Let w be a Wiener process and let τ a inf {t : w(t) = a}
MARTINGALES
57
be the first passage time of an a = 0. As w is not uniformly integrable and τ a is unbounded, one cannot apply the Optional Sampling Theorem: almost surely46 a.s. τ a < ∞, hence w (τ a ) = a. Therefore E (w (τ a )) = E (a) = a = 0 = E (w (0)) .
Example 1.90 The exponential martingales of Wiener processes are not uniformly integrable.
Let w be a Wiener process. If the so-called exponential martingale X (t) exp (w (t) − t/2) were uniformly integrable, then for every stopping time one could apply the Optional Sampling Theorem. X is a non-negative martingale, therefore there is47 a random variable X (∞) such that almost surely X(t) → X(∞). For almost all trajectories of w the set {w = 0} is unbounded48 , therefore w(σ n ) = 0 for some sequence σ n ∞. Therefore σ
σn
a.s. n X (∞) = lim X (σ n ) lim exp w (σ n ) − = lim exp − = 0. n→∞ n→∞ n→∞ 2 2 a.s.
Since X(0) = 1, X(∞) = 0 and X is continuous, if a < 1 then almost surely τ a inf {t : X(t) = a} < ∞. a.s.
That is X (τ a ) = a. So if a < 1, then E (X (0)) = 1 > a = E (X (τ a )) . Hence X is not uniformly integrable. Proposition 1.91 (Martingales and conservation of the expected value) Let X be an adapted and right-regular process. X is a martingale if and only if X(τ ) ∈ L1 (Ω)
and
E (X(τ )) = E (X(0))
for all bounded stopping times τ . This property holds for every stopping time τ if and only if X is a uniformly integrable martingale. 46 See:
Proposition B.7, page 564. Corollary 1.66, page 40. 48 See: Corollary B.8, page 565. 47 See:
58
STOCHASTIC PROCESSES
Proof. If X is a martingale, or uniformly integrable martingale, then by the Optional Sampling Theorem the proposition holds. Let s < t and let A ∈ Fs . It is easy to check that τ = tχAc + sχA
(1.38)
is a bounded stopping time. By the assumption of the proposition E (X(0)) = E (X(τ )) = E (X(t)χAc ) + E (X(s)χA ) . As τ ≡ t is also a stopping time, E (X(0)) = E (X(t)) = E (X(t)χAc ) + E (X(t)χA ) . Comparing the two equations E (X(s)χA ) = E (X(t)χA ) , that is E (X(s) | Fs ) = E (X(t) | Fs ) . As X is adapted, X(s) is Fs -measurable so X(s) = E (X(t) | Fs ). If one can apply the property E (X(τ )) = E (X(0)) for every stopping time τ then one can apply it for the stopping time τ ≡ ∞ as well. Hence X (∞) exists and in (1.38) t = ∞ is possible, hence X(s) = E (X(∞) | Fs ) , so X is uniformly integrable49 . Corollary 1.92 (Conservation of the martingale property under truncation) If X is a martingale and τ is a stopping time then the truncated process X τ is also a martingale. Proof. If X is right-regular then the truncated process X τ is also right-regular. By Proposition 1.35 X τ is adapted. Let φ be a bounded stopping time. As υ φ ∧ τ is a bounded stopping time by Proposition 1.91 E (X τ (φ)) = E (X(υ)) = E (X(0)) = E (X τ (0)) and therefore X τ is a martingale. 1.3.7
Application: elementary properties of L´ evy processes
L´evy processes are natural generalizations of Wiener and Poisson processes. Let us fix a stochastic base space (Ω, A, P, F) and assume that Θ = [0, ∞). Definition 1.93 Let X be an adapted stochastic process. X is a process with independent increments with respect to the filtration F if 49 See:
Lemma 1.70, page 42 .
MARTINGALES
59
1. X (0) = 0, 2. X is right-regular, 3. whenever s < t then the increment X (t) − X (s) is independent of the σ-algebra Fs . A process X with independent increments is a L´evy process, if it has stationary or homogeneous increments that is for every t and for every h > 0 the distribution of the increment X(t + h) − X(t) is the same as the distribution of X(h) − X(0). By definition every L´evy process and every process with independent increments has right-regular trajectories. This topological assumption is very important as it is not implied by the other assumptions: Example 1.94 Not every process starting from zero and having stationary and independent increments is a L´evy process.
Let Ω be arbitrary and A = Ft = {∅, Ω} and let (xα )α be a Hamel basis of R over the rational numbers. For every t let X(t) be the sum of the coordinates of t in the Hamel basis. Obviously X(t + s) = X(t) + X(s) so X has stationary and independent increments. But as X is highly discontinuous50 it does not have a modification which is a L´evy process. Example 1.95 The sum of two L´evy processes is not necessarily a L´evy process51 .
We show that even the sum of two Wiener processes is not a Wiener process. The present counter example is very important as it shows that, although the L´evy processes are the canonical and most important examples of semimartingales, they are not the right objects from the point of view of the theory. The sum of two semimartingales52 is a semimartingale and the same is true for martingales or for local martingales. But it is not true for L´evy processes! 1. Let Ω be the set of two-dimensional continuous functions R+ → R2 with the property f (0) = (0, 0). Let P1 be a measure on the Borel σ-algebra of Ω for which the canonical stochastic process X (ω, t) = ω (t) is a two-dimensional Wiener process with correlation coefficient 1. In the same way let P2 be the measure on Ω under which X is a Wiener process with correlation coefficient −1. Let P (P1 + P2 )/2. It is easy to see that the coordinate processes w1 (t) and 50 The
image space of X is the rational numbers! example depends on results which we shall prove later. So the reader can skip the example at the first reading. 52 We shall introduce the definitions of semimartingales and local martingales later. 51 The
60
STOCHASTIC PROCESSES
w2 (t) are Wiener processes. On the other hand, a simple calculation shows that the distribution of Z w1 + w2 is not Gaussian. Z is continuous and every continuous L´evy process is a linear combination of a Wiener process and a linear trend53 , therefore, as Z is not a Gaussian process it cannot be a L´evy process. 2. The next example is bit more technical, but very similar: Let w be a Wiener t process with respect to some filtration F. Let X (t) 0 sign (w) dw, where the integral, of course, is an Itˆ o integral. The quadratic variation of X is
t
2
(sign (w)) d [w] =
[X] (t) = 0
t
1ds = t 0
so by L´evy’s characterization theorem54 the continuous local martingale X is also a Wiener process55 with respect to F. If Z w + X = 1 • w + sign (w (s)) • w = (1 + sign (w (s))) • w then Z is a continuous martingale with respect to F with zero expected value. [Z] (t) =
t
2
(1 + sign (w)) d [w] = 0
t
2
(1 + sign (w (s))) ds 0
so Z is not a Wiener process. As in the first example, every continuous L´evy process is a linear combination of a Wiener process and a linear trend, therefore, as Z is not a Wiener process it cannot be a L´evy process. During the proof of the next proposition, we shall need the next very useful simple observation: Lemma 1.96 ξ 1 and ξ 2 are independent vector-valued random variables if and only if ϕ = ϕ 1 · ϕ2 , where ϕ1 is the Fourier transform of ξ 1 and ϕ2 is the Fourier transform of ξ 2 and ϕ is the Fourier transform of the joint distribution of (ξ 1 , ξ 2 ). Proof. If ξ 1 and ξ 2 are independent then the decomposition obviously holds. The other implication is an easy consequence of the Monotone Class Theorem: 53 See:
Theorem 6.11, page 367. Theorem 6.13, page 368. 55 See: Example 6.14, page 370. 54 See:
MARTINGALES
61
fix a vector v and let L be the set of bounded functions u for which E (u (ξ 1 ) · exp (i (v, ξ 2 ))) = E (u (ξ 1 )) · E (exp (i (v, ξ 2 ))) . L is obviously a λ-system. Under the conditions of the lemma L contains the π-system of the functions u (x) = exp (i (u, x)) , so it contains the characteristic functions of the sets of the σ-algebra generated by these exponential functions. Therefore it is easy to see that for every Borel measurable set B E (χB (ξ 1 ) · exp (i (v, ξ 2 ))) = P (ξ 1 ∈ B) · E (exp (i (v, ξ 2 ))) . Now let L be the set of bounded functions v for which E (χB (ξ 1 ) · v (ξ 2 )) = P (ξ 1 ∈ B) · E (v (ξ 2 )) . With the same argument as above, by the Monotone Class Theorem for any Borel measurable set D, one can choose v = χD . So P (ξ 1 ∈ B, ξ 2 ∈ D) = E (χB (ξ 1 ) · χD (ξ 2 )) = P (ξ 1 ∈ B) · P (ξ 2 ∈ D) therefore, by independent.
definition,
the
random
vectors
ξ1
and
ξ2
are
Proposition 1.97 For an adapted process X the increments are independent if and only if the σ-algebra Gt generated by the increments X (u) − X (v) ,
u≥v≥t
is independent of Ft for every t. Proof. To make the notation as simple as possible let X (t0 ) denote an arbitrary Ft0 -measurable random variable. Let 0 = t−1 ≤ t = t0 ≤ t1 ≤ t2 ≤ . . . ≤ tn . We show that if X has independent increments then the random variables X(t0 ), X(t1 ) − X(t0 ), X(t2 ) − X(t1 ), . . . , X(tn ) − X(tn−1 )
(1.39)
are independent. To prove this one should prove that the Fourier transform of the joint distribution of the variables in (1.39) is the product of the Fourier
62
STOCHASTIC PROCESSES
transforms of the distributions of these increments: n uj [X(tj ) − X(tj−1 )] = ϕ(u) E exp i
j=0
= E E exp i
= E exp i
= E exp i
= E exp i
= E exp i
n j=0
n−1
uj ∆X(tj ) E (exp (iun ∆X(tn ))) =
uj ∆X(tj ) ϕtn ,tn−1 (un ) =
j=0 n−1
uj ∆X(tj ) | Ftn−1 =
j=0 n−1
uj ∆X(tj ) E exp (iun ∆X(tn )) | Ftn−1 =
j=0 n−1
uj ∆X(tj ) ϕtn ,tn−1 (un ) = · · · =
j=0
=
n !
ϕtj ,tj−1 (uj ).
j=0
Of course this means that the σ-algebra generated by a finite number of increments is independent of Ft for any t. As the union of σ-algebras generated by finite number of increments is a π-system, with the uniqueness of the extension of the probability measures from π-systems one can prove that the σ-algebra generated by the increments is independent of Ft . Let us denote by ϕt the Fourier transform of X(t). As X has stationary and independent increments, for every u ϕt+s (u) E (exp (iuX(t + s))) = = E (exp (iu (X(t + s) − X (t))) exp (iuX(t))) = = E (exp (iu (X(t + s) − X (t)))) · E (exp (iuX(t))) = = E (exp (iuX(s))) · E (exp (iuX(t))) ϕt (u) · ϕs (u), therefore ϕt+s (u) = |ϕt (u)| · |ϕs (u)| .
(1.40)
MARTINGALES
63
As |ϕt (u)| ≤ 1 for all u and as |ϕ0 (u)| = 1 from Cauchy’s functional equation |ϕt (u)| = exp (t · c(u)) . This implies that ϕt (u) is never zero. Let h > 0. ϕt (u) − ϕt+h (u) = |ϕt (u)| 1 − ϕt+h (u) ≤ ϕt (u) ≤ |1 − ϕh (u)| . X is right-continuous so if h 0 then by the Dominated Convergence Theorem, using that X (0) = 0 lim ϕh (u) = ϕ0 (u) = 1.
h0
So ϕt (u) is right-continuous. If t > 0 then ϕt (u) − ϕt−h (u) = ϕt−h (u) 1 − ϕt (u) ≤ ϕt−h (u) ≤ |1 − ϕh (u)| → 0, so ϕt (u) is also left-continuous. Hence ϕt (u) is continuous in t. Therefore E(exp(iu∆X(t))) = lim E(exp(iu(X(t) − X(t − h)))) = h0
= lim
h0
ϕt (u) = 1, ϕt−h (u)
so ∆X(t) = 0 almost surely. a.s.
a.s.
Hence for some subsequence X (tnk ) → X (t). This implies that X (t−) = X (t). Therefore one can make the next important observation:
Proposition 1.98 If X is a L´evy process then ϕt (u) = 0 for every u and the probability of a jump at time t is zero for every t. This implies that every L´evy process is continuous in probability. We shall need the following generalization: Proposition 1.99 If X is a process with independent increments and X is continuous in probability then ϕt (u) ϕ(u, t) E (exp (iuX (t))) is never zero.
64
STOCHASTIC PROCESSES
Proof. Let us fix the parameter u. As X is continuous in probability ϕ(u, t) is continuous in t. Let t0 (u) inf {t : ϕ (u, t) = 0} . One should prove that t0 (u) = ∞. By definition X (0) = 0 therefore ϕ (u, 0) = 1 and as ϕ (u, t) is continuous in t obviously t0 (u) > 0. Let h (u, s, t) E (exp (iu (X (t) − X (s)))) . X has independent increments, so if s < t then ϕ (u, t) = ϕ (u, s) h (u, s, t) .
(1.41)
By the right-regularity of X ϕ (u, t0 (u)) = 0. As X (t) has limits from the left if t0 (u) < ∞ then ϕ (u, t0 (u) −) is well-defined. We show that it is not zero. By (1.41) if s < t0 (u) < ∞ then ϕ (u, t0 (u) −) = ϕ (u, s) h (u, s, t0 (u) −) . ϕ (u, s) = 0 by the definition of t0 (u), so if ϕ (u, t0 (u) −) = 0 then h (u, s, t0 (u) −) = 0 for every s < t0 (u). 0=
lim h (u, s, t0 (u) −) =
st0 (u)
=
lim E (exp (iuX (t0 (u) −) − iuX (s))) =
st0 (u)
= E (exp (0)) = 1, which is impossible. Therefore 0 = ϕ (u, t0 (u)) = ϕ (u, t0 (u) −) = 0, which is impossible since ϕ is continuous. Let us recall the following simple observation: Proposition 1.100 Let ψ be a complex-valued, continuous curve defined on R. If ψ (t) = 0 for every t then it has a logarithm that is there is a continuous curve φ with the property that ψ = exp (φ). If φ1 (t0 ) = φ2 (t0 ) for some point t0 and ψ = exp (φ1 ) = exp (φ2 ) for some continuous curves φ1 and φ2 then φ1 = φ2 .
MARTINGALES
65
Proof. The proposition and its proof is quite well-known, so we just sketch it: 1. ψ = 0, so if ψ = exp (φ1 ) = exp (φ2 ) then 1=
ψ exp (φ1 ) = = exp (φ1 − φ2 ) . ψ exp (φ2 )
Hence for all t φ1 (t) = φ2 (t) + 2πin (t) , where n (t) is a continuous integer-valued function. As n (t0 ) = 0 obviously n ≡ 0, so φ1 = φ2 . 2. The complex series ln (1 + z) =
∞
n+1
(−1)
n=1
zn n
is convergent if |z| < 1. On the real line exp (ln (1 + z)) = 1 + z.
(1.42)
As ln (1 + z) is analytic (1.42) holds for every |z| < 1. To simplify notation as much as possible let us assume that t0 = 0 and ϕ (t0 ) = 1 and let us assume that we are looking for a curve with φ (t0 ) = 0. From (1.42) there is an r > 0 that ψ (t) ln (ϕ (t)) is well-defined for |t| < r. 3. Let a be the infimum and let b be the supremum of the endpoints of closed intervals where one can define a φ. If an a and bn b and φ is defined on [an , bn ] then by the first point of the proof φ (t) is well-defined on (a, b). Let assume that b < ∞. As ψ (b) = 0 we can define the curve θ (t) ψ (b + t) /ψ (b). Applying the part of the proposition just proved for some r > 0 ψ (t) = exp ( (t)) , ψ (b)
|b − t| < r,
with (b) = 0. Let t ∈ (b − r, b). As the range of the complex exponential function is C\ {0} there is a z ∈ C with ψ (b) = exp (z). exp (φ (t)) = ψ (b) exp ( (t)) = exp (z + (t)) . Hence φ (t) = z + (t) + 2nπi. With z + (t) + 2nπi one can easily continue φ to (a, b + r). This contradiction shows that one can define φ for the whole R.
66
STOCHASTIC PROCESSES
ϕ1 (u) E (exp (iuX (1))) is non-zero and by the Dominated Convergence Theorem it is obviously continuous in u. By the observation just proved ϕ1 (u) = exp (log ϕ1 (u)) exp(φ(u)), where by definition φ(0) = 0. From this by (1.40) ϕn (u) = exp(nφ(u)) and ϕ1/n (u) = exp(n−1 φ(u)) for every n ∈ N. Hence if r is a rational number then ϕr (u) = exp(rφ(u)). By the just proved continuity in t t ∈ R+ .
ϕt (u) = exp (tφ(u)) ,
(1.43)
L´evy processes are not martingales but we can use martingale theory to investigate their properties. The key tool is the so-called exponential martingale of X. Let us define the process Zt (u, ω) Z (t, u, ω)
exp (iuX(t, ω)) . ϕt (u)
(1.44)
ϕt (u) is continuous in t for every fixed u, and therefore Zt (u, ω) is a right-regular stochastic process. Let t > s. E (Zt (u) | Fs ) E =E =
exp (iuX (t)) | Fs ϕt (u)
=
exp (iu (X (t) − X (s))) exp (iuX (s)) | Fs ϕt−s (u) ϕs (u)
exp (iuX (s)) E (exp (iu (X (t) − X (s)))) = ϕs (u) ϕt−s (u)
= Zs (u)
E (exp (iuX (t − s))) = ϕt−s (u)
= Zs (u) · 1 Zs (u) , therefore Zt (u) is a martingale in t for any fixed u. Definition 1.101 Zt (u) is called the exponential martingale of X. Example 1.102 The exponential martingale of a Wiener process. If w is a Wiener process then Zt (u, ω)
exp (iuw(t)) u2 = exp iuw(t) + t . exp(−tu2 /2) 2
=
MARTINGALES
67
If instead of the Fourier transform we normalize with the Laplace transform, then56 exp (uw(t)) u2 = exp uw(t) − t . exp(tu2 /2) 2
Let X be a L´evy process and assume that the filtration is generated by X. Denote this filtration by F X . Obviously F X does not necessarily contain the measure-zero sets57 , so F X does not satisfy the usual conditions. Let N denotes the collection of measure-zero sets and let us introduce the so-called augmented filtration: Ft σ (σ (X (s) : s ≤ t) ∪ N ) .
(1.45)
It is a bit surprising, but for every L´evy process the augmented filtration satisfies the usual conditions. That is, for L´evy processes the augmented filtration F is always right-continuous58 : Proposition 1.103 If X is a L´evy process then (1.45) is right-continuous that is Ft = Ft+ . Proof. Let us take the exponential martingale of X. If t < w < s then exp (iuX (w)) Zw (u) = E (Zs (u) | Fw ) E ϕw (u)
exp (iuX (s)) | Fw , ϕs (u)
therefore Zw (u) ϕs (u) exp (iuX (w))
ϕs (u) = E (exp (iuX (s)) | Fw ) . ϕw (u)
If w t then from the continuity of ϕt and from the right-continuity of X, with L´evy’s theorem59 exp (iuX (t))
ϕs (u) a.s. = E (exp (iuX (s)) | Ft+ ) . ϕt (u)
As exp (iuX (t)) is Ft -measurable, and Zt (u) is a martingale exp (iuX (t)) 56 See:
Example 1.118, page 82. Example 1.13, page 9. 58 See: Example 1.13, page 9. 59 See: Theorem 1.75, page 46. 57 See:
ϕs (u) a.s. = E (exp (iuX (s)) | Ft ) . ϕt (u)
68
STOCHASTIC PROCESSES
Therefore a.s.
E (exp (iuX (s)) | Ft ) = E (exp (iuX (s)) | Ft+ ) .
(1.46)
This equality can be extended to multidimensional trigonometric polynomials. For example, if t < w ≤ s1 ≤ s2 and η u1 X (s1 ) + u2 X (s2 ) then, as X(s2 ) − X (s1 ) is independent of Fs1 : E (exp (iη) | Fw ) = E (exp (iu1 X (s1 )) · exp (iu2 X (s2 )) | Fw ) = = E (exp (i(u1 + u2 )X (s1 )) · E (exp (iu2 (X(s2 ) − X (s1 ))) | Fs1 ) | Fw ) = = E (exp (i(u1 + u2 )X (s1 )) · E (exp (iu2 (X(s2 ) − X (s1 )))) | Fw ) =
= E exp (i (u1 + u2 ) X (s1 )) · ϕs2 −s1 (u2 ) | Fw = = ϕs2 −s1 (u2 ) · E (exp (i (u1 + u2 ) X (s1 )) | Fw ) = = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zw (u1 + u2 ) . If w t then by the right-continuity of Zs and by L´evy’s theorem60 a.s.
E (exp (iη) | Ft+ ) = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zt (u1 + u2 ) . On the other hand with the same calculation if w = t a.s.
E (exp (iη) | Ft ) = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zt (u1 + u2 ) . Therefore a.s.
E (exp (iη) | Ft ) = E (exp (iη) | Ft+ ) . That is if sk > t then
E exp i
uk X(sk )
| Ft+
a.s.
= E exp i
k
uk X(sk )
| Ft
.
(1.47)
k
If sk ≤ t then equation (1.47) trivially holds. Hence if L is the set of bounded functions f for which a.s.
E (f (X (s1 ) , . . . , X (sn )) | Ft+ ) = E (f (X (s1 ) , . . . , X (sn )) | Ft ) then L contains the π-system of the trigonometric polynomials. L is trivially a λsystem, therefore, by the Monotone Class Theorem, L contains the characteristic functions of the sets of the σ-algebra generated by the trigonometric polynomials. 60 See:
Theorem 1.75, page 46.
MARTINGALES
69
That is if B ∈ B (Rn ) then one can write in place of f the characteristic functions χB . Collection Z of sets A for which a.s.
E (χA | Ft+ ) = E (χA | Ft ) is also a λ-system which contains the sets of the π-system n
∪n σ ((X (sk ))k=1 , sk ≥ 0) . Again, by the Monotone Class Theorem, Z contains the σ-algebra 0 = σ (X (s) : s ≥ 0) . F∞
0 If A ∈ Ft+ ∩n Ft+1/n then A ∈ F∞ σ F∞ ∪ N . Therefore there is an a.s. 0 ∈ F0 ⊆ Z ∈ F∞ A , with χA = χA. As A ∞ a.s.
a.s. a.s. χA = E (χA | Ft+ ) = E χA | Ft+ = E χA | Ft . Hence up to a measure-zero set χA is almost surely equal to an Ft -measurable function E χA | Ft . As Ft contains all the measure-zero set χA is Ft measurable, that is A ∈ Ft . In a similar way one can prove the next proposition: Proposition 1.104 If X is a process with independent increments and X is continuous in probability then (1.45) is right-continuous, that is Ft = Ft+ . Example 1.105 One cannot drop the condition of independent increments. If ζ ∼ = N (0, 1) and X (t, ω) tζ (ω) then the trajectories of X are continuous and X has stationary increments. If F is the augmented filtration, then F0 = σ (N ), and if t > 0, then Ft = σ (σ (X) , N ), hence Ft is not right-continuous. Example 1.106 The augmentation is important: if w is a Wiener process then Ftw σ (w (s) : s ≤ t) is not necessarily right-continuous61 .
From now on we shall assume that the filtration of every L´evy process satisfies the usual assumptions. 61 See:
Example 1.13, page 9.
70
STOCHASTIC PROCESSES
Proposition 1.107 If the process X is left-continuous then the filtration FtX σ (X (s) : s ≤ t) is left-continuous. This remains true for the augmented filtration.
X X Proof. Let Ft− σ ∪s
Corollary 1.108 The augmented filtration of a Wiener process is continuous. A crucial property of L´evy processes is the so-called strong Markov property: Proposition 1.109 (Strong Markov property for L´ evy processes) Let τ < ∞ be a stopping time and let X be a L´evy process. 1. If X ∗ (t, ω) X (τ (ω) + t, ω) − X (τ (ω) , ω) ,
t ≥ 0,
then X ∗ is a L´evy process with respect to the filtration62 Ft∗ Fτ +t . 2. The distributions of X and X ∗ are the same63 . 3. In particular the set {X ∗ (t) : t ≥ 0} is independent of the stopped σ-algebra F0∗ = Fτ . Proof. Since X is right-regular, X is progressively measurable, so X ∗ is obviously Ft∗ -adapted. 1. For every u (1.44) is a martingale. If τ is a bounded stopping time then by the Optional Sampling Theorem E (Z (τ + t) | Fτ ) = Z (τ ) . −1
Z (τ ) is Fτ -measurable and as Z = 0 and as τ is bounded Z (τ ) E
Zτ +t | Fτ Zτ
is bounded so
= 1.
(1.48)
62 Observe that if t ≥ 0 then σ τ + t is a stopping time. Obviously F ∗ F τ +t Fσ is the t stopped σ-algebra. 63 That is the infinite dimensional distribution of {X (t) : t ≥ 0} and {X ∗ (t) : t ≥ 0} are equal.
MARTINGALES
71
By (1.40) for every ω ϕτ (ω) (u) ϕτ (ω) (u) 1 = = . ϕτ (ω)+t (u) ϕτ (ω) (u) ϕt (u) ϕt (u) So for every A ∈ Fτ using (1.48) by the definition of the conditional expectation ϕt (u) P (A) = ϕt (u)
1dP = ϕt (u) A
A
exp(iu(X(τ + t) − X(τ ))) dP = ϕt (u)
=
Zτ +t dP = Zτ
A
= ϕt (u)
(1.49)
exp (iuX ∗ (t)) dP.
A
If τ is arbitrary and τ n n ∧ τ then, as τ < ∞, for every outcome X (τ n + t) − X (τ n ) → X (τ + t) − X (τ ) X ∗ (t) .
(1.50)
If A ∈ Fτ then for every t ≥ 0 A ∩ {τ ≤ n} ∩ {τ n ≤ t} ∈ Ft ,
(1.51)
since if t ≤ n then {τ n ≤ t} = {τ ≤ t} ⊆ {τ ≤ n} , and if t > n then {τ n ≤ t} = {τ ≤ n}. From (1.51) by the definition of the stopped σ-algebra An A ∩ {τ ≤ n} ∈ Fτ n . As τ n is bounded, by (1.49) exp (iu (X(τ n + t) − X(τ n ))) dP = P (An ) ϕt (u) . An
(1.52)
72
STOCHASTIC PROCESSES
From (1.50) and by the Dominated Convergence Theorem
exp (iuX ∗ (t)) dP =
A
lim χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) dP =
=
A n→∞
χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) dP =
= lim
n→∞
A
exp (iu (X (τ n + t) − X (τ n ))) dP =
= lim
n→∞
An
= lim P (An ) ϕt (u) = P (A) ϕt (u) = P (A) n→∞
exp (iuX(t)) dP. Ω
2. If A Ω then the equation above means that the Fourier transform of X ∗ (t) is ϕt . That is, the distribution of X ∗ (t) and X (t) is the same. Let L be the set of bounded functions f for which for all A ∈ Fτ
f (X ∗ (t)) dP = P (A)
f (X ∗ (t)) dP.
Ω
A
Obviously L is a λ-system, and L contains the π-system of the trigonometric polynomials x → exp (iux) ,
u ∈ R.
By the Monotone Class Theorem, L contains the functions f χB with B ∈ B (R). Therefore for every A ∈ Fτ and B ∈ B(R)
χB (X ∗ (t)) dP = P (A ∩ {X ∗ (t) ∈ B}) =
A
= P (A)
χB (X ∗ (t)) dP = P (A) · P (X ∗ (t) ∈ B) .
Ω
So X ∗ (t) is independent of Fτ . 3. One should prove that X ∗ has stationary and independent increments. If σ τ + t and X ∗∗ (h) X (σ + h) − X (σ) ,
MARTINGALES
73
then using the part of the proposition already proved for the stopping time σ X ∗ (t + h) − X ∗ (t) (X (τ + t + h) − X (τ )) − (X (τ + t) − X (τ )) = = X (σ + h) − X (σ) = X ∗∗ (h) ∼ = X(h), which is independent of t and therefore X ∗ has stationary increments. Also by the already proved part of the proposition X ∗ (t + h) − X ∗ (t) = X ∗∗ (h) is independent of Fσ Ft∗ . Obviously X ∗ (0) = 0 and X ∗ is right-regular therefore X ∗ is a process with independent increments. 4. Now we prove that X and X ∗ have the same distribution. Let 0 = t0 < t1 < . . . < tn be arbitrary. As we proved X ∗ (tk ) − X ∗ (tk−1 ) ∼ = X ∗ (tk − tk−1 ) ∼ = X (tk − tk−1 ) ∼ = ∼ X (tk ) − X (tk − 1) . = As the increments are independent (X ∗ (tk ) − X ∗ (tk−1 ))k=1 has the same disn n tribution as (X (tk ) − X (tk−1 ))k=1 . This implies that (X (tk ))k=1 has the same n distribution as (X ∗ (tk ))k=1 . Which, by the Monotone Class Theorem, implies that X ∗ and X has the same distribution. n
5. As we proved X ∗ is a process with independent increments so Ft∗ is independent of the σ-algebra Gt∗ generated by the increments64 X ∗ (u) − X ∗ (v) ,
u ≥ v ≥ t.
So as a special case the set {X ∗ (t) : t ≥ 0} is independent of F0∗ = Fτ . Example 1.110 Random times which are not stopping times.
Let a > 0 and let w be a Wiener process. 1. Let γ a sup {0 ≤ s ≤ a : w (s) = 0} = inf {s ≥ 0 : w (a − s) = 0} . 64 See:
Proposition 1.97, page 61.
74
STOCHASTIC PROCESSES
Obviously γ a is Fa -measurable, so it is a random time. As P (w (a) = 0) = 0 almost surely γ a < a. Assume that γ a is a stopping time. In this case by the strong Markov property w∗ (t) w (t + γ a ) − w (γ a ) is also a Wiener process. It is easy to see that if w∗ is a Wiener process then w (t) tw∗ (1/t) is also a Wiener process65 . As every one-dimensional Wiener process almost surely returns to the origin66 , with the strong Markov property it is easy to prove that w returns to the origin almost surely after any time t. This means that there is a sequence tn 0 with tn > 0 that almost surely w∗ (tn ) = 0. But this is impossible as almost surely w∗ does not have a zero on the interval (0, a − γ a ]. 2. Let β a max {w (s) : 0 ≤ s ≤ a} , ρa inf {0 ≤ s ≤ a : w (s) = β a } . We show that ρa is not a stopping time. As P (w (a) − w (a/2) < 0) = 1/2 P (ρa < a) > 0. If ρa were a stopping time, then by the strong Markov property w∗ (t) w (t + ρa ) − w (ρa ) would be a Wiener process. But this is impossible as with positive probability the interval (0, a − ρa ] is not empty and on this interval w∗ cannot have a positive value. An important consequence of the strong Markov property is the following: Proposition 1.111 If the size of the jumps of a L´evy process X are smaller than a constant c > 0, that is |∆X| ≤ c then on any interval [0, t] the moments of X are uniformly bounded. That is for each m there is a constant K (m, t), that E (|X m (s)|) ≤ K (m, t) ,
s ∈ [0, t] .
Proof. One may assume that the stopping time67 τ 1 inf {t : |X (t)| > c} 65 See:
Corollary B.10, page 566. Corollary B.8, page 565. 67 Recall that F satisfies the usual assumptions. See: Example 1.32, page 17. 66 See:
MARTINGALES
75
is finite, as by the zero-one law the set of outcomes ω where τ 1 (ω) = ∞ has probability 0 or 1. If with probability one τ 1 (ω) = ∞ then X is uniformly bounded, hence in this case the proposition holds. Then define the stopping time τ 2 inf {t : |X ∗ (t)| > c} + τ 1 inf {t : |X (t + τ 1 ) − X (τ 1 )| > c} + τ 1 . In a similar way let us define τ 3 etc. By the strong Markov property the variables {X ∗ (t) : t ≥ 0} are independent of the σ-algebra Fτ 1 . The variable τ 2 − τ 1 inf {t ≥ 0 : |X ∗ (t)| > c} is measurable with respect to the σ-algebra generated by the variables {X ∗ (t) : t ≥ 0} hence τ 2 − τ 1 is independent of Fτ 1 . In general τ n − τ n−1 is independent of Fτ n−1 . Also by the strong Markov property for all n the distribution of τ n − τ n−1 is the same as the distribution of τ 1 . Therefore if τ 0 0, then using the independence of variables (τ k − τ k−1 )
E (exp (−τ n )) = E exp −
n
(τ k − τ k−1 )
n
= (E (exp (−τ 1 ))) q n ,
k=1
where 0 < q ≤ 1. If q = 1 then almost surely τ 1 = 0, which by the rightcontinuity implies that |X (0)| ≥ c > 0, which, by the definition of L´evy processes, is not the case, so q < 1. As the jumps are smaller than c |X (τ 1 )| ≤ |X (τ 1 −)| + |∆X (τ 1 )| ≤ ≤ |X (τ 1 −)| + c ≤ 2c. In a same way it is easy to see that in general sup |X τ n (t)| = sup |{X (t) : t ∈ [0, τ n ]}| ≤ 2nc. t
Therefore by Markov’s inequality P (|X (t)| > 2nc) ≤ P (τ n < t) = P (exp (−τ n ) > exp (−t)) ≤ ≤
E (exp (−τ n )) ≤ exp (t) q n . exp (−t)
As q < 1 L(m)
∞ n=0
m
[2 (n + 1) c] q n < ∞,
76
STOCHASTIC PROCESSES
so m
E (|X (t)| ) ≤
∞
m
[2 (n + 1) c] · P (|X (t)| > 2nc) ≤
n=0
≤ exp (t)
∞
m
[2 (n + 1) c] q n exp (t) L (m) ,
n=0
from which the proposition is evident. One can generalize these observations. Proposition 1.112 (Strong Markov property for processes with independent increments) Let X be a process with independent increments and assume that X is continuous in probability. Let D ([0, ∞)) denote the space of right-regular functions over [0, ∞) and let H be the σ-algebra over D ([0, ∞)) generated by the coordinate functionals. If f is a non-negative H-measurable functional68 over D ([0, ∞)), then for every stopping time τ < ∞ E (f (X ∗ ) | Fτ ) = E (f (Xs∗ )) |s=τ where Xs∗ (t) X (s + t) − X (s) . Proof. Let ϕ (u, t) be the Fourier transform of X (t). As X is continuous in probability ϕ (u, t) = 0 and Z (u, t)
exp (iuX (t)) ϕ (u, t)
is a martingale69 . Let τ be a bounded stopping time. By the Optional Sampling Theorem E (Z (u, τ + s) | Fτ ) = Z (u, τ ) . ϕ (u, τ + t) is Fτ -measurable. Therefore E (exp (iuX ∗ (t)) | Fτ ) E (exp (iu (X (τ + t) − X (τ ))) | Fτ ) =
(1.53)
ϕ (u, s + t) ϕ (u, τ + t) = |s=τ = ϕ (u, τ ) ϕ (u, s)
68 It is easy to see that f (X) = g (X (t ) , X (t ) , . . .) where g is an R∞ → R Borel mea1 2 surable function and (tk ) is a countable sequence in R+ . The canonical example is f (X) sups≤t |X (s)|. 69 See:
Proposition 1.99, page 63.
MARTINGALES
=
77
ϕ (u, s) E (exp (iu (X (t + s) − X (s)))) |s=τ = ϕ (u, s) = E (exp (iu (Xs∗ (t)))) |s=τ .
If τ is not bounded then τ n τ ∧ n is a bounded stopping time. Let h (s) E (exp (iu (X (s + t) − X (s)))) As τ < ∞ X (τ n + t) − X (τ n ) → X (τ + t) − X (τ ) So by the Dominated Convergence Theorem h (τ n ) → h (τ ). If A ∈ Fτ then A ∩ {τ ≤ n} ∈ Fτ n therefore
χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) =
A
χ (τ ≤ n) h (τ n ) dP. A
By the Dominated Convergence Theorem one can take the limit n → ∞. Hence in (1.53) we can drop the condition that τ is bounded. With the Monotone Class Theorem one can prove that for any Borel measurable set B E (χB (X ∗ (t)) | Fτ ) = E (χB (Xs∗ (t))) |s=τ In the usual way, using multi-dimensional trigonometric polynomials and the Monotone Class Theorem several times, one can extend the relation to every H-measurable and bounded function. Finally one can prove the proposition with the Monotone Convergence Theorem. Corollary 1.113 Under the same conditions as above E (f (X ∗ ) | τ = s) = E (f (Xs∗ )) . Let us remark, that if X is a L´evy process then the distribution of Xs∗ is the same as the distribution of X for every s so E (f (X ∗ ) | Fτ ) = E (f (X))
78
STOCHASTIC PROCESSES
for every τ < ∞. If f (X) exp (i
E exp i
n
n k=1
∗
| Fτ
uk X (tk )
uk X (tk )) then
= E exp i
k=1
n
uk X (tk )
.
k=1
The right-hand side is deterministic which implies that (X ∗ (t1 ), X ∗ (t2 ), . . . X ∗ (tn )) is independent of Fτ and has the same distribution as (X(t1 ), X(t2 ), . . . , X(tn )). Proposition 1.114 If X is a process with independent increments and X is continuous in probability, and the jumps of X are bounded by some constant c, then all the moments of X are uniformly bounded on any finite interval, that is, for every t E (|X m (s)|) ≤ K (m, t) < ∞,
s ∈ [0, t] .
Proof. Let us fix a t. X has right-regular trajectories so on any finite interval the trajectories are bounded. Therefore sups≤2t |X (s)| < ∞. Hence if b is sufficiently large then P
sup |X (s)| >
s≤2t
b 2
< q < 1.
Let τ inf {s : |X (s)| > a} ∧ 2t. By the definition of τ {τ < t} ⊆
sup |X(s)| > a ⊆ {τ ≤ t}. s≤t
If for some ω. ω∈
sup |X(s)| > a \{τ < t} s≤t
then sup |X(s, ω)| ≤ a s
but X(t, ω) > a, so process X has a jump at (t, ω), which by the stochastic continuity of X has probability zero. As the size of the jumps is bounded
MARTINGALES
79
by the right-continuity sup |X(s)| ≤ sup |X(s−)| + sup |∆X(s)| ≤ a + c. s≤τ
s≤τ
s≤τ
We show that this implies that
sup |X(s)| > a + b + c ⊆ sup |X(s)| > a, sup |X(τ + s) − X(τ )| > b . s≤t
s≤t
s≤t
If sup |X(s)| > a + b + c s≤t
then obviously sups≤t |X(s)| > a, hence τ ≤ t, so if sups≤t |X(τ +s)−X(τ )| ≤ b, then sup |X(s)| ≤ sup |X(s)| + sup |X(τ + s) − X(τ )| ≤ a + b + c. s≤t
s≤τ
s≤t
Which is impossible. If u ≤ t, then sup |X(u + s) − X(u)| ≤ 2 sup |X(s)|. s≤t
s≤2t
Therefore if u ≤ t, then b sup |X(u + s) − X(u)| > b ⊆ sup |X(s)| > . 2 s≤t s≤2t Let F be the distribution function of τ . By the just proved strong Markov property
P sup |X (s)| > a + b + c ≤ s≤t
≤ P sup |X (s)| > a, sup |X (τ + s) − X (τ )| > b s≤t
=
s≤t
= P τ < t, sup |X (τ + s) − X (τ )| > b =
s≤t
P sup |X ((τ + s)) − X (τ )| > b | τ = u dF (u) =
= [0,t)
s≤t
= [0,t)
P sup |X (u + s) − X (u)| > b dF (u) ≤ s≤t
80
STOCHASTIC PROCESSES
≤P
sup |X (s)| >
s≤2t
b 2
· P (τ < t) =
= q · P (τ < t) ≤ q · P sup |X (s)| > a . s≤t
From this for an arbitrary n
P sup |X (s)| > n (b + c) ≤ q n . s≤t
Hence ∞ m m m E (|X (t)| ) ≤ E sup |X (s)| (n (b + c)) q n−1 < ∞. ≤ s≤t
n=1
We shall return to L´evy processes in section 7.1. If the reader is interested only in L´evy processes then they can continue the reading there. 1.3.8
Application: the first passage times of the Wiener processes
In this subsection we present some applications of the Optional Sampling Theorem. Let w be a Wiener process. We shall discuss some properties of the first passage times τ a inf {t : w (t) = a} .
(1.54)
The set {a} is closed and w is continuous, hence τ a is a stopping time70 . Recall that71 almost surely lim sup w (t) = ∞, t→∞
lim inf w (t) = −∞. t→∞
(1.55)
Therefore as w is continuous τ a is almost surely finite. Example 1.115 The martingale convergence theorem does not hold in L1 (Ω).
Let w be a Wiener process and let X w + 1. Let τ be the first passage time of zero for X, that is let τ inf {t : X (t) = 0} = τ −1 inf {t : w (t) = −1} . 70 See: 71 See:
Example 1.32, page 17. Proposition B.7, page 564.
MARTINGALES
81
As X is martingale X τ is a non-negative martingale. By the martingale convergence theorem for non-negative martingales 72 if t ∞ then X τ (t) is almost surely convergent. As we remarked, τ is almost surely finite therefore obviously X τ (∞) = 0. By the Optional Sampling Theorem X τ (t)1 = X(τ ∧ t)1 = E (X(τ ∧ t)) = E (X(0)) = 1 for any t. Hence the convergence does not hold in L1 (Ω). Example 1.116 If a < 0 < b and τ a and τ b are the respective first passage times of some Wiener process w, then P (τ a < τ b ) =
b , b−a
P (τ b < τ a ) =
−a . b−a
By (1.55) with probability one, the trajectories of w are unbounded. Therefore as w starts from the origin the trajectories of w finally leave the interval [a, b]. So P (τ a < τ b ) + P (τ b < τ a ) = 1. If τ τ a ∧ τ b then wτ is a bounded martingale. Hence one can use the Optional Sampling Theorem. Obviously wττ is either a or b, hence E (wττ ) = aP (τ a < τ b ) + bP (τ b < τ a ) = E (wτ (0)) = 0. We have two equations with two unknowns. Solving this system of linear equations, one can easily deduce the formulas above. Example 1.117 Let a < 0 < b and let τ a and τ b be the respective first passage times of some Wiener process w. If τ τ a ∧ τ b , then E (τ ) = |ab|.
With direct calculation it is easy to see that the process w2 (t)−t is a martingale. From this it is easy to show that the process X (t) (w(t) − a) (b − w(t)) + t is also a martingale. By the Optional Sampling Theorem |ab| = −ab = E (X (0)) = E (X (τ ∧ n)) = = E (w (τ ∧ n) − a) (b − w (τ ∧ n)) + E (τ ∧ n) . 72 See:
Corollary 1.66, page 40.
82
STOCHASTIC PROCESSES
If n ∞ then by the Monotone and by the Dominated Convergence Theorems the limit of the right-hand side is E (τ ). Example 1.118 Let w be a Wiener process. The Laplace transform of the first passage time τ a is √ L (s) E (exp (−sτ a )) = exp − |a| 2s ,
s ≥ 0.
(1.56)
Let a > 0. For every u the process X (t) exp u · w (t) − t · u2 /2 is a martingale73 . So the truncated process X τ a is also a martingale. If u ≥ 0, then
0≤X
τa
u2 t (t) ≤ exp ua − 2
≤ exp (au) ,
hence X τ a is a bounded martingale. Every bounded martingale is uniformly integrable, therefore one can apply the Optional Sampling Theorem. So
u2 τ a E Xττaa = E exp ua − = E (X τ a (0)) = 1. 2 Hence
u2 τ a E exp − 2 If u
√
= exp (−ua) .
2s ≥ 0 then √
L (s) E (exp (−sτ a )) = exp −a 2s .
If a < 0 then repeating the calculations for the Wiener process −w √
L (s) = exp − |a| 2s .
Example 1.119 The Laplace transform of the first passage time of the reflected Wiener process |w| is (s) E (exp (−s L τ a )) =
73 See:
(1.44), page 66.
1 √ , cosh a 2s
s ≥ 0.
(1.57)
MARTINGALES
83
By definition τ a inf {t : |w (t)| = a} . Let 2 u t exp (uw (t)) + exp (−uw (t)) exp − X (t) 2 2 2 u t cosh (uw (t)) exp − . 2 X is the sum of two martingales, hence it is a martingale. X τ a ≤ cosh (ua), therefore one can again apply the Optional Sampling Theorem. 2
τa u τa E Xτ a = E cosh (ua) exp − = 1, 2 therefore
E exp If u
√
−u2 τa 2
=
1 . cosh (ua)
2s then E (exp (−s τ a )) =
1
√ . cosh a 2s
Example 1.120 The density function of the distribution of the first passage time τ a of a Wiener process is −1/2 a2 f (x) = |a| 2πx3 . exp − 2x
(1.58)
By the uniqueness of the Laplace transform √ it is sufficient to prove that the Laplace transform of (1.58) is exp − |a| 2s . By the definition of the Laplace transform ∞ exp (−sx) f (x) dx, s ≥ 0. L (s) 0
If F denotes the distribution function of (1.58) then F (x)
x
f (t) dt = 2 0
a
∞
√
2 1 u du, exp − 2x 2πx
(1.59)
84
STOCHASTIC PROCESSES
since if we substitute t xa2 /u2 , then 2 au3 u √ exp − xa2 (−2) u−3 du = 3 2πx3 2x a ∞ 2 ∞ 1 u √ =2 exp − du. 2x 2πx a
a
F (x) =
Integrating by parts and using that F (0) = 0, if s > 0 then L (s) = [exp (−sx) F
∞ (x)]0
∞
s exp (−sx) F (x) dx =
+ 0
∞
=s
exp (−sx) F (x) dx. 0
By (1.59)
∞
L (s) = 2s
exp (−sx) 0
a
∞
2 1 u √ dudx. exp − 2x 2πx
Fix s and let us take L (s) as a function of a. Let us denote this function by g (a). We show that if a > 0 then g (a) satisfies the differential equation d2 g (a) = 2sg (a) . da2
(1.60)
The integrand is non-negative, so by Fubini’s theorem one can change the order of the integration, so
∞
∞
g (a) = 2s a
0
exp (−sx) √
2 u 1 dxdu. exp − 2x 2πx
As 0
∞
√
1 1 exp (−sx) dx = √ Γ 2πx 2πs
1 <∞ 2
√ 1/ 2πx exp (−sx) is integrable, and dominates the integrand. Hence the inner integral is a continuous function of u. Using this, one can differentiate with respect the integral sign g (a) = −2s
0
∞
exp (−sx) √
2 1 a dx. exp − 2x 2πx
MARTINGALES
85
We can differentiate under the integral sign as on the interval a ∈ (b, c) exp (−sx) √
2 b exp − 3 2x 2πx c
is integrable and dominates the partial derivatives ∂ ∂a
exp(−sx) √
2 2 a a 1 −a = exp(−sx) √ . exp − exp − 2x 2x 2πx 2πx3
Hence g (a) = 2s
∞
exp (−sx) √
0
= 2s
∞
2 a exp − dx = 3 2x 2πx a
exp (−sx) f (x) dx = 2s · L (s) 2s · g (a) .
0
The characteristic polynomial of this second-order linear differential equation is √ λ2 −2s = 0, which has the roots λ1,2 = ± 2s. So the general solution of equation (1.60) is √
√
A exp a 2s + B exp −a 2s . As L (0) = A + B = 1 and L (∞) = 0 √
L (s) = exp −a 2s .
Example 1.121 The Fourier transform of τ a is ϕ (t) = exp −a |t| (1 − i · sgn t) .
√ The formula exp −a 2s has an analytic extension to the half plane Re (z) > 0. If s > 0 and z s + it then 1 1 z 1 log z = exp ln (|z|) exp i arg = exp 2 2 2 |z| " arctan (t/s) arctan (t/s) 4 = s2 + t2 cos + i sin . 2 2
z
1/2
86
STOCHASTIC PROCESSES
The complex Laplace transform is continuous so ϕ (t) = L (−it) = #
$
√ " arctan −t arctan −t 4 s s 2 2 + i sin = = lim exp −a 2 s + t cos s0 2 2
"
π
π = = exp −a 2 |t| cos − sgnt + i sin − sgnt 4 4 "
= exp −a |t| (1 − i · sgnt) .
Example 1.122 The maximum process of a Wiener process. Let w be a Wiener process, and let us introduce the maximum process S (t) sup w (s) = max w (s) . s≤t
s≤t
We show that for every a ≥ 0 and t ≥ 0 P (S (t) ≥ a) = P (τ a ≤ t) = 2 · P (w (t) ≥ a) = P (|w (t)| ≥ a) .
(1.61)
The first and last equality are trivial. We prove the second one: recall that the density function of the distribution of τ a is 2 d 1 d a P (τ a ≤ t) F (t) f (t) = a √ . exp − 3 dt dt 2t 2πt √ w (t) ∼ = N 0, t , so a = U (t) 2 · P (w (t) ≥ a) = 2 1 − Φ √ t 2 = √ 2π
u2 exp − du. √ 2 a/ t
∞
Differentiating with respect to t 2 a a d U (t) = √ exp − t−3/2 , dt 2t 2π
MARTINGALES
87
hence the derivatives of P (τ a ≤ t) and 2 · P (w (t) ≥ a) with respect to t are the same. The two functions are equal if t = 0, therefore 2 · P (w (t) ≥ a) = P (τ a ≤ t) for every t.
Example 1.123 The density function of S (t) sups≤t w (s) is f (x) = √
x2 2 exp − , 2t 2πt
x > 0.
√ By (1.61) P (S (t) ≥ x) = 2 1 − Φ x/ t . Differentiating we get the formula. Example 1.124 If w is a Wiener process then
π E sup |w (s)| = , 2 s≤1
2 E sup w (s) = . π s≤1
Let S (t) sup |w (s)| = max |w (s)| , s≤t
s≤t
τ a inf {t : |w (t)| = a} . If x > 0, then74 s s
P S (t) ≤ x = P max xw 2 ≤ x = P max w 2 ≤ 1 = s≤t s≤t x x t = P max2 |w (s)| ≤ 1 = P τ1 ≥ 2 = x s≤t/x 1 x =P √ ≤ √ . τ1 t If σ > 0, then %
74 Recall
2 π
0
∞
x2 exp − 2 2σ
dx = σ.
that s → xw s/x2 is also a Wiener process.
88
STOCHASTIC PROCESSES
The expected value depends only on the distribution, so by Fubini’s theorem and by (1.57) % 2
2 ∞ τ1 x 1 dx = exp − E S (1) = E √ =E π 0 2 τ1 % ∞ 2 2 τ1 x E exp − = dx = π 0 2 % ∞ % ∞ 2 1 2 exp (x) dx = 2 dx = = π 0 cosh x π 0 exp (2x) + 1 % ∞ % % 2 1 2 π π · = . =2 dy = 2 π 1 y2 + 1 π 4 2
In a similar way, if S denotes the supremum of w then E (S (1)) = E %
1 √ τ1
%
=E
2 π
∞
0
x2 τ 1 exp − 2
dx
=
2 x τ1 E exp − dx = 2 0 % ∞ % 2 2 . exp (−x) dx = = π 0 π
=
2 π
∞
One can prove the last relation with (1.61) as well: % E (S (1)) = E (|w (1)|) =
2 π
0
∞
x2 x exp − 2
%
dx =
2 π
Example 1.125 The intersection of a two-dimensional Wiener process with a line has Cauchy distribution.
Let w1 and w2 be independent Wiener processes, and let us consider the line75 L {x = a} where a > 0. The two-dimensional process w (t) (w1 (t) , w2 (t)) meets L the first time at τ a inf {t : w1 (t) = a} . 75 The Wiener processes are invariant under rotation so the result is true for an arbitrary line. One can generalize the result to an arbitrary dimension. In the general case, we are investigating the distribution of the intersection of the Wiener processes with hyperplanes.
MARTINGALES
89
What is the distribution of the y coordinate that is what is the distribution of w2 (τ a )?
1. For an arbitrary u the process t → u−1 w1 u2 t is also a Wiener process, hence the distribution of its maximum process is the same as the distribution of the maximum process of w1 . Let us denote this maximum process by S1 . √ 1 P (τ a ≥ x) = P (S1 (x) ≤ a) = P xS1 √ 2x ≤ a = ( x) 2
√ a = P xS1 (1) ≤ a = P ≥x . S12 (1) w intersects L at w2 (τ a ). τ a is σ (w1 )-measurable, and as w1 and w2 are independent, that is the σ-algebras σ (w2 ) and σ (w1 ) are independent, τ a is independent of w2 . We show that √ w2 (τ a ) ∼ = τ a · w2 (1)
(1.62)
√ that is, the distribution of w2 (τ a ) is the same as the distribution of τ a · w2 (1). Using the independence of τ a and w2 √
P (w2 (τ a ) ≤ x | τ a = t) = P (w2 (t) ≤ x) = P tw2 (1) ≤ x , and √
√ tw2 (1) ≤ x . P ( τ a w2 (1) ≤ x | τ a = t) = P Integrating both equations by the distribution of τ a we get (1.62). Hence √ a a w2 (τ a ) ∼ · w2 (1) ∼ · w2 (1) . = τ a · w2 (1) ∼ = = S1 |w1 (1)| w1 (1) and w2 (1) are independent with distribution N (0, 1). Therefore w2 (τ a ) has a Cauchy distribution. 2. One can also prove the relation with Fourier transforms. Let us calculate the
Fourier transform of w2 (τ a )! The Fourier transform of N (0, 1) is exp −t2 /2 . By the independence of τ a and w2 and by (1.56) ϕ (t) E (exp (itw2 (τ a ))) = ∞ E (exp (itw2 (τ a )) | τ a = u) dG (u) = = 0
=
∞
E (exp (itw2 (u))) dG (u) = 0
90
STOCHASTIC PROCESSES
t2 = exp − u dG (u) = 2 0 2 2 t t = E exp − τ a L = 2 2 √
= exp −a t2 = exp (−a |t|) ,
∞
which is the Fourier transform of a Cauchy distribution. Example 1.126 The process of first passage times of Wiener processes.
Let w be a Wiener process and let us define the hitting times τ a inf {t : w (t) = a} ,
σ a inf {t : w (t) > a} .
w is continuous, the set {x > a} is open, hence σ a is a weak stopping time. As the augmented filtration of w is right-continuous σ a is a stopping time76 . w has continuous trajectories so obviously τ a ≤ σ a . As the trajectories of w can contain ‘peaks and flat segments’ it can happen that for some outcomes τ a is strictly smaller than σ a . As we shall immediately see almost surely τ a = σ a . One can define the stochastic processes T (a, ω) τ a (ω),
S(a, ω) σ a (ω)
with a ∈ R+ . It is easy to see that T and S have strictly increasing trajectories. If an a then w(τ an ) = an a, hence obviously τ an τ a , so T is leftcontinuous. On the other hand, it is easy to see that if an a, then σ an σ a , hence S is right-continuous. It is also easy to see, that T (a+, ω) = S(a, ω) and S(a−, ω) = T (a, ω) for all ω. Obviously τ a and σ a are almost surely finite. By the strong Markov property of w w∗ (t) w(τ a + t) − w(τ a ) is also a Wiener process. {τ a < σ a } is in the set {w∗ (t) ≤ 0 on some interval [0, r] , r ∈ Q} . As w∗ is a Wiener process it is not difficult to prove77 that if r > 0 then P (w∗ (t) ≤ 0, ∀t ∈ [0, r]) = 0. 76 See: 77 See:
Example 1.32, page 17. Corollary B.12, page 566.
MARTINGALES
91
Hence P (τ a = σ a ) = P (τ a < σ a ) = 0 for every a. Therefore S is a right-continuous modification of T . Obviously if b > a and τ ∗b−a is the first passage time of w∗ to b − a then τ b − τ a = τ ∗b−a . By the strong Markov property τ ∗b−a is independent of Fτ a . Therefore T (b) − T (a) is independent of Fτ a . In general, one can easily prove that T and therefore S have independent increments with respect to the filtration Ga Fτ a . Obviously S(0) = 0, hence S is a L´evy process with respect to the filtration G. 1.3.9
Some remarks on the usual assumptions
The usual assumptions are crucial conditions of stochastic analysis. Without them very few statements of the theory would hold. The most important objects of stochastic analysis are related to stopping times, as these objects express the timing of events. The main tool of stochastic analysis is measure theory. In measure theory, objects are defined up to measure-zero sets. From a technical point of view, of course it is not a great surprise that we want to guarantee that every random time, which is almost surely equal to a stopping time, should also be a stopping time. The definition of a stopping time is very natural: at time t one can observe only τ ∧ t so we should assume τ ∧ t to be Ft -measurable for every t. Hence if τ and τ are almost surely equal and they differ on a set N , then every subset of N should be also Ft -measurable. This implies that one should add all the measure-zero sets and all their subsets to the filtration78 . The right-continuity of the filtration is more problematic; it assumes that somehow we can foresee the events of the near future. At first sight is seems natural; in our usual experience we always have some knowledge about the near future. Our basic experience is speed and momentum, and these objects are by definition the derivatives of the trajectories. By definition, differentiability means that the right-derivative is equal to the left-derivative and the left-derivative depends on the past and the present. So in our differentiable world we always know the right-derivative, hence—infinitesimally—we can always see the future. But in stochastic analysis we are interested in objects which are non-differentiable. Recall that for a continuous process the hitting time of a closed set is a stopping time79 . At the moment that we hit a closed set we know that we are in the set. But what about the hitting times80 of open sets? We hit an open set at its boundary and when we hit it we are generally still outside the set. Recall that the hitting time of an open set is a stopping time only when the filtration is right-continuous81 . That is, when we hit the boundary of an open set—by the 78 See:
Example 6.37, page 386. Example 1.32, page 17. 80 See: Definion 1.26, page 15. 81 See: Example 1.32, page 17. 79 See:
92
STOCHASTIC PROCESSES
right-continuity of the filtration—we can ask for some extra information about the future which tells us whether we shall really enter the set or not. This is, of course, a very strong assumption. If we want to go to a restaurant and we are at the door, we know that we shall enter the restaurant. But a Wiener process can easily turn back at the door. One of the most surprising statements of the theory is that the augmented filtration of a L´evy process is right-continuous. This is true not only for L´evy processes, but under more general conditions82 . It is important to understand the reason behind this phenomena. The probability that a one-dimensional Wiener process hits the boundary of an open set without actually entering the set itself has zero83 probability! And in general the rightcontinuity of an augmented filtration means that all the events which need some insight into the future84 have zero probability. We cannot see the future, we are just ignoring the irrelevant information!
1.4
Localization
Localization is one of the most frequently used concepts of mathematical analysis. For example, if f is a continuous function on R, then of course generally f is not integrable on the whole real line. But this is not a problem at all. We can x still talk about the integral function F (x) 0 f (t)dt of f . The functions of Calculus are not integrable, they are just locally integrable. In the real analysis we say that a certain property holds locally if it holds on every compact subset of the underlying topological space85 . In the real line it is enough to ask that the property holds on any closed, bounded interval, in particular for any t the property should hold on any interval [0, t]. Very often, like in the case of local integrability, it is sufficient to ask that the property should hold on some intervals [0, tn ] where tn ∞. In stochastic analysis we should choose the upper bounds tn in a measurable way with respect to the underlying filtration. This explains the next definition: Definition 1.127 Let X be a family of processes. We say that process X is locally in X if there is a sequence of stopping times (τ n ) for which almost surely86 τ n ∞, and the truncated processes X τ n belong to X for every n. The sequence (τ n ) is called the localizing sequence of X. Xloc denotes the set of processes locally belonging to X . A specific problem of the definition above, is that with localization one cannot modify the value of the variable X(0), since every truncated process X τ n at the 82 This is true e.g. for so called Feller processes, which form an important subclass of the Markov processes. 83 See: Example 1.126, page 90, Corollary B.12, page 566. But see: Example 6.10, page 364. 84 Like sudden jumps of the Poisson processes. 85 Generally the topological space is locally compact. 86 Almost surely and not everywhere! See: Proposition 1.130, page 94.
LOCALIZATION
93
time t = 0 has the same value X(0). To overcome this problem some authors87 instead of using X τ n use the process X τ n χ (τ n > 0) in the definition of the localization or instead of X they localize the process X − X(0). In most cases it does not matter how we define the localization. First of all we shall use the localization procedure to define the different classes of local martingales. From the point of view of stochastic analysis, one can always assume that every local martingale is zero at time t = 0, as our final goal is to investigate the class of semimartingale, and the semimartingales have the representation X(0) + L + V, where L is a local martingale, zero at time t = 0. Just to fix the ideas we shall later explicitly concretize the definitions in the cases of local martingales and locally bounded processes. In both cases we localize the processes X − X(0). 1.4.1
Stability under truncation
It is quite natural to ask for which type of processes X one has (Xloc )loc = Xloc . Definition 1.128 We say that space of processes X is closed or stable under truncation or closed under stopping if whenever X ∈ X then X τ ∈ X for arbitrary stopping time τ . It is an important consequence of
this property that if X is closed under trun(k) cation and Xk ∈ Xloc and τ n are the localizing sequences of the processes (k)
Xk , then τ n ∧m k=1 τ n for any finite m is a common localizing sequence of the first m processes. That is, if X is closed under the truncation, then for a finite number of processes we can always assume that they have a common localizing sequence. From the definition it is clear that if X is closed under the truncation, then Xloc is also closed under the truncation as, if (τ n ) is a localizing sequence of X and τ is an arbitrary stopping time, then (τ n ) is obviously a localizing sequence of the truncated process X τ . Example 1.129 M, the space of uniformly integrable martingales, H2 , the space of the square-integrable martingales and K, the set of bounded processes are closed under truncation. It is obvious from the definition that K is closed under truncation. By the Optional Sampling Theorem if M ∈ M, then M τ ∈ M. As 2 2 τ ≤E sup |X (t)| <∞ E sup |X (t)| t≥0
H2 is also closed under truncation.
87 See:
e.g. [78].
t≥0
94
STOCHASTIC PROCESSES
Proposition 1.130 If the filtration F is right-continuous and if X is closed under truncation, then (Xloc )loc = Xloc . Proof. τ n ≡ ∞ is a localizing sequence, therefore Xloc ⊆ (Xloc )loc . Let X ∈ (Xloc )loc and let (τ n ) be a localizing sequence with X τ n ∈ Xloc . As Xloc is closed under truncation X τ n ∧n ∈ Xloc for an arbitrary n, so one can assume that τ n σ is bounded. Let σ k be such a stopping time that (X τ k ) k = X τ k ∧σk ∈ X , and −k P (σ k ≤ τ k ) ≤ 2 . Let us define the sequence ρn τ n ∧ inf σ k . k≥n
Obviously
inf σ k < t = ∪k≥n {σ k < t} ∈ Ft
k≥n
so inf k≥n σ k is a weak stopping time. The filtration F is right-continuous, hence inf k≥n σ k is a stopping time and therefore ρn is a stopping time. The sequence (ρn ) is increasing and P (ρn < τ n ) ≤
∞ k=n
≤
∞
P (σ k < τ n ) ≤
∞
P (σ k ≤ τ k )
k=n
2−k = 2−n+1 → 0.
k=n
Almost surely τ n ∞ and therefore outside of a measure-zero set ρn ∞, that is (ρn ) is a localizing sequence88 of X ∈ (Xloc )loc so X ∈ Xloc . 1.4.2
Local martingales
Let us specify the definition of localization to local martingales. Definition 1.131 Let us denote by M the set of uniformly integrable martingales. Process X is a local martingale if X − X(0) ∈ Mloc , that is a process X a.s.
is a local martingale if there is a sequence of stopping times (τ n ), τ n ∞ such that X τ n − X(0) ∈ M for all n. Let us denote89 by L the set of local martingales which are zero at time t = 0. X is a local martingale, if there is an L ∈ L such that X = X(0) + L. 88 Let 89 See:
us observe that (ρn ) converges to infinity just almost surely. Definition 3.1, page 179.
LOCALIZATION
95
Example 1.132 Every martingale is a local martingale.
If M is a martingale, then M (0) is integrable. The process X ≡ M (0) is a martingale, so without loss of generality one can assume that M (0) = 0. If M is a martingale and τ n n, then (τ n ) is a localizing sequence and M τ n (t) = M (t ∧ n) = E (M (n) | Ft ) , therefore the set (M τ n (t))t≥0 is uniformly integrable90 for all n. Example 1.133 Local martingale which is not a martingale. Let (Ω, A,P) ([0, 1] , B ([0, 1]) , λ) , where λ denotes the Lebesgue measure. If Ft A and ξ is not integrable, then the constant process X(t) ξ is not a martingale, but as X − X(0) ≡ 0 by the definition of local martingales X is a local martingale.
Proposition 1.134 If X and Y are local martingales and ξ and η are F0 -measurable random variables then Z ξX + ηY is also a local martingale. Proof. If L ∈ L and ζ is an arbitrary F0 -measurable variable then ζ + L is a local martingale, therefore one can assume that X, Y ∈ L. Let (τ n ) be the localizing sequence of X and (σ n ) be the localizing sequence of Y . As ξ and η are F0 -measurable the variables αn
∞ if |ξ| ≤ n , βn 0 if |ξ| > n
∞ 0
if |η| ≤ n if |η| > n
are stopping times. Obviously ρn τ n ∧ σ n ∧ α n ∧ β n is a stopping time and ρn ∞ so (ρn ) is a localizing sequence. ρn
Z ρn (ξX + ηY )
= χ (|ξ| ≤ n) ξX ρn + χ (|η| ≤ n) ηY ρn .
(1.63)
As X ρn , Y ρn ∈ M and as χ (|ξ| ≤ n) ξ and χ (|η| ≤ n) η are bounded F0 measurable variables, obviously Z ρn ∈ M and therefore Z is a local martingale. Let us observe that in line (1.63) we used that X, Y ∈ L that is X(0) = Y (0) = 0. If in the definition of local martingales one had used the simpler X ∈ Mloc definition, then in this proposition one should have assumed the ξ and η to be bounded. 90 See:
Lemma 1.70, page 42.
96
STOCHASTIC PROCESSES
One can observe that in the definition of local martingales we used the class of uniformly integrable martingales and not the class of martingales. If Lτ n is a martingale for some τ n , then Lτ n ∧n ∈ M, so the class of local martingales is the same as the class of ‘locally uniformly integrable martingales’. Very often we prove different theorems first for uniformly integrable martingales and then with localization we extend the proofs to local martingales. In most cases one should use the same method if one wants to extend the result from uniformly integrable martingales just to martingales. An important subclass of local martingales is the space of locally squareintegrable martingales: 2 Definition 1.135 X is a locally square-integrable martingale if X−X(0) ∈ Hloc .
Example 1.136 Every martingale which has square-integrable values is a locally square-integrable martingale. By definition a martingale X is square-integrable in ω if X(t) ∈ L2 (Ω) for every t. In this case X(0) ∈ L2 (Ω), therefore for all t X(t) − X(0) ∈ L2 (Ω), so again one can assume that X(0) = 0. If τ n n then (τ n ) is a localizing sequence. By Doob’s inequality
sup |X τ n (t)| = sup |X (t)| ≤ 2 · X (n) < ∞, 2
t≤n
t 2
2
2 so X τ n ∈ H2 and therefore X ∈ Hloc .
Example 1.137 Every continuous local martingale is locally square-integrable91 .
Let X be a continuous local martingale and let (τ n ) be a localizing sequence of X. As X is continuous σ n inf {t : |X(t)| ≥ n} is a stopping time. If ρn τ n ∧σ n then ρn ∞ and |X ρn | ≤ n by the continuity of X, so X ρn is a bounded, hence it is a square-integrable martingale. Therefore 2 . M ∈ Hloc 2 Example 1.138 Martingales which are not in Hloc .
91 One can easily generalize this example. If the jumps of X are bounded then X is in H2 . loc See: Proposition 1.152, page 107.
LOCALIZATION
97
Let us denote by σ (N ) the σ-algebra generated by the measure-zero sets. Let Ft
σ (N ) if t < 1 , A if t ≥ 1
and let ξ ∈ L1 (Ω), but ξ ∈ / L2 (Ω). Let us also assume that E (ξ) = 0. F satisfies a.s. the usual conditions, hence X (t) E (ξ | Ft ) is martingale. X (0) = 0, hence 2 / Hloc as, if the not only X ∈ Mloc , but also X ∈ L. On the other hand X ∈ stopping time τ is not almost surely constant, then almost surely τ ≥ 1, hence / L2 (Ω). for all t ≥ 1 X τ (t) = ξ ∈ It is a quite natural, but wrong, guess that local martingales are badly integrable martingales. The local martingales are far more mysterious objects. Example 1.139 Integrable local martingale which is not a martingale.
Let Ω C [0, ∞) , that is let Ω be the set of continuous functions defined on the half-line R+ . Let X be the canonical coordinate process, that is if ω ∈ Ω, then let X (t, ω) ω (t), and let the filtration F be the filtration generated by X. Let P be the probability measure defined on Ω for which X is a Wiener process starting from point 1. Let τ 0 inf {t : X(t) = 0} . Let us define the measure Q(t) on the σ-algebra Ft with the Radon–Nikodym derivative dQ (t) X (t ∧ τ 0 ) = X (t) χ (t < τ 0 ) + X (τ 0 ) χ (t ≥ τ 0 ) = dP = X (t) χ (t < τ 0 ) . As the truncated martingales are martingales, X τ 0 is a martingale under the measure P. Hence E (X (t ∧ τ 0 ) | Fs ) = X (s ∧ τ 0 ) . The measures (Q (t))t≥0 are consistent: if s < t and F ∈ Fs ⊆ Ft , then
F
dQ (s) dP dP
F
dQ (t) dP = Q (t) (F ) . dP
Q (s) (F ) =
X (s ∧ τ 0 ) dP = F
X (t ∧ τ 0 ) dP F
98
STOCHASTIC PROCESSES
In particular
X (t ∧ τ 0 ) dP =
Q (t) (Ω) Ω
X (0) dP = 1, Ω
so Q (t) is a probability measure for every t. The space C [0, ∞) is a Kolmogorov type measure space, so on the Borel sets of C [0, ∞) there is a probability measure Q, which, restricted to Ft is Q (t). {τ 0 ≤ t} ∈ Ft for every t so Q (τ 0 ≤ t) = Q (t) (τ 0 ≤ t)
Ω
χ (τ 0 ≤ t) X (τ 0 ∧ t) dP =
= Ω
χ (τ 0 ≤ t) X (τ 0 ) dP = 0, a.s.
so Q (τ 0 = ∞) = 1, that is X is almost surely never zero under Q. Hence X > 0 under Q, so under Q the process Y 1/X is almost surely well-defined. 1. As a first step let us show that Y is not a martingale under Q. To show this it is sufficient to prove that the Q-expected value of Y is decreasing to zero. As P(τ 0 < ∞) if t ∞ EQ (Y (t))
Y (t) dQ =
Ω
= Ω
Ω
1 dQ (t) = X (t)
1 χ (t < τ 0 ) X (t) dP = X (t)
χ (t < τ 0 ) dP = P (t < τ 0 ) → 0.
= Ω
2. Now we prove that Y is a local martingale under Q. Let ε > 0 and let τ ε inf {t : X(t) = ε} . X is continuous, therefore if ε 0 then τ ε (ω) τ 0 (ω) for every outcome ω. Since Q(τ 0 = ∞) = 1 obviously Q-almost surely92 τ ε ∞. Let us show, that under Q the truncated process Y τ ε is a martingale. Almost surely 0 < Y τ ε ≤ 1/ε hence Y τ ε is almost surely bounded, hence it is uniformly integrable. One should 92 Let us recall that by the definition of the localizing sequence, it is sufficient if the localizing sequence converges just almost surely to infinity.
LOCALIZATION
99
only prove that Y τ ε is a martingale under Q. If s < t and F ∈ Fs , then as τ ε < τ 0
Y
τε
(t) dQ
F
F
1 dQ (t) = X (t ∧ τ ε )
(1.64)
1 X (t ∧ τ 0 ) dP = F X (t ∧ τ ε ) χ (t < τ ε ) χ (t ≥ τ ε ) + X (t) χ (t < τ 0 ) dP = = X (t) X (τ ε ) F X (t) = χ (τ 0 > t ≥ τ ε ) dP = χ (t < τ ε ) + ε F 1 = ε + (X τ 0 (t) − ε) χ (t ≥ τ ε ) dP. ε F =
Let us prove that M (t) (X τ 0 (t) − ε) χ (t ≥ τ ε ) is a martingale under P. If σ is a bounded stopping time, then as τ ε < τ 0 by the elementary properties of the conditional expectation93 and by the Optional Sampling Theorem E (M (σ)) E ((X τ 0 (σ) − ε) χ (σ ≥ τ ε )) = = E (E ((X τ 0 (σ) − ε) χ (σ ≥ τ ε ) | Fσ∧τ ε )) = = E (E (X τ 0 (σ) − ε | Fσ∧τ ε ) χ (σ ≥ τ ε )) = = E (E (X (τ 0 ∧ σ) − ε | Fσ∧τ ε ) χ (σ ≥ τ ε )) = = E ((X (σ ∧ τ ε ) − ε) χ (σ ≥ τ ε )) = = E ((X (τ ε ) − ε) χ (σ ≥ τ ε )) = 0, which means that M is really a martingale94 . As M is a martingale under P in the last integral of (1.64) one can substitute s on the place of t, so calculating backwards 1 1 τε dQ = dQ Y (t) dQ X (t ∧ τ ) X (s ∧ τ ε) ε F F F Y τ ε (s) dQ, F
that is Y τ ε is a martingale under Q. Therefore τ 1/n localizes Y under Q. 93 See: 94 See:
Proposition 1.34, page 20. Proposition 1.91, page 57.
100
STOCHASTIC PROCESSES
Example 1.140 L2 (Ω) bounded local martingale, which is not a martingale
95
.
Let w be a standard Wiener process in R3 , and let X(t) w(t)+u where u = 0 is a fixed vector. By the elementary properties of Wiener processes96 if t → ∞ then R (t) X (t)2 → ∞.
(1.65)
With direct calculation it is easy to check that on R3 \ {0} the function g (x)
1 1 =" 2 x2 x1 + x22 + x23
is harmonic, that is97 ∆g
∂2 ∂2 ∂2 g + g + g = 0. ∂x21 ∂x22 ∂x23
Hence by Itˆo’s formula98 M 1/R is a local martingale. The density function of the X (t) is
1 2 ft (x) √ 3 exp − x − u2 . 2t 2πt 1
If t ≥ 1 then ft is uniformly bounded so if t ≥ 1 then obviously
E M 2 (t) =
R3
≤
R3
1
2 ft
x2 1
(x) dλ3 (x) ≤
2 dλ3
(x) .
x2
Evidently the last integral can diverge only around x = 0. I
x≤1
1
2 dλ3 (x) =
x2
k
G(k)
1
2 dλ3
x2
(x)
95 We shall use several results which we shall prove later, so one can skip this example during the first reading. 96 See: Proposition B.7, page 564, Corollary 6.9, page 363. 97 Now ∆ denotes the Laplace operator. 98 See: Theorem 6.2, page 353. As n = 3 almost surely X(t) = 0 hence we can use the formula. See: Theorem 6.7, page 359.
LOCALIZATION
101
where G (k) =
1 2k+1
1 < x2 ≤ k 2
.
As 2k G (k) = G (0) using the transformation T (x) 2k x 1 1 3k dλ (x) = 3 2 2 2 dλ (x) = k G(0) x2 G(k) 2 x2 1 k =2 2 dλ3 (x) . G(k) x2 Hence I=
∞
2−k
k=0
G(0)
1
2 dλ3
x2
(x) < ∞.
2 It is easy to show that E
2 M (t) is continuous in t. Therefore it is bounded on [0, t]. Hence E M (t) is bounded on R+ . By (1.65) M (t) → 0. M is bounded in L1
L2 (Ω) therefore it is uniformly integrable, so M (t) → 0. If M were a martingale then 0 = M (t) = E (M (∞) | Ft ) = E (0 | Ft ) = 0, which is impossible. As the uniformly integrable local martingales are not necessarily martingales even the next, nearly trivial observation is very useful: Proposition 1.141 Every non-negative local martingale is a supermartingale. Proof: Let M = M (0) + L be a non-negative local martingale. Observe that by the definition of supermartingales, M (t) ≥ 0 is not necessarily integrable so one cannot assume that M (0) is integrable. As L ∈ L there is a localizing sequence (τ n ) that Lρn ∈ M for all n. If t > s, then as M ≥ 0, by Fatou’s lemma
E (M (t) | Fs ) = E lim inf M τ n (t) | Fs ≤ lim inf E (M τ n (t) | Fs ) = n→∞
n→∞
τn
= M (0) + lim inf E (L n→∞
(t) | Fs ) =
= M (0) + lim inf Lτ n (s) = M (s) . n→∞
Corollary 1.142 If M ∈ L and M ≥ 0 then M = 0. Proof: As M is a supermartingale 0 ≤ E (M (t)) ≤ E (M (0)) = 0 for all t ≥ 0, a.s. so M (t) = 0.
102
STOCHASTIC PROCESSES
The most striking and puzzling feature of local martingales is that even uniform integrability is not sufficient to guarantee that local martingales are proper martingales. The reason for it is the following: If Γ is a set of stopping times, then the uniform integrability of the family (X (t))t∈Θ does not guarantee the uniform integrability of the stopped family (X (τ ))τ ∈Γ . This cannot happen if the local martingale belongs to the so-called class D. Definition 1.143 Process X belongs to the Dirichlet–Doob class99 , shortly X is in class D, if the set {X (τ ) : τ < ∞ is an arbitrary finite-valued stopping time} is uniformly integrable. We shall also denote by D the set of processes in class D. Proposition 1.144 Let L be a local martingale. L is in class D if and only if L ∈ M that is if L is a uniformly integrable martingale. Proof: Recall that we constructed a non-negative L2 (Ω)-bounded local martingales which is not a proper martingale. 1. Let L ∈ D and let L be a local martingale. As τ = 0 is a stopping time, by the definition of D, L(0) is integrable, so one can assume that L ∈ L. If (τ n ) is a localizing sequence of L then L (τ n ∧ s) = Lτ n (s) = E (Lτ n (t) | Fs ) = = E (L (τ n ∧ t) | Fs ) . τ n ∞, hence the sequences (L (τ n ∧ s))n and (L (τ n ∧ t))n converge to L (s) and L (t). By uniform integrability the convergence L (τ n ∧ t) → L(t) holds in L1 (Ω) as well. By the L1 -continuity of the conditional expectation L (s) = E (L (t) | Fs ) , hence L is a martingale100 . Obviously the set {L(t)}t ⊆ {L(τ )}τ is uniformly integrable so L ∈ M. 2. The reverse implication is obvious: If L is a uniformly integrable martingale then by the Optional Sampling Theorem L (τ ) = E (L (∞) | Fτ ) for every stopping time τ , hence the family (L (τ ))τ is uniformly integrable101 . 99 In [77] on page 244 class D is called Dirchlet class. [74] on page 107 remarks that class D is for Doob’s class and the definition was introduced by P.A. Meyer in 1963. 100 Observe that it is enough to asssume that {L (τ )} is uniformly integrable for the set of τ bounded stopping times τ . 101 See: Lemma, 1.70, page 42.
LOCALIZATION
103
Corollary 1.145 If a process X is dominated by an integrable variable then X ∈ D, hence if X is a local martingale and X is dominated by an integrable variable102 then X ∈ M. Example 1.146 Let us assume that L has independent increments. If X exp (L) then X is a local martingale if and only if X is a martingale.
One should only prove that if X is a local martingale, then X is a martingale. By the definition of processes with independent increments, L(0) = 0, hence X(0) = 1. X is a non-negative local martingale, so it is a supermartingale103 . If m(t) denotes the expected value of X(t) then by the supermartingale property 1 ≥ m (t) > 0. Let us prove that M (t) X (t) /m (t) is a martingale. As L has independent increments, if t > s, then m (t) E (X(t)) = E (X(s)) E (exp (L(t) − L(s))) m(s)E (exp (L(t) − L(s))) . From this E (M (t) | Fs ) E =E
exp (L (t)) | Fs m (t)
=
exp (L (t) − L (s) + L (s)) | Fs m (t)
=
=
exp (L (s)) E (exp (L (t) − L (s)) | Fs ) = m (t)
=
exp (L (s)) E (exp (L (t) − L (s))) = m (t)
=
exp (L (s)) M (s) , m (s)
hence M is martingale. For arbitrary T < ∞ on the interval [0, T ] M is uniformly integrable, that is, M is in class D. As on interval [0, T ] 0 ≤ X = M m ≤ M, hence X is also in class D. Therefore X ∈ D and X is a local martingale on [0, T ]. This means that X is a martingale on [0, T ] for every T , hence X is a martingale on R+ . 102 See: 103 See:
Davis’ inequality. Theorem 4.62, page 277. Proposition 1.141, page 101.
104
STOCHASTIC PROCESSES
If a process has independent increments and the expected value of the process is zero, then it is obviously a martingale. Therefore martingales are the generalization of random walks. From an intuitive point of view one can also think about local martingales as generalized random walks as we shall later prove the next — somewhat striking— theorem: Theorem 1.147 Assume that the stochastic base satisfies the usual conditions. If a local martingale has independent increments then it is a true martingale104 . 1.4.3
Convergence of local martingales: uniform convergence on compacts in probability
Let X be an arbitrary space. In Xloc it is very natural to define the topology with localization; Xm → X, if X and the elements of the sequence (Xm ) have a common localizing sequence (τ n ) and for every n in the topology of X τn lim Xm = Xτn.
m→∞
p Let us assume105 that (Xm ) and X are in Hloc . In Hp one should define the topology with the norm
. XHp sup |X (s)| s p
If τ n ∞ and t < ∞, then for every δ > 0 one can find an n, that P (τ n ≤ t) < δ. Let ε > 0 be arbitrary. If A
sup |Xm (s) − X (s)| > ε , s≤t
then P (A) = P ((τ n ≤ t) ∩ A) + P ((τ n > t) ∩ A) ≤ ≤ P (τ n ≤ t) + P ((τ n > t) ∩ A) ≤ δ + P ((τ n > t) ∩ A) ≤ τn τn ≤ δ + P sup |Xm (s) − X (s)| > ε ≤ ≤δ+P
s≤t
τn sup |Xm s
(s) − X
τn
(s)| > ε .
104 Of course the main point is that a local martingale with independent increments has finite expected value. See: Theorem 7.97, page 545. 105 It is an important consequence of the Fundamental Theorem of Local Martingales that 1 . See Corollary 3.59, page 221. every local martingale is in Hloc
LOCALIZATION
105
By Markov’s inequality the stochastic convergence follows from the convergence τn in Lp (Ω). Therefore if limm→∞ Xm = X τ n in Hp then τn lim P sup |Xm (s) − X τ n (s)| > ε = 0.
m→∞
s
This implies that for every ε > 0 and for every t lim P sup |Xm (s) − X (s)| > ε = 0.
m→∞
s≤t
Hence one should expect that the next definition is very useful106 : Definition 1.148 We say that the sequence of stochastic processes (Xn ) converges uniformly on compacts in probability to process X if for arbitrary107 t<∞ P
sup |Xn (s) − X (s)| → 0. s≤t
ucp
We shall denote this type of convergence by → . Every stochastically convergent sequence has an almost surely convergent subucp ucp sequence. Hence if Xn → X∞ and Xn → Y∞ , then for every t < ∞ for some subsequence a.s.
sup |Xnk (s) − X (s)| → 0, s≤t
and a.s.
sup |Xnk (s) − Y (s)| → 0 s≤t
therefore X∞ and Y∞ are indistinguishable. One can also easily prove the next observations: ucp
Proposition 1.149 If for all n the processes Xn are right-regular and Xn → X∞ then X∞ is indistinguishable from a process which has right-regular trajectoucp ries. If Xn is continuous for all n and Xn → X∞ then X∞ is indistinguishable from a process which has continuous trajectories. 106 Later, as a consequence of the Fundamental Theorem of Local Martingales, we shall prove 1 , see: Corollary 3.59, page 221, so the natural topology that every local martingale is in Hloc for the space of local martingales is the uniform convergence on compacts in probability. 107 If X and X has regular trajectories then the supremums are measurable. n
106
STOCHASTIC PROCESSES
1.4.4
Locally bounded processes
One can use localization to define the class of locally bounded processes: Definition 1.150 Denote by K the set of bounded process. X ∈ K if and only if there is a real number k such that |X(t, ω)| ≤ k for all (t, ω). A process X is locally bounded if X − X(0) ∈ Kloc . As we have seen108 every regular function is bounded on every compact interval. Let us consider the stopping times τ a inf {t : |X (t)| > a} . If X is right-regular then |X (τ a )| ≥ a, but as X can reach the level a with a jump, it can happen that for certain outcomes |X (τ a )| > a. For right-continuous processes one can only use the estimation |X (τ a )| ≤ a + |∆X (τ a )| . As the jump |∆X (τ a )| can be arbitrarily large X is not necessarily bounded on the random interval [0, τ a ] {(t, ω) : 0 ≤ t ≤ τ a (ω) < ∞} .
(1.66)
On the other hand, let us assume that X is left-continuous. If τ a (ω) > 0 and |X(τ a (ω)), ω| > a for some outcome ω then by the left-continuity one can decrease the value of τ a (ω), which by definition is impossible. Hence |X (τ a )| ≤ a on the set {τ a > 0}. This means that if X is left-continuous and X(0) = 0 then X is bounded on the random interval (1.66). These observations are the core of the next two propositions: Proposition 1.151 If the filtration is right-continuous then every left-regular process is locally bounded. Proof: Let X be left-regular. The process X − X(0) is also left-regular so one can assume that X(0) = 0. Define the random times τ n inf {t : |X(t)| > n} . The filtration is right-continuous, X is left-regular so τ n is a stopping time109 . As X(0) = 0, if τ n (ω) = 0 then |X (τ n )| ≤ n. If τ n (ω) > 0 then |X (τ n (ω), ω)| > n is impossible as in this case, by the left-continuity of X one could decrease τ n (ω). 108 See: 109 See:
Proposition 1.6, page 5. Example 1.32, page 17.
LOCALIZATION
107
Hence the truncated process X τ n is bounded. Let us show that τ n ∞, that is, let us show that the sequence (τ n ) is a localizing sequence. Obviously (τ n ) is never decreasing. If for some outcome ω the sequence (τ n (ω)) were bounded then one would find a bounded sequence (tn ) for which |X(tn , ω)| > n. Let (tnk )k be a monotone, convergent subsequence of (tn ). If tnk → t∗ , then |X(tn , ω)| → ∞, which is impossible as X has finite left and right limits. Proposition 1.152 If the filtration is right-continuous and the jumps of the right-regular process X are bounded then X is locally bounded. Proof: We can again assume that X(0) = 0. Assume that |∆X| ≤ a. As in the previous proposition if τ n inf {t : |X(t)| > n} then (τ n ) is a localizing sequence, |X(τ n −)| ≤ n, therefore |X τ n | ≤ n + |∆X(τ n )| ≤ n + a.
Example 1.153 In the previous propositions one cannot drop the condition of regularity.
The process X(t)
1/t 0
if t > 0 if t = 0
is continuous from the left but not regular, and it is obviously not locally bounded. The 1/ (1 − t) if t < 1 X(t) 0 if t ≥ 1 is continuous from the right but it is also not locally bounded.
2 STOCHASTIC INTEGRATION WITH LOCALLY SQUARE-INTEGRABLE MARTINGALES In this chapter we shall present a relatively simple introduction to stochastic integration theory. Our main simplifying assumption is that we assume that the integrators are locally square-integrable martingales. Every continuous process is 2 contains the continuous local martingales. locally bounded, hence the space Hloc In most of the applications the integrator is continuous, therefore in this chapter we shall mainly concentrate on the continuous case. As we shall see, the slightly 2 more general case, when the integrator is in Hloc is nearly the same as the continuous one. The central concept of this chapter is the quadratic variation [X]. We shall show that if X is a continuous local martingale then [X] is continuous, increasing and X 2 − [X] is also a local martingale. It is a crucial observation that in the continuous case these properties characterize the quadratic variation. When the integrator X is discontinuous then the quadratic variation [X] is also discontinuous. As in the continuous case, X 2 − [X] is still a local martingale, but this property does not characterize the quadratic variation for local martingales in general. The jump process of the quadratic variation ∆ [X] satisfies 2 the identity ∆ [X] = (∆X) , and [X] is the only right-continuous, increasing 2 process for which X 2 − [X] is a local martingale and the identity ∆ [X] = (∆X) holds. When the integrators are continuous one can define the stochastic integral for progressively measurable integrands. The main difference between the 2 case is that in the discontinuous case we should take continuous and the Hloc into account the jumps of the integral. Because of this extra burden in the discontinuous case one can define the stochastic integral only when the integrands are predictable. In the first part of the chapter we shall introduce the so-called Itˆ o–Stieltjes integral. We shall use the existence theorem of the Itˆ o–Stieltjes integral to prove the existence of the quadratic variation. After this, we present the construction 108
ˆ THE ITO–STIELTJES INTEGRALS
109
of stochastic integral when the integrators are continuous local martingales. At the end of the chapter we briefly discuss the difference between the continuous 2 and the Hloc case. In the present chapter we assume that the filtration is right-continuous and if N ∈ A has probability zero, then N ∈ Fs for all s. But we shall not need the assumption that (Ω, A, P) is complete.
2.1
The Itˆ o–Stieltjes Integrals
In this section we introduce the simplest concept of stochastic integration, which I prefer to call Itˆ o–Stieltjes integration. Every integral is basically a limit of certain approximating sums. The meaning of the integral is generally obvious for the finite approximations and by definition the integral operator extends the meaning of the finite sums to some more complicated infinite objects. In stochastic integration theory we have two stochastic processes: the integrator X and the integrand Y . As in elementary analysis, let us fix an interval [a, b] and let (n)
∆n : a = t0
(n)
< t1
< · · · < t(n) mn = b
(2.1)
be a partition of [a, b]. For a fixed partition ∆n let us define the finite approximating sum Sn
mn
(n) (n) (n) X tk − X tk−1 , Y τk
k=1 (n)
where the test points τ k have been chosen in some way from the time subin(n) (n) tervals [tk−1 , tk ]. If the integrator X is the price of some risky asset then (n)
(n)
(n)
(n)
X(tk ) − X(tk−1 ) is the change of the price during the time interval [tk−1 , tk ] (n)
and if Y (τ k ) is the number of assets one holds during this time period then Sn is the net change of the value of the portfolio during the whole time period [a, b]. If (n) (n) lim max tk − tk−1 = 0 n→∞
k
then the sequence of partitions (∆n ) is called infinitesimal. In this section we b say that the integral a Y dX exists if for any infinitesimal sequence of partitions of [a, b] the sequence of approximating sums (Sn ) is convergent and the limit is independent of the partition (∆n ). The main problem is the following: under which conditions and in which sense does the limit limn→∞ Sn exist? Generally we can only guarantee that the approximating sequence (Sn ) is convergent in probability and for the existence of the integral we should assume that the test (n) points τ k have been chosen in a very restricted way. That is, we should assume,
110
STOCHASTIC INTEGRATION (n)
(n)
that τ k = tk−1 . This type of integral we shall call the Itˆ o–Stieltjes integral of Y against X. Perhaps the most important and most unusual point in the theory is (n) that we should restrict the choice of the test points τ k . The simplest example showing why it is necessary follows: Example 2.1 Let w be a Wiener process. Try to define the integral
b
wdw!
a
Consider the approximating sums Sn
(n) (n) (n) w(tk ) w(tk ) − w(tk−1 ) ,
k
and In
(n) (n) (n) w(tk−1 ) w(tk ) − w(tk−1 ) .
k (n)
In the first case τ k
(n)
tk
(n)
and in the second case τ k
Sn − In =
(n)
tk−1 . Obviously
2 (n) (n) w(tk ) − w(tk−1 ) ,
k
which is the approximating sum for the quadratic variation of the Wiener process. As we will prove1 if n → ∞ then in L2 (Ω)-norm lim (Sn − In ) = b − a = 0,
n→∞
that is the limit of the approximating sums is dependent on the choice of the test (n) points τ k . As the interpretation of the stochastic integral is basically the net (n) (n) gain of some gambling process, it is quite reasonable to choose τ k as tk−1 as one should decide about the size of a portfolio before the prices change, since it is quite unrealistic to assume that one can decide about the size of an investment after the new prices have already been announced. It is very simple to see that 1 In = 2 =
w
2
(n) (tk )
−w
2
(n) (tk−1 )
k
−
(n) w(tk )
k
2 1 (n) 1 2 (n) w(tk ) − w(tk−1 ) , w (b) − w2 (a) − 2 2 k
1 See:
−
Example 2.27, page 129, Theorem B.17, page 571.
2
(n) w(tk−1 )
=
ˆ THE ITO–STIELTJES INTEGRALS
111
hence lim In =
n→∞
=
1 & 2 'b 1 w (t) a − (b − a) = 2 2 1 1 2 w (b) − w2 (a) − (b − a) , 2 2
and similarly lim Sn =
n→∞
=
2.1.1
1 & 2 'b 1 w (t) a − (b − a) + (b − a) = 2 2 1 1 2 w (b) − w2 (a) + (b − a) . 2 2
Itˆ o–Stieltjes integrals when the integrators have finite variation
Integration theory is quite simple when the trajectories of the integrator X have finite variation on any finite interval. As a point of departure it is worth recalling a classical theorem from elementary analysis. The following simple proposition is well-known and it is just a parametrized version of one of the most important existence theorems of the calculus. Proposition 2.2 (Existence of Riemann–Stieltjes integrals) Let us fix a finite time interval [a, b]. If the trajectories of the integrator X have finite variation and the integrand Y is continuous, then for all outcomes ω the limit of the integrating sums Sn
mn
(n) (n) (n) Y (τ k ) X(tk ) − X(tk−1 ) ,
(2.2)
k=1
exists and it is independent of the choice of the infinitesimal sequence of partitions ( (n) (n) (n) (2.1) and of the choice of the test points τ k ∈ tk−1 , tk . Proof. As the trajectories Y (ω) are continuous on [a, b] they are uniformly continuous and therefore for any ε > 0 there is a δ (ω) > 0, such that if |t − t | < δ (ω) , then2 |Y (t , ω) − Y (t , ω)| <
ε . Var (X (ω) , a, b)
(2.3)
2 We can assume that Var (X (ω) , a, b) > 0, otherwise X (ω) is constant on [a, b] and the integral trivially exists.
112
STOCHASTIC INTEGRATION
If all partitions of [a, b] are finer than δ (ω) /2, that is, if for all n δ (ω) (n) (n) max tk − tk−1 < k 2 then by (2.3) 0 ≤ |Si − Sj |
(i) (i) (i) (j) (j) (j) Y (τ k ) X(tk ) − X(tk−1 ) − Y (τ l ) X(tl ) − X(tl−1 ) k
l
(i) (j) Y (θr ) − Y (θr ) (X(sr ) − X(sr−1 )) ≤ r
(j) ≤ max Y (θ(i) ) − Y (θ ) |X(sr ) − X(sr−1 )| ≤ r r r
r
(j) ≤ max Y (θ(i) r ) − Y (θ r ) Var (X, a, b) ≤ ε, r
(i)
(j)
where (sr ) is any partition containing the points (tk ) and (tl ) and the (i) (j) θ(i) and θ(j) are the original test points τ k and τ k corresponding to r r [sr−1 , sr ] respectively. So for any ω, (Sn (ω)) is a Cauchy sequence. so for all ω the limit b
Y dX a
(ω) lim Sn (ω) n→∞
exists. If (Sp ) and (Sq ) are two different approximating sequences generated by different infinitesimal sequences of partitions of [a, b] or they belong to different choices of test points and In
Sp Sq
if n = 2p if n = 2q − 1
then by the argument just presented (In ) also has a limit, which is of course the common limit of (Sp ) and (Sq ). Hence the limit does not depend on the (n) infinitesimal sequence of partitions (tk ) and does not depend on the way of (n) choosing the test points (τ k ). Definition 2.3 If the value of the integral is independent of the choice of test (n) points (τ k ) then the integral is called the Riemann–Stieltjes integral of Y against b X. Of course the integral is denoted by a Y dX.
ˆ THE ITO–STIELTJES INTEGRALS
113
Example 2.4 IfY and X have common points of discontinuity then the Riemann– b Stieltjes integral a Y dX does not exist.
If Y (t)
0 if t ≤ 0 1 if t > 0
and X (t)
0 if t < 0 1 if t ≥ 0
1 (n) then the Riemann–Stieltjes integral −1 XdY does not exist. If τ k ≤ 0 for the subinterval containing t = 0 then Sn = 0, otherwise Sn = 1. Observe that (n) if the test point τ k is the left endpoint of the subinterval, then Sn = 0, hence the so-called Itˆo–Stieltjes integral3 is zero. Our goal is to extend the integral to discontinuous integrands. As a first step, we extend the integral to regular integrands. As we saw in the previous (n) example even for left-regular integrands we cannot choose the test points τ k arbitrarily. (n)
Definition 2.5 If the value of the test point τ k is always the left endpoint (n) (n) (n) (n) of the subinterval [tk−1 , tk ], that is if τ k = tk−1 for all k, then the integral is called the Itˆ o–Stieltjes integral of Y against X. Of course the Itˆo–Stieltjes b integrals are also denoted by a Y dX. Example 2.6 If f is a simple predictable jump that is f (t)
c1 c2
if if
t ≤ t0 t > t0
then for any regular function g the Itˆ o–Stieltjes integral is a
b
f dg = c1 (g (t0 +) − g (a)) + c2 (g (b) − g (t0 +)) .
If f is a simple jump that is c1 f (t) c3 c 2
if if if
t < t0 t = t0 t > t0
then for any right-regular function g the Itˆ o–Stieltjes integral is again (2.4). 3 See
the definition below.
(2.4)
114
STOCHASTIC INTEGRATION
If t0 = b then by definition g(t0 +) = g(b+) = g(b) so in this case (2.4) is (n) obvious. Let (tk ) be an infinitesimal sequence of partitions. By the definition of the integral Sn
(n) (n) (n) f (tk−1 ) g(tk ) − g(tk−1 ) =
k
(n) (n) = c1 g(tj ) − g(a) + c2 g(b) − g(tj ) , (n)
(n)
(n)
where t0 ∈ [tj−1 , tj ). If n → ∞, then tj t0 + and as g is regular the limit limn Sn exists and it is equal to the formula given. Assume that g is right-regular. (n) (n) If t0 = tj−1 then the approximating sums do not change. If t0 = tj−1 then
(n) Sn = c1 (g (t0 ) − g (a)) + c3 g(t0 ) − g(tj ) +
(n) + c2 g(b) − g(tj ) . (n)
g is right-continuous at t0 so g (t0 ) − g(tj ) → 0, hence the limit is again the same as in the previous case. One can easily generalize the example above4 : Lemma 2.7 If every trajectory of the integrand Y is a finite number of jumps and X is a right-continuous process, then for arbitrary a < b the Itˆ o– b Stieltjes integral a Y dX exists and the approximating sums converge for every outcome ω. Example 2.8 If f is a simple spike, that is if f (t)
c 0
if if
t = t0 , t = t0
then for any right-continuous integrator the Itˆ o–Stieltjes integral of f is zero.
The approximating sum is Sn = 4 Let
0 (n)
(n)
c · g(tj+1 ) − g(tj )
(n)
if t0 = tj
(n)
if t0 = tj
us observe that the Itˆ o–Stieltjes integral is, trivially, additive.
.
ˆ THE ITO–STIELTJES INTEGRALS
115
In the first case of course limn Sn = 0, in the second case as g is right-continuous lim Sn = c lim
n→∞
n→∞
(n) (n) (n) g(tj+1 ) − g(tj ) = c lim g(tj+1 ) − g(t0 ) = 0. n→∞
Observe that if g has bounded variation, then g defines a signed measure on R. b The Lebesgue–Stieltjes integral is a f dg = f (t0 )∆g(t0 ) which is different from the Itˆo–Stieltjes integral. Later5 we shall show that for left-regular processes the Lebesgue–Stieltjes and the Itˆ o–Stieltjes integrals are equal but, as in this case f is not left-regular, the theorem is not applicable6 . We shall very often use the following simple observation: Proposition 2.9 (The existence of the Itˆ o–Stieltjes integral) If the integrator X is right-continuous7 and it has finite variation and the integrand Y is b regular then for any time interval [a, b] the Itˆ o–Stieltjes integral a Y dX exists and for all outcome ω the approximating sequences In (ω)
(n) (n) (n) Y (tk−1 , ω) X(tk , ω) − X(tk−1 , ω)
k
are convergent. Proof. The proof is similar to the proof of the existence of Riemann–Stieltjes integrals. Fix an outcome ω and let (In ) be the sequence of the approximating sums. Fix an ε > 0 and an outcome ω. By the regularity of Y (ω) there are only a finite number of jumps bigger than8 c Let J
ε . 4 · Var (X) (a, b, ω)
∆Y · χ (|∆Y | ≥ c) and Z Y − J. (J)
1. Let us denote by (In ) the approximating sums formed with J. As Y is regular the number of ‘big jumps’ on every trajectory is finite. X is right-continuous, b hence by the previous lemma the integral a J (ω) dX (ω) exists for any ω. Hence if i and j are big enough, then ε (J) (J) Ii (ω) − Ij (ω) ≤ . 2 5 It is an easy consequence of the Dominated Convergence Theorem. See: Theorem 2.88, page 174. See also the properties of the stochastic integral on page 434. 6 Recall that the Riemann–Stieltjes integral b f dg does not exist. a 7 If X is not right-continuous then we should assume that Y is left-regular. 8 See: Proposition 1.5, page 5. We can assume that Var (X (ω) , a, b) > 0 otherwise X (ω) is constant on [a, b] and the proposition is trivially satisfied.
116
STOCHASTIC INTEGRATION
2. Finally let us define the approximating sums In(Z)
(n) (n) (n) Z(tk−1 , ω)X (tk , ω) − X(tk−1 , ω) .
k
The jumps of Z are smaller than c and Z is regular, hence9 there is a δ(ω) such that if |s − t| ≤ δ(ω) then |Z(s, ω) − Z(t, ω)| ≤ 2c. (n) (n) If maxk tk − tk−1 ≤ δ(ω)/2 for all n ≥ N then as in the case of the ordinary Riemann–Stieltjes integral ε (Z) (Z) Ii (ω) − Ij (ω) ≤ 2c · Var (X (ω) , a, b) ≤ . 2 3. Adding up the two inequalities above if i and j are sufficiently large then (J) (J) |Ii (ω) − Ij (ω)| ≤ Ii (ω) − Ij (ω) + (Z) (Z) + Ii (ω) − Ij (ω) ≤ ε.
(2.5)
This means that (In (ω)) is a Cauchy sequence for any ω. The rest of the proof is the same as the last part of the proof of the previous proposition. Example 2.10 The Itˆ o–Stieltjes and the Lebesgue–Stieltjes integrals are not equal.
One should emphasize that as X has bounded variation one can also define the pathwise Lebesgue–Stieltjes integral of Y with respect to the measures generated by the trajectories of X. If Y is left-continuous then Y = lim
n→∞
(n) (n) (n) Y tk−1 χ tk−1 , tk
k
so by the Dominated Convergence Theorem the two integrals are equal. But in general the Itˆ o–Stieltjes and the Lebesgue–Stieltjes integrals are not equal. If Y (t) = X (t) 9 See:
Proposition 1.7, page 6.
0 if t < 1/2 1 if t ≥ 1/2
ˆ THE ITO–STIELTJES INTEGRALS
117
then the measure generated by X is the Dirac measure δ 1/2 so the Lebesgue– Stieltjes integral over (0, 1] is one, while the Itˆ o–Stieltjes integral is zero10 . 2.1.2
Itˆ o–Stieltjes integrals when the integrators are locally square-integrable martingales
Perhaps the most important stochastic processes are the Wiener processes. As the trajectories of Wiener processes almost surely do not have finite variation11 , we cannot apply the previous construction when the integrator is a Wiener process. Theorem 2.11 (Fisk) Let L be a continuous local martingale. If the trajectories of L have finite variation then for almost all outcomes ω the trajectories of L are constant functions. Proof. Consider the local martingale M L − L (0). It is sufficient to prove that M = 0. Let V Var (M ) and let (ρn ) be a localizing sequence of M . As the variation of a continuous function is continuous υ n (ω) inf {t : |M (t, ω)| ≥ n} and κn (ω) inf {t : V (t, ω) ≥ n} are stopping times. Hence τ n υ n ∧ κn ∧ ρn is also a stopping time. Obviously τ n ∞, hence if M τ n = 0 for all n then M is zero on [0, τ n ] for all n and therefore M will be zero on ∪n [0, τ n ] = R+ × Ω, so M = 0. As the trajectories of M τ n and V τ n are bounded one can assume that M and V Var (M ) are (n) bounded. Let (tk ) be an arbitrary infinitesimal sequence of partitions of [0, t]. By the energy identity12 if u > v then
2 E (M (u) − M (v)) = E M 2 (u) − M 2 (v) , (2.6) hence as M (0) = 0
E M 2 (t) = E M 2 (t) − E M 2 (0) =
(n) (n) 2 2 =E M tk − M tk−1 = k
=E
k
10 See:
Example 2.6, page 113. Theorem B.17, page 571. 12 See: Proposition 1.58, page 35. 11 See:
M
(n) tk
−M
(n) tk−1
2
.
118
STOCHASTIC INTEGRATION
V is bounded hence V Var (M ) ≤ c.
E M 2 (t) ≤
(n)
(n) (n) (n) ≤E − M tk−1 · max M tk − M tk−1 ≤ M tk
k
k
(n) (n) ≤ E V (t) · max M tk − M tk−1 k
(n) (n) ≤ c · E max M tk − M tk−1 . k
The trajectories of M are continuous hence they are uniformly continuous on [0, t] so
(n) (n) − M tk−1 = 0. lim max M tk
n→∞
k
On the other hand
(n) (n) max M tk − M tk−1 ≤ V (t) ≤ c, k
so we can use the Dominated Convergence Theorem:
(n) (n) lim E max M tk − M tk−1 = 0.
n→∞
k
a.s.
Hence M (t) = 0 for every t. The trajectories of M are continuous and therefore13 for almost all outcomes ω one has that M (t, ω) = 0 for all t. This means that when the integrators are continuous local martingales we need another approach. First we prove two very simple lemmata: Lemma 2.12 Let (Mk , Fk ) be a discrete-time martingale and let (Nk ) be an F (Fk ) adapted process. If the variables Nk−1 · (Mk − Mk−1 ) are integrable then the sequence Z0 0,
Zn
n k=1
13 See:
Proposition 1.9, page 7.
Nk−1 · (Mk − Mk−1 )
ˆ THE ITO–STIELTJES INTEGRALS
119
is an F-martingale. Specifically, if N is uniformly bounded and M is an arbitrary discrete-time martingale then Z is a martingale. Proof. By the assumptions Nk−1 ·(Mk − Mk−1 ) is integrable, hence if k −1 ≥ m then E (Nk−1 (Mk − Mk−1 ) | Fm ) = E (E (Nk−1 (Mk − Mk−1 ) | Fk−1 ) | Fm ) = = E (Nk−1 E (Mk − Mk−1 | Fk−1 ) | Fm ) = = E (Nk−1 · 0 | Fm ) = 0, from which the lemma is evident. Lemma 2.13 Let (Mk , Fk ) be a discrete-time L2 (Ω)-valued martingale. If |Nk | ≤ c is an F-adapted sequence and Z0 0,
Zn
n
Nk−1 · (Mk − Mk−1 )
k=1
then ) 2 2 Zn 2 ≤ c Mn 2 − M0 2 . Proof. By the previous lemma (Zn ) is a martingale, so by the energy equality 2
Zn 2 =
n
2
Nk−1 (Mk − Mk−1 )2 .
k=1
Using the energy equality again 2
Zn 2 ≤ c2
n
2
Mk − Mk−1 2 =
k=1
= c2
n
2 2 Mk 2 − Mk−1 2 =
k=1
2 2 = c2 Mn 2 − M0 2 . First we prove the existence of the integral for continuous integrands. Proposition 2.14 (Existence of Itˆ o–Stieltjes integrals for continuous integrands) If X ∈ H2 and Y is adapted and continuous on a finite interval
120
STOCHASTIC INTEGRATION
[a, b] then the Itˆ o –Stieltjes integral In
b a
Y dX exists and the approximating sums
(n) (n) (n) Y (tk−1 ) X(tk ) − X(tk−1 )
k
converge in probability. Proof. The proof is similar to the proof of the existence of the integral when the integrator has finite variation. 1. The basic, but not entirely correct trick is that as Y is continuous it is uniformly continuous, hence if In and Im are two approximating sums of the integral then by the previous lemma In − Im 2
(n) (n) (n) (m) (m) (m) Y (tk−1 ) X(tk )−X(tk−1 ) − Y (tk−1 ) X(tk )−X(tk−1 ) = k k 2
= Y (tk−1 ) − Y (tk−1 ) X(tk ) − X(tk−1 ) ≤ k 2 ) 2 2 ≤c X (b)2 − X (a)2 . Of course the main problem with this estimation is that one cannot guarantee that for any fixed partition Y (tk−1 , ω) − Y (tk−1 , ω) ≤ c
(2.7) (n)
(m)
for every ω. What one can show is that if the partitions (tk ) and (tk ) are sufficiently fine then outside of an event with small probability the estimation (2.7) is valid. That is the reason why one can prove only that the integrating sums converge in probability and not in L2 (Ω). 2. To show the correct proof fix an α and a β and let * + βα2 +
. c, 2 2 2 X (b)2 − X (a)2 For every δ > 0 let us define the modulus of continuity of Y : Mδ (ω, u) sup {|Y (t, ω) − Y (s, ω)| : |t − s| ≤ δ, t, s ∈ [a, u]} . As Y is continuous one can calculate the supremum when s and t are rational numbers so Mδ is adapted and as Y is continuous obviously Mδ is also continuous.
ˆ THE ITO–STIELTJES INTEGRALS
121
Y is continuous, so every trajectory of Y is uniformly continuous on [a, b], hence for every ω lim Mδ (ω, b) = 0.
δ0
This means that if δ is sufficiently small then P(Mδ (b) ≥ c) ≤
β . 2
Fix this δ and let us define the stopping time τ inf {u : Mδ (u) ≥ c} ∧ b. As τ is a stopping time, Z Y τ is adapted and if |x − y| ≤ δ then |Z (x) − Z (y)| ≤ c. Let In(Z)
(n) (n) (n) Z(tk−1 ) X(tk ) − X(tk−1 ) .
k
(i) (j) If the partitions tk and tk are finer than δ/2 then by the previous lemma 2
βα2 (Z) (Z) 2 2 . Ii − Ij ≤ c2 X (b)2 − X (a)2 = 2 2 Let A {Mδ (b) ≥ c}. It is easy to see that Z = Y on Ac . By Chebyshev’s inequality P (|Ii − Ij | > α) = = P ({|Ii − Ij | > α} ∩ A) + P ({|Ii − Ij | > α} ∩ Ac ) ≤ ≤ P (A) + P ({|Ii − Ij | > α} ∩ Ac ) =
(Z) (Z) = P (A) + P Ii − Ij > α ∩ Ac ≤ 2 (Z) (Z)
− I I i j β β (Z) (Z) 2 ≤ ≤ + P Ii − Ij > α ≤ + 2 2 2 α
2 2 c2 X (b)2 − X (a)2 β β β ≤ + = + = β. 2 2 α 2 2 Hence (In ) is convergent in probability. Now we generalize the theorem for regular integrands.
122
STOCHASTIC INTEGRATION
Proposition 2.15 (The existence of the Itˆ o–Stieltjes integral for H2 integrators) If on a finite interval [a, b] the adapted stochastic process Y is b regular and X ∈ H2 then the Itˆ o–Stieltjes integral a Y dX exists and the Itˆ o-type approximating sums converge in probability. Proof. The proof is similar to the proof of the existence of the integral when the integrator has finite variation. Let (In ) be an approximating sequence of the b integral a Y dX. Fix an ε and a β. * + + c,
Let again J
βε2 2
2
48 X (b)2 − X (a)2
∆Y χ (|∆Y | ≥ c) , Z Y − J.
1. As the trajectories of Y are regular for any ω the trajectory Y (ω) has a finite number of jumps which are larger than c. X ∈ H2 and by definition X b is right-continuous, hence the integral a JdX exists. As it converges for every outcome ω it converges stochastically as well, so if i and j are big enough, then ε β (J) (J) P Ii − Ij > ≤ . 2 3 2. The jumps of Z are smaller than c. As in the continuous case14 if δ > 0 is small enough then there is a stopping time τ such that P (τ < b) P (A) ≤
β 3
and if |x − y| ≤ δ then |Z (x) − Z (y)| ≤ 2c on the random interval [a, τ ]. If (i) V Z τ then |V (x) − V (y)| ≤ 2c whenever |x − y| ≤ δ. If the partitions (tk ) (j) and (tk ) are finer than δ/2 then again as in the continuous case 2
(V ) (V ) 2 2 2 Ii − Ij ≤ (2c) X (b)2 − X (a)2 . 2
By Chebyshev’s inequality
ε (2c)2 X (b)22 − X (a)22 β (V ) (V ) P Ii − Ij > ≤ = . 2 2 3 (ε/2) 14 See:
Proposition 1.7, page 6.
ˆ THE ITO–STIELTJES INTEGRALS
123
3. If i and j are big enough, then ε
ε
(Z) (J) (J) (Z) P (|Ii − Ij | > ε) ≤ P Ii − Ij > + P Ii − Ij > ≤ 2 2 ≤
ε
β (Z) (Z) + P Ii − Ij > ≤ 3 2
≤
ε
β (Z) (Z) ≤ + P (A) + P Ac ∩ Ii − Ij > 2 3
≤
ε
2β (V ) (V ) + P Ii − Ij > ≤ β. 3 2
This means that (In ) is a Cauchy sequence in probability and hence it converges in probability. Corollary 2.16 Let Y be an adapted, regular process on a finite interval [a, b]. b 2 If X ∈ Hloc then the Itˆ o–Stieltjes integral a Y dX exists and the approximating sums converge in probability. 2 and let (τ n ) be a localizing sequence of X. As Proof. Assume that X ∈ Hloc τ n ∞ for any β > 0 if s is big enough then P (τ s ≤ b) < β/2. Let
In
(n) (n) (n) Y (tk−1 ) X(tk ) − X(tk−1 ) ,
k
Sn
(n) (n) (n) Y (tk−1 ) X τ s (tk ) − X τ s (tk−1 ) .
k
For any α > 0 P (|In − Im | > α) ≤ P (τ s ≤ b) + P (|In − Im | > α, τ s ≥ b) ≤ ≤
β + P (|In − Im | > α, τ s ≥ b) ≤ 2
≤
β + P (|Sn − Sm | > α) . 2
As X τ s ∈ H2 by the previous proposition P (|Sn − Sm | > α) → 0. Hence (In ) is a stochastic Cauchy sequence, so it is convergent in probability.
124 2.1.3
STOCHASTIC INTEGRATION
Itˆ o–Stieltjes integrals when the integrators are semimartingales
As we can integrate with respect to processes with finite variation and with respect to locally square-integrable martingales, the next definition is very natural: Definition 2.17 An adapted process X is called a semimartingale if X has a decomposition X = X (0) + V + H
(2.8)
2 where V is a right-continuous, adapted process with finite variation and H ∈ Hloc and V (0) = H (0) = 0.
It is important to emphasize that at the moment we do not know too much about the class of semimartingales. As there are martingales which are not locally square-integrable it is not even evident from the definition that every martingale is a semimartingale. Later we shall prove that every local martingale is a semimartingale in the above sense15 . We shall later also prove that every integrable sub- and supermartingale is a semimartingale16 . Therefore the class of semimartingales is a very broad one. Every continuous local martingale is locally square-integrable 17 , therefore in the continuous case we can use the following definition: Definition 2.18 An adapted continuous stochastic process X is called a continuous semimartingale if X has a decomposition (2.8) where H is a continuous local martingale and V is a continuous, adapted process with finite variation. Proposition 2.19 If X is a continuous semimartingale then the decomposition (2.8) is unique. Proof. If X = X (0)+H1 +V1 and X = X (0)+H2 +V2 then H1 −H2 = V2 −V1 is a continuous local martingale having finite variation. Hence by Fisk’s theorem18 H1 − H2 = V1 − V2 = 0. Example 2.20 For discontinuous semimartingales the decomposition (2.8) necessarily unique.
is not
15 This is the so called Fundamental Theorem of Local Martingales. See: Theorem 3.57, page 220. 16 This is a direct consequence of the so called Doob–Meyer decomposition. See: Proposition 5.11, page 303. 17 See: Example 1.137, page 96. 18 See: Theorem 2.11, page 117.
ˆ THE ITO–STIELTJES INTEGRALS
125
The simplest example is the compensated Poisson process. If π is a Poisson process with parameter λ then the compensated Poisson process X (t) π (t) − 2 λt is in Hloc and the trajectories of X on any finite interval have finite variation. So H X, V 0 and H 0, V X are both proper decompositions of X. Almost surely convergent sequences are convergent in probability, therefore one can easily prove the following theorem: Theorem 2.21 (Existence of Itˆ o–Stieltjes integrals) If X is a semimartingale and Y is a regular and adapted process then for any finite interval [a, b] the b Itˆ o–Stieltjes integral a Y dX exists and it is convergent in probability. The value of the integral is independent of the value of the jumps of Y , that is for any regular Y
b
Y dX =
b
b
a
Y− dX =
Y+ dX.
a
a
Proof. We have already proved the first part of the theorem. Let (In ) be the b sequence of the approximating sums for a Y dX and let (Sn ) be the sequence of approximating sums when the integrand is Y− . We need to prove that In − Sn =
(
(n)
P (n) (n) (n) X tk − X tk−1 → 0. Y tk−1 − Y− tk−1
(2.9)
k
Observe that the situation is very similar to that in the proof of Theorem 2.15. We can separate the big jumps and the small jumps and apply the same argument as above19 . Example 2.22 Wiener integrals.
The simplest case of stochastic integration is the so-called Wiener integral: the integrator is a Wiener process w, the integrand is a deterministic function f . If f is regular, then f , as a stochastic process, is adapted and regular, hence by the b above theorem the expression a f (s) dw (s) is meaningful. The increments of a Wiener process are independent. As the sum of independent normally distributed variables is again normally distributed
(n) f (ti−1 )
(n) w(ti )
−
i 19 See:
(n) w(ti−1 )
∼ =N
0,
i
Example 2.8, page 114.
f
2
(n) ti−1
(n) ti
−
(n) ti−1
.
126
STOCHASTIC INTEGRATION
Stochastic convergence implies convergence in distribution, hence
b
f dw ∼ = N 0,
a
b 2
f (t)dt ,
a
where N (µ, σ 2 ) denotes the normal distribution with expected value µ and variance σ 2 . 2.1.4
Properties of the Itˆ o–Stieltjes integral
The next properties of the Itˆ o–Stieltjes integral are obvious: Proposition 2.23 If X1 , X2 and X are semimartingales, Y1 , Y2 and Y are adapted regular processes, α and β are constants then b b a.s. b 1. α a Y1 dX + β a Y2 dX = a (αY1 + βY2 ) dX, b b b a.s 2. a Y d (αX1 + βX2 ) = α a Y dX1 + β a Y dX2 . b b a.s. c 3. If a < c < b, then a Y dX = a Y dX + c Y dX. 4. If Y1 χA is an equivalent modification of Y2 χA for some A ⊆ Ω then the b b integrals a Y1 dX and a Y2 dX are almost surely equal on A. Since the approximating sums are convergent in probability it is important to note that the Itˆ o–Stieltjes integral is defined only as an equivalence class. In the following we shall not distinguish between functions and equivalence classes so a.s. when it is not important to emphasize this difference instead of = we shall use the simpler sign =. 2.1.5
The integral process
Let us briefly investigate the integral process (Y • X) (t)
t
Y dX. a
We have defined the stochastic integral only for fixed time intervals. On every time interval the definition determines the value of the stochastic integral up to a measure-zero set, hence the properties of the integral process t → (Y • X) (t) are unclear. It is not a stochastic process, just an indexed set of random variables! When does it have a version which is a martingale? Assume that X ∈ H2 and that Y is adapted. Assume also that Y is uniformly bounded that is |Y | ≤ c for some constant c. As the filtration F is right-continuous, the right-regular process
ˆ THE ITO–STIELTJES INTEGRALS
127
Y+ is also adapted. As we have seen20 for every t ∈ [a, b]
2 (n) (n) (n) 2 ≤ Y+ tk−1 ∧ t X tk ∧ t − X tk−1 ∧ t In (t)2 E k
≤ c2 E X 2 (b) − E X 2 (a) K, hence the sequence In (t)
(n) (n) (n) Y+ tk−1 ∧ t X tk ∧ t − X tk−1 ∧ t
k
is bounded in L2 (Ω) so the sequence of the approximating sums is uniformly integrable hence not only p
In (t) → (Y • X) (t) but also L1
In (t) → (Y • X) (t) . It is easy to see21 that if s < t then E (In (t) | Fs ) = In (s) . L1
As In (t) → operator
t a
Y dX using the L1 (Ω)-continuity of the conditional expectation
t
Y dX | Fs
E
s
Y dX.
=
a
a
Observe that In (t) is right-regular so In (t) is a martingale for every n. As Im −In is a martingale by Doob’s inequality, for any λ > 0
λP sup |In (t) − Im (t)| ≥ λ t
≤ In (b) − Im (b)1 .
(In (b)) is convergent in L1 (Ω) so P
sup |In (t) − Im (t)| → 0, t
20 See: 21 See:
Lemma 2.13, page 119. Lemma 2.12, page 118.
128
STOCHASTIC INTEGRATION
hence for a subsequence a.s.
sup |Ink (t) − Imk (t)| → 0,
(2.10)
t
so except for a measure-zero set the continuity-type properties of trajectories of (In ) are preserved, so we get the following proposition: Proposition 2.24 If Y is an adapted, regular, and uniformly bounded process, X ∈ H2 then the integral process (Y • X) (t)
t
Y dX,
t≥a
a
has a version which is a martingale. If (In ) is the sequence of approximating sums then for every t P
sup |In (s) − (X • M ) (s)| → 0.
(2.11)
a≤s≤t
If X is continuous and bounded then Y • X has a continuous version. Let us emphasize that in the argument above the set of exceptional points N in (2.10) is in Fb . Of course we should define the integral process on N as well, and of course we should guarantee that the integral process is adapted. We can do this only when we assume that for all s ≤ b, N ∈ Fs . This assumption is part of the usual conditions. Observe that in the continuous case we do not explicitly use the right-continuity of the filtration. On the other hand, this is a very uninteresting remark since, in most cases22 , if we add the measure-zero sets to the filtration then the augmented filtration is right-continuous. 2.1.6
Integration by parts and the existence of the quadratic variation
One of the most important concepts of stochastic analysis is the quadratic variation. The main reason to introduce the Itˆ o–Stieltjes integral is that from the existence theorem of the Itˆo–Stieltjes integral one can easily deduce the existence of the quadratic variation of semimartingales. Definition 2.25 Let U and V be stochastic processes on [a, b]. If for every
(n) of [a, b] the sequence infinitesimal sequence of partitions tk Qn
(n)
(n) (n) (n) U tk − U tk−1 V tk − V tk−1 k
22 E.g.
if the filtration is generated by a L´evy process. See: Proposition 1.103, page 67.
ˆ THE ITO–STIELTJES INTEGRALS
129
is convergent in probability then the limit limn→∞ Qn is called the quadratic co-variation of U and V . The quadratic co-variation of U and V on [a, b] is b b b denoted by [U, V ]a . If V = U then [U, U ]a [U ]a is called the quadratic variation of U . Of course in stochastic convergence b
[U ]a lim
n→∞
2 (n)
(n) U tk − U tk−1 . k
Example 2.26 If the trajectories of X are continuous and the trajectories of V have a.s. finite variation then [X, V ]ba = 0 for any interval [a, b].
By the continuity assumption, the trajectories of X are uniformly continuous on (n) (n) the compact interval [a, b]. Hence if maxk tk − tk−1 → 0 then for every ω (n) (n) lim max X(tk , ω) − X(tk−1 , ω) → 0.
n→∞
k
Therefore, as Var(V, a, b) < ∞
(n) (n) (n) (n) X(tk ) − X(tk−1 ) V (tk ) − V (tk−1 ) ≤ |Qn | k (n) (n) ≤ max X(tk ) − X(tk−1 ) Var(V, a, b) → 0. k
a.s.
Example 2.27 If w is a Wiener process23 then [w]t0 = t. If π is a Poisson process a.s. then [π]t0 = π (t).
If π is a Poisson process then for any ω the number of the jumps on any finite interval [0, t] is finite, so for any ω one can assume that every subinterval contains just one jump, hence Qn (t, ω) is the number of jumps of the trajectory π (ω) during the time interval [0, t]. So evidently Qn (t, ω) = π (t, ω). Proposition 2.28 (Integration By Parts Formula) If M and N are semimartingales then: b
1. For any finite interval [a, b] the quadratic co-variation [M, N ]a exists. 2. The following integration by parts formula holds: (M N ) (b) − (M N ) (a) =
M− dN + a
23 See:
Theorem B.17, page 571.
b
a
b
b
N− dM + [M, N ]a .
(2.12)
130
STOCHASTIC INTEGRATION
Proof. By definition semimartingales are right-regular processes so the processes (n) M− and N− are well-defined left-regular processes. For any partition (tk ) of [a, b] let us define the approximating sums
(n) (n) (n) (n) M tk−1 ∆N tk N tk−1 ∆M tk + +
k
k
+
(n)
∆M tk
(n)
∆N tk
.
k
With elementary calculation for all k (n)
(n)
(n)
(n)
M (tk )N (tk ) − M (tk−1 )(N tk−1 ) =
(n) (n) (n) = M tk−1 N (tk ) − N (tk−1 ) +
(n) (n) (n) + N tk−1 M (tk ) − M (tk−1 ) +
(n) (n) (n) (n) + M (tk ) − M (tk−1 ) N (tk ) − N (tk−1 ) . Adding up by k, on the left side one gets a telescopic sum which adds up to M (b) N (b) − M (a) N (a) , which is the expression on the left-hand side of (2.12). The integrating sums on the right-hand side converge to the Itˆ o–Stieltjes integrals
b
M dN = a
b
M− dN a
and
b
b
N dM = a
N− dM a
b
so [M, N ]a exits and the formula (2.12) holds. Example 2.29 The jumps of independent Poisson processes.
Let N1 and N2 be two Poisson processes with respect to the same filtration24 F. For s ≥ 0 let Ui (s, t) 24 That
exp (−sNi (t)) , E (exp (−sNi (t)))
i = 1, 2
is N1 and N2 are counting L´evy processes with respect to the same filtration.
ˆ THE ITO–STIELTJES INTEGRALS
131
be the exponential martingales defined by the Laplace transforms of the Poisson processes. By the Integration By Parts Formula
t
U1 (s1 , t) U2 (s2 , t) − 1 =
U1 (s1 , r−) U2 (s2 , dr) + 0
+
t
U2 (s2 , r−) U1 (s1 , dr) + 0
+ [U1 (s1 ) , U2 (s2 )] (t) . It is easy to see that U1 and U2 are bounded martingales, with respect to F for any s ≥ 0 on any finite interval [0, t]. As they are also F-adapted the stochastic integrals are martingales25 . Therefore the expected value of the stochastic integrals are zero. So E (U1 (s1 , t) U2 (s2 , t)) − 1 = E ([U1 (s1 ) , U2 (s2 )] (t)) . By the definition of U1 and U2
2 2 ! E exp − si Ni (t) = E (exp (−si Ni (t))) i=1
i=1
if and only if E ([U1 (s1 ) , U2 (s2 )] (t)) = 0.
(2.13)
That is N1 (t) and N2 (t) are independent if and only if (2.8) holds26 . As Laplace transform is continuous in time ∆Ui (s, r) =
exp (−sNi (r)) − exp (−sNi (r−)) ≤0 E (exp (−sNi (r)))
it is easy to see that [U1 (s1 ) , U2 (s2 )] (t) =
∆U1 (s1 , r) ∆U2 (s2 , r) ≥ 0.
r≤t
Therefore its expected value is zero if and only if it is almost surely zero. Hence N1 (t) and N2 (t) are independent if and only if with probability one N1 and N2 do not have common jumps on the interval [0, t]. 25 See: 26 One
Proposition 2.24, page 128. can easily modify the proof of Lemma 1.96 on page 60.
132
STOCHASTIC INTEGRATION
The next property of the quadratic co-variation is obvious: Proposition 2.30 If M, N and U are arbitrary semimartingales, ξ and η are F0 -measurable random variables then for any interval [a, b] b a.s.
b
b
[ξM + ηN, U ]a = ξ [M, U ]a + η [N, U ]a . Specifically [M + N ] = [M ] + 2 [M, N ] + [N ] .
a.s.
Example 2.31 If X = X (0)+L+V is a continuous semimartingale then [X]ba = [L]ba for any interval [a, b], where L is the continuous local martingale part of X . b a.s.
As V and L are continuous and the trajectories of V have finite variation [V ]a = a.s. 0 and [V, L] = 0. By the additivity: b a.s.
b
b a.s.
[X]a [X (0) + L + V ]a = [L + V ]a = a.s.
b
b a.s.
b
b
= [L]a + 2 [L, V ]a + [V ]a = [L]a .
Example 2.32 Assume that F is a deterministic, right-regular function with finite variation. If w is a Wiener process then
t
w (s) dF (s) ∼ =N
0,
0
0
t
(F (t) − F (s))2 ds .
w is continuous and F has finite variation therefore [w, F ] = 0. By the integration by parts formula w (t) F (t) =
t
t
wdF +
F− dw,
0
0
hence
t
t
wdF = w (t) F (t) − 0
F− dw = 0
t
F (t) dw −
= 0
t
F− dw = 0
t
(F (t) − F (s−)) dw (s) .
= 0
ˆ THE ITO–STIELTJES INTEGRALS
133
The last integral is a Wiener integral, so 0
t
wdF ∼ =N
t 2 0, (F (t) − F (s−)) ds = 0
t 2 (F (t) − F (s)) ds . = N 0, 0
As we have remarked, if X has finite variation and Y is continuous then27 [X, Y ] = 0. Hence in this case the integration by parts formula is XY − X (0) Y (0) = Y • X + X− • Y. For this formula we do not in fact need the continuity of Y . Observe that as X has finite variation every trajectory of X defines a measure on R+ . Let Y be an arbitrary semimartingale, and let ∆Y denote the jumps of Y . We show, that in this case [Y, X] = ∆Y • X, where the integral is the Lebesgue–Stieltjes integral defined by the trajectories of X. If U ∆Y χ(|∆Y | ≥ ε) are the jumps of Y which are bigger than ε then as the number of such jumps on every finite interval is finite [Y, X] = [Y − U, X] + [U, X] = = [Y − U, X] + ∆Y χ(|∆Y | ≥ ε)∆X = = [Y − U, X] + ∆Y χ(|∆Y | ≥ ε) • X. The jumps of the regular process Z Y − U are smaller than ε, hence if the partition of the interval [a, b] is fine enough, then28 (n) (n) Z(tk , ω) − Z(tk−1 , ω) ≤ 2ε for any ω. Therefore if n → ∞
(n) (n) (n) (n) Z(tk ) − Z(tk−1 ) X(tk ) − X(tk−1 ) ≤ 2εVar (X, a, b) → 0. k
As X has finite variation and the integral is a Lebesgue–Stieltjes integral one can use the Dominated Convergence Theorem. From this theorem for every 27 See: 28 See:
Example 2.26, page 129. Proposition 1.7, page 6.
134
STOCHASTIC INTEGRATION
trajectory ∆Y χ(|∆Y | ≥ ε) • X → ∆Y • X =
∆Y ∆X,
assuming of course that for every trajectory, on every finite interval, |∆Y | is integrable. But this has to be true as the trajectories of Y are regular so on every finite interval every trajectory of Y will be bounded29 . Proposition 2.33 If X is right-continuous and has finite variation, Y is an arbitrary semimartingale then [X, Y ] = ∆Y ∆X = ∆Y • X (2.14) therefore30 XY − X (0) Y (0) = Y− • X + X− • Y + [X, Y ] = = Y− • X + X− • Y + ∆Y • X = = Y • X + X− • Y where the integral with respect to X is a Lebesgue–Stieltjes integral and the integral with respect to Y is an Itˆ o–Stieltjes integral. 2.1.7
The Kunita–Watanabe inequality
In the construction of the stochastic integral below we shall use the following simple inequality: Proposition 2.34 (Kunita–Watanabe inequality) If X, Y are product measurable processes, and M, N are semimartingales, a ≤ b ≤ ∞ and V Var ([M, N ]) then b b b a.s. |XY | dV ≤ X 2 d [M ] Y 2 d [N ]. (2.15) a
a
a
Remark first that the meaning of the proposition is not really clear as it is not clear what is the meaning of [M ], [N ] and [M, N ]. So far we have defined the quadratic variation only for fixed time intervals, and the quadratic variation for every time interval is defined as a limit in stochastic convergence, and hence the quadratic variation on any interval is defined just up to a measure-zero set. If t X is a semimartingale then for every t one can define [X] (t) [X]0 , but this [X] is not a stochastic process since for a fixed ω and t the value of [X] (t, ω) 29 See:
Proposition 1.6, page 5. that the Lebesgue–Stieltjes integral Y •X exists: The trajectories of Y are regular, hence they are bounded on every finite interval. 30 Observe
ˆ THE ITO–STIELTJES INTEGRALS
135
is undefined. Of course, if t is restricted to the set of the rational numbers then we can collect the corresponding measure-zero sets in just one measurezero set, but it is unclear how one can extend this process to the irrational values of t as at the moment we have not proved any continuity property of the quadratic variation. Observe, that we do not know anything about integral processes. In particular we do not know when they will be martingales. If the integral process is a semimartingale then, by definition, it has a right-continuous version, so by (2.12) the quadratic variation also has a right-continuous version. One of the goals of the later developments will be to provide a right-continuous version for the quadratic variation process or, which is the same, to prove some martingale-type properties for the stochastic integral. So, to prove the inequality up to the end of the section we assume that there are processes [M ], [N ] and [M, N ] which are right-continuous, and that for any t they provide a version of the related quadratic variation. In this case [M ] (ω) , [N ] (ω) and Var ([M, N ] , ω) are increasing, right-continuous functions for every ω, hence they define a measure and for every ω the integrals in (2.15) are defined as Lebesgue–Stieltjes integrals. Proof. It is sufficient to prove the proposition for finite a and b. One can prove the case b = ∞ by the Monotone Convergence Theorem. Also by the Monotone Convergence Theorem one can assume that X any Y are bounded. We should b prove the inequality when on the left-hand side we have a XY d [M, N ] since to prove (2.15) one can replace Y by Y Y · sgn (XY )
dV . d [M, N ]
1. First assume that X = 1 and Y = 1. In this case, the inequality is ) ) b a.s. b b N ] ≤ [M ] [M, a a [N ]a .
(2.16)
Fix a u and a v. The proof of (2.16) is nearly the same as the proof of the classical Cauchy–Schwarz inequality. It is easy to see that for all rational numbers r a.s.
v a.s.
v
v
v
0 ≤ [M + rN ]u = [M, M ]u + 2r · [M, N ]u + r2 · [N, N ]u Ar2 + Br + C. Hence there is a measure-zero set Z such that on the complement of Z the inequality above is true for all rational, and therefore all real, r. Hence, as in a.s.
the proof of the Cauchy–Schwarz inequality B 2 − 4AC ≤ 0 so (2.16) holds with a = u and b = v. Unifying the measure-zero sets one can easily prove (2.16) for
136
STOCHASTIC INTEGRATION
every rational numbers u and v. By the assumption above the quadratic variation is right-continuous, so the relation (2.16) holds for every real a = u and b = v. 2. Let (tk ) be a partition of [a, b] and assume that X and Y are constant on every subinterval (tk−1 , tk ]. We are integrating by trajectory so b t XY d [M, N ] ≤ |X (tk ) Y (tk )| [M, N ]tk+1 ≤ k a k ) ) t t ≤ |X (tk ) Y (tk )| [M ]tk+1 [N ]tk+1 . k k k
Using the Cauchy–Schwarz inequality we can continue % b % 2 t t XY d [M, N ] ≤ |X (tk )| [M ]tk+1 Y 2 (tk ) [N ]tk+1 = k k a k k b b = X 2 d [M ] Y 2 d [N ]. a
a
3. Using standard measure theory one can easily prove31 that if µ is a finite, regular measure on the real line, and g is a bounded Borel measurable function, then there is a sequence of step functions sn
ci χ
(n)
(n)
ti , ti+1
i
that sn → g almost surely in µ. As µ is finite and g is bounded sn → g in L2 (µ). 4. We prove that Kunita–Watanabe inequality holds for every outcome where (2.16) holds for every real a and b. Fix the process Y and an outcome ω, and consider the set of processes X for which the inequality (2.16) holds for this ω. Let sn → X (ω) be a set of step functions. By (2.16) the measure generated by [M, N ] (ω) is absolutely continuous with respect to the measure generated by [M ] (ω). Hence sn → X (ω) almost surely in [M, N ] (ω). Therefore by the Dominated Convergence theorem, using that X and Y are bounded, a and b are finite and that the convergence holds almost everywhere in [M, N ] (ω) and in L2 ([M ] (ω)) b b b XY d [M, N ] ≤ X 2 d [M ] Y 2 d [N ] a a a 31 Use Lusin’s theorem [80], page 56, and the uniform continuity of continuous functions on compact sets.
ˆ THE ITO–STIELTJES INTEGRALS
137
for outcome ω. If X is product measurable then by Fubini’s theorem every trajectory of X is Borel measurable. Hence if X is product measurable then inequality (2.15) holds for almost all outcome ω. 5. Now we fix X and repeat the argument for Y . Corollary 2.35 If q, p ≥ 1 and 1/p + 1/q then E 0
∞
∞ |XY | d [M, N ] ≤ X 2 d [M ] 0
p
- ∞ Y 2 d [N ] . 0 q
Proof. By H¨ older’s inequality and by (2.15) E
∞
-
|XY | d [M, N ] ≤
-
∞
X 2 d [M ]
E
0
0
Y 2 d [N ]
≤
0
- ∞ X 2 d [M ] 0
≤
∞
p
- ∞ Y 2 d [N ] . 0 q
Corollary 2.36 If M and N are semimartingales then |[M, N ]| ≤
" [M ] [N ]
(2.17)
and 1/2
[M + N ]
1/2
≤ [M ]
1/2
+ [N ]
and [M + N ] ≤ 2 ([M ] + [N ]) . Proof. The first inequality is just the Kunita–Watanabe inequality when X = Y = 1. [M + N ] = [M ] + 2 [M, N ] + [N ] ≤ " ≤ [M ] + 2 [M ] [N ] + [N ] =
2 1/2 1/2 = [M ] + [N ]
138
STOCHASTIC INTEGRATION
from which the second inequality is obvious. In a similar way " [M + N ] ≤ [M ] + 2 [M ] [N ] + [N ] ≤ ≤ [M ] + ([M ] + [N ]) + [N ] = = 2 ([M ] + [N ]) .
2.2
The Quadratic Variation of Continuous Local Martingales
The following proposition is the starting point in our construction of the stochastic integral process. Proposition 2.37 (Simple Doob–Meyer decomposition) If M is a uniformly bounded, continuous martingale, then: 1. 2. 3. 4.
t
The quadratic variation P (t) [M ] (t) [M ]0 exists. [M ] has a version which is increasing and continuous. For this version M 2 − [M ] is a martingale. [M ] is indistinguishable from any increasing, continuous process P for which P (0) = 0 and M 2 − P is a martingale. (n)
If (tk ) is an infinitesimal sequence of partitions of [0, t] then p
sup |Qn (s) − [M ] (s)| → 0
(2.18)
s≤t
for any t, where Qn (s)
2 (n) (n) . M tk ∧ s − M tk−1 ∧ s
k
Proof. By the Integration By Parts Formula for any t M 2 (t) − M 2 (0) = 2
t
M dM + [M ] (t) = 2 · (M • M ) (t) + [M ] (t) . 0
As M is continuous and uniformly bounded the integral process M • M has a version which is a continuous martingale32 , therefore as M 2 is continuous [M ] M 2 − M 2 (0) − 2 · M • M is continuous, and by Proposition 2.24 M 2 − [M ] = M 2 (0) + 2 · (M • M ) 32 See:
Proposition 2.24, page 128.
THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES
139
t
is a martingale. [M ] (t) is a version of the quadratic variation [M ]0 for any t. p a.s. [M ]0 ≤
q
[M ]0 . Taking the union the For any rational numbers p ≤ q we have measure-zero sets and using the continuity of [M ] we can construct a version which is increasing. If P is another continuous, increasing process for which P (0) = 0 and M 2 − P is a martingale, then N P − [M ] is also a continuous martingale and N (0) = 0. As N is the difference of two increasing processes the trajectories of N have finite variation. By Fisk’s theorem33 N = 0, so P is indistinguishable from [M ]. The convergence (2.18) is a simple consequence of (2.11). First we extend the proposition to continuous local martingales. In order to do it we need the following rule: Proposition 2.38 Under the assumptions of the previous proposition if τ is an τ arbitrary stopping time then [M τ ] = [M ] . τ
2 Proof. As (M τ ) = M 2 τ
τ 2 τ τ (M τ ) − [M ] = M 2 − [M ] = M 2 − [M ] .
τ Stopped martingales are martingales hence M 2 − [M ] is a martinτ gale. [M ] is increasing, so by the uniqueness of the quadratic variation τ [M τ ] = [M ] . Proposition 2.39 If M is a continuous local martingale then there is one and only one continuous, increasing process [M ] such that: 1. [M ] (0) = 0 and 2. M 2 − [M ] is a continuous local martingale.
(n) For any t if tk is an infinitesimal sequence of partitions of [0, t] then p
sup |Qn (s) − [M ] (s)| → 0
(2.19)
s≤t
where Qn (s)
2 (n) (n) . M tk ∧ s − M tk−1 ∧ s
k
Proof. Let M be a continuous local martingale and let (σ n ) be a localizing sequence of M . As M is continuous the hitting times υ n inf {t : |M (t)| ≥ n} 33 See:
Theorem 2.11, page 117.
140
STOCHASTIC INTEGRATION
are stopping times. Stopped martingales are martingales, so if instead of σ n we take the localizing sequence τ n σ n ∧ υ n then the processes Mn M τ n are bounded martingales. 1. As Mn is a bounded, continuous martingale [Mn ] is an increasing processes and Mn2 − [Mn ] is a continuous martingale. By the previous proposition τn
[Mn+1 ]
& τn ' = [Mn ] , = Mn+1
hence [Mn ] = [Mn+1 ] on the interval [0, τ n ]. As τ n ∞ one can define the process [M ] as the ‘union’ of the processes [Mn ], that is [M ] (t, ω) [Mn ] (t, ω) ,
t ≤ τ n (ω) .
Evidently [M ] is continuous, increasing and [M ] (0) = 0. Of course
τ n 2 τ M 2 − [M ] = (M τ n ) − [M ] n Mn2 − [Mn ] ,
which is a martingale, hence M 2 − [M ] is a local martingale. 2. Assume that A (0) = 0 and M 2 − A is a continuous local martingale for some continuous, increasing process A.
Z M 2 − [M ] − M 2 − A = A − [M ] is a continuous local martingale and Z, as the difference of two increasing processes, has finite variation. So by Fisk’s theorem Z is constant. As Z(0) = A (0) − [M ] (0) = 0, obviously Z ≡ 0. (n)
3. Finally, let us prove (2.19). Fix ε, δ, t > 0 and (tk )k . Let Qn be (m) the approximating sum for [M ] and let Qn be the approximating sum for [Mm ]. A sup |Qn (s) − [M ] (s)| > ε , s≤t
(m)
A
(m) sup Qn (s) − [Mm ] (s) > ε . s≤t
As τ m ∞, for m large enough P (τ m ≤ t) ≤ δ/2 and P A(m) ≤ δ/2. Obviously P (A) = P (A ∩ (τ m ≤ t)) + P (A ∩ (τ m > t)) ≤ ≤ P ((τ m ≤ t)) + P (A ∩ (τ m > t)) ≤
THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES
141
δ + P (A ∩ (τ m > t)) = 2
δ
δ δ δ = + P A(m) ∩ (τ m > t) ≤ + P A(m) ≤ + , 2 2 2 2 ≤
hence (2.19) holds. Proposition 2.40 If M and N are continuous local martingales then [M, N ] is the only continuous process with finite variation on finite intervals for which: 1. [M, N ] (0) = 0 and 2. M N − [M, N ] is a continuous local martingale. (n)
For any infinitesimal sequence of partitions (tk ) of [0, t] p
sup |Qn (s) − [M, N ] (s)| → 0 s≤t
where Qn (s)
(M (tk ∧ s) − M (tk−1 ∧ s)) (N (tk ∧ s) − N (tk−1 ∧ s)) .
(2.20)
k
Proof. From Fisk’s theorem the uniqueness of [M, N ] is again trivial, as M N −A and M N − B are continuous local martingales for some A and B, then A − B is a continuous local martingale with finite variation, so A − B is a constant. As A (0) = B (0) = 0 obviously A = B. MN =
1 2 2 (M + N ) − (M − N ) , 4
so it is easy to see that Proposition 2.39 can be applied to [M, N ]
1 ([M + N ] − [M − N ]) 4
(2.21)
in order to show that M N − [M, N ] is a continuous local martingale and that (2.21) holds. Definition 2.41 If for some process X there is a process P such that X − P is a local martingale, then we say that P is a compensator of X. If P is continuous then we say that P is a continuous compensator of X. If P is predictable then we say that P is a predictable compensator of X etc. So far we have proved that if M is a continuous local martingale then [M ] is the only increasing, continuous compensator of M 2 . It is important to emphasize that this property of [M ] holds only for continuous local martingales.
142
STOCHASTIC INTEGRATION
Example 2.42 Quadratic variation of the compensated Poisson processes.
Let π be a Poisson process with parameter λ. The increments of π are independent and the expected value of π (t) is λt, hence the compensated process ν (t) π (t) − λt is a martingale. We show that ν 2 (t) − λt is also a martingale, that is: λt is a continuous, increasing compensator for ν 2 .
2 E ν 2 (t) − λt | Fs = ν (s) + 2ν (s) E (ν (t) − ν (s) | Fs ) +
2 + E (ν (t) − ν (s)) | Fs − λt. The increments of π are independent, hence the conditional expectation is a real expectation. Given that the increments are stationary 2ν (s) E (ν (t) − ν (s) | Fs ) = 2ν (s) E (ν (t − s)) = 0
2 2 E (ν (t) − ν (s)) | Fs = E (ν (t − s)) = λ (t − s) , hence
E ν 2 (t) − λt | Fs = ν 2 (s) + λ (t − s) − λt = = ν 2 (s) − λs. (ν)
If we partition the interval [0, t] then if Qn is the sequence of the approximating (π) sum for [ν] and Qn is for [π] , then (π) Q(ν) n = Qn − 2λ
+ λ2
(n)
(n) (n) (n) π tk − π tk−1 tk − tk−1 + k (n)
(n)
tk − tk−1
2 .
k
(n) (n) (π) It is easy to see that if maxk tk − tk−1 → 0 then the limit of Qn is the process π. The limits of the other expressions are zero. Hence [ν] = π. Proposition 2.43 If M, N and U are continuous local martingales; ξ and η are F0 -measurable random variables then [ξM + ηN, U ] = ξ [M, U ] + η [N, U ] . Proof. M U − [M, U ] and N U − [N, U ] are local martingales hence (M + N ) U − ([M, U ] + [N, U ]) is also a local martingale, and by the uniqueness property of
THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES
143
the quadratic co-variation [M + N, U ] = [M, U ] + [N, U ] . In a similar way: M U − [M, U ] is a local martingale, ξ is F0 -measurable, hence ξ (M U − [M, U ]) is also a local martingale, hence again by the uniqueness property of the quadratic co-variation [ξM, N ] = ξ [M, N ]. Proposition 2.44 If M and N are continuous local martingales then [M, N ] = [M − M (0) , N − N (0)] = [M − M (0) , N ] . Proof. Obviously [M − M (0) , N ] = [M, N ] − [M (0) , N ]. As M (0) is F0 measurable M (0) N is a continuous local martingale. Hence [M (0) , N ] = 0. Proposition 2.45 (Stopping rule for quadratic variation) Let τ be an arbitrary stopping time. τ
1. If M is a continuous local martingale then [M τ ] = [M ] . τ 2. If M and N are continuous local martingales then [M τ , N τ ] = [M, N ] = τ [M , N ]. Proof. [M τ ] is the only continuous, increasing process A for which A (0) = 0 2 and (M τ ) − A is a continuous local martingale. M 2 − [M ] is a continuous local martingale, hence
τ τ τ 2 τ M 2 − [M ] = M 2 − [M ] = (M τ ) − [M ] τ
is a continuous local martingale, hence by the uniqueness [M ] = [M τ ]. From (2.21) and from the first part of the proof [M τ , N τ ] =
1 τ τ ([(M + N ) ] − [(M − N ) ]) = 4 1 τ τ τ ([M + N ] − [M − N ] ) = [M, N ] . 4
If U and V are martingales and τ is a stopping time, then for any bounded stopping time σ by the Optional Sampling Theorem E ((U τ · (V − V τ )) (σ)) = E (U (τ ∧ σ) · E (V (σ) − V (τ ∧ σ) | Fτ ∧σ )) = = E (U (τ ∧ σ) · 0) = 0,
144
STOCHASTIC INTEGRATION
hence U τ (V − V τ ) is a martingale. From this it is easy to prove with localization that M τ (N − N τ ) is a local martingale, hence τ
τ
M τ N − [M, N ] = M τ N − M τ N τ + M τ N τ − [M, N ] = τ
τ
= M τ (N − N τ ) + ((M N ) − [M, N ] ) is also a local martingale. From the uniqueness of the quadratic co-variation τ
[M τ , N ] = [M, N ] = [M τ , N τ ] .
Example 2.46 If M and N are independent and they are continuous local martingales with respect to their own filtration then [M, N ] = 0.
Let F M and F N be the filtrations generated by M and N . Let Fs be the σ-algebra generated by the sets A ∩ B,
A ∈ FsM , B ∈ FsN .
We shall prove that if M and N are independent martingales then M N is a martingale under the filtration F. As M and N are martingales, M (t) and N (t) are integrable. M (t) and N (t) are independent for any t. Hence the product M (t) N (t) is also integrable. If F A ∩ B, A ∈ FsM and B ∈ FsN then E (M N (t) χF ) = E (M (t) χA N (t) χB ) = E (M (t) χA ) E (N (t) χB ) = = E (M (s) χA ) E (N (s) χB ) = E (M N (s) χF ) , which by the uniqueness of the extension of finite measures can be extended for every F ∈ Fs . Hence M N is an F-martingale so [M, N ] = 0. The quadratic co-variation is independent of the filtration34 so [M, N ] = 0 under the original filtration. If M and N are local martingales with respect to their own filtration, then the localized processes are independent martingales. Hence if τ (τ n ) is a common localizing sequence then [M, N ] n = [M τ n , N τ n ] = 0. Hence [M, N ] = 0. Proposition 2.47 Let M be a continuous local martingale. M is indistinguishable from a constant if and only if the quadratic variation [M ] is zero. 34 Here we directly used the definition of the quadratic variation as the limit of the approximating sums.
THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES
145
Proof. If M is a constant then M 2 is also a constant, hence M 2 is a local martingale35 so [M ] = 0. On the other hand if [M ] = 0 then M 2 − [M ] = M 2 is a local martingale. The proposition follows from the next proposition. Proposition 2.48 M and M 2 are continuous local martingales, if and only if M is a constant. Proof. If M is constant then M and M 2 are local martingales. On the other hand 2
(M − M (0)) = M 2 − 2 · M · M (0) + M 2 (0) . Since M and M 2 are local martingales and M (0) is F0 -measurable, 2 (M − M (0)) is also a local martingale. Let (τ n ) be a localizing sequence for 2 (M − M (0)) . By the martingale property
2 2 E (M τ n (t) − M τ n (0)) = E (M τ n (0) − M τ n (0)) = 0, hence for any t a.s.
M (t ∧ τ n ) = M (0) . Therefore for any t a.s.
M (t) = lim M (t ∧ τ n ) = M (0) . n→∞
The local martingales are right-regular therefore M is indistinguishable from M (0). Corollary 2.49 Let a ≤ b < ∞. A continuous local martingale M is constant on [a, b] if and only if [M ] is constant on [a, b]. Proof. If τ n ∞ then a process X is constant on an interval [a, b] if and only τ if X τ n is constant on [a, b] for all n. Using this fact and that [M τ n ] = [M ] n one can assume that M is a martingale. 1. Define the stochastic process N (t) M (t + a) − M (a) . N is trivially a martingale for the filtration Gt Ft+a , t ≥ 0. N 2 (t) − ([M ] (t + a) − [M ] (a)) = M 2 (t + a) − ([M ] (t + a) − [M ] (a)) − − 2M (t + a) M (a) + M 2 (a) . 35 See:
Definition 1.131, page 94.
146
STOCHASTIC INTEGRATION
Obviously M 2 (t + a) − ([M ] (t + a) − [M ] (a)) is a G-martingale. M (t + a) is also a G-martingale hence M (t + a) M (a) + M 2 (a) is obviously a G-local martingale, hence by the uniqueness of the quadratic variation [N ] (t) = [M ] (t + a) − [M ] (a) . 2. M is constant on the interval [a, b] if and only if N is zero on the interval [0, b − a]. As we proved N is constant on [0, b − a] if and only if [N ] = 0 on [0, b − a]. Hence M is constant on [a, b] if and only if [M ] is constant on [a, b]. We summarize the statements above in the following proposition: Proposition 2.50 [M, N ] is a symmetric bilinear form and [M ] ≥ 0. [M ] = 0 if and only if M is constant. This is also true on any half-line [a, ∞) if instead of [M, N ] we use the increments [M, N ] − [M, N ] (a).
2.3
Integration when Integrators are Continuous Semimartingales
In this section we introduce a simple construction of the stochastic integral when the integrator X is a continuous semimartingale and the integrand Y is progressively measurable36 . Every continuous semimartingale has a unique decomposition of type X = X (0) + L + V , where V is continuous and has finite variation and L is a continuous local martingale. The integration with respect to V is a simple measure theoretic exercise: V (ω) generates a σ-finite measure on R+ for every ω. Every progressively measurable process is product measurable, hence all trajectories Y (ω) are measurable. For every ω and for every t one can define the pathwise integral (Y • V ) (t, ω)
t
Y (s, ω) V (ds, ω) , 0
where the integrals are simple Lebesgue integrals37 The main problem is how to define the stochastic integral with respect to the local martingale part L! 36 See: 37 See:
[78] Proposition 1.20, page 11.
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
2.3.1
147
The space of square-integrable continuous local martingales
Recall the definition and some elementary properties of square-integrable martingales: Definition 2.51 As before H2 is the space of L2 (Ω) bounded martingales38 on R+ . Let G 2 Hc2 denote the space of L2 (Ω)-bounded, continuous martingales. . / H02 M ∈ H2 : M (0) = 0 ,
. / G02 M ∈ G 2 : M (0) = 0 .
The elements of H2 , G 2 , H02 and G02 are equivalence classes: M1 and M2 are in the same equivalence class if they are indistinguishable. Proposition 2.52 M ∈ H2 if and only if sup M 2 (t) ∈ L1 (Ω). t
H2 , ·H2 is a Hilbert space where M H2 M (∞)2 = lim M (t)2 . t→∞
the set of continuous square-integrable martingales G 2 is a closed subspace of H2 . Proof. The first statement follows from Doob’s inequality39 . The relation M (∞)2 = lim M (t)2 t→∞
is obviously true as M (t) converges40 to M (∞) in L2 (Ω), and the norm is a continuous function. In order to show that G 2 is closed, let (Mn ) be a sequence of H2
continuous square-integrable martingales and assume that Mn → M . By Doob’s inequality41 E
2 2 sup |Mn (t) − M (t)| ≤ 4 Mn (∞) − M (∞)2 t
2
4 Mn − M H2 → 0. 38 That is if M is a martingale then M ∈ H2 , that is M is square-integrable, if and only if supt M (t) 2 < ∞. 39 See: Corollary 1.54, page 34. 40 See: Corollary 1.59, page 35. 41 See: (1.18) line, page 34.
148
STOCHASTIC INTEGRATION
From the L2 -convergence one has a subsequence for which a.s.
sup |Mnk (t) − M (t)| → 0, t
hence Mnk (t, ω) → M (t, ω) uniformly in t for almost all ω. Hence M (t, ω) is continuous in t for almost all ω. So the trajectories of M are almost surely continuous, therefore G 2 is closed. Our direct goal is to prove that if M is a square-integrable martingale and M (0) = 0 then
2 2 M H2 M (∞)2 = E M 2 (∞) = E ([M ] (∞)) . To do this one should prove that M 2 − [M ] is not only a local martingale but it is a uniformly integrable martingale. Proposition 2.53 (Characterization of square-integrable martingales) Let M be a continuous local martingale. The following statements are equivalent: 1. M is square integrable, 2. M (0) ∈ L2 (Ω) and E ([M ] (∞)) < ∞. In both cases M 2 − [M ] is a uniformly integrable martingale. Proof. The proof of the equivalence of the statements is the following: 1. Let (τ n ) be a localizing sequence of the local martingale M 2 − [M ] and let 2 σ n τ n ∧ n. By the martingale property of (M τ n ) − [M τ n ]
E M 2 (σ n ) − [M ] (σ n ) = E M 2 (0) .
(2.22)
As M is square-integrable M 2 (σ n ) ≤ sup M 2 (t) ∈ L1 (Ω) , t
so by the Dominated Convergence Theorem
lim E M 2 (σ n ) = E lim M 2 (σ n ) = E M 2 (∞) < ∞.
n→∞
n→∞
[M ] is increasing therefore by the Monotone Convergence Theorem and by (2.22)
E ([M ] (∞)) = lim E ([M ] (σ n )) = lim (E M 2 (σ n ) − E M 2 (0) < ∞, n→∞
n→∞
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
149
that is [M ] (∞) ∈ L1 (Ω) and 1. implies 2. For every stopping time τ 2 M − [M ] (τ ) ≤ sup M 2 (t) + sup [M ] (t) = t
t
= sup M (t) + [M ] (∞) ∈ L1 (Ω) , 2
t
.
/
hence the set M 2 (τ ) − [M ] (τ ) τ is dominated by an integrable variable and therefore it is uniformly integrable. By this M 2 −[M ] is a class D local martingale hence it is a uniformly integrable martingale42 . 2. Let τ be an arbitrary stopping time. Let (σ n ) be a localizing sequence of M . One can assume that M σn − M (0) is bounded43 . Let N M τ ∧σn − M (0). By the definition of the quadratic variation N 2 (t) = 2
t
N− dN + [N ] (t) . 0
o–Stieltjes integral defines a martingale44 . So As N− is bounded the Itˆ
E N 2 (t) = E ([N ] (t)) = E ([M τ ∧σn ] (t)) ≤ E ([M ] (∞)) . Applying Fatou’s lemma
2 E (M − M (0)) (τ ) ≤ E ([M ] (∞)) .
(2.23)
By the second assumption of 2. the expected value on the right-hand side is finite so the set of variables S of type (M − M (0)) (τ ) is bounded in L2 (Ω). Hence S is a uniformly integrable set and therefore M − M (0) is a class D local martingale and hence it is a martingale45 . By (2.23) M − M (0) is trivially bounded in L2 (Ω), that is M − M (0) ∈ G 2 . As M (0) ∈ L2 (Ω) by the first assumption of 2. obviously M ∈ G 2 . Corollary 2.54 If M ∈ G 2 and σ ≤ τ are stopping times then
E M 2 (τ ) − M 2 (σ) | Fσ = E ([M ] (τ ) − [M ] (σ) | Fσ ) =
2 = E (M (τ ) − M (σ)) | Fσ , specifically
E M 2 (τ ) − E M 2 (0) = E ([M ] (τ )) . 42 See:
(2.24)
Proposition 1.144, page 102. σn the general case when M is not necessarily continuous one can assume that M− −M (0) is bounded. 44 See: Proposition 2.24, page 128. 45 See: Proposition 1.144, page 102. 43 In
150
STOCHASTIC INTEGRATION
Proof. By the previous proposition M 2 − [M ] is a uniformly integrable martingale, hence if σ ≤ τ then by the Optional Sampling Theorem
E M 2 (τ ) − [M ] (τ ) | Fσ = M 2 (σ) − [M ] (σ) from which the first equation follows. M is also uniformly integrable hence again by the Optional Sampling Theorem M (σ) = E (M (τ ) | Fσ ) .
2 E (M (τ ) − M (σ)) | Fσ =
= E M 2 (τ ) + M 2 (σ) − 2M (σ) M (τ ) | Fσ =
2 = E M 2 (τ ) + M 2 (σ) − 2M (σ) | Fσ =
= E M 2 (τ ) − M 2 (σ) | Fσ . Let M be a semimartingale. Let us define ∞ χC d [M ] αM (C) E 0
where the integral with respect [M ] is the pathwise Lebesgue–Stieltjes integral generated by the increasing, right-regular46 process [M ]. It is not entirely trivial that αM is well-defined, that is the expression under the expected value is measurable. By the Monotone Convergence Theorem ∞ n χC d [M ] = E lim χC d [M ] . E n→∞
0
0
n
As 0 χC d [M ] is measurable47 for every n the parametric integral under the expected value is measurable. Obviously αM is a measure on B (R+ ) × A. Example 2.55 If M ∈ G 2 and τ is a stopping time then αM ([0, τ ]) = E M 2 (τ ) − E M 2 (0) . If M ∈ G02 then E (M 2 (∞)) = E ([M ] (∞))
[M ] (∞) = αM (R+ × Ω).
M H2
2
46 Of
course tacitly we again assume that [M ] has a right-regular version. Proposition 1.20, page 11.
47 See:
(2.25)
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
151
If τ is an arbitrary random time then ∞ αM ([0, τ ]) E χ ([0, τ ]) d [M ] = E ([M ] (τ ) − [M ] (0)) = 0
= E ([M ] (τ )) . By (2.24) for every stopping time
E ([M ] (τ )) = E M 2 (τ ) − E M 2 (0) , hence αM ([0, τ ]) = E ([M ] (τ )) − E ([M ] (0)) = E ([M ] (τ )) =
= E M 2 (τ ) − E M 2 (0) . If M ∈ G02 then M (0) 0 hence by (2.24)
E M 2 (∞) = E M 2 (∞) − E M 2 (0) = E ([M ] (∞)) . The other relations are consequences of the definitions. Definition 2.56 αM is called the Dol´eans measure48 generated by the quadratic variation of M . 2.3.2
Integration with respect to continuous local martingales
Let us start with the simplest case: Definition 2.57 Let M be a continuous local martingale. Let L2 (M ) denote the space of equivalence classes of square-integrable and progressively measurable functions on the measure space (R+ × Ω, R, αM ) that is let L2 (M ) L2 (R+ × Ω, R, αM ) where R , as before, denote the σ-algebra of progressively measurable sets. Let ·M denote the norm of the Hilbert space L2 (M ): - XM
X 2 dα R+ ×Ω
Example 2.58 The space L2 (w). 48 See:
Definition 5.4, page 295.
M
- E 0
∞
X 2 d [M ]
.
152
STOCHASTIC INTEGRATION
The quadratic variation of a Wiener process on an interval [0, s] is s. Hence t 2 Xw = E 0 X 2 (s) ds on the interval [0, t]. If t < ∞ then w ∈ L2 (w) , since by Fubini’s theorem 2 ww
t
E
2
w (s) ds
t
=
0
E w (s) ds =
0
t
sds < ∞.
2
0
The main result of this section is the following: Proposition 2.59 (Stochastic integration and quadratic variation) If M is a continuous local martingale and X ∈ L2 (M ) then there is a unique process in G02 denoted by X • M such that for every N ∈ G 2 [X • M, N ] = X • [M, N ] . If we denote X • M by
t 0
(2.26)
XdM then (2.26) can be written as
t
XdM, N = 0
t
Xd [M, N ] . 0
Proof. We divide the proof into several steps. We prove that X • M exists, and the definition of X • M is correct—that is, the process X • M is unique. 1. The proof of uniqueness is easy. If I1 and I2 are two processes in G02 satisfying (2.26) then [I1 , N ] = [I2 , N ] for all N ∈ G02 . Hence [I1 − I2 , N ] = 0 for all N ∈ G02 . As I1 − I2 ∈ G02 [I1 − I2 , I1 − I2 ] [I1 − I2 ] = 0, hence I1 − I2 is constant49 . As I1 − I2 ∈ G02 , I1 − I2 = 0, so I1 = I2 . 2. Now we prove the existence of X • M . Assume first that N ∈ G02 . By the Kunita–Watanabe inequality50 and by the formula (2.25) E
0
∞
- - ∞ ∞ Xd [M, N ] ≤ X 2 d [M ] d [N ] 0 0 2 2 - ∞
XM = XM 49 See: 50 See:
Proposition 2.47, page 144. Corollary 2.35, page 137.
E "
d [N ]
=
0
E ([N ] (∞)) = XM N H2 .
(2.27)
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
153
∞ Observe that XM N H2 < ∞, hence 0 Xd [M, N ] is almost surely finite. So the right-hand side of (2.26) is well-defined. By the bilinearity of the quadratic co-variation N → E
∞
Xd [M, N ]
0
is a continuous linear functional on the Hilbert space G02 . As every continuous linear functional on a Hilbert space has a scalar product representation there is an X • M ∈ G02 such that for every N ∈ G02 E
∞
Xd [M, N ] = (X • M, N ) E ((X • M ) (∞) N (∞)) .
(2.28)
0
3. The main part of the proof is to show that for X • M the identity (2.26) holds. Define the process S (X • M ) N − X • [M, N ] . To prove (2.26) we show that S is a continuous martingale, hence by the uniqueness of the quadratic co-variation [X • M, N ] = X • [M, N ]! First observe that S is adapted: (X • M ) N is a product of two martingales, that is the product of two adapted processes. t X is progressively measurable, by the definition of L2 (M ), so the integral 0 Xd [M, N ] is also adapted51 . S is continuous as by the construction (X • M ) N is a product of two continuous functions so it is continuous, and since M and N are continuous t the quadratic variation [M, N ] is also continuous. Therefore the integral 0 Xd [M, N ] as a function of t is continuous. Finally to show that S is a martingale one should prove that52 E (S (τ )) = E (S (0)) = 0
(2.29)
for every bounded stopping time τ . By definition X • M is a uniformly integrable martingale. Therefore by the Optional Sampling Theorem (X • M ) (τ ) = E ((X • M ) (∞) | Fτ ) . 51 See: 52 See:
Proposition 1.20, page 11. Proposition 1.91, page 57.
154
STOCHASTIC INTEGRATION
Using that N τ ∈ G02 and (2.28) E (S (τ )) E (X • M ) (τ ) N (τ ) −
τ
0
X [M, N ] =
τ
= E ((X • M ) (τ ) N (τ )) − E
X [M, N ] =
0
= E (E ((X • M ) (∞) | Fτ ) N (τ )) − E = E (E ((X • M ) (∞) N (τ ) | Fτ )) − E = E (X • M (∞) N (τ )) − E
= E (X • M (∞) N (τ )) −
0 ∞
τ
X [M, N ] 0
∞
∞
=
X [M, N ] = τ
0
∞
X [M, N ] = τ
X [M, N τ ] = 0.
0
Therefore (2.29) holds. 4. Finally if N ∈ G 2 then N − N (0) ∈ G02 , hence [X • M, N ] = [X • M, N − N (0)] = = X • [M, N − N (0)] = X • [M, N ] .
Proposition 2.60 (Stopping rule for stochastic integrals) If M is an arbitrary continuous local martingale, X ∈ L2 (M ) and τ is an arbitrary stopping time then τ
X • M τ = (χ ([0, τ ]) X) • M = (X • M ) = X τ • M τ .
(2.30)
Proof. By (2.26) and by the stopping rule for the quadratic variation, if N ∈ G 2 τ
τ
τ
τ
[(X • M ) , N ] = [(X • M ) , N ] = (X • [M, N ]) = X • [M, N ] = = X • [M τ , N ] = [X • M τ , N ] . By the bilinearity of the quadratic variation τ
[(X • M ) − X • M τ , N ] = 0, N ∈ G 2 , τ
from which [(X • M ) − X • M τ ] = 0 that is τ
(X • M ) = X • M τ .
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
155
If X ∈ L2 (M ) then trivially χ ([0, τ ]) X ∈ L2 (M ). For every N ∈ G 2 τ
[X • M τ , N ] = X • [M τ , N ] = X • [M, N ] = = (χ ([0, τ ]) X) • [M, N ] = = [(χ ([0, τ ]) X) • M, N ] , hence again X • M τ = (χ ([0, τ ]) X) • M. Using stopping rule (2.30) we can extend the stochastic integral to the space L2loc (M ). Definition 2.61 Let M be a continuous local martingale. The space L2loc (M ) is the set of progressively measurable processes X for which there is a localizing sequence of stopping times (τ n ) such that
∞
E 0 τn
=E
X 2 d [M τ n ] = E
X d [M ] = E
0
2
∞
τn
X 2 d [M ]
=
0
∞
χ ([0, τ n ]) X d [M ] 2
0
χ ([0, τ n ]) X 2 dαM < ∞. (0,∞)×Ω
Example 2.62 If M is a continuous local martingale and X is locally bounded then X ∈ L2loc (M ).
One can assume that X(0) = 0 as obviously every F0 -measurable constant pro2 . Let (τ n ) be a common localizing cess is in L2loc . As M is continuous M ∈ Hloc τn 2 53 τn sequence of X and M . M ∈ H so [M ] (∞) ∈ L1 (Ω). Therefore E
∞
2
X d [M
τn
] ≤ sup X 2 (t) E ([M τ n ] (∞)) < ∞. t≤τ n
0
Proposition 2.63 If M is a continuous local martingale then for every X ∈ L2loc (M ) there is a process denoted by X • M such that 1. (X • M ) (0) = 0 and X • M is a continuous local martingale, 2. for every continuous local martingale N [X • M, N ] = X • [M, N ] . 53 See:
Proposition 2.53, page 148.
(2.31)
156
STOCHASTIC INTEGRATION
X • M is unambiguously defined by (2.31), that is X • M is the only continuous local martingale for which for every continuous local martingale N (2.31) holds. Proof. M is a continuous local martingale so it is locally bounded hence M ∈ 2 . Assume that L2loc (M Hloc ) and let (τ n ) be such a localizing sequence of
∞X ∈ 2 X for which E 0 X d [M τ n ] < ∞ that is let X ∈ L2 (M τ n ). Consider the integrals In X • M τ n . τn
τn In+1 (X • M τ n+1 )
τn
= X • (M τ n+1 )
= X • M τ n = In ,
hence In+1 and In are equal on [0, τ n ]. One can define the integral process X • M unambiguously if for all n the value of X • M is by definition is In on the interval [0, τ n ]. By the stopping rule for stochastic integrals it is obvious from the construction that X • M is independent of the localizing sequence (τ n ). Obviously (X • M ) (0) = 0 and X • M is continuous. Trivially (X • M )
τn
τn
(X • M τ n )
= X • M τn
τ
and X • M τ n ∈ G02 , hence (X • M ) n is a uniformly integrable martingale so X •M is a local martingale. We should prove (2.31). Let (τ n ) be such a localizing sequence that X ∈ L2 (M τ n ) and N τ n ∈ G 2 . As X ∈ L2 (M τ n ) and N τ n ∈ G 2 by the stopping rule for the quadratic variation54 τn
[X • M, N ]
= [(X • M )
τn
, N τn]
[X • M τ n , N τ n ] = X • [M τ n , N τ n ] = τn
= X • [M, N ]
τn
= (X • [M, N ])
,
hence (2.31) is valid. Let us prove some elementary properties of the stochastic integral. The most important properties are simple consequences of (2.31), the basic properties of the quadratic variation and the analogous properties of the pathwise integration. Proposition 2.64 (Itˆ o’s isometry) If M is a continuous local martingale then the mapping X → X • M is an L2 (M ) → G02 isometry. That is if X ∈ L2 (M ) then
2 2 E (X • M ) (∞) X • M H2 = XM E
2
0
54 See:
Proposition 2.45, page 143.
∞
X d [M ] . 2
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
157
Proof. Using the definition of the norm in H2 and (2.25), by (2.31)
2 2 X • M H2 E (X • M ) (∞) = E ([X • M ] (∞)) E ([X • M, X • M ] (∞)) = ∞ =E Xd [X • M, M ] = E 0
∞
Xd (X • [M ]) .
0
In the right-hand side of the identity [X • M, M ] = X • [M, M ]. The integral is taken pathwise, hence ∞ 2 X • M H2 = E Xd (X • [M ]) = 0
∞
=E 0
2 X 2 d [M ] XM ,
and hence the mapping X → X • M is an isometry. 1
Example 2.65 The standard deviation of
0
√ wdw is 1/ 2.
The integral is meaningful and as on finite intervals w ∈ L2 (w) the integral 1 process w • w is a martingale. Hence the expected value of the integral 0 wdw is zero. By Itˆo’s isometry and by Fubini’s theorem 2 1 1 1
2 wdw w (s) ds = E w2 (s) ds = =E E 0
0
0
1
=
sds = 0
1 . 2
√ Hence the standard deviation is 1/ 2. We can calculate the standard deviation in the following way as well:
2
t
wdw
−
0
wdw 0
is a martingale, hence 2 1 E wdw =E 0
1
wdw
E
0
1
0
w2 d [w] = 0
1
wdw,
0
=E using (2.26) directly.
t
1
1
wdw 0
1 E w2 (s) ds = , 2
=
158
STOCHASTIC INTEGRATION
Proposition 2.66 If M is a continuous local martingale and X ∈ L2loc (M ) then [X • M ] = X 2 • [M ] .
(2.32)
Proof. By simple calculation using (2.31), and that on the right-hand side of (2.31), we have a pathwise integral [X • M ] [X • M, X • M ] = X • [M, X • M ] = = X • (X • [M, M ]) = X 2 • [M ] . Corollary 2.67 If M is a continuous local martingale and X is a progressively measurable process then X ∈ L2loc (M ) if and only if for all t almost surely
t
X 2 d [M ] X 2 • [M ] (t) < ∞.
(2.33)
0
Proof. The quadratic variation [X • M ], like every quadratic variation, is almost surely finite, hence if X ∈ L2loc (M ) then by (2.32), (2.33) holds. On the other hand, assume that (2.33) holds. For all n let us define the stopping times t τ n inf t : X 2 d [M ] ≥ n . 0
As [M ] is continuous, X 2 • [M ] is also continuous, hence
τn
X 2 d [M ] ≤ n,
0
that is X ∈ L2 (M τ n ) , hence X ∈ L2loc (M ) , so the space L2loc (M ) contains all the R-measurable processes, for which (2.33) holds for all t. Corollary 2.68 Assume that M is a local martingale and X ∈ L2loc (M ). If on an interval [a, b] 1. X (t, ω) = 0 for all ω or 2. M (t, ω) = M (a, ω) , then X • M is constant on [a, b]. Proof. The integral X 2 •[M, M ] is a pathwise integral, hence under the assumptions X 2 • [M, M ] is constant on [a, b]. As [X • M ] = X 2 • [M ] , the local martingale X • M is constant on55 [a, b]. 55 See:
Proposition 2.47, page 144.
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
159
Proposition 2.69 (Stopping rule for stochastic integrals) If M is a continuous local martingale, X ∈ L2loc (M ) and τ is an arbitrary stopping time then τ
(X • M ) = χ ([0, τ ]) X • M = X τ • M τ = X • M τ .
(2.34)
Proof. Let τ be an arbitrary stopping time. If X ∈ L2loc (M ) , then as |χ ([0, τ ]) X| ≤ |X| trivially χ ([0, τ ]) X ∈ L2loc (M ). Using the analogous properties of the L2 (M ) integrals τ τn
((X • M ) )
τ
τ
τ
= ((X • M ) n ) (X • M τ n ) = = χ ([0, τ ]) X • M τ n τn
(χ ([0, τ ]) X • M )
.
The proof of the other parts of (2.34) are analogous. Proposition 2.70 (Linearity) X • M is bilinear, that is if α1 and α2 are constants then X • (α1 M1 + α2 M2 ) = α1 (X • M1 ) + α2 (X • M2 ) and (α1 X1 + α2 X2 ) • M = α1 (X1 • M ) + α2 (X2 • M ) when all the expressions are meaningful. In these relations if two integrals are meaningful then the third one is meaningful. Proof. If X ∈ L2loc (M1 ) ∩ L2loc (M2 ) then for all t
t
X 2 d [M1 ] < ∞ 0
t
X 2 d [M2 ] < ∞.
and 0
Obviously, by the Kunita–Watanabe inequality56 [M1 + M2 ] ≤ 2 ([M1 ] + [M2 ]) hence
t
X d [M1 + M2 ] ≤ 2 2
0 56 See:
Corollary 2.36, page 137.
t 2
X d [M2 ] < ∞, 2
X d [M1 ] + 0
t
0
160
STOCHASTIC INTEGRATION
therefore X ∈ L2loc (M1 + M2 ). From the linearity of the pathwise integration and from the bilinearity of the quadratic variation [X • (α1 M1 + α2 M2 ) , N ] = X • [(α1 M1 + α2 M2 ) , N ] = = X • (α1 [M1 , N ] + α2 [M2 , N ]) = = α1 X • [M1 , N ] + α2 X • [M2 , N ] = = [α1 X • M1 + α2 X • M2 , N ] , from which the linearity of the integral in the integrand is evident. The linearity in the integrator is also evident as [(α1 X1 + α2 X2 ) • M, N ] = (α1 X1 + α2 X2 ) • [M, N ] = = α1 X1 • [M, N ] + α2 X2 • [M, N ] = = [α1 X1 • M, N ] + [α2 X2 • M, N ] = = [α1 X1 • M + α2 X2 • M, N ] . The remark about the integrability is evident from the trivial linearity of the space L2loc (M ). Proposition 2.71 (Associativity) If X ∈ L2 (M ) then Y ∈ L2 (X • M ) if and only if XY ∈ L2 (M ). If X ∈ L2loc (M ) then Y ∈ L2loc (X • M ) , if and only if XY ∈ L2loc (M ). In both cases (Y X) • M = Y • (X • M ) .
(2.35)
Proof. Using the construction of the stochastic integral and given that the associativity formula (2.35) is valid for pathwise integration [X • M ] = [X • M, X • M ] = X • [M, X • M ] = = X • (X • [M, M ]) = X 2 • [M, M ] . By the associativity of the pathwise integration for non-negative integrands E
∞
Y d [X • M ] = E 2
0
∞
=E
s
X d [M ] = 2
Y d
0
2
0
∞
Y 2 X 2 d [M ] ,
0
hence Y X ∈ L2 (M ) if and only if Y ∈ L2 (X • M ). If X ∈ L2 (M ) , then by the Kunita–Watanabe inequality for almost all ω the trajectory X (ω) is integrable
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
161
with respect to [M, N ] (ω). If XY ∈ L2 (M ) then using (2.26) again [(Y X) • M, N ] = (Y X) • [M, N ]
t
Y Xd [M, N ] = 0
t
s
Xd [M, N ] Y • (X • [M, N ]) ,
Yd 0
(2.36)
0
Using (2.26) and that Y ∈ L2 (X • M ) , Y • (X • [M, N ]) = Y • [X • M, N ] = [Y • (X • M ) , N ] . Comparing it with line (2.36), [(Y X) • M, N ] = [Y • (X • M ) , N ] . Hence by the uniqueness of the stochastic integral (Y X) • M = Y • (X • M ) . To prove the general case, observe that XY ∈ L2loc (M ) if and only if for some localizing sequence (τ n )
E χ ([0, τ n ]) X 2 Y 2 • [M ] < ∞. As
χ ([0, τ n ]) Y 2 • X 2 • [M ] = χ ([0, τ n ]) Y 2 X 2 • [M ] XY ∈ L2loc (M ) if and only if Y ∈ L2loc (X • M ). Let (τ n ) be a common localizing sequence for M and X • M . If Y ∈ L2loc (X • M ) then evidently τ
Y ∈ L2 ((X • M ) n ) = L2 ((X • M τ n )) . So τn
(Y • (X • M ))
τn
Y • (X • M )
= Y • (X • M τ n ) = τn
= (Y X • M τ n ) ((Y X • M )) from which the associativity is evident.
,
STOCHASTIC INTEGRATION
162 2.3.3
Integration with respect to semimartingales
We can extend again the definition of the stochastic integration to semimartingales: Definition 2.72 Let X = X (0) + L + V be a continuous semimartingale. If for some process Y the integrals Y • L and Y • V are meaningful then the stochastic integral Y • X of Y with respect to X by definition is the sum Y • X Y • L + Y • V. Remember that by Fisk’s theorem the decomposition X = X (0) + L + V is unique, hence the integral is well-defined. Proposition 2.73 The most important properties of the stochastic integral Y •X are the following: 1. Y • X is bilinear, that is Y • (α1 X1 + α2 X2 ) = α1 (Y • X1 ) + α2 (Y • X2 ) and (α1 Y1 + α2 Y2 ) • X = α1 (Y1 • X) + α2 (Y2 • X) assuming that all the expressions are meaningful. If two integrals are meaningful then the third is meaningful. 2. For all locally bounded processes Y, Z Z • (Y • X) = (ZY ) • X. 3. For every stopping time τ τ
(Y • X) = (Y χ ([0, τ ]) • X) = Y • X τ . 4. If the integrator X is a local martingale or if X has bounded variation on finite intervals then the same is true for the integral process Y • X. 5. Y • X is constant on any interval where either Y = 0, or X is constant. 6. [Y • X, Z] = Y • [X, Z] for any continuous semimartingale Z. 2.3.4
The Dominated Convergence Theorem for stochastic integrals
A crucial property of every integral is that under some conditions one can swap the order of taking limit and the integration: Proposition 2.74 (Dominated Convergence Theorem for stochastic integrals) Let X be a continuous semimartingale, and let (Yn ) be a sequence of
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
163
progressively measurable processes. Assume that (Yn (t, ω)) converges to Y∞ (t, ω) in every point (t, ω). If there is an integrable process Y such that57 |Yn | ≤ Y for all n, then Yn • X → Y∞ • X, where the convergence is uniform in probability on every compact interval, that is p
sup |(Yn • X) (s) − (Y∞ • X) (s)| → 0, s≤t
for all t ≥ 0.
Proof. One can prove the proposition separately when X has finite variation and when X is a local martingale. It is sufficient to prove the proposition when Y∞ ≡ 0. 1. First, assume that X has finite variation. In this case the integrability of Y means that for every t
t
|Y | dVar (X) < ∞. 0
As |Yn | ≤ Y , for every ω the trajectory Yn (ω) is also integrable on every interval [0, t]. Applying the classical Dominated Convergence Theorem for every trajectory individually, for all s ≤ t
0
s
t Yn dX ≤ |Yn | dVar (X) → 0. 0
Hence the integral, as a function of the upper bound uniformly converges to zero. Pointwise convergence on a finite measure space implies convergence in measure, so when the integrator has finite variation then the proposition holds. 2. Let X be a local martingale. Y is integrable with respect to X, hence by definition Y ∈ L2loc (X). Let ε, δ > 0 be arbitrary, and let (τ n ) be a localizing sequence of Y . To make the notation simpler, let us denote by σ a τ n for which σ P (τ n < t) ≤ δ/2. By the stopping rule (Yn • X) = Yn • X σ , that is if s ≤ σ (ω) then (Yn • X) (s, ω) = (Yn • X σ ) (s, ω) . If A
sup |Yn • X| (s) > ε , s≤t
Aσ
sup |Yn • X σ | (s) > ε , s≤t
57 The integrability of Y depends on the integrator X. If X is a local martingale, then by definition this means that Y ∈ L2loc (X).
164
STOCHASTIC INTEGRATION
then P (A) = P ((σ < t) ∩ A) + P ((σ ≥ t) ∩ A) ≤ ≤ P (σ < t) + P ((t ≤ σ) ∩ A) ≤
δ + P (Aσ ) . 2
Since Y ∈ L2 (X σ ), obviously Yn ∈ L2 (X σ ). Hence by the classical Dominated Convergence Theorem as Yn → 0 and |Yn | ≤ Y 2
Yn X σ E
∞
Yn2 d [X σ ] = E
0
∞
=E
χ ([0, σ]) Yn2 d [X]
∞
σ
Yn2 d [X]
=
0
→ 0,
0
that is Yn → 0 in L2 (X σ ). By Itˆ o’s isometry the correspondence Z → Z • X σ is H2
an L2 (X σ ) → H2 isometry58 . Hence Yn • X σ → 0. By Doob’s inequality59 E
2 sup |Yn • X | (s) σ
s≤∞
2 ≤ 4E ((Yn • X σ ) (∞)) 2
4 Yn • X σ H2 → 0. By Markov’s inequality, stochastic convergence follows from the L2 (Ω)convergence, hence P (Aσ ) P sup |Yn • X σ | (s) > ε → 0. s≤t
Hence for n large enough P (A) P sup |Yn • X| (s) > ε ≤ δ. s≤t
2.3.5
Stochastic integration and the Itˆ o–Stieltjes integral
As we mentioned, every integral is in some sense the limits of certain approximating sums. From the construction above it is not clear in which sense the integral X • M is a limit of the approximating sums. 58 See: 59 See:
Itˆ o’s isometry, Proposition 2.64, page 156. line (1.17) page 34. Proposition 2.52 page 147.
INTEGRATION WITH CONTINUOUS SEMIMARTINGALES
165
Lemma 2.75 If X is a continuous semimartingale and Y
η i · χ ((τ i , τ i+1 ])
i
is an integrable, non-negative predictable simple process60 then (Y • X) (t) =
η i · (X(τ i+1 ∧ t) − X(τ i ∧ t)) .
i
Proof. If σ ≤ τ are stopping times, then using the linearity and the stopping rule χ ((σ, τ ]) • X = (χ ([0, τ ]) − χ ([0, σ])) • X = τ
σ
= (1 • X) − (1 • X) = X τ − X σ . Hence the formula holds with η ≡ 1. It is easy to check that if F ∈ Fσ ⊆ Fτ then σ (ω) if ω ∈ F τ (ω) if ω ∈ F σ F (ω) , τ F (ω) ∞ if ω ∈ /F ∞ if ω ∈ /F are also stopping times, hence (χF χ ((σ, τ ])) • X = χ ((σ F , τ F ]) • X = X τ F − X σF = χF (X τ − X σ ) , hence the formula is valid if η = χF , F ∈ Fσ . If η is an Fσ -measurable step function, then since the integral is linear one can write η in the place of χF . It is easy to show that for any Fσ -measurable function η the process ηχ ((σ, τ ]) is integrable with respect to X, hence using the Dominated Convergence Theorem one can prove the formula when η is an arbitrary Fσ -measurable function. As Y ≥0 0 ≤ Yn
n
η i χ ((τ i , τ i+1 ]) ≤ Y.
i=1
The general case follows from the Dominated Convergence Theorem and from the linearity of the integral. Corollary 2.76 If X is a continuous semimartingale, τ n ∞ and Y i η i · χ ((τ i , τ i+1 ]) is a predictable simple process then
t
Y dX (Y • X) (t) = 0 60 See:
Definition 1.41, page 24.
i
η i · (X(τ i+1 ∧ t) − X(τ i ∧ t)) .
166
STOCHASTIC INTEGRATION
Proof. As τ n ∞, Y is left-continuous and has right-hand side limits. So Y is locally bounded on [0, ∞) and therefore Y ± are integrable. Proposition 2.77 If X is a continuous semimartingale, Y is a left-continuous, adapted and locally bounded process, then (Y • X) (t) is the Itˆ o–Stieltjes integral for every t. The convergence of the approximating sums is uniform in probability on every compact interval. The partitions of the intervals can be random as well. (n)
Proof. More precisely, let τ k For each t let
(n)
≤ τ k+1 ∞ be a sequence of stopping times.
(n) (n) (n) Y (τ k ) X(τ k+1 ∧ t) − X(τ k ∧ t)
k
be the sequence of Itˆo-type approximating processes. Assume that for each ω (n) (n) lim max τ k+1 (ω) − τ k (ω) = 0.
n→∞
k
Define the locally bounded simple predictable processes Y (n)
(n) (n) (n) χ τ k , τ k+1 . Y τk
k
As we saw
(n) (n) (n) Y (τ k ) X(τ k+1 ∧ t) − X(τ k ∧ t) . Y (n) • X (t) = k
Y is continuous from the left, hence in every point Y (n) → Y . Let K (t) sup |Y (s)| . s
Y is continuous from the left, so one can take the supremum over the rational points only, hence K is progressively If Y is locally bounded then measurable. K is also locally bounded, and as Y (n) ≤ K, by the Dominated Convergence Theorem Y (n) • X → Y • X, where the convergence is uniform in probability on every compact interval.
LOCALLY SQUARE-INTEGRABLE MARTINGALES
2.4
167
Integration when Integrators are Locally Square-Integrable Martingales
So far, we have assumed that the integrator processes are continuous. It is obvious from the construction that during the discussion the continuity of the integrator was rarely explicitly used. In fact the only place where the continuity is used is the construction and the characterization of the quadratic variation process. The main point in the continuous case is that if M is a continuous local martingale then the quadratic variation is continuous, hence by Fisk’s theorem it is the only increasing process P for which M 2 − P is a local martingale. In this section 2 2 . If M ∈ Hloc then we briefly discuss the case when the integrators are in Hloc the quadratic variation is generally not continuous hence one cannot use Fisk’s theorem. On the other hand as we shall show the jumps ∆ [M ] of [M ] are the 2 squares of the jumps of M and [M ] − ∆ [M ] = [M ] − (∆M ) is continuous so [M ] is the only right-continuous, increasing process P for which M 2 − P is a local 2 martingale and ∆P = (∆M ) . If we use this observation then the rest of the construction is nearly the same as for continuous local martingales and hence we 2 very short. can make the discussion of the case Hloc 2.4.1
The quadratic variation of locally square-integrable martingales
Recall that we have defined the quadratic variation only for finite intervals. The 2 first step of the discussion is to construct the quadratic variation process for Hloc local martingales. Proposition 2.78 If M ∈ H2 and M− is uniformly bounded then there is an a.s. t increasing and right-continuous process [M ] such that [M ] (t) = [M ]0 for any t and 2
∆ [M ] = (∆M ) . This version is indistinguishable from any increasing, right-continuous process (n) 2 P for which P (0) = 0, ∆P = (∆M ) and M 2 − P is a martingale. If (tk ) is an infinitesimal sequence of partitions of [0, t], then p
sup |Qn (s) − [M ] (s)| → 0,
(2.37)
s≤t
where Qn (s)
k
2 (n) (n) M (tk ∧ s) − M (tk−1 ∧ s) .
(2.38)
STOCHASTIC INTEGRATION
168
Proof. By the integration by parts formula for any t
t
M 2 (t) − [M ] (t) = M 2 (0) + 2
M− dM = M 2 (0) + 2 (M− • M ) (t) . 0
As M ∈ H2 and M− is uniformly bounded, the integral process M− • M has a version which is a martingale61 and P
sup |In (s) − (M− • M ) (s)| = sup |In (s) − (M • M ) (s)| → 0, s≤t
(2.39)
s≤t
where In (s)
(n) (n) (n) M (tk−1 ∧ s) M (tk ∧ s) − M (tk−1 ∧ s) .
k
As M 2 and M− • M are right-continuous [M ] M 2 − M 2 (0) − 2M− • M t
is also right-continuous. [M ] (t) is a version of the quadratic variation [M ]0 for p a.s.
q
any t. [M ]0 ≤ [M ]0 for rational numbers p ≤ q. Unifying the measure-zero sets and using the right-continuity of [M ] we can construct an increasing version. (2.37) follows from (2.39). By (2.38) there is a subsequence such that a.s.
sup |Qnk (s) − [M ] (s)| → 0. s≤t
From the uniform convergence ∆ [M ] (s) [M ] (s) − [M ] (s−) = lim Qn (s) − lim Qn (s−) = n→∞
n→∞
= lim (Qn (s) − Qn (s−)) = lim (∆Qn (s)) = n→∞ n→∞
2
2 (n) (n) − M (s−) − M tk−1 = lim M (s) − M tk−1 = n→∞
2
2
= (∆M (s)) − 02 = (∆M (s)) . If P is another right-continuous, increasing process for which P (0) = 0, ∆P = 2 (∆M ) and M 2 − P is martingale, then N P − [M ] is a continuous martingale, N (0) = 0 and the trajectories of N have finite variation. By Fisk’s theorem N = 0 so P is indistinguishable from [M ] . 61 See:
Proposition 2.24, page 128.
LOCALLY SQUARE-INTEGRABLE MARTINGALES
169
Proposition 2.79 Under the assumption of the previous proposition, if τ is an τ arbitrary stopping time then [M τ ] = [M ] . τ
2 Proof. As (M τ ) = M 2 τ
τ 2 τ τ (M τ ) − [M ] = M 2 − [M ] = M 2 − [M ] Since stopped martingales are martingales, stopping the martingale M 2 − [M ] τ at τ we get a martingale again. [M ] is increasing and
τ τ τ 2 2 ∆ ([M ] ) = (∆ [M ]) = (∆M ) = ((∆M τ )) τ
so by the uniqueness property of the quadratic variation [M τ ] = [M ] . 2 Proposition 2.80 If M ∈ Hloc then there is one and only one right-continuous 2 increasing process [M ] such that [M ]
(0) = 0, ∆ [M ] = (∆M ) and M 2 − [M ] is (n)
a local martingale. For any t if tk
is an infinitesimal sequence of partitions
of [0, t] then p
sup |Qn (s) − [M ] (s)| → 0 s≤t
where Qn (s)
2 (n) (n) M (tk ∧ s) − M (tk−1 ∧ s) .
k 2 Proof. If M ∈ Hloc then by definition there is a localizing sequence such that τn 2 M ∈ H . As M− is left-regular M− is locally bounded62 , hence we can assume τn is bounded. If Mn M τ n then by the previous proposition that M− & τn ' τ = [Mnτ n ] = [Mn ] [Mn+1 ] n = Mn+1
hence [Mn ] = [Mn+1 ] on the interval [0, τ n ] and [Mn ] is constant on [τ n , ∞). As τ n ∞ one can define the process [M ] as the ‘union’ of processes [Mn ] that is [M ] (t, ω) [Mn ] (t, ω) ,
t ≤ τ n (ω) .
Evidently [M ] is right-continuous, increasing and [M ] (0) = 0 and ∆ [M ] = 2 (∆M ) . Of course
2 τ n 2 τ M − [M ] = (M τ n ) − [M ] n Mn2 − [Mn ] , 62 See:
Proposition 1.151, page 106.
STOCHASTIC INTEGRATION
170
is a martingale, hence M 2 −[M ] is a local martingale. The proof of the uniqueness is the same as above. 2 Corollary 2.81 If M, N ∈ Hloc then there is one and only one right-continuous process with bounded variation [M, N ] such that
1. [M, N ] (0) = 0, 2. ∆ [M, N ] = ∆M ∆N and 3. M N − [M, N ] is a local martingale. (n)
For any t if (tk )k is an infinitesimal sequence of partitions of [0, t] then p
sup |Qn (s) − [M ] (s)| → 0 s≤t
where Qn (s)
(n)
(n)
M (tk ∧ s) − M (tk−1 ∧ s)
(n) (n) N (tk ∧ s) − N (tk−1 ∧ s) .
k
One can easily prove the next propositions with a trivial modification of proofs of the corresponding theorems in the continuous case63 . Proposition 2.82 If M is a local martingale then the quadratic variation [M ] is zero if and only if M is indistinguishable from a constant64 . Proposition 2.83 (Stopping rule for quadratic variation) Let τ be an arbitrary stopping time65 . τ
1. If M is a local martingale then [M τ ] = [M ] . τ 2. If M and N are local martingales then [M τ , N τ ] = [M, N ] = [M, N τ ]. Proposition 2.84 (Characterization of H2 martingales) Let M be a local martingale. The following statements are equivalent: 1. M ∈ H2 . 2. M (0) ∈ L2 (Ω) and E ([M ] (∞)) < ∞. In both cases M 2 − [M ] is a uniformly integrable martingale66 . 63 Let us emphasize that we have not proved yet that for an arbitrary local martingale L the difference L2 − [L] is also a local martingale. To prove this we shall need the Fundamental Theorem of Local Martingales. See: 3.62, page 222. At the moment we have proved only for locally square-integrable martingales that L2 − [L] is a local martingale, so at the moment one can use the results below only for locally square-integrable martingales. 64 See: Proposition 2.47, page 144. 65 See: Proposition 2.45, page 143. 66 See: Proposition 2.53, page 148.
LOCALLY SQUARE-INTEGRABLE MARTINGALES
171
Proposition 2.85 If M ∈ H02 then M H2 2.4.2
" " " " E (M 2 (∞)) = E ([M ] (∞)) [M ] (∞) = αM (R+ × Ω). 2
Integration when the integrators are locally square-integrable martingales
In the discontinuous case, we can define the stochastic integral only when the integrand X is predictable, therefore when M is not continuous then by definition we assume that the members of L2 (M ), and of course the members of L2loc (M ), are predictable. Proposition 2.86 If M ∈ H2 and X ∈ L2 (M ) , then there is a unique process in H02 denoted by X • M such that for every N ∈ H2 [X • M, N ] = X • [M, N ] . If we denote X • M as
t
(2.40)
XdM then (2.40) can be written as
0
t
t XdM, N = Xd [M, N ] .
0
0
Proof. Most of the proof is the same as in the continuous case. 1. One can prove the existence of X • M as in the continuous case. One can also define the process S (X • M ) N − X • [M, N ] . As in the continuous case one can show that S is a martingale. 2. As X • [M, N ] is a pathwise integral it is easy to show67 that the jumps of X • [M, N ] are X · ∆ [M, N ] = X · ∆M ∆N. Using the predictability of the integrands we prove that the jumps of X • M are X∆M , that is ∆ (X • M ) = X · ∆M.
(2.41)
From this the jumps of (X • M ) N are (X∆M ) ∆N . By the characterization of the quadratic co-variation this implies that [X • M, N ] = X • [M, N ] . That is if N ∈ H02 then (2.40) holds! 67 See:
Proposition 1.20, page 11.
172
STOCHASTIC INTEGRATION
3. So let us prove (2.41)! Assume that M ∈ H2 . First let X ξχ ((a, b]) , where the ξ is bounded and Fa -measurable. Observe that X is left-regular, hence it is predictable. We prove that X • M = ξ M b − M a . E
∞
Xd [M, N ] E ξ
0
∞
χ ((a, b]) d [M, N ] =
0
= E (ξ ([M, N ] (b) − [M, N ] (a))) =
b a = E ξ [M, N ] (∞) − [M, N ] (∞) =
& ' = E ξ M b − M a , N (∞) . As M b − M a ∈ H2 if N ∈ H2
& ' M b − M a N − M b − M a, N
is uniformly integrable. As ξ is bounded and Fa -measurable & '
ξ M b − M a N − ξ M b − M a, N is also uniformly integrable. Hence '
& E ξ M b − M a , N (∞) = E ξ M b − M a (∞) N (∞)
(ξ M b − M a , N ). By the definition of the integral for all N ∈ H2
(ξ M − M b
a
, N) = E
∞
Xd [M, N ] = (X • M, N ) .
0
This means that, as we said,
X • M = ξ Mb − Ma .
∞ 4. The mapping X → E 0 Xd [M, N ] is linear, the mapping X → X • M is obviously also linear, hence if X
n i=1
ξ i χ ((ti , ti+1 ]) ,
(2.42)
LOCALLY SQUARE-INTEGRABLE MARTINGALES
173
where the ξ i are bounded and Fti -measurable then (X • M ) (t) =
n
ξ i (M (t ∧ ti+1 ) − M (t ∧ ti )) =
i=1
= (X • M ) (t) . For processes (2.42) relations (2.40) and the jump condition (2.41) obviously hold. As elements of L2 (M ) are predictable the bounded predictable step processes68 are dense in L2 (M ). Let X ∈ L2 (M ) and let Xn → X where (Xn ) are step processes. By Doob’s inequality and by (2.27) and (2.28) 2
sup |(Xn • M ) (t) − (X • M ) (t)|
E
t
≤
2
≤ 4 (Xn − X) • M (∞)2 = = 4 ((Xn − X) • M, (Xn − X) • M ) = ∞ (Xn − X) d [M, (Xn − X) • M ] ≤ = 4E 0
≤ 4 Xn − XM (Xn − X) • M H2 ≤ ≤ 4 Xn − XM (Xn • M H2 + X • M H2 ) . (2.40) holds for step processes so Xn • M H2 =
"
- E ([Xn • M ] (∞)) = E
∞
0
Xn2 d [M ] Xn M
is a bounded sequence. As Xn − XM → 0 E
2 sup |(Xn • M ) (t) − (X • M ) (t)| → 0. t
Hence for a subsequence (Xnk • M ) almost surely sup |(Xnk • M ) (t) − (X • M ) (t)| → 0. t
Therefore for the jumps almost surely ∆ (Xnk • M ) → ∆ (X • M ) . 68 See:
Proposition 1.42, page 25.
174
STOCHASTIC INTEGRATION
As we have proved ∆ (Xnk • M ) = Xnk ∆M, therefore ∆ (Xnk • M ) = Xnk ∆M → X∆M. This means that if M ∈ H2 then ∆ (X • M ) = X∆M, hence the proposition holds. 2 We can extend the stochastic integral X • M to processes M ∈ Hloc and X ∈ 2 Lloc (M ) exactly as we did it for continuous local martingales. It is easy to 2 and (τ n ) show that if X is locally bounded then X ∈ L2loc (M ). If M ∈ Hloc is a localizing sequence of M then, as in the continuous case, one can prove that τn
X • M τ n = X • (M τ n+1 )
τn
= (X • M τ n+1 )
so one can ‘paste together’ X • M . Let us observe that τn
∆ (X • M )
= ∆ (X • M τ n ) = X∆M τ n
which implies that ∆ (X • M ) = X∆M . Using that the members of L2 (M ) are predictable we showed that ∆ (X • M ) = X∆M. With localization one can easily prove the following important observation: 2 and X ∈ L2loc (M ) then ∆ (X • M ) = X∆M . Corollary 2.87 If M ∈ Hloc
Let us summarize the properties of the stochastic integration when the integrator 2 is in Hloc . The proofs of these properties are direct modifications of the proofs of the corresponding properties presented in the continuous case. Theorem 2.88 (Properties of stochastic integration) If the integrators are 2 then stochastic integration has the following properties: in Hloc 1. 2. 3. 4.
We defined the stochastic integral only for predictable integrands. 2 2 If M ∈ Hloc and X ∈ L2loc (M ) then (X • M ) (0) = 0 and X • M ∈ Hloc . 2 If M ∈ Hloc and X is locally bounded and predictable then X • M exists. Observe that when M is continuous then X ∈ L2loc (M ) if and only if X 2 • [M ] < ∞. In the general case this characterization is not true. 2 5. If M ∈ Hloc then X → X • M is an L2 (M ) → H02 isometry. 2 6. If M, N ∈ Hloc and X ∈ L2loc (M ) then [X • M, N ] = X • [M, N ]. 2 7. Assume that M ∈ Hloc and X ∈ L2loc (M ). If on an interval [a, b] one has X (t, ω) = 0 or M (t, ω) = M (a, ω) for all ω then X • M is constant on [a, b].
LOCALLY SQUARE-INTEGRABLE MARTINGALES
175
2 8. If M ∈ Hloc , X ∈ L2loc (M ) and τ is an arbitrary stopping time then τ
(X • M ) = χ ([0, τ ]) X • M = X τ • M τ = X • M τ . 9. X • M is bilinear. 2 10. Assume that M ∈ Hloc . If X ∈ L2 (M ) then Y ∈ L2 (X • M ) if and only 2 if XY ∈ L (M ). If X ∈ L2loc (M ) then Y ∈ L2loc (X • M ) , if and only if XY ∈ L2loc (M ). In both cases (Y X) • M = Y • (X • M ) . 2 11. If M ∈ Hloc and X ∈ L2loc (M ) then ∆ (X • M ) = X∆M . 2 12. If M ∈ Hloc and (Xn ) is a sequence of predictable processes, Xn → X∞ in every point and there is an X ∈ L2loc (M ) such that |Xn | ≤ X then Xn • M → X∞ • M, where the convergence is uniform on every compact interval in probability, that is p
sup |(Xn • M ) (s) − (X∞ • M ) (s)| → 0, s≤t
2 , τ n ∞ and X 13. If M ∈ Hloc process then X • M exists and
t
XdM (X • M ) (t) = 0
i
for every t ≥ 0.
ξ i χ ((τ i , τ i+1 ]) is a predictable simple
ξ i (M (τ i+1 ∧ t) − M (τ i ∧ t)) .
i
2 14. If M ∈ Hloc , X is left-regular then (X • M ) (t) is an Itˆ o–Stieltjes integral for every t where the convergence of the approximating sums is uniform in probability on every compact interval. The approximating partitions can be random as well69 . 2 it is possible that the trajectories of M have finite Remark that if M ∈ Hloc variation on finite intervals70 . In this case we potentially might have two different definitions for the stochastic integral. Fortunately this not the case.
Proposition 2.89 Let us assume that for some process M we can define two different concepts of integration. Assume that 1. both concepts of integration is linear over the bounded processes, 2. for both concepts of integration bounded predictable processes are integrable, 69 If
X is not left-regular then the property does not hold. See: Example 2.8, page 114. Example 2.20, page 124.
70 See:
STOCHASTIC INTEGRATION
176
3. the integral of the bounded simple processes X
n
ξ i χ ((ti , ti+1 ])
i=1
is (X • M ) (t) =
n
ξ i (M (ti+1 ∧ t) − M (ti ∧ t)) ,
i=1
4. in both cases the Theorem on Dominated Convergence is true. If for some predictable process X both integrals exist then they are indistinguishable. Proof. Let us denote by L the set of bounded processes where the two concepts of integration coincide. L is obviously a linear space and 1 ∈ L. By the dominated convergence property it is obvious that L is a λ-system. The set of bounded elementary processes is a π-system, hence by the Monotone Class Theorem L contains all the bounded predictable processes71 . If X predictable then for all n the integrals of Xn Xχ (|X| ≤ n) are equal. If X is integrable for both concepts then by the Dominated Convergence Theorem the two integrals should be equal. 2.4.3
Stochastic integration when the integrators are semimartingales
The decomposition of continuous semimartingales is unique. In the discontinuous case this is not true. Hence we need the following new definition: Definition 2.90 Let X be a semimartingale. We say that the predictable process Y is integrable with respect to X if there is a decomposition X = X (0) + H + V
(2.43)
2 where H ∈ Hloc and V has finite variation and the integrals Y • H and Y • V exist. In this case
Y • X Y • H + Y • V. The next example is very important. 71 See:
Proposition 1.42, page 25.
LOCALLY SQUARE-INTEGRABLE MARTINGALES
177
Example 2.91 If the integrand is not locally bounded then it is possible that for some decomposition of a semimartingale the two integrals in the above definition exist but for some other decomposition they do not exist.
If X Poisson process with parameter λ, then X has two different decompositions. One can write
X (t) = (X (t) − λt) + λt H (t) + V (t) ,
where H is the compensated Poisson process and we can decompose X as
X =0+X =H +V
where H = 0 and V X has finite variation. Let τ be the time of the first jump of X, and let Y (t, ω) χ (0, τ (ω)] /t. Y is predictable as it is the limit of the predictable processes Yn χ (1/n, τ ] /t, but not locally bounded. If we use the decomposition X = X + 0 then Y • 0 = 0, Y • X = 1/τ , hence the integral exists. t s On the other hand if V (t) = λt then for all ω 0 Y dV = λ 0 Y dt is ∞, hence in the other decomposition the integral does not exist. If Y is locally bounded and X is a semimartingale then obviously for any decomposition (2.43) of X the integrals Y • H and Y • V exist. If we restrict the set of possible integrands to locally bounded processes then the stochastic integration with respect to semimartingales is very simple. One can easily show that the integral has all the usual properties of the stochastic integration72 . If Y is integrable then there is a decomposition (2.43) that Y • H and Y • V exist. If |Yn | ≤ Y then Yn • H and Yn • V exist for every n. For stochastic integration with respect 2 integrator and for the classical pathwise integration the Dominated to a Hloc Convergence Theorem holds. For bounded predictable processes the stochastic integral exists for any decomposition of the semimartingale integrand and it is linear for any fixed decomposition over the bounded integrands. Therefore, by Proposition 2.89, if for two different decompositions of a semimartingale the integral exists for some predictable process, then the two possible integrals are equal. Hence for predictable processes the definition of the integral is independent of the decomposition of the semimartingale. Every left-regular process is locally bounded, hence one can easily prove the following very important observation: 72 See:
Theorem 2.88, page 174,
178
STOCHASTIC INTEGRATION
Proposition 2.92 (Existence of quadratic variation) If X and Y are arbitrary semimartingales then the quadratic co-variation [X, Y ] has a rightcontinuous version and XY − X (0) Y (0) = X− • Y + Y− • X + [X, Y ] where the integrals are stochastic integrals. The jumps of the quadratic covariation are ∆ [X, Y ] = ∆X∆Y . Proof. It is sufficient to prove the relation ∆ [X, Y ] = ∆X∆Y . From the formula for the jumps of stochastic integrals ∆ [X, Y ] = ∆ (XY ) − X− ∆Y − Y− ∆X = = XY − X− Y− − X− ∆Y − Y− ∆X = = XY − X− (Y− + ∆Y ) − Y− (∆X + X− ) + Y− X− = = XY − X− Y − Y− X + Y− X− = = (X − X− ) (Y − Y− ) = ∆X∆Y. Recall that we have defined semimartingales as the sums of locally squareintegrable martingales and processes with finite variation. At this stage there are three things we do not know about semimartingales. 1. We have not proved that the local martingales are semimartingales73 . 2. We also have not proved that if M is a local martingale then M 2 − [M ] is a local martingale. 3. Let X be a semimartingale and let us assume that Y1 • X and Y2 • X exist. Can we prove the existence of (Y1 + Y2 ) • X? At the moment of course not: If Y1 and Y2 are not locally bounded then it is possible that Y1 • X exists in one decomposition and Y2 • X exists in some different decomposition of X. These are serious problems, to overcome them in the next chapters will take a concerted effort!
73 To prove that every local martingale is semimartingale one needs the Fundamental Theorem of Local Martingales. See: Theorem 3.57, page 220.
3 THE STRUCTURE OF LOCAL MARTINGALES The main result of this chapter is the theorem which we shall call the Fundamental Theorem of Local Martingales, which states that every local martingale can be decomposed as the sum of a locally bounded local martingale and a local martingale which has locally integrable variation1 . The Fundamental Theorem of Local Martingales has many important consequences. Perhaps the most important one is that for any local martingale L the quadratic variation process [L] exists and the difference L2 − [L] is a local martingale. From now on we assume that the space (X, A, P) is complete, that is if N ⊆ A ∈ A and P (A) = 0 then N ∈ A. We also assume that the filtration F satisfies the usual conditions. Of course the usual conditions is not a big surprise, but it is remarkable that we need the completeness of the base space (X, A, P). The reader can take this fact as an indicator of the forthcoming measure-theoretic difficulties. Let us introduce some useful notation. Definition 3.1 Fix a stochastic base (Ω, A, P, F). 1. L will denote the set of local martingales L for which L (0) = 0. 2. V will denote the set of right-regular, adapted processes V which have finite variation on every finite interval and for which V (0) = 0. 3. A will denote the set of processes A ∈ V, for which E (Var (A) (∞)) < ∞. A is called the space of processes with integrable variation2 . 1 See:
Definition 3.1 below. careful reader would notice that the symbol A denotes two objects. A is the set of possible events in the probability space (Ω, A, P) and A is also the set of processes with integrable variation. In the theory of stochastic processes the events of (Ω, A, P) play a very minor role, so generally A will denote the set of processes with integrable variation. Nevertheless we apologize for this inconvenience. 2 The
179
180
THE STRUCTURE OF LOCAL MARTINGALES
4. If X ∈ Aloc that is there is a localizing sequence (τ n ) that X τ n ∈ A for all n then we shall say that X has locally integrable variation. 5. V + will denote the processes in V which have increasing trajectories. 6. A+ will denote the processes in A which have increasing trajectories. 7. The adapted process S is a semimartingale if it has a decomposition S = S (0) + L + V
(3.1)
where V ∈ V and L ∈ L. S will denote the set of semimartingales3 . 8. A semimartingale S is a special semimartingale if there is a decomposition (3.1) where V is predictable. Sp will denote the space of special semimartingales. Observe that we have changed the definition of semimartingales4 . An important message of the Fundamental Theorem is that the present definition is the same as the old one. The sequence of jumps of a Poisson process does not have an accumulation point, hence every compound Poisson process is in V. Example 3.2 If X is a compound Poisson process then the distribution of the jumps of X has finite expected value if and only if X ∈ Aloc .
Let (τ n ) be the jump-times of X and let (ξ n ) be the size of the jumps. If for the common distribution of the jumps the expected value m E (|ξ n |) is finite then n E (Var (X τ n ) (∞)) = E |ξ k | = nm < ∞, k=1
hence X ∈ Aloc . On the other hand assume that X ∈ Aloc , but m = ∞. Let (σ n ) be an A-localizing sequence of X. Define the stopping times ρn σ n ∧ τ 1 . ρn ≤ τ 1 , hence Fρn ⊆ Fτ 1 so ρn is Fτ 1 -measurable. P (τ 1 < ∞) = 1, hence if n is large enough then P (ρn = τ 1 ) > 0. The compound Poisson processes are L´evy processes, hence by the strong Markov property of L´evy processes ξ 1 = ∆X (τ 1 ) is independent of Fτ 1 . Hence ∞ > E (Var (X σn ) (∞)) ≥ E (Var (X ρn ) (∞)) = E (|ξ 1 | χ (τ 1 = ρn )) = = E (|ξ 1 |) P (τ 1 = ρn ) = ∞, which is impossible. Of course, in the argument we did not use the fact that the distributions of the jumps were the same. If (τ n ) is a strictly increasing sequence 3 The 4 See:
decomposition is not necessarily unique. Definition 2.17, page 124.
THE STRUCTURE OF LOCAL MARTINGALES
181
of stopping times and at each τ k we have a jump ξ k , which is independent of Fτ k then E (|ξ k |) is finite for all k if and only if the process X (t) k ξ k χ (τ k ≤ t) is in Aloc . Example 3.3 If L ∈ L and L∗ (t) sups≤t |L (s)| then L∗ ∈ A+ loc . Since L is right-continuous sup |L (s)| = s≤t
sup |L (s)| ,
s≤t,s∈Q
hence L∗ is adapted, and increasing. As L is right-continuous, L∗ is also rightcontinuous. Let (τ n ) be a localizing sequence of L. Let σ n inf {t : |L (t)| > n} ∧ τ n . Since L is right-continuous, σ n is a stopping time5 for all n. If σ ∞ (ω) sup σ n (ω) < ∞ n
for some outcome ω then |L (σ ∞ (ω) −)| = ∞ which is impossible as for every outcome L has finite limits from the left. Therefore obviously σ n ∞. σn
(L∗ )
σn
(t) ≤ (L∗ )
(∞) = L∗ (σ n ) ≤ L∗ (σ n −) + |∆L∗ (σ n )| ≤
≤ n + |L (σ n )| = n + |Lτ n (σ n )| . Lτ n ∈ M therefore by the Optional Sampling Theorem |Lτ n (σ n )| is integrable, hence (L∗ )σn ∈ A+ and so L∗ ∈ A+ loc .
Proposition 3.4 If V ∈ V ∩ L then V ∈ Aloc . Proof. Let (τ n ) be a localizing sequence of a V ∈ L. σ n inf {t : Var (V ) (t) > n} ∧ τ n . V is right-continuous and adapted, hence Var (V ) is also right-continuous and adapted. The filtration is right-continuous hence σ n is a stopping time and by the right-regularity of Var (V ) again σ n ∞. If ∆V denotes the jumps of V , 5 See:
Example 1.32, page 17.
182
THE STRUCTURE OF LOCAL MARTINGALES
then Var (V ) (σ n ) = Var (V ) (σ n −) + ∆Var (V ) (σ n ) = = Var (V ) (σ n −) + |∆V (σ n )| ≤ ≤ n + |V (σ n )| + |V (σ n −)| ≤ 2n + |V (σ n )| . V τ n is a uniformly integrable martingale, hence V (σ n ) is integrable so Var (V ) (σ n ) is also integrable, hence by the definition of A obviously V ∈ Aloc .
3.1
Predictable Projection
Our main tool in analysing the structure of local martingales is the so-called predictable projection. It is very natural to ask that how ‘far’ are the predictable processes from the other classes of measurable processes. If X is a product measurable process, then with the predictable projection one can find a predictable process denoted by p X which is in some sense ‘close’ to X. This closeness means that for every so-called predictable stopping time τ the expected value of the stopped variables X(τ ) and (p X) (τ ) are equal. If X is the gain process of some game and τ is an exit strategy, then the stopped variable X(τ ) is the value of the game if one plays the exit strategy τ . If a stopping time is ‘predictable’ then somehow we can foresee, predict it. As X(τ ) and (p X) (τ ) have the same expected value for predictable exit rules it will be irrelevant, on average, whether we play the game X or the predictable game p X. So, as an interpretation one can say that p X is the predictable part6 of X. If L is a local martingale then the ‘unpredictable’ part of L are the jumps ∆L of L. The most important examples of the predictable projection are the rules p L = L− and p (∆L) = 0. This means that one cannot ‘predict’ the size of the jumps of a local martingale7 . 3.1.1
Predictable stopping times
Let us first define when a stopping time is predictable. Definition 3.5 We say that a stopping time σ announces τ if σ (ω) ≤ τ (ω) for all outcomes ω and σ (ω) < τ (ω) whenever τ (ω) > 0. We say that the stopping time τ is predictable if there is a sequence of stopping times (σ n ) such that σ n τ and σ n announces τ for all n. The sequence (σ n ) is called the announcing or predicting sequence of τ . 6 More exactly, of course, p X by definition is the predictable part of X, since p X is mathematically well-defined, but the expression ‘predictable part’ is not a mathematical concept. 7 It is very natural to ask how one can predict the jumps of stock prices. As in mathematical finance we assume that the stock prices are basically driven by some local martingale, the answer is that nobody can predict the jumps of the price processes. So the humble relation p (ΛL) = 0 just mentioned has extraordinarily important theoretical and applied implications.
PREDICTABLE PROJECTION
183
Definition 3.6 We say that a stopping time σ is totally inaccessible if P (τ = σ < ∞) = 0 for every predictable stopping time τ . Example 3.7 The jump-times of Poisson processes are totally inaccessible8 .
Let N be a Poisson process with parameter λ > 0. Let τ be the time of the first jump of N that is let τ inf {t : N (t) = 1} . Obviously τ is a stopping time and almost surely 0 < τ < ∞. It is well-known9 that τ has an exponential distribution with parameter λ. We show that τ is not predictable. Suppose (τ n ) to be an announcing sequence of τ . τ n < τ < ∞, so trivially N (τ n ) = 0 a.s. and by the strong Markov property of L´evy processes N (n) (t) N (t + τ n ) − N (τ n ) = N (t + τ n ) is also a Poisson process with parameter λ. If σ n is the time of the first jump of N (n) , then σ n = τ − τ n , and σ n also has an exponential distribution with parameter λ. 1 = P (τ n τ ) = P (σ n 0) , which is impossible since the convergence of distributions follows from the almost sure convergence. In the same way one can prove that no part of τ is predictable, that there is no predictable stopping time ρ for which on a set B with positive probability ρ = τ : Assume that there is such a ρ and let (ρn ) be a sequence announcing ρ. For the stopping times τ n ρn ∧ τ N (n) (t) N (t + τ n ) − N (τ n ) is again a Poisson process with parameter λ. Again, if σ n denotes the time of the first jump of N (n) then σ n has exponential distribution with parameter λ. (σ n ) is almost surely convergent, hence it converges in distribution, which is impossible since if σ ∞ is the limit of (σ n ) then σ ∞ must be zero on B where B has positive probability. 8 See: 9 See:
Example 7.5, page 465 and Example 7.75, page 517. page 461.
184
THE STRUCTURE OF LOCAL MARTINGALES
Example 3.8 If w is a Wiener process then for all a the first passage time τ a inf {t: w (t) = a} is a predictable stopping time.
By the continuity of the trajectories of w the sequence announces τ .
τ a−1/n
obviously
Proposition 3.9 If τ and σ are predictable stopping times then τ ∧ σ and τ ∨ σ are predictable stopping times. If τ n τ and τ n are predictable stopping times then τ is a predictable stopping time. Proof. If (τ n ) announces τ and (σ n ) announces σ then (τ n ∧ σ n ) will be an announcing sequence for τ ∧ σ and (τ n ∨ σ n ) will be an announcing sequence for (n) τ ∨ σ. If (τ k ) announces τ n , then obviously σ n max τ (k) : k = 1, . . . , n n announces τ and (k) σ n ≤ max τ n+1 : k = 1, . . . , n ≤ σ n+1 . For any ε > 0 and for any outcome ω, there is an N , depending on the outcome,
(N ) such that for any outcome τ n (ω) ≥ τ (ω) − ε for every n ≥ N (ω). τ k (N )
announces τ N hence τ k
≥ τ N − ε for all k ≥ M ≥ N.
(k) (M ) (N ) σ M max τ M : k = 1, . . . , M ≥ τ M ≥ τ M ≥ τ N − ε ≥ τ − 2ε so σ n τ . Example 3.10 If the stopping times τ n are predictable and τ n τ then τ is not necessarily predictable.
Let τ be an arbitrary non-predictable stopping time. Obviously τ n τ + 1/n is a predictable stopping time for every n and τ n τ . a.s.
Proposition 3.11 If τ = 0 then τ is a predictable stopping time. If τ is a a.s. predictable stopping time and σ = τ then σ is also a predictable stopping time. Proof. (Ω, A, P) is complete and the filtration contains the measure-zero sets a.s.
{τ ≤ t} = {τ = 0} ∈ Ft
PREDICTABLE PROJECTION
185
for any t. Hence τ is a stopping time. As τ is announced by τn
τ − 1/n 0
if τ > 1/n if τ ≤ 1/n
τ is predictable. To prove the second statement let (τ n ) be an announcing sequence of τ . By the usual conditions τn if τ =σ if τ = σ and σ > 1/n σ n σ − 1/n 0 otherwise then (σ n ) is an announcing sequence for σ. Definition 3.12 If τ is a predictable stopping time and (τ n ) is an announcing sequence of τ then let Fτ − σ (∪n Fτ n ) . Of course one should now prove that the definition of Fτ − is independent of the announcing sequence (τ n ). Proposition 3.13 If τ is a predictable stopping time then Fτ − = σ (F0 , {A ∩ {t < τ } , A ∈ Ft }) ,
(3.2)
hence the definition of Fτ − is independent of the announcing sequence (τ n ). Proof. For an arbitrary stopping time ρ let Fρ σ (F0 , {A ∩ {t < ρ} , A ∈ Ft }) . Assume that (τ n ) announces τ . 1. We show that if σ announces τ then Fσ ⊆ Fτ , hence σ (∪n Fτ n ) Fτ − ⊆ Fτ . Since σ announces τ , for every A A = (∪r∈Q A ∩ {σ < r < τ }) ∪ (A ∩ {τ = 0}) . If A ∈ Fσ then B A ∩ {σ < r} ∈ Fr , hence by the definition of Fτ A ∩ {σ < r < τ } B ∩ {r < τ } ∈ Fτ .
186
THE STRUCTURE OF LOCAL MARTINGALES
As σ ≤ τ A ∩ {τ = 0} = A ∩ {σ = 0} ∩ {τ = 0} . As τ is a stopping time {τ = 0} ∈ F0 . A ∈ Fσ so A ∩ {σ = 0} ∈ F0 , hence again by the definition of Fτ A ∩ {τ = 0} = A ∩ {σ = 0} ∩ {τ = 0} ∈ Fτ , so Fσ ⊆ Fτ . 2. We show that Fσ ⊆ Fσ for every stopping time σ. If A ∈ Ft for some t then for all possible u {t < σ} ∩ A ∩ {σ ≤ u} = A ∩ {t < σ ≤ u} ∈ Fu , hence {t < σ} ∩ A ∈ Fσ . 0 ≤ σ. Therefore F0 ⊆ Fσ , so by the definition of Fσ obviously Fσ ⊆ Fσ . 3. Let A ∈ Ft . Since τ n τ and as Fτ ⊆ Fτ n
n
A ∩ {t < τ } = ∪n (A ∩ {t < τ n }) ∈ ∪n Fτ n ⊆ σ (∪n Fτ n ) Fτ − . Trivially F0 ⊆ Fτ n ⊆ Fτ − . Hence Fτ ⊆ Fτ − . So Fτ − = Fτ . That is the proposition holds. Proposition 3.14 If τ is predictable then τ is measurable with respect to Fτ − . Proof. If (τ n ) announces τ then τ n is measurable with respect to Fτ n ⊆ Fτ − for all n. τ n τ and therefore τ is also measurable with respect to Fτ − . Proposition 3.15 If X is a predictable process and τ is a predictable stopping time, then the stopped variable Xτ X (τ ) χ (τ < ∞) is measurable with respect to Fτ − . Proof. If X is left-continuous and adapted then it is progressively measurable. Hence if (τ n ) announces τ then Xτ n χ (τ n < ∞) X (τ n ) is measurable10 with respect to Fτ n ⊆ Fτ − . As X is left-continuous Xτ n → Xτ . Hence Xτ is measurable with respect to Fτ − . The set of bounded processes X for which the proposition holds is a λ-system. The left-continuous, adapted processes form a π-system and the predictable processes are measurable with 10 See:
Proposition 1.35, page 22.
PREDICTABLE PROJECTION
187
respect to the σ-algebra generated by the adapted left-continuous processes11 . Hence by the Monotone Class Theorem one can prove the Fτ − -measurability of Xτ in the usual way for every predictable process X. Proposition 3.16 If τ is a predictable stopping time and B ∈ Fτ − then τ B (ω)
τ (ω) if ω ∈ B ∞ if ω ∈ /B
is also a predictable stopping time. Proof. Let B be the collection of such sets B that the proposition holds both for B and for B c . As the maximum and the minimum of a finite number of predictable stopping times are again predictable stopping times, B is an algebra. As the limit of an increasing sequence of predictable stopping times is again a predictable stopping time B is a σ-algebra. Let (τ n ) be an announcing sequence of τ . Fix a C ∈ Fτ n . If m ≥ n then Fτ n ⊆ Fτ m and therefore (m)
τC
τ m (ω) if ω ∈ C ∞ if ω ∈ /C (m)
is again a stopping time. Hence τ C ∧ m is also a stopping time, and (m) announces τ C . Hence τ C is predictable. By the definition of τC ∧ m m≥n
Fτ − , using the fact that B is a σ-algebra, Fτ − σ (∪Fτ n ) ⊆ B, hence the proposition holds. Proposition 3.17 If δ and ρ are predictable stopping times then {ρ = δ} , {ρ ≤ δ} , {δ ≤ ρ} , {δ < ρ} , {ρ < δ} ∈ Fρ− . Proof. It is enough to prove that c
{ρ ≤ δ} , {δ ≤ ρ} = {ρ < δ} ∈ Fρ− . If (ρn ) announces ρ and (δ n ) announces δ, then12 {ρ ≤ δ} = ∩n {ρn ≤ δ} =
c c = (∪n {ρn ≤ δ} ) ∈ σ ∪n Fρn Fρ− ,
{ρ < δ} = (∪n {ρ ≤ δ n }) \ {ρ = δ = 0} =
= (∪n ∩m {ρm ≤ δ n }) \ {ρ = δ = 0} ∈ σ ∪Fρm Fρ− .
11 See: 12 See:
Proposition 1.42, page 25. Proposition 1.34, page 20.
188
THE STRUCTURE OF LOCAL MARTINGALES
3.1.2
Decomposition of thin sets
Assume that the trajectories of some process X are regular. This implies that for every outcome ω the set {t : ∆X (t, ω) = 0} is at most countable and the jumps, which are larger than a given c > 0 cannot have an accumulation point. If τ 0 = 0,
τ n+1 = inf {t > τ n : |∆X| ≥ c}
then the union of the graphs of the stopping times13 (τ n ) covers the set {|∆X| ≥ c}. This implies that the set {∆X = 0} can be covered by the graphs of at most a countable number of stopping times. Definition 3.18 A set A ⊆ R+ × Ω is called thin if A ⊆ ∪n [ρn ] where (ρn ) is a sequence of stopping times. If [ρn ] ∩ [ρm ] = ∅ for all m = n then we say that (ρn ) is an exhausting sequence of A. Proposition 3.19 (Accessible and inaccessible part of stopping times) For every stopping time τ there are, at most, countably many, predictable stopping times (σ k ) and a set X ∈ Fτ , for which τ X is totally inaccessible and the graph of τ X c is covered by the disjoint union of the graphs of the predictable stopping times (σ k ). (0) Proof. If τ is totally inaccessible
(0) then thereis nothing to prove. If τ τ is not totally inaccessible then P τ = σ 1 < ∞ > 0 for some predictable stopping time σ 1 . Let B1 τ (0) = σ 1 < ∞ = {τ = σ 1 < ∞} .
Let us delete from τ the set B1 : As14 B1 ∈ Fτ (0) = Fτ τ (ω) if ω ∈ / B1 (0) (1) τ (ω) τ B c (ω) 1 ∞ if ω ∈ B1 is a stopping time. If τ (1) is totally inaccessible then we stop. If τ (1) is not totally (1) inaccessible then P τ = σ 2 < ∞ > 0 for some predictable stopping time σ 2 . Let c B2 τ (1) = σ 2 < ∞ = {τ = σ 2 < ∞} ∩ {τ = σ 1 < ∞} ∈ Fτ etc. The only problem is that we should finish the process in at most countably many steps. Let
bn sup P τ (n) = σ < ∞, σ is predictable 13 See: 14 See:
Example 1.32, page 17. Proposition 1.34, page 20.
PREDICTABLE PROJECTION
189
and for any n let σ n+1 be such a predictable stopping time that
b n P (Bn+1 ) P τ (n) = σ n+1 < ∞ ≥ . 2 If bn = 0 then as above Bn+1 τ (n) = σ n+1 < ∞ = {τ = σ n+1 < ∞} \ (∪nk=1 Bk ) ∈ Fτ and
τ
(n+1)
(ω) =
(n) τ Bc n+1
(ω)
τ (ω) ∞
if ω ∈ / ∪n+1 k=1 Bk if
ω ∈ ∪n+1 k=1 Bk
.
As the sets Bn are disjoint ∞ n=1
bn ≤ 2
∞
P (Bn ) = P (∪n Bn ) < ∞.
n=1
Therefore bn → 0. Bn ∈ Fτ for every n so X Ω \ (∪n Bn ) ∈ Fτ . Let σ be a predictable stopping time. For an arbitrary n
P (τ X = σ < ∞) ≤ P τ (n) = σ < ∞ ≤ bn → 0, so P (τ X = σ < ∞) = 0. That is τ (ω) if ω ∈ / ∪n Bn . τ X (ω) ∞ if ω ∈ ∪n Bn is totally inaccessible. Definition 3.20 A stopping time σ is called accessible if there is a sequence of predictable stopping times (τ n ) such that P (∪n {σ = τ n < ∞}) = P (σ < ∞) . Corollary 3.21 If τ is a stopping time then there are disjoint events X, Y ∈ Fτ a.s. such that X ∪ Y = {τ < ∞} and τ X is totally inaccessible and τ Y is accessible and a.s.
τ = τX ∧ τY . Proposition 3.22 Every thin set X has an exhausting sequence. We may assume that the members of the exhausting sequence are either predictable or totally inaccessible.
190
THE STRUCTURE OF LOCAL MARTINGALES
Proof. By the previous proposition every stopping time can be covered by a countable number of predictable and one totally inaccessible stopping time. Hence one can cover the set X with the sets ∪n [σ n ] and ∪m [τ m ] where σ n is totally inaccessible for all n and τ m is predictable for all m. We show that one can find such a sequence of stopping times which are either predictable or totally inaccessible and the graphs of the stopping times being disjoint. As σ n is predictable and τ m is totally inaccessible {τ m = σ n < ∞} has zero measure, so it is sufficient to make the sequences (τ n ) and (σ m ) disjoint15 . 1. One can easily make the sequence (σ n ) disjoint. σ 2 } ∈ Fσ2 , σ 2 (ω)
Since16 {σ 1 =
σ 2 (ω) if ω ∈ / {σ 1 = σ 2 } ∞ if ω ∈ {σ 1 = σ 2 }
is also a totally inaccessible stopping time and σ 1 and σ 2 have disjoint graphs. 2. To do the same with the sequence (τ n ) it is sufficient to observe that if τ 1 and τ 2 are predictable stopping times then17 {τ 1 = τ 2 } ∈ Fτ 2 − , hence18 τ 2
(ω)
τ 2 (ω) if ω ∈ / {τ 1 = τ 2 } ∞ if ω ∈ {τ 1 = τ 2 }
is a predictable stopping time. 3.1.3
The extended conditional expectation
In this subsection we introduce a generalization of the conditional expectation. Let us first recall the definition of the conditional expectation: Definition 3.23 (Conditional expectation) Let η be an arbitrary random variable and let F be a σ-algebra. We say that the random variable ξ is the conditional expectation of η with respect to the σ-algebra F if 1. ξ is F-measurable and 2. F ξ dP = F η dP for any F ∈ F. The usual notation of the conditional expectation is E(η | F). 15 Observe that we have used the completeness of (Ω, A, P) implicitly as we tacitly assumed that if a function is almost surely zero, then it is a predictable stopping time. 16 See: Proposition 1.34, page 20. 17 See: Proposition 3.17, page 187. 18 See: Proposition 3.16, page 187.
PREDICTABLE PROJECTION
191
Let us recall that if η ≥ 0 then almost surely E(η | F) ≥ 0, but it is possible that E(η | F) has infinite values. The following theorem is well-known and it is a direct consequence of the Radon–Nikodym theorem: Theorem 3.24 Let F be an arbitrary σ-algebra. 1. If η is non-negative then E(η | F) exists. 2. If η is quasi-integrable19 then E(η | F) exists and E(η | F) = E(η + | F)− E(η − | F). 3. E(η | F) is unique up to a measure-zero set. Definition 3.25 (Generalized conditional expectation) Let η be an arbitrary random variable and let F be a σ-algebra. If the conditional expectations E (η + | F) and E (η − | F) are finite then by definition the generalized conditional expectation of η with respect to the σ-algebra F will be the difference
E η+ | F − E η− | F . We shall also denote the generalized conditional expectation by E (η | F). One can easily reformulate most of the properties of the conditional expectation for the generalized conditional expectation: Proposition 3.26 The generalized conditional expectation has the following properties: 1. The generalized conditional expectation E (ξ | F) exists if and only if the a.s. conditional expectation of |ξ| is almost surely finite that is E (|ξ| | F) < ∞. 2. The generalized conditional expectation E (ξ | F) is unique up to a measurezero set. a.s. 3. If ξ is F-measurable then E (ξ | F) = ξ. 4. If the variables ξ and η have generalized conditional expectation then for arbitrary numbers a and b the variable aξ + bη also has generalized conditional expectation and a.s.
E (aξ + bη | F) = aE (ξ | F) + bE (η | F) . 5. If the generalized conditional expectation E (ξ | F) exists and η is F-measurable then the generalized conditional expectation E (ηξ | F) also exists and a.s.
E (ηξ | F) = ηE (ξ | F) . 19 That is the expectation E(η) E(η + )−E(η − ) exists. Recall that by definition the integral E(η) exists if the difference E(η + ) − E(η − ) is not of the form ∞ − ∞. See: [71].
192
THE STRUCTURE OF LOCAL MARTINGALES
6. If G ⊆ F and the generalized conditional expectation E (ξ | G) exits then the generalized conditional expectation E (ξ | F) also exists and a.s.
E (E (ξ | G) | F) = E (ξ | G) , a.s.
E (E (ξ | F) | G) = E (ξ | G) . Recall that the difference E (η + | F) − E (η − | F) is not necessarily meaningful for every η. To make the definition of the predictable projection as simple as possible we introduce the operator xy
x−y ∞
if x, y ≥ 0 are not both infinite, . otherwise.
Definition 3.27 (Extended conditional expectation) Let η be an arbitrary random variable and let F be an arbitrary σ-algebra. 0 (η | F) is by definition 1. The extended conditional expectation E
0 (η | F) E η + | F E η − | F . E
(3.3)
0 (η | F) is well-defined if 2. We say that the extended conditional expectation E except on a measure-zero set one need not use the convention ∞ ∞ ∞. 0 (η | F) is F-measurable, but it is different from the conLet us remark that E ditional expectation and from the generalized conditional expectation. Observe 0 (η | F) is finite if and only if E 0 (η | F) is the generalized conditional that E expectation. Observe also that in general the extended conditional expectation 0 is not linear. operator E 3.1.4
Definition of the predictable projection
Using the extended conditional expectation, let us define the predictable projection. Before the definition let us discuss the next, very important example. Example 3.28 The predictable projection of a local martingale.
Let L ∈ L be a local martingale and let L be the net gain of some game. Let τ < ∞ be some exit strategy from L. Let assume that we want to predict, foresee, at least infinitesimally, the value of L (τ ). Of course we should assume that τ is a predictable stopping time otherwise there is no hope to predict the value of L (τ ). If (τ n ) announces τ and L is a uniformly integrable martingale then at τ n our reasonable prediction of L (τ ) is E (L (τ ) | Fτ n ) = L (τ n ). By L´evy’s
PREDICTABLE PROJECTION
193
theorem lim E (L (τ ) | Fτ n ) = E (L (τ ) | Fτ − )
n→∞
which is obviously lim L (τ n ) = L (τ −) .
n→∞
If L is a local martingale and (σ m ) is localizing L then lim Lσm (τ −) = L (τ −)
m→∞
is a reasonable infinitesimal prediction of L (τ ). Therefore we have got that the ‘predictable part’ of L is L− . Which is not a great surprise after all. Perhaps it is more interesting to study the limit of the estimating formula E (Lσm (τ ) | Fτ − ). c
Lσm (τ ) = L (τ ) χ ({τ > σ m } ) + L (σ m ) χ (τ > σ m ) . Obviously20 {τ > σ m } = ∪n {τ n > σ m } ∈ σ (∪n Fτ n ) = Fτ − therefore
σm (τ ) | Fτ − − E L− (τ ) | Fτ − =
= χ (τ ≤ σ m ) E L+ (τ ) | Fτ − +
+ χ (τ > σ m ) E L+ (σ m ) | Fτ − −
− χ (τ ≤ σ m ) E L− (τ ) | Fτ − −
− χ (τ > σ m ) E L− (σ m ) | Fτ − .
E (Lσm (τ ) | Fτ − ) E
L+
σ m
If m → ∞ then as τ < ∞ the limit is almost surely
E L+ (τ ) | Fτ − − E L− (τ ) | Fτ − . Almost surely
χ (τ ≤ σ m ) E L+ (τ ) | Fτ − = E χ (τ ≤ σ m ) L+ (τ ) | Fτ − < ∞ for all m and almost surely ∪m {τ ≤ σ m } = Ω. Therefore from the calculation it is clear that almost surely E (L± (τ ) | Fτ − ) < ∞. Hence
L− (τ ) = E L+ (τ ) | Fτ − − E L− (τ ) | Fτ − E (L (τ ) | Fτ − ) where E denotes the extended conditional expectation with respect to Fτ − . 20 See:
Proposition 1.34, page 20.
THE STRUCTURE OF LOCAL MARTINGALES
194
Definition 3.29 (Predictable projection) We say that the predictable process p X is the predictable projection of a process X if p
a.s. 0 X (τ ) = E (X (τ ) | Fτ − )
on the set {τ < ∞}
(3.4)
that is21 0 (X (τ ) | Fτ − ) a.s. 0 (X (τ ) χ (τ < ∞) | Fτ − ) a.s. χ (τ < ∞) E = E = p X (τ ) χ (τ < ∞) for every predictable stopping time τ . 1. p X is well-defined if for every predictable stopping time τ the extended conditional expectation in (3.4) is well defined. 2. p X is finite if for every predictable stopping time τ the extended conditional expectation in (3.4) is a.s. finite. In this case the definition of the predictable projection is the following: for every predictable stopping time τ on the set {τ < ∞} p
a.s. 0 X (τ ) = E (X (τ ) | Fτ − ) = E (X (τ ) | Fτ − ) =
= E X + (τ ) | Fτ − − E X − (τ ) | Fτ − ,
where E denotes the generalized conditional expectation. Example 3.30 For non-negative processes, if the predictable projection exists then it is well-defined; for bounded processes, if it exists then it is finite.
3.1.5
The uniqueness of the predictable projection, the predictable section theorem
Our direct goal in this subsection is to prove the next observation22 : Theorem 3.31 (Uniqueness of the predictable projection) If X1 ≥ X2 and p X1 is a predictable projection of X1 and p X2 is a predictable projection of X2 then p X1 (ω) ≥ p X2 (ω) for almost all outcomes ω. Specifically, if (p X)1 and (p X)2 are both predictable projections of some process X then (p X)1 and (p X)2 are indistinguishable. Before we start to prove the theorem, let us remark that from the definition of the predictable projection it is obvious that the result of the operation p X is unique up to modification. But stochastic processes are not just indexed sets of random 21 See:
Proposition 3.14, page 186. Observe that X (τ ) can have a meaning even on {τ = ∞}. moment we do not know when the predictable projection exists. Later we shall prove that every product measurable process has a predictable projection. See: Proposition 3.44, page 206. 22 At
PREDICTABLE PROJECTION
195
variables, they are functions of two variables! The main point of the theorem is that we can select from every equivalence class p X (t) a representative in such a way that the resulting process, as a function of two variables, is predictable. From this it is not really surprising that the proof of the theorem depends on some deep and difficult properties of measurable sets. The theorem is an easy and direct consequence of the next result: Proposition 3.32 If B ⊆ R+ × Ω is a predictable set and P (projΩ B) > 0 then there is a predictable stopping time σ such that [σ] = Graph (σ) {(σ (ω) , ω) : σ (ω) < ∞} ⊆ B, and 0 < P (σ < ∞). Proof of the Theorem: If the Theorem were not true then the projection on Ω of the set B {(t, ω) :
p
X1 (t, ω) < p X2 (t, ω)}
would have positive probability. The set B is predictable, since p X1 and p X2 are predictable by definition. Hence, by the proposition above, there is a predictable stopping time σ such that with positive probability the graph of σ is in B. It is easy to see that the extended conditional expectation is a monotone operation. Hence as X1 ≥ X2 by the definition of the predictable projection on the set {σ < ∞} p
a.s.
a.s. 0 0 (X2 (σ) | Fσ− ) a.s. X1 (σ) = E (X1 (σ) | Fσ− ) ≥ E = p X2 (σ)
which is impossible as on a set of size P (σ < ∞) > 0 p
X1 (σ) < p X2 (σ) .
If (p X)1 and (p X)2 are two different predictable projections of some X then for almost all outcomes (p X)1 (ω) ≥ (p X)2 (ω) and (p X)2 (ω) ≥ (p X)1 (ω). Hence the trajectories of (p X)1 and (p X)2 are equal almost surely. Proposition 3.32 is a direct consequence of the next important ‘technical’ theorem. This theorem is the ‘hard part’ of the whole stochastic analysis, and during the proof of it we shall use the completeness of (Ω, A, P) several times. Theorem 3.33 (Predictable Section Theorem) If B ⊆ R+ × Ω is a predictable set then for every ε > 0 there is a predictable stopping time σ such that [σ] = Graph (σ) {(t, ω) : t = σ (ω) < ∞} ⊆ B, and P (projΩ B) ≤ P (σ < ∞) + ε.
196
THE STRUCTURE OF LOCAL MARTINGALES
Proof. The proof of the theorem contains several steps. 1. Observe that if τ is an arbitrary stopping time then the process χ ([0, τ ]) is adapted and left-continuous. Recall that the random intervals of type [0, τ ] together with the products {0}×F, F ∈ F0 , generate the predictable sets23 P. As c (τ , ∞) = [0, τ ] , the random intervals (τ , ∞) together with the sets {0} × F, F ∈ F0 , also generate the predictable sets. If σ is a predictable stopping time and (σ n ) announces σ then [σ, ∞) = ∩n (σ n , ∞). Hence if σ is a predictable stopping time then the random interval [σ, ∞) is a predictable set. For an arbitrary stopping time τ (τ , ∞) = ∪n [τ + 1/n, ∞) . τ + 1/n is trivially a predictable stopping time, hence the intervals [σ, ∞), where σ is a predictable stopping time, together with the sets {0}×F, F ∈ F0 , generate the predictable sets P. Let I denote the set of random intervals [σ, τ ), where σ and τ are predictable stopping times. Using that the minima and the maxima of predictable stopping times are predictable stopping times, it is easy to see that I is a semi-algebra. Observe that if F ∈ F0 then 0 if ω ∈ F σ (ω) ∞ if ω ∈ /F is a predictable stopping time. If τ 0 then {0} × F = [σ, τ ] = ∩n [σ, τ + 1/n) that is {0} × F ∈ σ(I). This implies that P = σ(I). 2. Since (Ω, A, P) is complete, by the Measurable Selection Theorem24 there exists a measurable function f : Ω → [0, ∞] such that Graph (f ) ⊆ B
and
{f < ∞} = projΩ B.
On the product σ-algebra (R+ × Ω, B (R+ ) × A) let us define the set-function µ (E) P (projΩ (E ∩ Graph (f ))) . Since f is measurable the set Graph (f ) is product measurable. Hence the set after the operator projΩ is product measurable. As (Ω, A, P) is complete, by the Projection Theorem25 projΩ (E ∩ Graph (f )) ∈ A. 23 See:
Corollary 1.44, page 26. Theorem A.13, page 551. 25 See: Theorem A.12, page 550. 24 See:
PREDICTABLE PROJECTION
197
This implies that the definition of µ is correct. One should prove that µ is a measure. To show this it is sufficient to observe that if E1 ∩ E2 = ∅ then26 projΩ (E1 ∩ Graph (f ))
projΩ (E2 ∩ Graph (f )) = ∅.
(3.5)
3. Consider the algebra H generated by I. As we have remarked P = σ (H). By the assumption of the theorem B ∈ P and, as P = σ (H), of course B ∈ σ (H). Obviously µ is a finite measure. Hence from the uniqueness of the extension of finite measures from an algebra to the generated σ-algebra and from the construction of the extended measure, µ (B) µ∗ (B) , there is a C ⊆ B, where C = ∩n Hn , Hn ∈ H such that P (projΩ B) µ (B) ≤ µ (C) + ε. Therefore it is sufficient to prove that there is a predictable stopping time σ for which almost surely Graph (σ) ⊆ C. 4. Consider the d´ebut of C: τ C (ω) inf {t : (t, ω) ∈ C} . Random intervals are progressively measurable and therefore C is progressively measurable. Hence τ C is a stopping time27 . We should show that τ C is a predictable stopping time and for almost all ω if τ C (ω) < ∞ then (τ C (ω) , ω) ∈ C. 5. Assume first that C [σ, ρ) ∈ I. If D {σ < ρ} then τ C (ω) = σ D (ω)
σ (ω) if ω ∈ D = ∞ if ω ∈ /D
σ (ω) if σ (ω) < ρ (ω) . ∞ if ρ (ω) ≤ σ (ω)
Obviously τ C (ω) < ∞ if and only if (τ C (ω) , ω) ∈ C. By the definition of I the stopping times σ and ρ are predictable. Therefore28 D {σ < ρ} ∈ Fσ− . As σ is predictable σ D is also a predictable stopping time. Therefore if C ∈ I the theorem holds. 26 If the intersection were not empty and x was in the intersection then (x, y) ∈ Graph (f ) if and only if y = f (x) so (x, f (x)) ∈ E1 ∩ E2 = ∅. The σ-additivity of µ is obvious as for an arbitrary mapping F (∪n An ) = ∪n F (An ). 27 See: Theorem 1.28, page 15. 28 See: Proposition 3.17, page 187.
198
THE STRUCTURE OF LOCAL MARTINGALES
6. Assume that the theorem is true for C1 , C2 , . . . , Cn and let C ∪ni=1 Ci . Obviously τ C = min τ Ci ∈ C. i
As the minimum of a finite number of predictable stopping times is a predictable stopping time, the theorem holds for C as well. This means that the theorem is valid if C ∈ H. 7. If C1 , C2 , . . . , Cn ∈ H and C ∩ni=1 Ci then τ C = max τ Ci ∈ C i
and, as the maximum of a finite number of predictable stopping times is a predictable stopping time, the theorem holds again. 8. Finally let Cn C and let assume that τ Cn is predictable and [τ Cn ] ⊆ Cn for all n. Let τ ess sup {σ : σ ≤ τ C , σ is a predictable stopping time} . By the usual construction of the essential supremum, given that the increasing limit of predictable stopping times is again a predictable stopping time, one can easily prove that τ is a predictable stopping time. Let Dn Cn ∩ [τ , ∞) and let τ Dn be the d´ebut of Dn . Obviously τ Dn = τ C n ∨ τ . τ is predictable, hence τ Dn is also predictable. As [τ Cn ] ⊆ Cn it is obvious that [τ Dn ] ⊆ Dn . As τ ≤ τ C C ⊆ Cn ∩ [τ C , ∞) ⊆ Cn ∩ [τ , ∞) Dn ⊆ Cn C. Hence C = ∩n Dn . Obviously τ Cn ≤ τ C so τ ≤ τ C n ∨ τ = τ Dn ≤ τ C . So, by the definition of τ a.s.
τ = τ Dn
for all n.
As [τ Dn ] ⊆ Dn if τ (ω) < ∞ then {(τ (ω) , ω)} ∈ ∩n Dn = C, a.s.
(3.6) a.s.
for almost all ω. This implies that τ ≥ τ C , so τ = τ C . The filtration is complete and τ is predictable so τ C is a predictable stopping time. By (3.6) (τ C (ω) , ω) ∈ C if τ C (ω) < ∞ for almost all ω.
PREDICTABLE PROJECTION
199
Corollary 3.34 A random variable τ is a predictable stopping time if and only if [τ ] [τ , τ ] = Graph (τ ) is a predictable set. Proof. If τ is a predictable stopping time and (τ n ) announces τ then Graph (τ ) = (∩n (τ n , τ ]) ∪ ({0} × {τ = 0}) , hence Graph (τ ) is a predictable set. 1. On the other hand let us assume that Graph (τ ) is predictable. Applying the predictable section theorem for Graph (τ ) one can find a predictable stopping time σ n for all n such that τ = σ n for the finite values of τ outside an event with probability smaller than 1/n. If τ n mink≤n σ k then τ n is a predictable stopping time and except on a measure-zero set τ n τ . Almost surely zero functions are stopping times, hence τ is a stopping time. But we should prove that τ is a predictable stopping time29 . (n)
2. Let (τ k ) be a finite announcing sequence for τ n . If d denotes the metric generating the topology of [0, ∞] then we can assume that for all k
(n) P d τ k , τ n > 2−k < 2−(k+n) . (n)
Introduce the stopping times ρk inf n τ k . Obviously ρk ≤ ρk+1 for every k (n) and ρk ≤ τ k ≤ τ n announces τ for every k. Let ρ limk ρk . If for some ω and for some k
(n) d τ k (ω) , τ (ω) ≤ 2−k , for all n then d (ρk (ω) , τ (ω)) ≤ 2−k . Therefore d (ρ (ω) , τ (ω)) ≤ 2−k . Therefore
(n)
P d (ρ, τ ) > 2−k ≤ P d τ k , τ > 2−k . n
If τ n is finite then τ n = τ , so
(n) (n) d τ k , τ > 2−k ⊆ d τ k , τ n > 2−k , 29 Remember that in general if τ τ and τ are predictable stopping times for all n then n n τ is not necessarily a predictable stopping time.
200
THE STRUCTURE OF LOCAL MARTINGALES
hence
(n)
P d τ k , τ n > 2−k ≤ 2−k . P d (ρ, τ ) > 2−k ≤ n a.s.
Therefore τ = ρ so τ is a predictable stopping time30 . We shall often use the following observation: Proposition 3.35 Every right-regular, predictable process is locally bounded. Proof. To simplify the notation we assume, that X (0) = 0. τ n inf {t : |X (t)| > n} , is a stopping time. X is regular, hence on every finite time-interval every trajectory of X is bounded so τ n ∞. Observe that X can have a jump at time τ n , so X is not necessarily bounded on the interval [0, τ n ]. As X is right-continuous |X (τ n )| ≥ n. Hence [τ n , τ n ] {(t, ω) : t = τ n (ω)} = [0, τ n ] ∩ {(t, ω) : |X (t, ω)| ≥ n} . X is predictable hence the set {|X| ≥ n} is predictable. So the graph of τ n is predictable. Hence by the just proved corollary τ n is a predictable stopping time. (n) (n) (n) If (σ m ) announces τ n then |X σm | ≤ n. For any n let us choose such a σ n σ m for which
P τ n − σ n ≥ 2−n ≤ 2−n . By the Borel–Cantelli lemma, outside of a measure-zero set, for all ω there exists an n0 (ω) such that if n ≥ n0 (ω) then σ n < τ n < σ n + 2−n . a.s.
This implies that σ n → ∞. If ρm max1≤k≤m σ k then almost surely ρm ∞. It is easy to see that X ρm is bounded for all m, so X is locally bounded. Example 3.36 The Poisson processes are not predictable
30 See: 31 See:
Proposition 3.11, page 184. Example 3.56, page 219.
31
.
PREDICTABLE PROJECTION
201
Let X be a Poisson process. Recall32 that τ 1 = inf {t : X (t) = 1} is not predictable, so [τ 1 ] is not a predictable set. As [τ 1 ] = [0, τ 1 ] ∩ {X ≥ 1} and [0, τ 1 ] is predictable {X ≥ 1} is not predictable. So X is not a predictable process. 3.1.6
Properties of the predictable projection
Later we shall prove that every product measurable process has a predictable projection. During the proof of the existence we shall need some properties of the operation, so we first summarize them. Proposition 3.37 (Properties of the predictable projection) Up to indistinguishability the predictable projection has the following properties: 1. If Y is predictable then p Y exists and p Y = Y . 2. If Y is a finite-valued predictable process and X has a finite predictable projection p X, then (Y X) also has a finite valued predictable projection p (Y X) and p
(Y · X) = Y · (p X) .
(3.7)
The identity (3.7) also holds if Y is non-negative and predictable, X is nonnegative and p X exists, or when Y is non-negative, predictable and finite, and p X exists. 3. The correspondence X → p X is increasing, that is if 0 ≤ X ≤ Y and p X and p Y are meaningful then 0 ≤ p X ≤ p Y . 4. If processes X and Y have finite predictable projection, then X + Y also has finite predictable projection and p
(X + Y ) = p X + p Y.
The additivity property also holds when X and Y are non-negative. 5. The correspondence X → p X is homogeneous, that is, if process X has a predictable projection and a is an arbitrary real number, then aX also has a predictable projection. If a ≥ 0 then p (aX) = a (p X). 6. The predictable projection satisfies the Monotone Convergence Theorem, that is, if 0 ≤ Xn X∞ and the processes Xn have predictable projections then the process X∞ also has a predictable projection and p
Xn (t, ω) p X∞ (t, ω)
for all t and for almost all ω. 32 See:
Example 3.7, page 183.
(3.8)
202
THE STRUCTURE OF LOCAL MARTINGALES σ
7. If σ is an arbitrary stopping time then p (X σ ) = (p X) on the random interval [0, σ]. 8. The predictable projection is localizable. If (σ n ) is a localizing sequence, p (X σn ) exists for all n and p (X σn ) = Y σn on the random intervals [0, σ n ], then p X exists and p X = Y . If p (X σn ) is well-defined or finite for all n then p X is also well-defined or finite. Proof. The proof is built on the analogous properties of the conditional expectation and on the uniqueness of the predictable projection. 1. If Y is predictable then Y + (τ ) and Y − (τ ) are Fτ − -measurable33 . If p Y exists then for every predictable stopping time τ , on the set {τ < ∞} p
0 (Y (τ ) | Fτ − ) E Y + (τ ) | Fτ − E Y − (τ ) | Fτ − = Y (τ ) = E = Y + (τ ) Y − (τ ) = Y (τ ) .
As the predictable projection is unique, p Y and Y are indistinguishable. Reading the line above in reverse order one can see that p Y exists. 2. If Y is predictable, then the stopped variable Yτ is Fτ − -measurable34 for every predictable stopping time τ . By the first assumption p X is finite hence on the set {τ < ∞} p
0 (X (τ ) | Fτ − ) = E (X (τ ) | Fτ − ) . X (τ ) = E
Multiplying the equation by Y (τ ) and using the analogous properties of the generalized conditional expectation Y (τ ) (p X (τ )) = Y (τ ) E (X (τ ) | Fτ − ) = E (Y (τ ) X (τ ) | Fτ − ) . As the predictable projection is unique, (p X) Y = p (XY ). If X and Y are non-negative, one can use the same argument, but instead of the generalized conditional expectation one should use the similar properties of the conditional expectation. To prove the last property it is sufficient to remark that if Y is non-negative and finite we can multiply the relation p
0 (X (τ ) | Fτ − ) = E X + (τ ) | Fτ − E X − (τ ) | Fτ − X (τ ) = E
by Y . 3. We have already proved this property35 . 33 See:
Proposition 3.15, page 186. Proposition 3.15, page 186. 35 See: Theorem 3.31, page 194. 34 See:
PREDICTABLE PROJECTION
203
4. By the definition of the predictable projection, for an arbitrary predictable stopping time τ since on the set {τ < ∞} p
a.s. 0 X (τ ) = E (X (τ ) | Fτ − ) ,
p
a.s. 0 Y (τ ) = E (Y (τ ) | Fτ − )
by the assumptions, predictable projections are finite or non-negative one 0 Using the additivity of the generalized conditional can write E instead of E. expectation (p X (τ ) + p Y (τ )) χ (τ < ∞) = = (E (X (τ ) | Fτ − ) + E (Y (τ ) | Fτ − )) χ (τ < ∞) = = (E (X (τ ) χ (τ < ∞) | Fτ − ) + E (Y (τ ) χ (τ < ∞) | Fτ − )) = = E ((X (τ ) + Y (τ )) χ (τ < ∞) | Fτ − ) = 0 (X (τ ) + Y (τ ) | Fτ − ) χ (τ < ∞) , =E hence p (X + Y ) exists and p (X + Y ) = p X + p Y . 5. The proof is analogous. 6. Assume that 0 ≤ Xn X∞ , and that p Xn exists for all n. For an arbitrary predictable stopping time τ on the set {τ < ∞} p
a.s.
Xn (τ ) = E (Xn (τ ) | Fτ − ) .
The limit Z lim inf n→∞ p Xn is predictable. By the monotonicity of the predictable projection limn→∞ p Xn (τ ) Z (τ ) exists for every trajectory except on a measure-zero set. The Monotone Convergence Theorem is true for the conditional expectation, hence if n → ∞, then a.s.
Z (τ ) = lim inf p Xn (τ ) = lim n→∞
=E
n→∞
p
a.s.
Xn (τ ) = lim E (Xn (τ ) | Fτ − ) = n→∞
lim Xn (τ ) | Fτ − E (X∞ (τ ) | Fτ − ) ,
n→∞
a.s.
so Z (τ ) = E (X∞ (τ ) | Fτ − ) , hence Z is the predictable projection of X∞ . σ 7. First of all let us remark that in general the relation p (X σ ) = (p X) is not σ true. As we shall prove36 , if L ∈ L then p L = L− and obviously (L− ) = (Lσ )− . To prove the property let us notice that χ ([0, σ]) is a predictable process so by 36 See:
Proposition 3.39, page 205.
204
THE STRUCTURE OF LOCAL MARTINGALES
the second property of the predictable projection just proved χ ([0, σ]) ·
p
(X σ ) =
p
(χ ([0, σ]) · X σ ) =
=
p
(χ ([0, σ]) · X) = σ
= χ ([0, σ]) · p X = χ ([0, σ]) · ( p X) . σ
that is p (X σ ) = ( p X) on the random interval [0, σ]. 8. Let τ be an arbitrary predictable stopping time. If (τ k ) announces τ then {τ ≤ σ n } = ∩k {τ k ≤ σ n } ∈ σ (∪k Fτ k ) ⊆ Fτ − . In a similar way as above 0 (X (τ ) | Fτ − ) = χ (τ ≤ σ n ) · E
(3.9)
0 (X (τ ) | Fτ − ) = = χ (τ ≤ σ n ) · E 2
0 (χ (τ ≤ σ n ) X (τ ) | Fτ − ) = = χ (τ ≤ σ n ) · E 0 (χ (τ ≤ σ n ) X σn (τ ) | Fτ − ) = = χ (τ ≤ σ n ) · E 0 (X σn (τ ) | Fτ − ) = = χ (τ ≤ σ n ) · E = χ (τ ≤ σ n ) · (p (X σn )) (τ ) = = χ (τ ≤ σ n ) · Y σn (τ ) . 0 (X (τ ) | Fτ − ) a.s. = Y (τ ) on the set {τ < ∞}. As σ n ∞, obviously E Y is a limit of predictable processes so it is predictable, hence the first part of the property holds. The second part of the property follows from (3.9). 3.1.7
Predictable projection of local martingales
Let us first prove some interesting results. Proposition 3.38 (Predictable Optional Sampling) If X is a uniformly integrable martingale and τ is a predictable stopping time then a.s.
X (τ −) = E (X (τ ) | Fτ − ) = E (X (∞) | Fτ − ) . Proof. If (τ n ) announces τ then by the Optional Sampling Theorem a.s.
X (τ n ) = E (X (τ ) | Fτ n ) = E (X (∞) | Fτ n ) .
(3.10)
PREDICTABLE PROJECTION
205
By the uniform integrability X (∞) ∈ L1 (Ω). Hence X (τ ) ∈ L1 (Ω). Every martingale has left-limits so lim X(τ n ) = X(τ −).
n→∞
As Fτ n Fτ − by L´evy’s martingale convergence theorem37 lim E (X (∞) | Fτ n ) = E (X (τ ) | Fτ − ) = E (X (∞) | Fτ − ) .
n→∞
Therefore (3.10) holds. Proposition 3.39 If L is a local martingale then L has a finite predictable projection p L and p
L (t) = L− (t) L (t−) .
Proof. L− is left-continuous therefore it is predictable. If L is a uniformly integrable martingale and τ is a predictable stopping time then by (3.10) a.s.
L− (τ ) = L (τ −) = E (L (τ ) | Fτ − ) that is by the definition of the predictable projection p L = L− . The general case is evident from the last property of the predictable projection. Corollary 3.40 A local martingale is predictable if and only if it is continuous. Proof. If L is continuous then L is predictable. The reverse implication follows from L+ = L = p L = L− . Corollary 3.41 If X is a special semimartingale then there is just one decomposition of the kind X = X (0) + V + L, V ∈ V, L ∈ L where V is predictable. This decomposition is called the canonical decomposition of X. Proof. If V1 + L1 = V2 + L2 then M V1 − V2 = L2 − L1 is a predictable local martingale, hence M is a continuous local martingale. On the other hand the a.s. trajectories of M have finite variation, so by Fisk’s theorem38 M = 0. The most important example of the predictable projection is the following. 37 See: 38 See:
Theorem 1.69, page 41. Theorem 2.11, page 117.
206
THE STRUCTURE OF LOCAL MARTINGALES
Corollary 3.42 If L is a local martingale and ∆L is the jump process of L then p
(∆L) = 0.
Proof. It is evident from the additivity of the predictable projection: p
(∆L) =
p
(L − L− ) = p L −
p
(L− ) =
= L− − L− = 0. 3.1.8
Existence of the predictable projection
Now we are ready to discuss the question of the existence of the predictable projection. Corollary 3.43 Let η be an integrable random variable and let M (t) E (η | Ft ) be the martingale generated by the random variable η. If X ≡ η then X has a predictable projection and p X = M− . Proof. As the usual conditions hold M (t) E (η | Ft ) has a right-regular version, hence M is a uniformly integrable martingale. If τ is a predictable stopping time then by the Predictable Optional Sampling Theorem on the set {τ < ∞} M (τ −) = E (M (∞) | Fτ − ) = = E (η | Fτ − ) E (X (τ ) | Fτ − ) , that is, X has a predictable projection and p X = M− . Proposition 3.44 (Existence of the predictable projection) Every product-measurable process X has a predictable projection. Proof. If η is an integrable random variable then as we have seen X ≡ η has a predictable projection. If I ∈ B (R+ ) then χI is predictable, hence p
(χI · X) = χI · p X,
that is, the processes of type X χI · η also have predictable projection. The processes of type χI · η form a π-system, which generate the set of productmeasurable processes. Denote by H the set of bounded processes X which have predictable projection. Observe that by the properties of the predictable projection, H is a λ-system: 1. The constant process 1 is predictable, hence trivially 1 ∈ H. 2. By the linearity of the predictable projection among the bounded processes H is a linear space. 3. By the Monotone Convergence Theorem for the predictable projection, H is a monotone class.
PREDICTABLE COMPENSATORS
207
By the Monotone Class Theorem H contains the set of processes generated by the π-system of the processes χI · η that is H contains the product-measurable bounded process. Again by the Monotone Convergence Theorem every nonnegative product-measurable process has a predictable projection. If X is a product-measurable process then the process p
X pX + pX −
satisfies (3.4) and therefore X has a predictable projection.
3.2
Predictable Compensators
Assume that X is some non-negative increasing process. That is assume that X describes some ‘unpredictable’ cumulative losses39 . Although the losses are ‘unpredictable’ one can still ask whether they are ‘insurable’, that is, whether there is an insurance contract which compensates for the losses of X. Under compensation we mean that there is some payments process P for which the net risk X − P is a local martingale40 . Of course, since the process X is ‘risky’, one cannot forecast the jumps of X, so to make the insurance contract fair one should pay the price of the compensator ‘before’ the jumps, so we are looking for a ‘predictable’ payment process P . The main result of this section is that if a rightcontinuous, increasing process X has locally integrable variation, then X has an increasing, right-continuous and locally integrable predictable compensator. This theorem is the ‘home edition’ of the ‘professional’ Doob–Meyer decomposition41 . 3.2.1
Predictable Radon–Nikodym Theorem
Before the proof of the existence of the compensator, we introduce some tools which we shall use during the proof. Let V ∈ V + . Since V is right-continuous and increasing, every trajectory V (ω) generates a σ-finite measure µ (ω) on R+ . Let us denote by G the set of possible events42 in Ω. If the process Y is measurable with respect to the product σ-algebra G B (R+ )×G then the trajectories Y (ω) are B (R+ )-measurable. Hence if Y (ω) ≥ 0 then the integrals 0
∞
Y (s, ω) dµ (s, ω)
∞
Y (s, ω) dV (s, ω)
(3.11)
0
are meaningful. Of course the value of the integral depends on ω. Let Y χ(s,t] χF for some F ∈ G. V is adapted hence V (s, ω), as a function of ω, is 39 Of
course in this paragraph we use the word ‘predictable’ in a colloquial sense. Definition 2.41, page 141. 41 See: Theorem 5.1, page 292. 42 Recall that symbol A is ambiguous. It denotes the set of events and the set of processes with integrable variation. To avoid confusion in the present subsection we shall use the symbol G for the events of Ω. 40 See:
208
THE STRUCTURE OF LOCAL MARTINGALES
G-measurable for all s and therefore the integral ∞ Y (s, ω) dV (s, ω) = χF · (V (t, ω) − V (s, ω)) 0
is also G-measurable. With the Monotone Class Theorem it is now easy to prove that (3.11) is G-measurable43 for every non-negative product-measurable processes Y . Hence the expression ∞ ∞ E Y dV Y (s, ω) dV (s, ω) dP (ω) 0
Ω
0
is meaningful for all non-negative product-measurable Y . Definition 3.45 For V ∈ V + let us define ∞ ∞ χB dV E χB (s, ω) dV (s, ω) , µV (B) E 0
B ∈ G.
0
If V ∈ A+ , It is easy to see that µV is a measure on the product σ-algebra G. + that is if V ∈ V and E (V (∞)) < ∞ then µV is finite. If V ∈ A+ loc then µV is σ-finite. Definition 3.46 If µ is a measure on the product σ-algebra G and for some V ∈ V+ µV (B) = µ (B) ,
B ∈ G
then we say that measure µ is generated by V . Definition 3.47 We say that the product-measurable set N is evanescent if its characteristic function is indistinguishable from the characteristic function of the empty set, that is outside an event with zero probability the trajectories of χN are zero44 . We say that Definition 3.48 Let µ be a measure on the product σ-algebra45 G. µ is absolutely continuous if µ (N ) = 0 for every evanescent set N . Proposition 3.49 (Generalized Radon–Nikodym Theorem) Let µ be a measure on G). (R+ × Ω, B(R+ ) × G) (Ω, 43 Observe that we cannot directly apply the classical Radon–Nikodym theorem. The present argument is exactly that which one should use during the proof of the Radon–Nikodym theorem. 44 It is contained in R × N for some N with P (N ) = 0. + 45 Recall that G denotes the set of events in Ω.
PREDICTABLE COMPENSATORS
209
There exists an A ∈ A+ loc for which µ (B) = µA (B) E
∞
χB dA ,
whenever B ∈ G
0
if and only if 1. µ ([0]) µ({0} × Ω) = 0. 2. µ ([0, t] × F ) is σ-finite for every t. 3. µ is absolutely continuous. If A, B ∈ V are two processes representing µ then A and B are indistinguishable. Proof. It is easy to see that the conditions are necessary. For example, if B is evanescent then, by definition, outside an event of zero probability the trajectories of χB are zero so, trivially, µA (B) = 0. Let us prove the sufficiency of the conditions. Let us define the measures ρt (H) µ ([0, t] × H) = µ ([0] × H) + µ ((0, t] × H) = = µ ((0, t] × H) ,
H ∈ G.
By the third assumption ρt is absolutely continuous with respect to P for every t. Hence the classical Radon–Nikodym derivative dρt /dP exists for all t. By the second condition dρt /dP is finite. Let At be a non-negative version of the a.s.
a.s.
derivative. Trivially A0 = 0 and if s ≤ t then As ≤ At . By this monotonicity a.s.
if tn t∞ then Atn A∗ . By the second assumption and by the Dominated Convergence Theorem µ ([0, t∞ ] × F ) = lim µ ([0, tn ] × F ) = n→∞ Atn dP = lim Atn dP A∗ dP. = lim n→∞
F
F n→∞
F
a.s.
Hence by the uniqueness of the Radon–Nikodym derivative A∗ = At∞ . Of course we have the usual problem: since the variables At are defined up to a measure-zero set we can unambiguously define the process t → At only on the rational numbers. To define A for all real numbers t ≥ 0 we use the definition At inf {Ar : t < r, r ∈ Q} . Since the filtration is right-continuous, At will be G-measurable for all t and it is easy to see that the trajectories of A are increasing and right-continuous. From the first condition it is obvious that A0 = 0 is possible, that is A ∈ V + . By the argument above At is a version of dρt /dP for all t. The processes of type χ ([0, t] × H) form a π-system generating the σ-algebra of product measurable
210
THE STRUCTURE OF LOCAL MARTINGALES
This π-system is contained in the λ-system of bounded functions for sets G. which ∞ f dA = f dµ f dµ. E R+ ×Ω
0
Ω
By the Monotone Class Theorem and by the Monotone Convergence Theorem for every B ∈ G ∞ χB dA = χB dµ = µ (B) . E R+ ×Ω
0
As µ is σ-finite, for every A there is just one µ associated with A. From the construction it is clear, that if B ∈ V + is another right-continuous representation of µ then B is a modification of A. As both A and B are right-continuous they are indistinguishable. The main non-trivial question is the following: when is the generalized Radon– Nikodym derivative A predictable? To answer the question we need the following simple observation: Lemma 3.50 Let F ⊆ G be two σ-algebras and let ξ ∈ L1 (Ω, G, P). Assume that F contains the measure-zero sets of G. If for all F ∈ G E (ξ · χF ) = E (ξ · E (χF | F))
(3.12)
then ξ is F-measurable. Proof. We prove that a.s 0 ξ E (ξ | F) = ξ.
If F ∈ G and h χF then by (3.12) E (ξ · h) E (ξ · χF ) = E (ξ · E (χF | F)) = = E (E (ξ · E (χF | F) | F)) = = E (E (ξ | F) · E (χF | F))
ξ · χF | F = E 0 ξ · E (χF | F) = E E 0
=E 0 ξ · χF E 0 ξ·h . As ξ ∈ L1 (Ω) by the usual simple density argument, using that every h ∈ L∞ (Ω, G, P) is almost surely a uniform limit of step functions the equation
E (ξ · h) = E 0 ξ·h
PREDICTABLE COMPENSATORS
211
can be extended to all h ∈ L∞ (Ω, G, P). Therefore if h sign(ξ − 0 ξ) then
E ξ − 0 ξ = E ξ − 0 ξ sign ξ − 0 ξ = 0. a.s. ξ. By the definition of the conditional expectation 0 ξ is Hence, as we said ξ = 0 F-measurable. F contains all the measure-zero sets of G and therefore ξ is also F-measurable.
Proposition 3.51 Assume that the measure µV is generated by a process V ∈ V + . Assume also that whenever p
X = pY
then
∞
X dV
E
µV (X) = µV (Y ) E
0
∞
Y dV
.
0
Then V is predictable. Proof. We divide the proof into several steps. 1. First we prove that V (τ ) is Fτ − -measurable for every predictable stopping time τ . By the lemma it is sufficient to prove that E (χH · V (τ )) = E (E (χH | Fτ − ) · V (τ )) ,
whenever H ∈ G,
which is the same as µV (χH · χ ([0, τ ])) = µV (E (χH | Fτ − ) · χ ([0, τ ])) . To prove this equation one should prove, by the assumption of the proposition, that the predictable projections of the two associated processes in the argument of µV are indistinguishable: p
(χH · χ ([0, τ ])) =
p
(E (χH | Fτ − ) · χ ([0, τ ])) .
Let M (t) E (χH | Ft ). By the Predictable Optional Sampling Theorem
τ (t) = M ((τ ∧ t) −) = E M (∞) | F(τ ∧t)− = M−
= E χH | F(τ ∧t)− = E E (χH | Fτ − ) | F(t∧τ )−
p
(E (χH | Fτ − )) (t ∧ τ ) ,
that is τ
τ = (p (E (χH | Fτ − ))) . M−
(3.13)
212
THE STRUCTURE OF LOCAL MARTINGALES
χ ([0, τ ]) is predictable and46 p
(χH · χ ([0, τ ])) =
p
p
(χH ) = M− hence47
(χH ) · χ ([0, τ ]) = M− · χ ([0, τ ]) = τ
τ · χ ([0, τ ]) = (p (E (χH | Fτ − ))) · χ ([0, τ ]) = = M−
=
p
(E (χH | Fτ − )) · χ ([0, τ ]) =
=
p
(E (χH | Fτ − ) · χ ([0, τ ])) ,
which is what we wanted in (3.13). Therefore V (τ ) is Fτ − -measurable. 2. Since V (τ ) is Fτ − -measurable, from the definition of the predictable projection on {τ < ∞} p
a.s.
a.s.
V (τ ) = E (V (τ ) | Fτ − ) = V (τ )
(3.14)
for every predictable stopping time τ . 3. As a special case we get that p V is a modification of V . As a next step V are indistinguishable. Let V = V c + V d , where V c is we prove that V and p d ∆V is the jump part of V . As V c and V d are noncontinuous and V negative and as the predictable projection for non-negative processes is additive:
p V = p V c + V d = p (V c ) + p V d = V c + p V d . Since V is regular, the jumps of V form a thin set and as V is right-regular one may write Vd = ∆V = ∆V (σ k ) χ ([σ k , ∞)) , k
where the σ k are either predictable or totally inaccessible48 . The terms in the sum are non-negative so by the Monotone Convergence Theorem for the predictable projection49
d p p V = (∆V (σ k ) χ ([σ k , ∞))) . k
If σ k is predictable, then χ ([σ k , ∞)) is predictable and by (3.14) in the previous point of the proof50 p
(∆V (σ k ) · χ ([σ k , ∞))) =
p
(∆V (σ k )) · χ ([σ k , ∞)) =
=
p
((V − V− ) (σ k )) · χ ([σ k , ∞)) =
= ∆V (σ k ) · χ ([σ k , ∞)) 46 See:
Corollary 3.43, page 206. (3.7), page 201. 48 See: Proposition 3.19, page 188. 49 See: (3.8), page 201. 50 See: (3.7), page 201. 47 See:
PREDICTABLE COMPENSATORS
213
almost surely. If σ k is totally inaccessible, then ∞ E (∆V (σ k )) = χ ([σ k ]) dV dP µV (χ ([σ k ])) . Ω
0 a.s.
Since σ k is totally inaccessible χ ([σ k ]) (τ ) = 0 for any predictable stopping time τ . So p χ ([σ k ]) = 0 = p 0. Hence by the assumption of the proposition E (∆V (σ k )) = 0. a.s.
As V is increasing, V (σ k ) = V (σ k −) , that is if σ k is totally inaccessible, then a.s. ∆V (σ k ) = 0. From this it is now obvious that V and the predictable process p V are indistinguishable. 4. As a final step it will be sufficient to prove that if X is a product measurable and indistinguishable from the zero process, then X is predictable. Let P (N ) = 0 and let X (ω) = 0 if ω ∈ / N . Let us first assume that X = χC , where C (t1 , t2 ] × A with A ⊆ N . As Ft is complete t if ω ∈ A σ t (ω) 0 if ω ∈ /A is a stopping time for all t. This implies that C {(t, ω) : t1 < t ≤ t2 , ω ∈ A} = = {(t, ω) : σ t1 (ω) < t ≤ σ t2 (ω)} (σ t1 , σ t2 ] ∈ P. From this, using that X is product measurable, the predictability of X is already immediate. 3.2.2
Predictable Compensator of locally integrable processes
Now we are ready to prove the ‘home edition’ of the Doob–Meyer decomposition: Theorem 3.52 (Existence of Predictable Compensators) If A ∈ A+ loc then which is unique up to indistinguishability there is a predictable process Ap ∈ A+ loc and which satisfies each of the following three equivalent properties51 : 1. A − Ap ∈ L. 51 The interpretation of these equivalent properties is the following. If A is a process of some cumulative losses then Ap is the cumulative insurance fee which one would have to pay to an insurance company to cover the risk in A. A − Ap ∈ L means that to make the contract fair the net gain of the insurance company should contain no systematic trend. By the second condition, whatever stopping strategy one of the parties follows, on average nobody gains. By the third condition, if H is the number of insurance contracts then on average the net payouts of the company is the same as the amount of fees paid by the clients. Of course one can decide about the number of contracts and about the size of the fee before the losses represented by A occur, so H and Ap should be predictable.
214
THE STRUCTURE OF LOCAL MARTINGALES
2. E (A (τ )) = E (Ap (τ )) for any stopping time52 τ . 3. For all non-negative, predictable processes H E
∞
HdA
=E
∞
H dA
0
p
,
(3.15)
0
where the integrals on both sides are are pathwise Lebesgue–Stieltjes integrals. Proof. We divide the proof into several steps. 1. First we prove that the three conditions above are equivalent. Let A−Ap ∈ L and let (τ n ) be a joint localizing sequence of A, Ap and A−Ap . As the spaces A+ loc τ and L are closed under stopping one can find such a sequence. (A − Ap ) n ∈ M, hence by the Optional Sampling Theorem E (A (τ ∧ τ n )) = E (Ap (τ ∧ τ n )) . A and Ap are increasing processes by the definition of A+ loc , hence by the Monotone Convergence Theorem E (A (τ )) = E (Ap (τ )) . ∞ If H χ ([0, τ ]) and X ∈ V + arbitrary, then 0 HdX = X (τ ) , hence if H χ ([0, τ ]) then from the second condition the third one follows. If H is the characteristic function of some set {0} × F, F ∈ F0 then (3.15) obviously holds. The σ-algebra generated by the intervals [0, τ ] and by the sets {0} × F is the set of predictable processes, hence the general case follows from the Monotone Class Theorem and from the Monotone Convergence Theorem, since the processes of type χ ([0, τ ]) form a π-system53 , and the set of processes H satisfying (3.15) is trivially a λ-system. We can reverse the last argument so if the third condition holds, then E (A (θ)) = E
∞
χ ([0, θ]) dA 0
=E
∞
χ ([0, θ]) dA
p
= E (Ap (θ))
0
for any stopping time θ. Of course, it can happen that both expected values are infinite. If (τ n ) is a joint localizing sequence for A and Ap , then E (Aτ ∧τ n ) = E (Apτ ∧τ n ) for every stopping time τ , where by the localization the two expected τ values are finite. Hence (A − Ap ) n ∈ M, that is A − Ap is a local martingale. 52 Observe that since A and Ap are increasing A (∞) and Ap (∞) are well-defined although they can be +∞. 53 χ ([0, τ ]) χ ([0, σ]) = χ ([0, σ ∧ τ ]) and σ ∧ τ is also stopping time.
PREDICTABLE COMPENSATORS
215
2. We prove that Ap is unique. If Ap1 , Ap2 ∈ Aloc are predictable processes and A−Api are local martingales for indexes i = 1, 2, then Ap1 −Ap2 ∈ V is a predictable local martingale. As every predictable local martingale is continuous54 Ap1 −Ap2 ∈ V is a continuous local martingale. Hence by Fisk’s theorem55 Ap1 −Ap2 = Ap1 (0)− Ap2 (0) = 0. 3. Finally we prove the existence of Ap . Let µA be the measure generated by
∞ A, that is if X is a product measurable set then let µA (X) E 0 χ (X) dA . On the product measurable sets let us define the set function µ (X) µA ( χ (X)) E p
∞
p
χ (X) dA .
0
Observe that since p χ (X) is well-defined the set function µ is also well-defined. If X1 and X2 are disjoint then by the additivity of the predictable projection
∞
µ (X1 ∪ X2 ) µA (p χ (X1 ∪ X2 )) E
p
χ (X1 ∪ X2 ) dA
=
0
∞
=E
p
(χ (X1 ) + χ (X2 )) dA
=
0
=E
∞
p
∞
χ (X1 ) dA + E
0
p
χ (X2 ) dA
0
µA (p χ (X1 )) + µA (p χ (X2 )) µ (X1 ) + µ (X2 ) , so µ is additive. It is clear from the Monotone Convergence Theorem for the predictable projection that µ is σ-additive. Hence µ is a measure. A ∈ A+ loc therefore µA , hence µ is σ-finite. If X is evanescent, it is predictable56 , hence µ (X) µA (p χ (X)) = µA (χ (X)) E
∞
χ (X) dA 0
∞
0 dA
=E
= 0.
0
Hence µ is absolutely continuous. Therefore by the generalized Radon–Nikodym theorem there is an Ap ∈ A+ loc for which µ = µAp . That is for all predictable 54 See:
Corollary 3.40, page 205. Theorem 2.11. page 117. 56 See: step 4 in the proof of Proposition 3.51, page 211. 55 See:
216
THE STRUCTURE OF LOCAL MARTINGALES
sets X
∞
E
χ (X) dA
µ (X) = µAp (X) E
0
∞
χ (X) dA
p
.
0
From Proposition 3.51 it is clear that Ap is predictable. Hence Ap is the increasing, predictable compensator of A. Corollary 3.53 If A ∈ Aloc then there is a predictable process Ap ∈ Aloc for which A − Ap is a local martingale. If Ap1 and Ap2 are two such processes then Ap1 and Ap2 are indistinguishable. Proof. As A ∈ Aloc ⊆ V the process Var (A) is well-defined and Var (A) ∈ + A+ loc . By Jordan’s decomposition B (A + Var (A)) /2 ∈ Aloc and C + p p p (Var (A) − A) /2 ∈ Aloc . The process A B − C is predictable and A − Ap ∈ L. The proof of the uniqueness of Ap is the same as in the previous statement. Let us remark, that the condition A ∈ Aloc is in some sense necessary. If A ∈ V and there is an Ap ∈ V such that A − Ap ∈ L, then A − Ap ∈ L ∩ V, hence57 A − Ap ∈ Aloc . As Ap is predictable and right-regular it is locally bounded58 , so A ∈ Aloc . Example 3.54 Predictable compensator of compound Poisson processes.
Let X be a compound Poisson process. Assume that the expected value of the distribution of the jumps M is finite. If N is the Poisson process with parameter λ describing the number of the jumps of X then
E (X (t)) =
∞
E (X (t) | N (t) = k) P (N (t) = k) =
k=0
=
∞ k=0
k
kM
(λt) exp (−λt) = λtM. k!
X has independent increments, so it is very easy to see that X (t) − λM t is a martingale59 . The process λM t is continuous, hence it is predictable, so X p (t) = λM t. If the distribution of the jumps does not have an expected value then X ∈ / Aloc , hence X does not have a compensator. 57 See:
Proposition 3.4, page 181. Proposition 3.35, page 200. 59 Every process with independent increments and zero expected value is martingale. 58 See:
PREDICTABLE COMPENSATORS
3.2.3
217
Properties of the Predictable Compensator
Let us summarize the most important properties of Ap . Let us remark that if A ∈ Aloc and H is locally bounded then the frequently used condition H • A ∈ Aloc holds. 1. 2. 3. 4.
If A ∈ Aloc and A is predictable then 60 Ap = A. If A ∈ Aloc then A is a local martingale if and only if Ap = 0. p p (A + B) = Ap + B p , (cA) = cAp . If A ∈ A then Ap ∈ A and A − Ap ∈ M. p
p
± + Let A± 12 (A ± Var (A)). Ap (A+ ) − (A− ) . Obviously A
±∈ pA . By + ± p ± definition (A ) ∈ Aloc . But as for τ = ∞ E (A (τ )) = E (A ) (τ ) so p (A± ) ∈ A+ . Hence if A ∈ A then Ap ∈ A. Therefore A − Ap is in A. Hence A − Ap is a class D local martingale so A − Ap ∈ M. p τ 5. If A ∈ Aloc and τ is a stopping time then (Aτ ) = (Ap ) . p p τ τ p τ A−A ∈ L, hence (A − A ) = A −(A ) ∈ L. Truncated predictable processes p p τ are predictable61 hence, as (Aτ ) is unique, (Aτ ) = (Ap ) . 6. If A ∈ Aloc then
∆ (Ap ) =
p
(∆A) .
(3.16)
∆ (Ap ) Ap − Ap− is predictable hence62 p
(∆ (Ap )) = ∆ (Ap ) .
If L ∈ L then p (∆L) = 0 and as p (∆L) is a finite predictable projection by the linearity of the predictable projection63 0=
p
(∆L) =
p
(∆ (A − Ap )) =
=
p
(∆A) −
p
(∆ (Ap )) =
p
p
(∆A − ∆ (Ap )) =
(∆A) − ∆ (Ap ) ,
which is exactly (3.16). 7. If H is predictable, A ∈ Aloc and (H • A) (t) p
(H • A) = H • Ap .
t 0
HdA ∈ Aloc then (3.17)
The predictable compensator by definition is member of the space Aloc , hence under the present conditions H • Ap ∈ Aloc . 60 0
∈ L. Proposition 1.39, page 23. 62 See: Proposition 3.37, page 201. 63 See: Corollary 3.42, page 206. 61 See:
218
THE STRUCTURE OF LOCAL MARTINGALES
Let B − C be the Jordan decomposition of the integrator A. By the definition of the integral with respect to a signed measure H + • B ∈ A+ loc . The integrator + is predictable and if H χ ([0, τ ]), then the integral process B p ∈ A+ loc
H •B +
p
t
(t)
H + dB p = B p (τ ∧ t) − B p (0) = 0 p
τ
= B (τ ∧ t) = (B p ) (t) is predictable. The processes of type χ ([0, τ ]) form a π-system which generates the predictable sets. The set of bounded processes H for which H • B p is predictable is a λ-system. By the Monotone Class Theorem H • B p is predictable for every bounded predictable process H. By the Monotone Convergence Theorem H + • B p is predictable for every predictable, non-negative process H + . By + + p (3.15) if H + • B ∈ A+ loc , then H • B ∈ Aloc . If G is a non-negative, predictable process then by (3.15) ∞ ∞
E Gd H + • B GH + dB = =E 0
0
∞
=E
+
GH dB
p
=
0
=E
∞
Gd H + • B p ,
0
hence as H + • B p ∈ A+ loc and the process is predictable
H+ • B
p
= H + • Bp.
The general case is evident from the definition of the integration with respect to signed measure and from the additivity of the compensator operator64 . 8. If H predictable, A ∈ Aloc and H • A ∈ Aloc then the integral process H • (A − Ap ) is a local martingale. Indeed p
p
(H • (A − Ap )) = H • (A − Ap ) = p
= H • (Ap − (Ap ) ) = = H • 0 = 0. hence by 2. above H • (A − Ap ) ∈ L. us remark that by this property if H and A ∈ Aloc are predictable and 0t HdA ∈ Aloc , t then the integral process 0 HdA is predictable. For example, if A ∈ Aloc , H is locally bounded and A and H are predictable then the integral process 0t HdA is predictable. 64 Let
THE FUNDAMENTAL THEOREM OF LOCAL MARTINGALES
219
9. If H is predictable, V ∈ V ∩ L and H • V ∈ Aloc then the integral process H • V is a local martingale65 . By the assumptions66 V ∈ Aloc . By 7. and using that V ∈ L and hence V p = 0 p
(H • V ) = H • V p = H • 0 = 0, hence by 2. H • V ∈ L. Example 3.55 If the integrand is not predictable and the integrator is a discontinuous local martingale then it is possible that the stochastic integral is not a local martingale.
In the theory of stochastic integration it is very important, that if the integrand is predictable and locally bounded and the integrator is a local martingale then the integral is also a local martingale. By 9. it holds if the integrator is in V ∩ L. If the integrand is not predictable then the integral process is not necessarily a local martingale67 . Let L (t) π (t) − t be a compensated Poisson process. Let us denote by τ the time of the first jump of X. If H −χ (t < τ ) then the trajectories of H are right-regular. The process
t
χ (s < τ ) d (π (s) − s) = t ∧ τ
(H • L) (t) = − 0
is not a local martingale: The trajectories are continuous and increasing, hence if it is a local martingale then by Fisk’s theorem it is a constant, which is impossible as τ has an exponential distribution. Example 3.56 Right-regular, adapted process which is not predictable.
In the previous example χ (t < τ ) = χ ([0, τ )) is right-continuous and adapted, but it cannot be predictable by Property 8.
3.3
The Fundamental Theorem of Local Martingales
In the previous chapter we defined the semimartingales as processes X = X (0)+ 2 . In this chapter we said that X was a H + V where V ∈ V and H ∈ Hloc semimartingale if X = X (0) + L + V, where V ∈ V and L ∈ L. Now we are ready to prove that the two definitions are equivalent. 65 See:
Example 3.55, page 219. Proposition 3.4, page 181. 67 Intuitive if in a game one should not decide before the jumps of the gain process, then it is possible to make a systematic profit from the jumps. 66 See:
220
THE STRUCTURE OF LOCAL MARTINGALES
Theorem 3.57 (Fundamental Theorem of Local Martingales) Every local martingale L has a decomposition L = L (0) + L + L ,
L , L ∈ L,
2 where L ∈ Aloc and L is locally bounded, hence L ∈ Hloc .
Proof. To make the notation simple let us assume that L (0) = 0. Fix a b > 0 and let (ρn ) be a localizing sequence of L. The trajectories of L are regular, hence the number of jumps which have absolute value larger than b, the number of the ‘big jumps’, is finite on every finite interval. Hence one can define the process A consisting of the ‘big jumps’ of L: A
∆Lχ (|∆L| > b) .
Evidently the trajectories of A have finite variation on every finite interval. In the definition of A the jumps at t are in A (t), hence A is right-regular and adapted. Therefore A ∈ V. Let us introduce the stopping times τ n inf {t : Var (A) (t) > n} ∧ inf {t : |L (t)| > n} ∧ ρn . Obviously Var (A) (τ n ) ≤ n + |∆L (τ n )| ≤ ≤ n + |L (τ n )| + |L (τ n −)| ≤ ≤ 2n + |L (τ n )| . As Lρn is a uniformly integrable martingale L (τ n ) is integrable. Hence Var (A) (τ n ) is integrable, so A ∈ Aloc . As A ∈ Aloc we can take the compensator Ap of A. Let us define L A − Ap ∈ Aloc and let L L − L . By the definition of Ap the processes L and L are local martingales. We are going to show that the process U L − A is locally bounded. As U is right-continuous σ n inf {t : |U (t)| > n} , is a stopping time. By the definition of σ n obviously |U (σ n −)| ≤ n. The size of the jumps of U are bounded by b, so |U (σ n )| ≤ |U (σ n −)| + |∆U (σ n )| ≤ n + b,
THE FUNDAMENTAL THEOREM OF LOCAL MARTINGALES
221
hence U is really locally bounded. Ap is right-continuous and predictable, hence it is locally bounded68 , so L L − (A − Ap ) = U − Ap is also locally bounded. Example 3.58 The decomposition is not unique.
Let ξ be an integrable but not square-integrable random variable. If {∅, Ω} if t < 1 Ft A if t ≥ 1 then L (t) E (ξ | Ft ) =
E (ξ) if t < 1 ξ if t ≥ 1
2 . If ξ is symmetric then L ∈ L. If η is a martingale, which is not in Hloc ξχ (|ξ| ≤ 1) then
L (t) E (η | Ft )
E (η) = 0 if t < 1 η if t ≥ 1
is in L and it is bounded and L L − L has integrable variation. Observe that L 0, L L is also a good decomposition which shows that the decomposition is not unique. Corollary 3.59 Every local martingale L has an H1 -localization. Proof. Let L = L (0) + L + L be a decomposition guaranteed by the Funda1 . L ∈ Aloc ∩ L, and if mental Theorem. L is locally bounded, hence L ∈ Hloc (τ n ) is a localizing sequence then as τn
(L ) ≤ Var (L )τ n (∞) ∈ L1 (Ω) , 1 . trivially L ∈ Hloc
Corollary 3.60 If X is a local martingale and Y is a locally bounded predictable process, then the stochastic integral Y • X is well-defined and Y • X ∈ L. If Y is left-regular, then for any t the random variable (Y • X) (t) is the Itˆ o–Stieltjes t integral 0 Y dX of Y with respect to X. 68 See:
Proposition 3.35, page 200.
222
THE STRUCTURE OF LOCAL MARTINGALES
Proof. By the Fundamental Theorem X is a semimartingale in the sense of the previous chapter. Y is locally bounded hence Y • X is well-defined69 . One should only prove the relation Y • X ∈ L. If X = X(0) + V + H, where H ∈ 2 2 then Y • H ∈ Hloc ⊆ L. V ∈ L ∩ V, hence by the last property of the Hloc predictable compensator Y • V ∈ L. By definition Y • X = Y • H + Y • V so Y • X ∈ L.
3.4
Quadratic Variation
Perhaps the most important consequence of the Fundamental Theorem is the following: Corollary 3.61 If X and Y are arbitrary semimartingales70 then [X, Y ] exits and XY − X (0) Y (0) = X− • Y + Y− • X + [X, Y ] .
(3.18)
The jump process of the quadratic co-variation [X, Y ] is ∆ [X, Y ] = ∆X∆Y.
(3.19)
If X and Y are local martingales then XY − [X, Y ] is a local martingale and [X, Y ] is the only process from V for which XY − X (0) Y (0) − [X, Y ] is a local martingale and ∆ [X, Y ] = ∆X∆Y . Proof. By the Fundamental Theorem X and Y are semimartingales in the sense of the previous chapter. Recall that if X and Y are semimartingales in the sense of the previous chapter then the stochastic integrals X− • Y and Y− • X exist and (3.18) and (3.19) hold71 . If X and Y are local martingales then, as X− and Y− are locally bounded, X− • Y and Y− • X are local martingales. If XY − X (0) Y (0) − A ∈ L for some process A ∈ V and ∆ [X, Y ] = ∆A then [X, Y ] − A ∈ L ∩ V is continuous so by Fisk’s theorem [X, Y ] − A is constant. Theorem 3.62 (Fundamental Properties of the Quadratic Variation) If L is a local martingale then: 1. the quadratic variation [L] exits, 2. L2 − [L] is a local martingale, 3. it is a right-regular increasing process with 1/2
[L] 69 See
the discussion in 2.4.3. course by the new definition. 71 See: Proposition 2.92, page 178 70 Of
∈ A+ loc .
(3.20)
QUADRATIC VARIATION
223
Proof. Let (τ n ) be a localizing sequence of L. The existence and the rightregularity of [L] is obvious from the previous statements. For every n σ n inf {t : [L] (t) > n} ∧ inf {t : |L| (t) > n} ∧ τ n is a stopping time. As [L] and |L| are right-regular obviously σ n ∞. As L has limit from the left " " √ 1/2 [L] (σ n ) ≤ n + ∆ [L] (σ n ) ≤ n + ∆ [L] (σ n ) = √ = n + |∆L| (σ n ) ≤ √ ≤ n + |L (σ n )| + |L (σ n −)| ≤ √ ≤ n + |L (σ n )| + n. By the Optional Sampling Theorem the right-hand side is integrable hence 1/2 [L] ∈ A+ loc . Example 3.63 Process with finite quadratic variation which is not a semimartingale.
Let X be the right-regular step process over [0, 1) with X (1 − 1/n) ∞ n ∞ (−1) /n. Obviously Var (X) = n=1 1/n = ∞, but [X] = n=1 1/n2 < ∞. As every deterministic semimartingale has finite variation72 X cannot be a semimartingale. Proposition 3.64 (characterization of the locally square-integrable martingales) Let L ∈ L. The following statements are equivalent: 2 . 1. L ∈ Hloc + 2. [L] ∈ Aloc . 2 3. There is a predictable process in A+ loc , denoted by L, for which L − L is a local martingale. 2 4. sups≤t |L (s)| ∈ A+ loc .
Proof. We show that each statement implies the following one: 1. L (0) = 0, therefore L ∈ H2 if and only if E ([L] (∞)) is finite73 , hence L ∈ 2 if and only if there is a localizing sequence (τ n ) for which E ([Lτ n ] (∞)) < ∞ Hloc for all n. 2. By the elementary properties of the quadratic variation L2 − [L] ∈ L. As p p [L] ∈ A+ loc one can define L [L] and of course [L] − [L] [L] − L ∈ L. Hence L2 − L is a local martingale. 72 See: 73 See:
Theorem 7.83, page 524. Step 4. of the proof. Proposition 2.84, page 170.
224
THE STRUCTURE OF LOCAL MARTINGALES
3. By 3. in the proposition L2 − L ∈ L hence74 Y (t) sups≤t L2 − L + + ∈ A+ loc . L ∈ Aloc and it is increasing so Z (t) sups≤t |L| ∈ Aloc . Obviously 2 sups≤t |L (s)| ≤ Y (t) + Z (t) so 4. follows from 3. 2
4. If (τ n ) is a localizing sequence of U (t) sups≤t |L (s)| then
2 E Lτ n (t) ≤ E (U (τ n )) < ∞, 2 . hence L ∈ Hloc p
2 Corollary 3.65 If H ∈ Hloc then H = [H] . 2 then there is a predictable process with finite Corollary 3.66 If M, N ∈ Hloc variation on finite intervals denoted by M, N such that M N −M, N is a local martingale. p
M, N = [M, N ] . 2 Proof. As M + N, M − N ∈ Hloc
M, N =
1 (M + N − M − N ) . 4
2 Definition 3.67 If M, N ∈ Hloc then M, N is called the predictable quadratic 2 then M M, M is called the co-variation of M and N . If M ∈ Hloc predictable quadratic variation of M .
74 See:
Example 3.3, page 181.
4 GENERAL THEORY OF STOCHASTIC INTEGRATION In this chapter we discuss the general theory of stochastic integration. We shall assume that the integrators are semimartingales, but we shall not assume that the integrands are locally bounded. In the first part of the chapter we shall prove that every local martingale is the sum of a continuous and a purely discontinuous local martingale1 . One can think about purely discontinuous local martingales as sums of continuously compensated single jumps2 . The quadratic variation of a purely discontinuous local martingale is the sum of the squares of the jumps of the local martingale3 . This decomposition is unique, that is for every local martingale L there is just one continuous Lc ∈ L and one purely discontinuous local martingale Ld ∈ L for which L = L (0) + Lc + Ld . In the second part of the chapter we shall present the integration theory for general local martingales. Our starting point is the decomposition L = L (0) + Lc + Ld . By the second, chapter the integral is well-defined when the integrator is a continuous local martingale, so we need to present the integration theory only when the integrator is purely discontinuous.
4.1
Purely Discontinuous Local Martingales
d If some process V has finite variation then for every outcome the process V ∆V is absolutely convergent. Therefore it is finite, so one can easily define the decomposition
V = V − ∆V + ∆V V c + V d where V c is a continuous process with finite variation and V d is a pure jump process also with finite variation. The trajectories of local martingales are 1 See:
Definition 4.5, page 228. Proposition 4.30, page 243. 3 See: Theorem 4.33, page 244. 2 See:
225
226
GENERAL THEORY OF STOCHASTIC INTEGRATION
right-regular, hence one can easily define the jump process ∆L of a local martingale L. Unfortunately if L is a local martingale then the process
∆L (s) ∆L (t) s≤t
formed from the jumps of L is not necessarily finite4 and, which is more important, generally it is not a local martingale. For example, if L is a compensated Poisson process then the A ∆L is increasing. Hence it is not a local martingale. In which sense can one define the continuous and the discontinuous part of a local martingale? Assume that a sequence of stopping times (σ n ) covers the jumps of L. First let us define the process A1 ∆L (σ 1 ) χ ([σ 1 , ∞)) . A1 is the ‘first’ jump of L. Obviously5 1/2
|A1 | ≤ [L]
∈ Aloc .
Let L1 A1 − Ap1 be the compensated first jump of L. As we know ∆Ap1 = (∆A1 ). Arguing a bit heuristically6 as p (∆L) = 0 the jumps of the local martingales are unpredictable, hence p (∆A1 ) = 0 so ∆Ap1 = 0, that is Ap1 is continuous. Let L1 L − L1 . As L and L1 are local martingales L1 is also a local martingale. Since Ap1 is continuous we have deleted from L just the ‘first’ jump of L. Then let
p
A2 ∆L(σ 2 )χ ([σ 2 , ∞)) and L2 A2 − Ap2 , L2 L1 − L2 , etc. Of courseat the moment we do not know whether the sum of the compensated jumps n Ln is convergent or not. If Ld n Ln exists then it is reasonable to call Lc L − Ld the continuous part of L, and Ld the discontinuous part of L. If U is an arbitrary continuous local martingale then as the trajectories of Ln An − Apn have finite variation [U, Ln ] = 0 for all n. Hence by the integration by parts formula7 U Ln ∈ L. Assume that the compensators Apn are continuous and
(∆L (s))2 is finite as [L] < ∞. In some sense this is the executive summary of the theory of discontinuous local martingales. See also: (1.22) on page 38. 5 See: Proposition 3.62, page 222. 6 See: Proposition 4.28, page 240. 7 See: Corollary 3.61, page 222, Corollary 3.60, page 221. 4 But
PURELY DISCONTINUOUS LOCAL MARTINGALES
227
the jump times [σ i ] and [σ j ] are disjoint if i = j. In this case8 ' & [Li , Lj ] Ai − Api , Aj − Apj = [Ai , Aj ] = 0. Definition 4.1 (Strong orthogonality) We say that the local martingales M and N are strongly orthogonal if [M, N ] = 0. 4.1.1
Orthogonality of local martingales
Recall that if . / N, M ∈ H02 X ∈ H2 : X (0) = 0 and M and N are strongly orthogonal then N M = N M − [N, M ] is a uniformly integrable martingale9 . Therefore (N, M )H2 E (N (∞) M (∞)) = E (N (0) M (0)) = 0. This means that if N, M ∈ H02 and [N, M ] = 0 then N and M are orthogonal in the Hilbert space H02 . In view of these observations the next definition looks very promising: Definition 4.2 (Orthogonality) We say that the local martingales M and N are orthogonal if the product M N is a local martingale. Example 4.3 It is possible for some local martingales10 M and N to be orthogonal nevertheless [N, M ] = 0.
Let π be a Poisson process and let (τ n ) be the sequence of stopping times describing the jumps. Let (ξ n ) be a sequence of independent and identically distributed random variables which are independent of (τ n ). If E(ξ n ) = 0 for all n then the compound Poisson process M (t)
ξ n χ(τ n ≤ t)
n
is a martingale. If (η n ) is a similar sequence then N (t)
η n χ(τ n ≤ t)
n 8 See:
Example 2.26, page 129. Proposition 2.84, page 170. 10 Observe that M and N in the example are purely discontinuous. 9 See:
228
GENERAL THEORY OF STOCHASTIC INTEGRATION
is also a martingale. If the sequences (ξ n ) and (η n ) are independent and ζ n ξ n η n then E(ζ n ) E(η n ξ n ) = E(η n )E(ξ n ) = 0 therefore the compound Poisson process M N (t) =
n
ξ n η n χ(τ n ≤ t)
ζ n χ(τ n ≤ t)
n
is also a martingale so M and N are orthogonal. Obviously [M, N ] = M N = 0. Proposition 4.4 The local martingale L is orthogonal to itself if and only if up to indistinguishability L is a constant11 . Proof. By Proposition 2.48, L and L2 are local martingales if and only if L is a constant. Definition 4.5 A local martingale L is purely discontinuous if L (0) = 0, and L is orthogonal to every continuous local martingale. It is obvious from the definition that the purely discontinuous local martingales form a linear subspace of the local martingales. We show that in some sense it is the ‘orthocomplement’ of the subspace of the continuous local martingales. Corollary 4.6 If L is a continuous, purely discontinuous local martingale then L = 0. Proof. By the assumption of the proposition L is orthogonal to itself, hence L is constant. For every purely discontinuous local martingale by definition L (0) = 0, so L ≡ 0. Corollary 4.7 If M and N are purely discontinuous local martingales and ∆M = ∆N then M and N are indistinguishable Proof. L M − N is a purely discontinuous, continuous local martingale. Therefore L = 0. Proposition 4.8 Local martingales M and N are orthogonal if and only if [M, N ] is a local martingale. Proof. As M N −[M, N ] is always a local martingale12 M N is a local martingale if and only if [M, N ] ∈ L. 11 Of
course L is constant by time, that is L (t) = L (0) for all t. Corollary 3.61, page 222.
12 See:
PURELY DISCONTINUOUS LOCAL MARTINGALES
229
2 Example 4.9 Orthogonality in Hloc .
2 Recall13 that if M, N ∈ Hloc then [M, N ] ∈ Aloc and in this case one can p define the predictable quadratic co-variation M, N [M, N ] . If M and N are orthogonal then
M, N = (M, N − [M, N ]) + [M, N ] ∈ L. Hence M, N ∈ V ∩ L. As M, N is a predictable local martingale it is continuous14 , so by Fisk’s theorem p
M, N = M, N = 0. On the other hand if M, N = 0 then p
[M, N ] = [M, N ] − M, N = [M, N ] − [M, N ] ∈ L 2 so M and N are orthogonal. Hence we proved that if N, M ∈ Hloc then M and N are orthogonal if and only if M, N = 0.
Corollary 4.10 If M is continuous and N is purely discontinuous then [N, M ] = 0. N ∈ L is a purely discontinuous local martingale if and only if [M, N ] = 0 for every continuous local martingale M . Therefore N ∈ L is a purely discontinuous local martingale if and only if N is strongly orthogonal to every continuous local martingale M . Proof. As M is continuous ∆ [N, M ] = ∆M ∆N = 0. As M and N are orthogonal [N, M ] is a continuous local martingale which has finite variation so by Fisk’s theorem it is zero. The rest follows from this. Theorem 4.11 (Generalized Fisk’s theorem) If L ∈ V ∩ L then L is purely discontinuous. Proof. If L ∈ V then [M, L] = 0 for every continuous local martingale M . Hence 0 = [M, L] ∈ L, which means that M and L are strongly orthogonal. Therefore L is purely discontinuous. Example 4.12 If A ∈ Aloc , then L A − Ap ∈ L ∩ V is a purely discontinuous local martingale.
13 See: 14 See:
Corollary 3.66, page 224. Corollary 3.40, page 205.
230
GENERAL THEORY OF STOCHASTIC INTEGRATION
Proposition 4.13 (Orthogonality and localization) The orthogonality of local martingales has the following properties: 1. If τ and σ are stopping times, M and N are orthogonal local martingales, then the stopped processes M τ and N σ are also orthogonal. 2. If M and N are local martingales and (τ n ) is a localizing sequence of N, then M and N are orthogonal if and only if M and the stopped processes N τ n are orthogonal for all n. Proof. If N and M are orthogonal, then [M, N ] is a local martingale. σ∧τ [M τ , N σ ] = [M, N ] is also a local martingale, hence M τ and N σ are orthogτ τn are orthogonal, then [M, N τ n ] = [M, N ] n ∈ L for all n. onal. If M and N Obviously Lloc = L so [M, N ] ∈ L. Hence M and N are orthogonal. Every continuous local martingale is locally bounded hence: Corollary 4.14 If M is a local martingale then M is purely discontinuous if and only if one of the next statements holds: 1. M is orthogonal to every square-integrable continuous martingale. 2. M is orthogonal to every bounded continuous martingale. Recall that the space of square-integrable martingales H2 , is a Hilbert space with the scalar product (M, N ) (M, N )H2 E (M (∞) , N (∞)) ,
M, N ∈ H2 .
By Doob’s inequality if M ∈ H2 , then supt |M (t)| ∈ L2 (Ω) , hence |M N (t)| ≤ sup |M (t)| sup |N (t)| ∈ L1 (Ω) , t
t
so M N is in class D as it is dominated by an integrable variable. Proposition 4.15 (H2 -orthogonality) If M ∈ H2 and N ∈ H02 , then the following statements are equivalent: 1. M and N as local martingales are orthogonal. 2. [M, N ] is a uniformly integrable martingale. 3. (M τ , N )H2 = 0 that is M τ and N as elements of the Hilbert space H2 , are orthogonal for every stopping time τ . Proof. Recall that by definition N ∈ H02 , if N ∈ H2 and N (0) = 0.
PURELY DISCONTINUOUS LOCAL MARTINGALES
231
1. By the Kunita–Watanabe inequality15 " |[M, N ]| ≤ [M ] [N ]. As M, N ∈ H2 both16 [M ] = [M − M (0)] , [N ] ∈ A+ , therefore [M, N ] ∈ A, so [M, N ] has integrable variation. Hence it is a uniformly integrable local martingale, that is [M, N ] is a uniformly integrable martingale. 2. Assume that [M, N ] is a uniformly integrable martingale. In this case τ [M τ , N ] = [M, N ] is also a uniformly integrable martingale, so it is sufficient to prove, that if M ∈ H2 , N ∈ H02 and [M, N ] is a uniformly integrable martingale then (M, N )H2 = 0. As M, N ∈ H2 |M N (t)| ≤ sup |M (s)| sup |N (s)| ∈ L1 (Ω). s
s
This implies that M N −[M, N ] is a uniformly integrable martingale. As N (0) = 0 (M, N )H2 E (M N (∞)) = E (M N (∞)) − E ([M, N ] (0)) = = E (M N (∞)) − E ([M, N ] (∞)) = = E ((M N − [M, N ]) (∞)) = = E ((M N − [M, N ]) (0)) = E (M (0) N (0)) = 0, hence (M, N )H2 = 0. 3. Assume that the third condition holds. If τ is an arbitrary stopping time then by the Optional Sampling Theorem E (M (τ ) N (τ )) = E (M (τ ) E (N (∞) | Fτ )) = = E (E (M (τ ) N (∞) | Fτ )) = = E (M (τ ) N (∞)) = E (M τ (∞) N (∞)) = = (M τ , N )H2 = 0 hence M N is a martingale. Example 4.16 The assumption in the third property about all possible stoppings of M is important.
From the proof it is obvious that if M, N ∈ H02 are orthogonal as local martingales, then they are always orthogonal in the Hilbert space H02 . The reverse is not true. It is possible that M, N ∈ H02 , they are orthogonal in Hilbert space 15 See: 16 See:
(2.17), page 137. Proposition 2.84, page 170.
232
GENERAL THEORY OF STOCHASTIC INTEGRATION
sense, but they are not orthogonal as local martingales. Perhaps the simplest counterexample is the following: Let M ∈ H02 and let ξ be an F0 -measurable random variable with P(ξ = 1) = P(ξ = −1) = 1/2. Let us assume that ξ is independent of M . As ξ is F0 -measurable N ξM is also in H02 . Since ξ and M are independent (M, N )H2 E(M (∞)N (∞)) E(ξM 2 (∞)) = E(ξ)E(M 2 (∞)) = 0, hence M and N are orthogonal in H02 . On the other hand, unless M = 0, M N is not a martingale. As ξ is F0 -measurable E(M N (t) | F0 ) = ξE(M 2 (t) | F0 ) = 0 = M N (0). 4.1.2
Decomposition of local martingales
If Hc2 denotes the continuous elements of H2 then by Doob’s inequality Hc2 is a closed subspace of H2 . It is not too surprising that the following proposition holds: Proposition 4.17 If Hd2 denotes the set of purely discontinuous elements of H2 then Hd2 is the orthogonal complement of Hc2 . Proof. Let N ∈ Hd2 and M ∈ Hc2 . By the definition of purely discontinuous local martingales N (0) = 0. M and N are orthogonal local martingales hence by the previous proposition (M, N )H2 = 0. On the other hand let us assume that N ∈ H2 is orthogonal to the subspace Hc2 . The constant process M (t) N (0) is a continuous martingale, hence
E N 2 (0) = E (M (∞) N (0)) = E (M (∞) E (N (∞) | F0 )) = = E (M (∞) N (∞)) (M, N )H2 = 0, hence N (0) = 0. If M ∈ Hc2 , then M τ ∈ Hc2 for any stopping time τ , hence (M τ , N ) = 0. Again by the previous proposition N and M are orthogonal local martingales. By Corollary 4.14 N is purely discontinuous. Corollary 4.18 Every M ∈ H2 has a unique decomposition M = M (0) + M c + M d ,
where
M c ∈ Hc2 , M d ∈ Hd2 .
(4.1)
Theorem 4.19 (Continuous and purely discontinuous parts of local martingales) Every local martingale L has a decomposition L = L (0) + Lc + Ld ,
Lc , Ld ∈ L,
PURELY DISCONTINUOUS LOCAL MARTINGALES
233
where Lc is continuous, and Ld is purely discontinuous. If L = L (0) + Lc1 + Ld1 and L = L (0) + Lc2 + Ld2 are two such decompositions then Lc1 , Lc2 and also Ld1 , Ld2 are indistinguishable. Proof. If Lci , Ldi , i = 1, 2 are two decompositions of L then Lc1 − Lc2 = Ld2 − Ld1 , which means that Lc1 − Lc2 is purely discontinuous and continuous. Hence17 Lc1 − Lc2 = 0, so Ld2 − Ld1 = 0 as well. 1. Let us take the decomposition L = L (0) + L + L of the Fundamental 2 . L ∈ V, hence by the generalized Fisk’s theorem18 L is Theorem. L ∈ Hloc 2 purely discontinuous. We may therefore assume that L ∈ Hloc . 2 c 2. Let (τ n ) be a H -localizing sequence of L, and let Lk and Ldk be the (4.1) decomposition of Lτ k in H2 . Of course τ k+1 τ k
Lτ k = ((L)
)
τ k c τ k d τ k
= Lck+1 + Ldk+1 = Lk+1 + Lk+1 .
τ
Obviously Lck+1 k ∈ Hc2 . Ldk+1 is orthogonal to every continuous local marτ
tingale, hence Ldk+1 k is also orthogonal to every continuous local martin
τ gale19 , hence Ldk+1 k is purely discontinuous. As the decomposition (4.1) is unique
Ldk+1
τ k
= Ldk ,
Lck+1
τ k
= Lck ,
so τk
(Lc )
Lck
and
Ld
τ k
Ldk .
unambiguously defines the local martingales Lc and Ld 3. Lc is trivially a continuous local martingale. We show that Ld is purely discontinuous. To prove it, it is sufficient show that Ld is orthogonal to
d to τn τn 20 2 ∈ Hd2 , hence Ld and U are every continuous martingale U ∈ H . L d 21 orthogonal. Hence U and L are also orthogonal . Example 4.20 Purely discontinuous local martingale22 which is not in V. 17 See: 18 See: 19 See: 20 See: 21 See: 22 See:
Corollary 4.6, page 228. Corollary 4.11, page 229. Proposition 4.13, page 229. Corollary 4.14, page 230. Corollary 4.14, page 230. Example 7.35, page 484.
234
GENERAL THEORY OF STOCHASTIC INTEGRATION
Let (Ni ) be a sequence of independent Poisson processes with λ = 1. For any t the compensated Poisson processes Mi (t) Ni (t) − λt = Ni (t) − t on the finite time horizon [0, t] are in H02 . As they are independent they almost surely do not have common jumps23 . Obviously [Mi , Mj ] = ∆Mi ∆Mj , therefore if i = j then M i and Mj are orthogonal local martingales so they are orthogonal in H02 . As i 1/i2 < ∞ the sequence M
∞ 1 i=1
i
Mi
is convergent in the Hilbert space H02 . Every Mi is in V so they are purely discontinuous. Therefore M is also purely discontinuous. The variables Ni (t) − t are independent, they have zero expected value. So for any t the sequence Rn
n Ni (t) − t i=1
i
is a discrete-time martingale. Obviously (Rn ) is bounded in L2 (Ω) so by the Martingale Convergence Theorem it is convergent almost surely. As i 1/i = ∞ obviously Var (M ) (t) ≥
∆M (s) =
s≤t
4.1.3
∞ Ni (t) i=1
i
= ∞.
Decomposition of semimartingales
The decomposition theorem just proved can be transferred to semimartingales. Theorem 4.21 (Continuous part of semimartingales) If S ∈ S then there is a continuous local martingale Lc for which for any decomposition of S S = S (0) + V + L,
V ∈ V, L ∈ L
Lc is the continuous part of L. If Lc1 and Lc2 are two such local martingales then Lc1 and Lc2 are indistinguishable. Proof. One should prove only the uniqueness of Lc , the other part of the theorem is trivial. If S (0) + V1 + Lc1 + Ld1 = S (0) + V2 + Lc2 + Ld2 23 See:
Proposition 7.13, page 471.
LOCAL MARTINGALES AND COMPENSATED JUMPS
235
then Lc1 − Lc2 = V2 − V1 + Ld2 − Ld1 .
(4.2)
V2 − V1 ∈ V ∩ L, hence V2 − V1 is purely discontinuous24 , hence the right side of (4.2) is purely discontinuous, the left side is continuous, hence Lc1 − Lc2 = 0. Example 4.22 If S is a semimartingale then the continuous part of S as a ‘true semimartingale’, is not ‘well-defined’.
Let us take our usual counter-example, the Poisson process. If π is a Poisson process and π(t) = λt + (π(t) − λt) then the continuous part of the local martingale part L(t) (π(t) − λt) is zero. If π(t) = π(t) + 0 then the continuous part of the local martingale part L = 0 is again zero. In the first case the continuous part of the finite variation part V (t) = λt ∈ V is λt, but in the second case the continuous part of the finite variation part π ∈ V is zero. What is the continuous part of the semimartingale π? In Itˆ o’s formula25 we shall use the notation S c . Let us fix the definition of S c in the following way: Definition 4.23 If S is a semimartingale then S c denote the continuous part of the local martingale part of S. Example 4.24 If S π is a Poisson process then π c = 0.
4.2
Purely Discontinuous Local Martingales and Compensated Jumps
During the construction of stochastic integrals with respect to local martingales, we shall need the next inequality: Theorem 4.25 (Davis’ inequality) There are positive constants c and C such that for every local martingale L ∈ L and for any stopping time τ c·E 24 See:
"
"
[L] (τ ) ≤ E sup |L (t)| ≤ C · E [L] (τ ) . t≤τ
Theorem 4.11, page 229. precisely: In Itˆ o’s formula one uses only the quadratic variation of the continuous part of the semimartingales, which is independent of the decomposition. See: Corollary 4.36, page 246. 25 More
236
GENERAL THEORY OF STOCHASTIC INTEGRATION
The proof of the inequality is a lengthy calculation which we shall present at the end of this chapter as a separate section. The most important application of Davis’ inequality is the following theorem. Theorem 4.26 (Convergence of strongly orthogonal series) Let us assume that (Ln ) ⊆ L and if i = j then Li and Lj are strongly orthogonal, that is i = j.
[Li , Lj ] = 0,
"∞ + If n=1 [Ln ] ∈ Aloc then there is an L ∈ L such that on every compact interval in the topology of uniform convergence in probability L=
∞
Ln .
n=1
"∞ + then L is a uniformly integrable martingale and the If n=1 [Ln ] ∈ A convergence holds in the topology of uniform convergence in L1 (Ω) that is m lim E sup L (t) − Ln (t) = 0. m→∞ t
n=1
"∞
+ Proof. First let us assume that n=1 [Ln ] ∈ A . By Davis’ inequality and by the assumption [Li , Lj ] = 0 if m > n then
* m $ +# m + E sup Li (t) ≤ C · E , Li (∞) = t
i=n
i=n
* +m + = C · E , [Li ] (∞) . i=n
As
"∞ n=1
[Ln ] ∈ A+ by the Dominated Convergence Theorem * +m + [Li ] (∞) = 0, lim E ,
n,m→∞
i=n
which implies that m lim E sup Li (t) = 0. n,m→∞ t
i=n
LOCAL MARTINGALES AND COMPENSATED JUMPS
237
m As L1 (Ω) is complete supt | i=1 Li (t)| is convergent in L1 (Ω). From the convergence in L1 (Ω) one has a subsequence which is almost surely convergent, therefore there is a process L such that for almost all ω n k lim sup Li (t, ω) − L (t, ω) = 0. k→∞ t i=1
L is obviously right-regular and of course L1 (Ω), that is
n i=1
Li converges to L uniformly in
n lim E sup Li (t) − L (t) = 0. n→∞ t
i=1
Again by Davis’ inequality
1/2 E sup |Li (t)| ≤ C · E [Li ] (∞) < ∞, t
hence Li is a class D local martingale hence it is a martingale. From the ∞ convergence in L1 (Ω) it follows that L i=1 Li is also a martingale. n n Li (t) + E sup L − Li (t) < ∞ E sup |L (t)| ≤ E sup t t t
i=1
i=1
hence the limit L is in D that is L a uniformly integrable martingale. "∞ + Now let us assume that n=1 [Ln ] ∈ Aloc . In this case there is a localizing sequence (τ k ) for which * * * τ k +∞ +∞ +∞ + + + τ τ , [Lnk ] = , [Ln ] k = , [Ln ] ∈ A+ . n=1
n=1
n=1
" Observe that (τ k ) is a common localizing sequence for all Ln , that is [Lτnk ] ∈ A for all n. Observe also, that by Davis’ inequality Lτnk ∈ M for every n and k. By the first part of the proof forevery k there is an L(k) ∈ M such that
(k+1) ∞ τk τk (k) . Obviously L = L(k) , so one can define an L ∈ L for n=1 Ln = L τk (k) which L = L . Let us fix an ε and a δ. As τ k ∞ for every t < ∞ there is
238
GENERAL THEORY OF STOCHASTIC INTEGRATION
an n such that P (τ k ≤ t) ≤ δ/2 whenever k ≥ n. In the usual way, for k ≥ n n Lk (s) > ε ≤ P sup L(s) − s≤t k=1 n Lk (s) > ε, τ k > t . ≤ P (τ k ≤ t) + P sup L(s) − s≤t
k=1
The first probability is smaller than δ/2, the second probability is n τk τk P sup L (s) − Lk (s) > ε, τ k > t s≤t
k=1
which is smaller than n τk τk P sup L (s) − Lk (s) > ε . s
k=1
As Lτnk → Lτ k uniformly in L1 (Ω), by Markov’s inequality n τk τk Lk (s) > ε → 0, P sup L (s) − s
k=1
from which one can easily show that for n large enough n P sup L(s) − Lk (s) > ε < δ, s≤t
k=1
n ucp that is k=1 Lk → L, which means that on every compact interval in the topology of uniform convergence in probability
lim
n→∞
n
Lk
k=1
∞
Lk = L.
k=1
Theorem 4.27 (Parseval’s identity) Under the conditions of the theorem above for every t # lim
n→∞
L−
n k=1
$ Lk (t) = 0
(4.3)
LOCAL MARTINGALES AND COMPENSATED JUMPS
239
and a.s.
[L] (t) =
∞
[Lk ] (t)
(4.4)
k=1
where in both cases the convergence holds in probability. Proof. By Davis’ inequality * $ +# n m + 1 Lk (t) ≤ · E sup L (s) − Ln (s) . E , L − c s≤t n=1 k=1
If
"∞ n=1
[Ln ] ∈ A+ then by the theorem just proved m Ln (s) = 0. lim E sup L (s) − m→∞ s≤t
n=1
in probability, By Markov’s" inequality convergence in L1 (Ω) implies convergence "∞ ∞ + [L ] ∈ A then (4.3) holds. Let [L ] ∈ A+ therefore if n n n=1 n=1 loc and "∞ let (τ k ) be a localizing sequence of [L ]. Let us fix an ε and a δ. As n n=1 τ k ∞ for every t < ∞ there is a q such that P (τ k ≤ t) ≤ δ/2 whenever k ≥ q. In the usual way, for k ≥ q n P sup L(s) − Lk (s) > ε ≤ s≤t k=1 n Lk (s) > ε, τ k > t . ≤ P (τ k ≤ t) + P sup L(s) − s≤t
k=1
Obviously n Lk (s) > ε, τ k > t = P sup L(s) − s≤t k=1 n τk τk Lk (s) > ε, τ k > t ≤ = P sup L (s) − s≤t k=1 n τk τk Lk (s) > ε . ≤ P sup L (s) − s≤t
k=1
240
GENERAL THEORY OF STOCHASTIC INTEGRATION
By the stopping rule of the quadratic variation * * * τ k +∞ +∞ +∞ + + + τ τ , [Lnk ] = , [Ln ] k = , [Ln ] ∈ A+ , n=1
n=1
n=1
so by the first part of the proof if n is large enough n δ δ Lk (s) > ε ≤ + P sup L(s) − 2 2 s≤t k=1
that is (4.3) holds in the general case. By Kunita–Watanabe inequality26 *# $ * $ +# + n n + " + , [L] (t) − , Lk (t) ≤ Lk (t). L− k=1 k=1 This implies that # [L] (t) = lim
n→∞
∞
n
$ Lk (t) = lim
k=1
n→∞
n
[Lk ] (t)
k=1
[Lk ] (t)
k=1
where convergences hold in probability. 4.2.1
Construction of purely discontinuous local martingales
The cornerstone of the construction of the general stochastic integral is the next proposition: Proposition 4.28 Let H be a progressively measurable process. There is one and only one purely discontinuous local martingale L ∈ L for which ∆L = H if and only if 1. the set {H = 0} is thin, 2. p H = 0 and " 3. H 2 ∈ A+ loc . Proof. By the definition of the thin sets, for every ω there exists just a countable of points where the trajectory H (ω) is not zero. Hence the sum
number 2 H 2 (t) s≤t H (s) is meaningful. Observe that from the condition " + H 2 ∈ Aloc it implicitly follows that H (0) = 0. 26 See:
Corollary 2.36, page 137.
LOCAL MARTINGALES AND COMPENSATED JUMPS
241
1. The uniqueness of L is obvious, as if purely discontinuous local martingales have the same jumps then they are indistinguishable27 . 2 2. If H ∆L for some L ∈ L then p H p (∆L) = 0, and as (∆L) = ∆ [L] and [L] is increasing
H2 =
2
(∆L) ≤
2
c
(∆L) + [L] =
= [L] . " H 2 ∈ A+ [L] ∈ A+ loc obviously loc , so the conditions are necessary. " + 2 3. Let us assume that H ∈ Aloc and let us assume that the sequence of stopping times (ρm ) exhausting29 for the thin set {H = 0}. We can assume that ρm is either totally inaccessible or predictable. For every stopping time ρm let us define a simple jump processes which jumps at ρm and for which the value of the jump is H (ρm ):
Since28
"
Nm H (ρm ) χ ([ρm , ∞)) . It is worth emphasizing that it is possible that ∪m [ρm ] = {H = 0}. That is, the inclusion {H = 0} ⊆ ∪m [ρm ] can be proper, but ∪m {∆Nm = 0} = {H = 0} . Nm is right-regular, H is progressively measurable, hence the stopped variables " H 2 ∈ A+ H (ρm ) are Fρm -measurable and so Nm is adapted. As loc |Nm | ≤
)
H 2 ∈ A+ loc
for every m, hence Nm has locally integrable variation, so it has a compensator p . Nm p 4. We show that Nm is continuous. If ρm is predictable then the graph [ρm ] of ρm is a predictable set30 so using property 6. of the predictable 27 See:
Corollary 4.7, page 228. (3.20) line, page 222. 29 See: Proposition 3.22, page 189. 30 See: Corollary 3.34, page 199. 28 See:
242
GENERAL THEORY OF STOCHASTIC INTEGRATION
compensator31 up to indistinguishability p ∆ (Nm )=
p
(∆Nm )
p
(H (ρm ) χ ([ρm ])) =
p
(Hχ ([ρm ])) =
= (p H) χ ([ρm ]) = 0 · χ ([ρm ]) = 0. p Hence Nm is continuous. Let ρm be totally inaccessible. As above
p ∆ (Nm )=
p
(∆Nm ) =
p
(Hχ ([ρm ])) .
ρm is totally inaccessible and therefore P (ρm = σ) = 0 for every predictable stopping time σ, hence if σ is predictable then p
0 (Hχ ([ρ ]) (σ) | Fσ− ) = (Hχ ([ρm ])) (σ) E m 0 (0 | Fσ− ) = 0. =E
p By the definition of the predictable projection ∆ (Nm ) = 0. p 5. Let Lm Nm − Nm ∈ L be the compensated jumps. As the compensators are continuous and have finite variation if i = j then [Li , Lj ] = [Ni , Nj ] = 0, and
)
[Lk ] =
)
[Nk ] =
)
H 2 ∈ A+ loc .
Hence32 there is an L ∈ L for which L = k Lk . As the convergence is uniform in probability there is a sequence for which the convergence is almost surely uniform. Hence up to indistinguishability ∆L = ∆
Lk = ∆Lk = H.
Observe that in the last step we have used the fact that {H = 0} = ∪m {∆Nm = 0} = ∪m {∆Lm = 0} . 6. Let us prove that L is purely discontinuous. Let M be a continuous local martingale. Obviously [Lk , M ] = 0. Therefore by the inequality of Kunita and 31 See: 32 See:
page 217. Theorem 4.26, page 236.
LOCAL MARTINGALES AND COMPENSATED JUMPS
243
Watanabe33 and by (4.3) # $ # $ n n Lk + M, Lk = |[M, L]| ≤ M, L − k=1 k=1 * # $ $ +# n n " + , = M, L − Lk ≤ [M ] Lk → 0 L− k=1
k=1
which implies that [M, L] = 0, that is M and L are orthogonal. Hence L is purely discontinuous. Definition 4.29 The following definitions are useful: 1. We say that process X is a single jump if there is a stopping time ρ and an Fρ -measurable random variable ξ such that X = ξχ ([ρ, ∞)). 2. We say that process X is a compensated single jump if there is a single jump Y for which X = Y − Y p . 3. We say that the X is a continuously compensated single jump if Y p in 2. is continuous. Proposition 4.30 (The structure of purely discontinuous local martingales) If L ∈ L is a purely discontinuous local martingale then in the topology of uniform convergence in probability on compact intervals L
∞
Lk ,
k=1
where for all k: 1. Lk ∈ L is a continuously compensated single jump, 2. the jumps of Lk are jumps of L. 3. If i = j then [Li , Lj ] = 0 that is Li and Lj are strongly orthogonal, 2
4. [Lk ] = (∆L (ρk )) χ ([ρk , ∞)), where ρk denotes the stopping time of Lk . & ' 5. If i = j then the graphs [ρi ] and ρj are disjoint. " If [L] ∈ A+ then the convergence holds in the topology of uniform convergence in L1 (Ω). Proof. It is sufficient to remark, that if L ∈ L is purely discontinuous then the jump process of L satisfies the conditions of the above proposition34 . 33 See: 34 See:
Corollary 2.36, page 137. Proposition 4.28, page 240.
244
GENERAL THEORY OF STOCHASTIC INTEGRATION
4.2.2
Quadratic variation of purely discontinuous local martingales
In this subsection we return to the investigation of the quadratic variation. Definition 4.31 We say that M is a pure quadratic jump process if [M ] =
2
(∆M ) .
(4.5)
Example 4.32 Every V ∈ V is a pure quadratic jump process35 .
By (2.14) [V, V ] =
∆V ∆V =
2
(∆V ) .
Theorem 4.33 (Quadratic variation of purely discontinuous local martingales) A local martingale L ∈ L is a pure quadratic jump process if and only if it is purely discontinuous. Proof. Let L ∈ L. 1. If L is purely discontinuous, then by the structure of purely discontinuous local martingales36 L = k Lk , where [Lk , Lj ] =
0 if k = j . 2 (∆L (ρk )) χ ([ρk , ∞)) if k = j
By Parseval’s identity (4.4) for every t
a.s
[L] (t) =
∞
[Lk ] (t) =
2
(∆L) (s) .
s≤t
k=1
As both sides of the equation are right-regular [L] and indistinguishable. 2. If L is a pure quadratic jump process, then [L] = 35 See: 36 See:
Proposition 2.33, page 134. Proposition 4.30, page 243.
2
(∆L) .
s≤t
2
(∆L)
are
LOCAL MARTINGALES AND COMPENSATED JUMPS
245
Let L = Lc + Ld be the decomposition of L ∈ L. As Lc is continuous37 ' ' & ' & & [L] = Lc + Ld = [Lc ] + 2 Lc , Ld + Ld = & ' = [Lc ] + Ld . By the part of the theorem already proved 2 2 & d' 2 ∆Ld = ∆Ld + ∆Lc = (∆L) . L = Hence [Lc ] = 0, therefore Lc = 0 and so L = Ld . Corollary 4.34 If X is a purely discontinuous local martingale then for every local martingale Y [X, Y ] =
∆X∆Y.
(4.6)
Proof. Obviously & ' ' & [X, Y ] = X, Y c + Y d = [X, Y c ] + X, Y d . By the definition of the orthogonality [X, Y c ] is a local martingale. ∆ [X, Y c ] = ∆X∆Y c = 0, hence [X, Y c ] is continuous. [X, Y c ] ∈ V ∩ L so by Fisk’s theorem [X, Y c ] = 0. As the purely discontinuous local martingales form a linear space & ' 1 & ' & ' X +Yd − X −Yd = X, Y d = 4 2 2
1 ∆X + ∆Y d − ∆X − ∆Y d = = 4
∆X∆Y. ∆X ∆Y d + ∆Y c = = ∆X∆Y d =
Proposition 4.35 (Quadratic variation of semimartingales) For every semimartingale X [X] = [X c ] +
2
(∆X) ,
(4.7)
where, as before38 , X c denotes the continuous part of the local martingale part of X. More generally if X and Y are semimartingales then [X, Y ] = [X c , Y c ] + 37 See: 38 See:
Corollary 4.10, page 229. Definition 4.23, page 235.
∆X∆Y.
(4.8)
246
GENERAL THEORY OF STOCHASTIC INTEGRATION
Proof. Recall39 that every semimartingale X has a decomposition, X = X (0) + X c + H + V, where X c is a continuous local martingale, V ∈ V and H is a purely discontinuous local martingale. By simple calculation [X] = [X c ] + [V ] + [H] + + 2 [X c , H] + 2 [X c , V ] + 2 [H, V ] . As X c is continuous and V has finite variation so [X c , V ] = 0. H is purely discontinuous and X c is continuous, hence by (4.6) [X c , H] = 0. Therefore [X] = [X c ] + [V ] + [H] + 2 [H, V ] . Every process with finite variation is a pure quadratic jump process so [V ] =
2
(∆V ) .
H is purely discontinuous, hence it is also a pure quadratic jump process, so [H] =
2
(∆H) .
As V has finite variation so by (2.14) [H, V ] =
∆H∆V.
Therefore [V ] + [H] + 2 [H, V ] =
2
(∆H + ∆V ) =
2
(∆X) ,
so (4.7) holds. The proof of the general case is similar. c
Corollary 4.36 If X is a semimartingale then [X c ] = [X] . More generally if c X and Y are semimartingales then [X c , Y c ] = [X, Y ] .
4.3
Stochastic Integration With Respect To Local Martingales
Recall that so far we have defined the stochastic integral with respect to local martingales only when the integrator Y was locally square-integrable. In fact, in this case the construction of the stochastic integral is nearly the same as the construction when the integrator is a continuous local martingale. The only 39 See:
Theorem 4.19, page 232.
STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES
247
2 difference is that when Y ∈ Hloc then one can integrate only predictable processes and one has to consider the condition for the jumps of the integral ∆ (X • Y ) = 2 then a predictable process X is integrable X∆Y as well. Recall that if Y ∈ Hloc if and only if
/ . X ∈ L2loc (Y ) Z : Z 2 • [Y ] ∈ A+ loc . 2 In this case X • Y ∈ Hloc . Observe that the condition X ∈ L2loc (Y ) is very 2 if and only if40 [M ] ∈ A+ natural. If M is a local martingale then M ∈ Hloc loc . 2 2 As [X • Y ] = X • [Y ] , obviously X • Y ∈ Hloc if and only if X ∈ L2loc (Y ). As ∆ (X • Y ) = X∆Y , if Y is continuous then X • Y is also continuous. Let Y = Y (0) + Y c + Y d be the decomposition of Y into continuous and purely discontinuous local martingales. As [Y ] ∈ A+ loc and as
& ' [Y ] = [Y c ] + Y d
(4.9)
& ' c d it is obvious that [Y c ] , Y d ∈ A+ loc . This immediately implies that Y and Y are 2 2 2 c in Hloc . From (4.9) loc (Y ) if and only if X ∈ Lloc (Y ) it is also clear that X ∈ L d c d 2 and X ∈ Lloc Y . This implies that X • Y and X • Y exist and obviously X • Y = X • Y c + X • Y d. By the construction X • Y c is continuous. Observe that X • Y d is a purely discontinuous local martingale as for any continuous local martingale L ' & ' & X • Y d , L = X • Y d , L = X • 0 = 0, that is X • Y d is strongly orthogonal to every continuous local martingale. The goal of this section is to extend the integration to the case when the integrator is an arbitrary local martingale. To do this one should define the stochastic integral for every purely discontinuous local martingale. Extending the integration to purely discontinuous local martingales from the integration procedure we expect the following properties: 1. If L ∈ L is purely discontinuous then X • L ∈ L should be also purely discontinuous. 2. Purely discontinuous local martingales are uniquely determined by their jumps41 , hence it is sufficient to prescribe the jumps of X • L: it is very natural to ask that the formula ∆ (X • L) = X∆L should hold. 40 See: 41 See:
Proposition 3.64, page 223. Corollary 4.7, page 228.
248
GENERAL THEORY OF STOCHASTIC INTEGRATION 1/2
3. We have proved42 [L] ∈ A+ therefore if loc for any local martingale L, " X ) • L is a purely discontinuous local martingale then the expression [X • L] = 2 (X∆L) should have locally integrable variation. 4. If L ∈ L then p (∆L) = 0. By the jump condition, if X is predictable then p
(∆ (X • L)) =
p
(X · ∆L) = X · (p (∆L)) = X · 0 = 0
from which one can expect that one can guarantee only for predictable integrands X that X • L ∈ L and ∆ (X • L) = X∆L. 4.3.1
Definition of stochastic integration
Assume, that L ∈ L is a purely discontinuous local martingale. As L is a local martingale p (∆L) is finite and p (∆L) = 0. If H is a predictable real valued process then as p (∆L) is finite43 p
(H∆L) = H (p (∆L)) = 0,
hence if )
2
H 2 (∆L) ∈ A+ loc ,
then there is one and only one purely discontinuous local martingale44 , denoted by H • L, for which ∆ (H • L) = H∆L. If one expects the properties H∆L = ∆ (H • L)
and
d
(H • L) = H • Ld
from the stochastic integral H • L then this definition is the only possible one for H • L. Definition 4.37 If L is a purely discontinuous local martingale then H • L is the stochastic integral of H with respect to L. Definition 4.38 If L = L " (0) + Lc + Ld is a local martingale and H is a predictable process for which H 2 • [L] ∈ A+ loc then H • L H • Lc + H • Ld . H • L is the stochastic integral of H with respect to L. 42 See:
(3.20), page 222. Proposition 3.37. page 201. 44 See: Proposition 4.28, page 240. 43 See:
STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES
249
Example 4.39 If X ∈ V is predictable45 and L is a local martingale then ∆X • L = ∆X∆L.
1. The trajectories of L are right-regular, therefore they are bounded on finite intervals46 . As X ∈ V obviously ∆L • X exists and ∆X∆L = ∆L • X. X is predictable and right-regular, therefore it is locally bounded47 . As Var (X) is also predictable and it is also right-regular it is also locally bounded. 2. |∆X| ≤ Var (X), which implies that ∆X • L is well-defined. Let L = L (0) + Lc + Ld be the decomposition of L. For any local martingale N ∆X • [Lc , N ] = 0 hence ∆X • Lc = 0. Therefore one can assume that L is purely discontinuous. ) ) 2 2 |∆X∆L| ≤ ∆X∆L ≤ (∆X) (∆L) ≤ " " ≤ [X] [L] < ∞. Obviously ∆ ( ∆X∆L) = ∆X∆L. As ∆X∆L has finite variation, so if it is a local martingale thenit is a purely discontinuous local martingale. Therefore we should prove that ∆X∆L is a local martingale. Hence we should prove that ∆L • X is a local martingale. 3. With localization one can assume that X and Var (X) are bounded. As X and Var (X) are bounded ) " 2 |∆L| • Var (X) = |∆X| |∆L| ≤ (∆X) [L] ≤ " " ≤ sup |X| · Var (X) [L] ∈ A+ loc . Hence with further localization we can assume that ∆L•X ∈ A. If τ is a stopping time then E ((∆L • X) (τ )) = E ((∆L • X τ ) (∞)) . As X τ is also predictable48 one should prove that if ∆L • X ∈ A and X is predictable, then E ((∆L • X) (∞)) = 0. By Dellacherie’s formula49 , using that 45 If
X is not predictable then ∆X is also not predictable so ∆X • L is undefined. Proposition 1.6, page 5. 47 See: Proposition 3.35, page 200. 48 See: Proposition 1.39, page 23. 49 See: Proposition 5.9, page 301. 46 See:
250
GENERAL THEORY OF STOCHASTIC INTEGRATION
L is a local martingale hence p (∆L) = 0, E ((∆L • X) (∞)) = E ((p (∆L) • X) (∞)) = 0. That is ∆L • X = ∆X∆L is a local martingale. 4.3.2
Properties of stochastic integration
Let us discuss the properties of stochastic integration with respect to local martingales: " H 2 • [L] ∈ A+ 1. If loc then the definition is meaningful and H • L ∈ L. Specifically every locally bounded predictable process is integrable 50 . For any local martingale L 2 (∆L) . (4.10) [L] = [Lc ] + The integral H 2 • [Lc ] is finite, hence the integral H • Lc exists51 . By (4.10) ) ) ) 2 2 H 2 • [Ld ] = H 2 • (∆L) = (H∆L) ∈ Aloc , hence H • Ld is also meaningful. Both integrals are local martingales, hence the also a local martingale. The second observation sum H • L H • Lc + H • Ld is" easily follows from the relation [L] ∈ A+ loc . 2. H∆L = ∆ (H • L). c
d
3. (H • L) = H • Lc and (H • L) = H • Ld . 4. [H • L] = H 2 • [L]. c 2 [H • L] = [(H • L) ] + (∆ (H • L)) = & ' 2 = H 2 • [Lc ] + (H∆L) = H 2 • [Lc ] + H 2 • Ld = = H 2 • [L] . 5. H • L is the only process in L for which [H • L, N ] = H • [L, N ] holds for every N ∈ L. By the inequality of Kunita and Watanabe " " |H| • Var ([L, N ]) ≤ H 2 • [L] [N ] [M ] ∈ A+ loc for any local martingale M , hence the present construction of H • L is maximal in H, that is if one wants to extend the definition of the stochastic integral to a broader class of integrands H, then H • L will not necessarily be a local martingale. 51 See: Corollary 2.67, page 158. 50
STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES
251
hence the integral H • [L, N ] is meaningful. Therefore ( c d [H • L, N ] = [(H • L) , N c ] + (H • L) , N d = = [H • Lc , N c ] + H∆L∆N ' & = H • [Lc , N c ] + H • Ld , N d =
' & = H • [Lc , N c ] + Ld , N d = = H • [L, N ] . If H • [L, N ] = [Y, N ] for some local martingale Y , then [Y − H • L, N ] = 0. Hence if N Y − H • L then [Y − H • L] = 0. Y − H • L is a local martingale therefore52 Y − H • L = 0. 6. If τ is an arbitrary stopping time, and H • L exists then τ
H • Lτ = (H • L) = (χ ([0, τ ]) H) • L. If
" H 2 • [L] ∈ Aloc , then trivially " " H 2 • [Lτ ] = χ ([0, τ ]) H 2 • [L] ∈ Aloc
so the integrals above exists. By the stopping rule of the quadratic variation if N ∈L τ
τ
τ
τ
[(H • L) , N ] = [(H • L) , N ] = (H • [L, N ]) = H • [L, N ] = = H • [Lτ , N ] = [H • Lτ , N ] , hence by the bilinearity of the quadratic variation τ
[(H • L) − H • Lτ , N ] = 0, N ∈ L, from which τ
(H • L) = H • Lτ . For arbitrary N ∈ L τ
[H • Lτ , N ] = H • [Lτ , N ] = H • [L, N ] = = (χ ([0, τ ]) H) • [L, N ] = = [(χ ([0, τ ]) H) • L, N ] , hence again H • Lτ = (χ ([0, τ ]) H) • L from Property 5. 52 See:
Proposition 2.82, page 170.
252
GENERAL THEORY OF STOCHASTIC INTEGRATION
7. The integral is linear in the integrand. By elementary calculation ) ) ) 2 (H1 + H2 ) • [L] ≤ H12 • [L] + H22 • [L], hence if H1 • L and H2 • L exist then the integral (H1 + H2 ) • L also exists. When the integrator is continuous the integral is linear. The linearity of the purely discontinuous part is a simple consequence of the relation. (H1 + H2 ) ∆L = H1 ∆L + H2 ∆L. The proof of the homogeneity is analogous. 8. The integral is linear in the integrator. By the inequality of Kunita and Watanabe53 [L1 + L2 ] ≤ 2 ([L1 ] + [L2 ]) , hence if the integrals H • L1 and H • L2 exist then H • (L1 + L2 ) also exists. The decomposition of the local martingales into continuous and purely discontinuous c d martingales is unique so (L1 + L2 ) = Lc1 + Lc2 , and (L1 + L2 ) = Ld1 + Ld2 . For continuous local martingales we have already proved the linearity, the linearity of the purely discontinuous part is evident from the relation ∆ (L1 + L2 ) = ∆L1 + ∆L2 . 9. If H i ξ i χ ((τ i , τ i+1 ]) is an adapted simple process then (H • L) (t) =
ξ i (L (τ i+1 ∧ t) − L (τ i ∧ t)) .
(4.11)
i
By the linearity it is sufficient to calculate the integral just for one jump. For the continuous part we have already deduced the formula. For the discontinuous part it is sufficient to remark that if ξ i is Fτ i -measurable and L is a purely discontinuous local martingale then ξ i (L (τ i+1 ∧ t) − L (τ i ∧ t)) is a purely discontinuous local martingale54 , with jumps ξ i χ ((τ i , τ i+1 ]) ∆L. 10. Assume that the integral H • L exists. The integral K • (H • L) exists if and only if the integral (KH) • L exists. In this case (KH) • L = K • (H • L) . Let us remark that as the integrals are pathwise integrals with respect to processes with finite variation ) " 2 K 2 • (H 2 • [L]) = (KH) • [L]. 53 See: 54 The
Corollary 2.36, page 137. space of purely discontinuous local martingales is closed under stopping.
STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES
253
K • (H • L) exists if and only if ) " " 2 K 2 • [H • L] = K 2 • (H 2 • [L]) = (KH) • [L] ∈ A+ loc , from which the first part is evident. If N is an arbitrary local martingale then [K • (H • L) , N ] = K • [H • L, N ] = KH • [L, N ] = = [KH • L, N ] , from which the second part is evident. 11. If τ is an arbitrary stopping time then τ
H • Lτ = (χ ([0, τ ]) H) • L = (H • L) . If N is an arbitrary local martingale, then τ
[H • Lτ , N ] = H • [Lτ , N ] = H • [L, N ] = = Hχ ([0, τ ]) • [L, N ] = = [Hχ ([0, τ ]) • L, N ] = τ
τ
= (H • [L, N ]) = [H • L, N ] = τ
= [(H • L) , N ] , from which the property is evident. 12. The Dominated Convergence Theorem is valid, that is if (Hn ) is a sequence of predictable processes, Hn → H∞ and there is a predictable process H, for which the integral H • L exists and |Hn | ≤ H then the integrals Hn • L also exist and Hn • L → H∞ • L, where the convergence is uniform in probability on the compact time-intervals. As Hn2 • [L] ≤ H 2 • [L] for all n ≤ ∞ the integrals Hn • L exist. By Davis’ inequality, for every stopping time τ %
2 τ E sup |((Hn − H∞ ) • L ) (t)| ≤ C · E (Hn − H∞ ) • [L] (∞) .
τ
t
"
τ H 2 • [L] m (∞) < ∞, hence by There is a localizing sequence (τ m ), that E the classical Dominated Convergence Theorem E
) 2 τ (Hn − H∞ ) • [L] m (∞) → 0
254
GENERAL THEORY OF STOCHASTIC INTEGRATION
hence L
sup |((Hn − H∞ ) • Lτ m ) (t)| →1 0, t
from which as in the continuous case55 one can guarantee on every compact interval the uniform convergence in probability. 13. The definition of the integral is unambiguous that is if L ∈ V ∩ L then the two possible concepts of integration give the same result. It is trivial from Proposition 2.89. 14. If X is left-continuous and locally bounded then (X • L) (t) is an Itˆ o– Stieltjes integral for every t where the convergence of the approximating sums is uniform in probability on every compact interval. The approximating partitions can be random as well. The proof is the same as in the continuous case56 .
4.4
Stochastic Integration With Respect To Semimartingales
Recall the definition of stochastic integration with respect to semimartingales: Definition 4.40 If semimartingale X has a decomposition X = X (0) + L + V,
V ∈ V, L ∈ L
for which the integrals H • L and H • V exist then H • X H • L + H • V. By Proposition 2.89 the next statement is trivial57 : Proposition 4.41 For predictable integrands the definition is unambiguous, that is the integral is independent of the decomposition of the integrator. Proposition 4.42 If X and Y are arbitrary semimartingales and the integrals U • X and V • Y exist, then [U • X, V • Y ] = U V • [X, Y ] . Proof. Let XL + XV , and YL + YV be the decomposition of X and Y . [U • X, V • Y ] = [U • XL , V • YL ] + [U • XL , V • YV ] + + [U • XV , V • YL ] + [U • XV , V • YV ] . 55 See:
Proposition 2.74. page 162. Proposition 2.77, page 166. 57 See: Subsection 2.4.3, page 176. 56 See:
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
255
For integrals with respect to local martingales [U • XL , V • YL ] = U V • [XL , YL ] . In the three other expressions one factor has finite variation, hence the quadratic variation is the sum of the products of the jumps58 . For example [U • XL , V • YV ] =
∆ (U • XL ) ∆ (V • YV ) =
(U ∆XL ) (V ∆YV ) .
On the other hand for the same reason U V • [XL , YV ] = U V •
∆XL ∆YV
=
U V ∆XL ∆YV ,
hence [U • XL , V • YV ] = U V • [XL , YV ] . One can finish the proof with the same calculation for the other tags. Observe that the existence of the integral H • X means that for some decomposition X = X (0) + L + V one can define the integral and the existence of the integral does not mean that in every decomposition of X the two integrals are meaningful. Observe also that with the definition we extended the class of integrable processes even for local martingales. It is possible that the integral H • L as an integral with respect to the local martingale L does not exist, but L has a decomposition L = L (0) + M + V, M ∈ L, V ∈ V for which H is integrable with respect to M and V . Of course in this general case we cannot guarantee that59 H • L ∈ L. Example 4.43 If the integrand is not locally bounded then the stochastic integral with respect to a local martingales is not necessarily a local martingale.
Let M be a compound Poisson process, where P (ξ k = ±1) = 1/2 for the jumps ξ k . M is a martingale and the trajectories of M are not continuous. Let τ 1 be the time of the first jump of M and let X (t, ω) 58 See: 59 See:
line (2.14), page 134. Example 4.43, page 255.
1 · χ ((0, τ 1 (ω)]) . t
256
GENERAL THEORY OF STOCHASTIC INTEGRATION
X is predictable but it is not locally bounded. As the trajectories of M have finite variation the pathwise stochastic integral 1 χ ((0, τ 1 (ω)]) dM (s, ω) = L (t, ω) (X • M ) (t, ω) = (0,t] s 0 if t < τ 1 (ω) = ξ 1 (ω) /τ 1 (ω) if τ 1 (ω) ≤ t is meaningful. We prove that L is not a local martingale. If (ρk ) would be a localization of L then Lρ1 was a uniformly integrable martingale. Hence for the stopping time σ ρ1 ∧ t E (L (σ)) E (L (ρ1 ∧ t)) = E (Lρ1 (t)) = E (L (0)) = 0. Therefore it is sufficient to prove that for any finite stopping time σ = 0 E (|L (σ)|) = ∞.
(4.12)
Let σ be a finite stopping time with respect to the filtration F generated by M . 1 1 E (|L (σ)|) = χ (τ 1 ≤ σ) dP ≥ χ (τ 1 ≤ σ ∧ τ 1 ) dP. τ τ Ω 1 Ω 1 Hence to prove (4.12) one can assume that σ ≤ τ 1 . In this case σ is Fτ 1 measurable. Hence it is independent of the variables (ξ n ). So one can assume that σ is a stopping time for the filtration generated by the point process part of M . By the formula of the representation of stopping times of point processes60 σ = ϕ0 χ (σ < τ 1 ) +
∞
χ (τ n ≤ σ < τ n+1 ) ϕn (τ 0 , . . . , τ n )
n=1
∞
χ (τ n ≤ σ < τ n+1 ) ϕn (τ 0 , . . . , τ n ) =
n=0
= ϕ0 χ (σ < τ 1 ) + χ (σ ≥ τ 1 ) ϕ1 (τ 1 ) . From this {τ 1 ≤ ϕ0 } ⊆ {τ 1 ≤ σ}. If ϕ0 > 0 then using that τ 1 has an exponential distribution 1 1 E (|L (σ)|) = χ (τ 1 ≤ σ) dP ≥ χ (τ 1 ≤ ϕ0 ) dP = τ τ Ω 1 Ω 1 ϕ0 1 = λ exp (−λx) dx = ∞. x 0 60 See:
Proposition C.6, page 581.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
257
σ = 0 and F0 = {∅, Ω}, therefore {σ ≤ 0} = ∅. Hence σ > 0, so if ϕ0 = 0 then σ ≥ τ 1 . Hence again 1 1 χ (τ 1 ≤ σ) dP = dP = ∞. E (|L (σ)|) = Ω τ1 Ω τ1 By the definition of the integral it is clear that if a process H is integrable with respect to semimartingales X1 and X2 then H is integrable with respect to aX1 + bX2 for every constants a, b and H • (aX1 + bX2 ) = a (H • X1 ) + b (H • X2 ) . Observe that by the above definitions the other additivity of the integral, that is the relation (H1 + H2 ) • X = H1 • X + H2 • X is not clear. Our direct goal in the following two subsections is to prove this additivity property of the integral. 4.4.1
Integration with respect to special semimartingales
Recall that by definition S is a special semimartingale if it has a decomposition S = S (0) + V + L,
V ∈ V, L ∈ L
(4.13)
where V is predictable. Theorem 4.44 (Characterization of special semimartingales) Let S be a semimartingale. The next statements are equivalent: 1. S is a special semimartingale, i.e. there is a decomposition (4.13) where V is predictable. 2. There is a decomposition (4.13), where V ∈ Aloc . 3. For all decompositions (4.13) V ∈ Aloc . 4. S ∗ (t) sups≤t |S (s) − S (0)| ∈ A+ loc . Proof. We prove the equivalence of the statements backwards. 1. Let us assume that the last statement holds, and let S = S (0) + V + L be a decomposition of S. Let L∗ (t) sups≤t |L (s)|. L∗ is in61 A+ loc , hence from the assumption of the fourth statement V ∗ (t) sup |V (s)| ≤ S ∗ (t) + L∗ (t) ∈ A+ loc . s≤t
61 See:
Example 3.3, page 181.
258
GENERAL THEORY OF STOCHASTIC INTEGRATION
The process Var (V )− is increasing and continuous from the left, hence it is locally bounded, hence Var (A)− ∈ A+ loc . As Var (V ) ≤ Var (V )− + ∆ (Var (V )) ≤ Var (V )− + 2V ∗ Var (V ) ∈ A+ loc , hence the third condition holds. 2. From the third condition the second one follows trivially. 3. If V ∈ Aloc in the decomposition S = S (0)+V +L, then V p , the predictable compensator of V , exists. V − V p is a local martingale, hence S = S (0) + V p + (V − V p + L) is a decomposition where V p ∈ V is predictable, so S is a special semimartingale. 4. Let us assume that S (0) = 0 so S = V + L. If V ∗ (t) sups≤t |V (s)|, then as V ∗ ≤ Var (V ) S ∗ ≤ V ∗ + L∗ ≤ Var (V ) + L∗ . L∗ ∈ A+ loc , so it is sufficient to prove that if V ∈ V is predictable then Var (V ) ∈ A+ loc . It is sufficient to prove that Var (V ) is locally bounded. V is continuous from the right, hence when one calculates Var (V ) it suffices to use the partitions with dyadic rationals and hence if V is predictable then Var (V ) is also predictable. Var (V ) is right-continuous and predictable hence it is locally bounded62 . Example 4.45 X ∈ V is a special semimartingale if and only if X ∈ Aloc . A compound Poisson process is a special semimartingale if and only if the expected value of the distribution of the jumps is finite.
The first remark is evident from the theorem. Recall, that a compound Poisson process has locally integrable variation if and only if the distribution of the jumps has finite expected value63 . Example 4.46 If a semimartingale S is locally bounded then S is a special semimartingale.
Example 4.47 If a semimartingale S has bounded jumps then S is a special semimartingale64 .
62 See:
Proposition 3.35, page 200. Example 3.2, page 180. 64 See: Proposition 1.152, page 107. 63 See:
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
259
Example 4.48 Decomposition of continuous semimartingales.
Recall that by definition S is a continuous semimartingale if S has a decomposition S = S (0) + V + L, where V ∈ V, L ∈ L and V and L are continuous65 . Let S now be a semimartingale and let us assume that S is continuous. As S is continuous it is locally bounded, so S is a special semimartingale. By the just proved proposition S has a decomposition S (0) + V + L, where V ∈ V is predictable and L ∈ L. As S is continuous L is also predictable, hence it is continuous66 . This implies that V is also continuous. This means that S is a continuous semimartingale. The stochastic integral X • Y is always a semimartingale. One can ask: when is it a special semimartingale? Theorem 4.49 (Integration with respect to special semimartingales) Let X be a special semimartingale. Assume that for a predictable process H the integral H • X exists. Let X X (0) + A + L be the canonical decomposition of X. H • X is a special semimartingale if and only if the integrals H • A and H • L exist and H • L is a local martingale. In this case the canonical decomposition of H • X is exactly H • A + H • L. Proof. Let us first remark that if U and W are predictable and W ∈ V and the integral U • W exists then it is predictable. This is obviously true if U χ ((s, t]) χF ,
F ∈ Fs
as67
U • W = χF W t − W s = (χF χ ((s, ∞))) W t − W s . The general case follows from the Monotone Class Theorem. Assume that the integral68 Z H •X H •V +H •M exists and it is a special semimartingale. Let Z B + N be the canonical decomposition of Z. B ∈ Aloc and B is predictable. χ (|H| ≤ n) is bounded and predictable, hence the integral χ (|H| ≤ n) • Z χ (|H| ≤ n) • B + χ (|H| ≤ n) • N 65 See:
Definition 2.18, page 124. 3.40, page 205. 67 See: Proposition 1.39, page 23. 68 With some decomposition X = X (0) + V + M. 66 See:
260
GENERAL THEORY OF STOCHASTIC INTEGRATION
exists. χ (|H| ≤ n) is bounded, B ∈ Aloc hence χ (|H| ≤ n) • B ∈ Aloc . As χ (|H| ≤ n) and B are predictable χ (|H| ≤ n) • B is also predictable. Let Hn Hχ (|H| ≤ n). Hn is bounded and predictable hence the integral Hn • X Hn • A + Hn • L is meaningful. Hn • A ∈ Aloc and Hn • A is predictable and Hn • L ∈ L so Hn • X is a special semimartingale and Hn • A + Hn • L its canonical decomposition. By the associativity rule of the integration with respect to local martingales and processes with finite variation, and by the linearity in the integrator χ (|H| ≤ n) • Z χ (|H| ≤ n) • (H • X) χ (|H| ≤ n) • (H • V + H • M ) = = χ (|H| ≤ n) • (H • V ) + χ (|H| ≤ n) • (H • M ) = = (χ (|H| ≤ n) H) • V + (χ (|H| ≤ n) H) • M (χ (|H| ≤ n) H) • X Hn • X = Hn • A + Hn • L. The canonical decomposition of special semimartingales is unique, hence χ (|H| ≤ n) • B = Hn • A,
χ (|H| ≤ n) • N = Hn • L.
As we have seen χ (|H| ≤ n) H 2 • [L] Hn2 • [L] = [Hn • L] = [χ (|H| ≤ n) • N ] = = χ (|H| ≤ n) • [N ] ≤ [N ] . " " [N ] ∈ A+ H 2 • [L] ∈ A+ loc , so by the Monotone Convergence Theorem loc and therefore the integral H • L ∈ L exists, and by the Dominated Convergence Theorem N = H • L. Similarly, H • A exists, it is in Aloc and H • A = B. If H and A are predictable then H • A is predictable hence the other implication is evident. Corollary 4.50 Let L be a local martingale and let us assume that the integral H • L exists. H • L is a local martingale if and only if sups≤t |(H • L) (s)| is locally integrable, that is sup |(H • L) (s)| ∈ A+ loc . s≤t
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
261
Proof. As sups≤t |M (s)| is locally integrable69 for every local martingale M ∈ L one should only prove that if sups≤t |(H • L) (s)| is locally integrable then H • L is a local martingale. X L is a special semimartingale with canonical decomposition X = L + 0. Hence H • L is a local martingale if and only if Y H • L is a special semimartingale. But as Y (0) = 0, the process Y is a special semimartingale70 if and only if sups≤t |Y (s)| ∈ A+ loc . 4.4.2
Linearity of the stochastic integral
The most important property of every integral is the linearity in the integrand. Now we are ready to prove this important property: Theorem 4.51 (Additivity of stochastic integration) Let X be an arbitrary semimartingale. If H1 and H2 are predictable processes and the integrals H1 • X and H2 • X exist, then for arbitrary constants a and b the integral (aH1 + bH2 ) • X exists and (aH1 + bH2 ) • X = a (H1 • X) + b (H2 • X) .
(4.14)
Proof. Let B {|∆X| > 1, |∆ (H1 • X)| > 1, |∆ (H2 • X)| > 1} be the set of the ‘big jumps’. Observe that ∆ (Hi • X) ∆ (Hi • Vi + Hi • Li ) = = ∆ (Hi • Vi ) + ∆ (Hi • Li ) = = Hi ∆Vi + Hi ∆Li = Hi ∆X, so B = {|∆X| > 1, |H1 ∆X| > 1, |H2 ∆X| > 1} . Obviously for an arbitrary ω the section B (ω) does not have an accumulation point. Let us separate the ‘big jumps’ from X. That is let X
∆XχB ,
X X − X.
∈ V and the integrals Hk • X Observe that, by the simple structure of B, X are simple sums, so they exist. By the construction of the stochastic integral 69 See: 70 See:
Example 3.3, page 181. Theorem 4.44, page 257.
262
GENERAL THEORY OF STOCHASTIC INTEGRATION
Hk • X also exists71 . As the jumps of the X are bounded, X is a special semimartingale72 .
= ∆ Hk • X = Hk ∆X = Hk ∆ X − X = Hk ∆XχB c , hence the jumps of Hk • X are also bounded and therefore the processes Hk • X are also special semimartingales. Let X = X (0) + A + L be the canonical decomposition of X. By the previous theorem integrals Hk • A and Hk • L also exist. The integration with respect to local martingales and with respect to processes with finite variation is additive, hence (H1 + H2 ) • A = H1 • A + H2 • A, (H1 + H2 ) • L = H1 • L + H2 • L, which of course means that the integrals on the left-hand side exist. The integrals are ordinary sums, hence Hk • X = H1 • X + H2 • X. (H1 + H2 ) • X Adding up these three lines above and using that the integral is additive in the integrator we get (4.14). The homogeneity of the integral is obvious by the definition of the integral. 4.4.3
The associativity rule
Like additivity, the associativity rule is also not directly evident from the definition of the stochastic integral. Theorem 4.52 (Associativity rule) Let X be an arbitrary semimartingale and let us assume that the integral H • X exists. The integral K • (H • X) exists if and only if the integral (KH) • X exists. In this case K • (H • X) = (KH) • X. • L ≤ H 2 • [L] and Var V ≤ Var (V )! 72 See: Example 4.47, page 258. 71 H 2
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
263
Proof. Assume that K is integrable with respect to the semimartingale Y H • X. Let B be again the set of the ‘big jumps’, that is B {|∆X| > 1, |∆Y | > 1, |∆ (K • Y )| > 1} . As in the previous subsection for every ω the section B (ω) is a discrete set. Let us define the processes X Y
χB ∆X,
X X − X,
χB ∆Y,
Y Y − Y .
Using the formula for the jumps of the integrals and the additivity of the integral in the integrator = H • X. Y Y − Y = H • X − H • X As the jumps of X are bounded, X is a special semimartingale. Let X = X (0) + A + L be the canonical decomposition of X. By the same reason Y is also a special semimartingale and as we saw above the canonical decomposition of Y is Y = H • X = H • A + H • L. The integral K • Y on any finite interval is a finite sum, hence if K • Y exists then K • Y also exists.
∆ K • Y = K∆Y = K∆Y χB c . The jumps of K • Y are bounded so K • Y is also a special semimartingale. Therefore the integrals K • (H • A) and K • (H • L) exist and K • (H • L) is a local martingale. By the associativity rule for local martingales and for processes with finite variation K • (H • A) = (KH) • A, K • (H • L) = (KH) • L.
264
GENERAL THEORY OF STOCHASTIC INTEGRATION
Adding up the corresponding lines K • Y = K • Y + K • Y =
= = K • (H • A + H • L) + K • H • X
= = (KH) • A + (HL) • L + (KH) • X = (KH) • X. = (KH) • X + (KH) • X The proof of the reverse implication is similar. Assume that the integrals Y H • X and (KH) • X exist, and let B {|∆X| > 1, |∆Y | > 1, |∆ ((KH) • X)| > 1} . In this case H •X =H •A+H •L (KH) • X = (KH) • A + (KH) • L = = K • (H • A) + K • (H • L) , is again a simple sum, therefore where of course the integrals exist. (KH) • X = (KH) • X = (KH) • X + (KH) • X
= = K • (H • A) + K • (H • L) + K • H • X
= =K • H •A+H •L+H •X
=K • H • A+L+X = K • (H • X) .
4.4.4
Change of measure
In this subsection we discuss the behaviour of the stochastic integral when we change the measure on the underlying probability space. Definition 4.53 Let P and Q be two probability measures on a measure space (Ω, A). Let us fix a filtration F. If Q is absolutely continuous with respect to P on the measure space (Ω, Ft ) for every t then we say that Q is locally absolutely continuous with respect to P. In this case we shall use the loc
notation Q P.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
265
loc
If Q P then one can define the Radon–Nikodym derivatives Λ (t)
dQ (t) dP (t)
where Q (t) is the restriction of Q and P (t) is the restriction of P to Ft . If s < t and F ∈ Fs then dQ (t) Λ (t) dP dP = Q (t) (F ) = F dP (t) F dQ (s) dP Λ (s) dP. = Q (s) (F ) = F dP (s) F If filtration F satisfies the usual conditions then process Λ has a modification which is a martingale. As Λ (t) is defined up to a set with measure-zero one can assume that the Radon–Nikodym process Λ is a martingale. loc
Lemma 4.54 If Q P and σ is a bounded stopping time then Λ (σ) is the Radon–Nikodym derivative dQ/dP on the σ-algebra Fσ . If Λ is uniformly integrable then this is true for any stopping time σ. Proof. If σ is a bounded stopping time and σ ≤ t then by the Optional Sampling Theorem, since Λ is a martingale Λ (σ) = E (Λ (t) | Fσ ) . That is if F ∈ Fσ ⊆ Ft then Λ (σ) dP = Λ (t) dP = Q (t) (F ) = Q (F ) . F
F
As Λ is not always a uniformly integrable martingale73 the lemma is not valid a.s. for arbitrary stopping time σ. Since Λ is non-negative Λ (t) → Λ (∞) , where Λ (∞) ≥ 0 is an integrable74 variable. By Fatou’s lemma Λ (t) = E (Λ (N ) | Ft ) = lim inf E (Λ (N ) | Ft ) ≥ N →∞
≥ E lim inf Λ (N ) | Ft = E (Λ (∞) | Ft ) . N →∞
Hence the extended process is a non-negative, integrable supermartingale on [0, ∞]. By the Optional Sampling Theorem for Submartingales75 if σ ≤ τ are 73 See:
Example 6.34, page 384. Corollary 1.66, page 40. 75 See: Proposition 1.88, page 54. 74 See:
266
GENERAL THEORY OF STOCHASTIC INTEGRATION
arbitrary stopping times then Λ (σ) ≥ E (Λ (τ ) | Fσ ) .
(4.15)
Let us introduce the stopping time τ inf {t : Λ (t) = 0} . Let L be a local martingale and let U ∆L (τ ) χ ([τ , ∞)) . As L is a local martingale U ∈ Aloc . So U has a compensator U p . With this notation we have the following theorem: loc
Proposition 4.55 Let Q P. If Λ (t)
dQ (t) dP (t)
then Λ−1 is meaningful and right-regular76 under Q. If L is a local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under Q and 0 L − Λ−1 • [L, Λ] + U p L is a local martingale77 under measure Q. Proof. We divide the proof into several steps. 1. First we show that Λ > 0 almost surely under Q. Let τ inf {t : Λ (t) = 0} . Λ is right-continuous so if τ (ω) < ∞ then Λ (τ (ω) , ω) = 0. If 0 ≤ q ∈ Q then τ + q ≥ τ . Hence by (4.15) Λ (τ ) χ (τ < ∞) ≥ χ (τ < ∞) · E (Λ (τ + q) | Fτ ) = = E (Λ (τ + q) χ (τ < ∞) | Fτ ) . Taking expected value 0 ≥ E (Λ (τ + q) χ (τ < ∞)) ≥ 0. 76 That
a.s.
is Λ−1 is almost surely finite and right-regular with respect to Q, that is Λ > 0 a.s with respect to Q. In this case Λ−1 = Λ under Q. See: (4.18). 77 More precisely L is indistinguishable from a local martingale under Q.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
267
a.s.
Hence Λ (τ + q) = 0 on the set {τ < ∞} for any q ∈ Q. As Λ is right-continuous, outside a set with P-measure-zero if τ (ω) ≤ t < ∞ then Λ (t, ω) = 0. Q (t) ({Λ (t) = 0}) = {Λ(t)=0}
dQ (t) dP = dP
Λ (t) dP = 0, {Λ(t)=0}
so Λ (t) > 0 almost surely with respect to Q (t). Q (Λ (t) = 0 for some t) = Q (τ < ∞) = Q (∪n Λ (n) = 0) ≤ ≤
∞
Q (Λ (n) = 0) =
n=1
∞
Q (n) (Λ (n) = 0) = 0.
n=1
Hence Λ−1 is meaningful and Λ−1 > 0 almost surely under Q. We prove that Λ− is also almost surely positive with respect to Q. Let ρ inf {t : Λ− (t) = 0} , 1 ρn inf t : Λ (t) ≤ . n As Λ is right-regular Λ (ρn ) ≤ 1/n. Obviously on the set {ρ < ∞} lim Λ (ρn ) = Λ (ρ−) = 0.
n→∞
By (4.15) for any positive rational number q
Λ (ρn ) χ (ρn < ∞) ≥ E Λ (ρn + q) χ (ρn < ∞) | Fρn . Taking expected value 1 ≥ E (Λ (ρn + q) χ (ρn < ∞)) ≥ 0. n By Fatou’s lemma E (Λ ((ρ + q) −) χ (ρ < ∞)) = 0. Hence for every q ≥ 0 a.s
Λ ((ρ + q) −) χ (ρ < ∞) = 0.
(4.16)
268
GENERAL THEORY OF STOCHASTIC INTEGRATION
Hence outside a set with P-measure-zero if ρ (ω) ≤ t < ∞ then Λ− (t, ω) = 0. Hence if ρ (ω) < t < ∞ then Λ (t, ω) = 0. Therefore τ (ω) ≤ ρ (ω). Q (t) ({Λ− (t) = 0}) ≤ Q (t) ({ρ ≤ t}) = ≤
{ρ≤t}
Λ (t) dP ≤
Λ (t) dP = 0. {τ ≤t}
With the same argument as above one can easily prove that Q (Λ− (t) = 0 for some t) = 0. If for some ω the trajectory Λ (ω) and Λ− (ω) are positive then as Λ (ω) is right-regular Λ−1 (ω) is also right-regular. Therefore it is bounded on any finite interval78 . Hence if V ∈ V then Λ−1 • V is well-defined and Λ−1 • V ∈ V under Q. 2. Assume that for some right-regular, adapted process N the product N Λ is a local martingale under P. We show that N is a local martingale under Q. Let σ σ be a stopping time and let us assume that the truncated process (ΛN ) is a 79 martingale under P. If F ∈ Fσ∧t , and r ≥ t, then
N σ (t) dQ =
F
N σ (t) Λσ (t) dP = F
σ
=
σ
N σ (r) dQ.
N (r) Λ (r) dP = F
F
Hence N σ is a martingale under Q with respect to the filtration (Fσ∧t )t . We show that it is a martingale under Q with respect to the filtration F. Let ρ be a bounded stopping time under F. We show that τ ρ ∧ σ is a stopping time under (Fσ∧t )t . One should show that {ρ ∧ σ ≤ t} ∈ Fσ∧t . By definition this means that {ρ ∧ σ ≤ t} ∩ {σ ∧ t ≤ r} ∈ Fr . If t ≤ r then this is true as ρ ∧ σ and σ ∧ t are stopping times. If t > r then the set above is {σ ≤ r} ∈ Fr . By the Optional Sampling Theorem, using that τ ρ ∧ σ is a stopping time under (Fσ∧t )t and N σ is a Q-martingale under this filtration N σ (0) dQ = N σ (τ ) dQ = N σ (ρ) dQ. Ω 78 See: 79 See:
Proposition 1.6, page 5. Lemma 4.54, page 265.
Ω
Ω
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
269
This implies that N σ is a martingale under Q. Hence N is a local martingale under Q. 0 (0) = 0. Integrating by 3. To simplify the notation let L (0) = 0, from which L parts LΛ = L− • Λ + Λ− • L + [L, Λ] .
(4.17)
Λ and L are local martingales under P so the stochastic integrals on the righthand side are local martingales under P. Let
a
a−1 0
if a > 0 . if a = 0
(4.18)
and let A Λ • [L, Λ] .
(4.19)
A is almost surely finite under Q as Λ > 0 and Λ− are almost surely finite under Q. But we are now defining A under P and with positive probability Λ can be unbounded on some finite intervals under P. Hence we do not know that A is well-defined under P. To solve this problem let us observe that (ρn ) in (4.16) is 0 So it is sufficient to prove a localizing sequence under Q and one can localize L. ρ ρ 0 n that (L n ) = (L) is a local martingale under Q for every n. For Lρn (4.19) is well-defined. So one can assume that A is finite. Again integrating by parts, noting that Λ is right-continuous ΛA = A− • Λ + Λ− • A + [A, Λ] = = A− • Λ + Λ− • A + ∆A∆Λ = = A− • Λ + Λ− • A + ∆Λ • A = = A− • Λ + Λ • A = = A− • Λ + ΛΛ • [L, Λ] = = A− • Λ + χ (Λ > 0) • [L, Λ] . Finally80 p ΛU p = U− • Λ + Λ− • U p + [U p , Λ] = p = U− • Λ + Λ− • U p + ∆U p ∆Λ = p = U− • Λ + Λ− • U p + ∆U p • Λ = 80 See:
Example 4.39, page 249.
270
GENERAL THEORY OF STOCHASTIC INTEGRATION
= U p • Λ + Λ− • U p = = U p • Λ + Λ − • U p ± Λ− • U = = U p • Λ + Λ− • (U − U p ) + Λ− • U The stochastic integrals with respect to local martingales are local martingales, the sum of local martingales is a local martingale so 0 ΛL − ΛA + ΛU p = ΛL = local martingale + [L, Λ] − χ (Λ > 0) • [L, Λ] + Λ− • U. Observe that the last line is χ (Λ = 0) • [L, Λ] + Λ− • U = = χ (t ≥ τ ) • [L, Λ] + Λ− (τ ) ∆L (τ ) χ (t ≥ τ ) = = χ (t ≥ τ ) ∆L (τ ) ∆Λ (τ ) + Λ− (τ ) ∆L (τ ) χ (t ≥ τ ) = 0 0 is a local where we have used that [L, Λ] is constant81 on {t ≥ τ }. Hence ΛL 0 martingale under P. So by the second part of the proof L is a local martingale under Q. loc
loc
loc
Corollary 4.56 Let Q P and let P Q that is let assume that Q ∼ P. If Λ (t)
dQ (t) dP (t)
then Λ > 0. If L is a local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under Q and 0 L − Λ−1 • [L, Λ] L is a local martingale under the measure Q. loc
Corollary 4.57 Let Q P. If Λ (t)
dQ (t) dP (t)
and L is a continuous local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under measure Q and 0 L − Λ−1 • [L, Λ] L is a local martingale under the measure Q. 81 See:
Corollary 2.49, page 145.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
271
loc
If V ∈ V under P and Q P then obviously V ∈ V under Q. Hence the proof of the following observation is trivial: loc
Corollary 4.58 If X is a semimartingale under P and Q P then X is a semimartingale under Q. Let V ∈ V and assume that the integral H • V exists under measure P. By definition this means that the pathwise integrals (H • V ) (ω) exist almost surely loc
under P. If Q P then the integral H • V exists under the measure Q as well, and the value of the two processes are almost surely the same under Q. It is not too surprising that it is true for any semimartingale. Proposition 4.59 Let X be an arbitrary semimartingale and let H be a predictable process. Assume that the integral H • X exists under measure P. If loc
Q P then the integral H • X exists under measure Q as well, and the two integral processes are indistinguishable under measure Q. Proof. By the remark above it is obviously sufficient to prove the proposition if X ∈ L under P. It is also sufficient to prove that for every T > 0 the two integrals exist on the interval [0, T ] and they are almost surely equal. 1. Let X = X c + X d be the decomposition of X into continuous and purely discontinuous local martingales. As the time horizon is finite, Λ is a uniformly integrable martingale. Recall that if L is a local martingale under the measure P then L − Λ−1 • [L, Λ] + U p L
(4.20)
is a local martingale under measure Q and if L is continuous then U p can be dropped. X − Λ−1 • [X, Λ] + U p = X & ' = X c + X d − Λ−1 • X c + X d , Λ + U p = & '
= X c − Λ−1 • [X c , Λ] + X d − Λ−1 • X d , Λ + U p . By (4.20) the processes 1c X c − Λ−1 • [X c , Λ] X
1d X d − Λ−1 • &X d , Λ' + U p and X
are local martingales under measure Q. X c is continuous, hence the quadratic 1c is continuous. If W and V co-variation [X c , Λ] is also continuous82 . Hence X 82 See:
line (3.19), page 222.
272
GENERAL THEORY OF STOCHASTIC INTEGRATION
are pure quadratic jump processes then [W + V ] = [W ] + 2 [W, V ] + [V ] = 2 2 = (∆W ) + 2 ∆W ∆V + (∆V ) = 2 = (∆ (W + V )) hence W +V is also a pure quadratic jump process. Processes with finite variation 1d is a pure quadratic jump process are pure quadratic jump processes83 , hence X under P. Under the change of measure the quadratic variation does not change, 1d is a purely 1d is a pure quadratic jump process under Q. Hence X hence X exists discontinuous local martingale under Q. We want to show that H • X under Q. This means that H • X exist on (0, t] for every t. To prove this one 1d exist under Q. 1c and H • X need only prove that the integrals H • X c 1 is a continuous local martingale, hence H•X 1c exists under Q if and only 2. X (
1c <∞ =1. H•X exists under P therefore P H 2 • [X c ] <∞ =1. if Q H 2 • X As Λ−1 • [X c , Λ] is continuous by quick calculation 1c ] [X c − Λ−1 • [X c , Λ]] = [X c ] . [X Therefore
(
1c < ∞ = P H 2 • [X c ] < ∞ = 1, P H2 • X 1c ]=∞) = 0. Q P on Ft so Q((H 2 • [X 1c ])(t)=∞)=0 for every that is P(H 2 • [X 2 c c 1 1 t, that is Q((H • [X ]) = ∞) = 0. So H • X exists under Q. 1d is purely discontinuous, hence H • X 1d exists under Q if and only if 3. X % Z
( 1d ∈ A+ H2 • X loc
under measure Q. Z is obviously increasing, so we need only prove that Z ∈ A+ loc . 4. Let us prove the following general observation: if Λ is a non-negative martingale, τ is an arbitrary stopping time and Λ ≤ c on [0, τ ) then E (χ (τ > 0) Λ (τ )) ≤ c. 83 See:
line (2.14), page 134.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
273
Let M Λτ . By L´evy’s theorem M (t−) = E (M (t) | Ft− ) . Hence as {τ < t} ∈ Ft− E (Λ (t ∧ τ )) E (M (t)) = E (M (t−)) = = E (M (t−) χ (τ ≥ t)) + E (M (t−) χ (τ < t)) = = E (Λ (t−) χ (τ ≥ t)) + E (M (t) χ (τ < t)) = = E (Λ (t−) χ (τ ≥ t)) + E (Λ (τ ) χ (τ < t)) . Hence as Λ ≥ 0 E (Λ (τ ) χ (τ < t)) ≤ E (Λ (t ∧ τ )) < ∞. So by the Optional Sampling Theorem for non-negative martingales84 E (Λ (τ ) χ (τ ≥ t)) = E (Λ (τ )) − E (Λ (τ ) χ (τ < t)) ≤ = E (Λ (τ ∧ t)) − E (Λ (τ ) χ (τ < t)) = = E (Λ (t−) χ (τ ≥ t)) ≤ c. If t 0 then by Fatou’s lemma E (χ (τ > 0) Λ (τ )) ≤ c. 5. Z− is locally bounded. Let (ρn ) be a localizing sequence of Z− . Let τ n inf {s : Λ (s) > n} ∧ ρn ∧ n.
(4.21)
τ n is a bounded stopping time and if s < τ n (ω) then Λ (s, ω) ≤ n. Hence using the estimate just proved dQ = EQ (Z (τ n −)) = E Z (τ n −) dP dQ | Fτ n = = E E Z (τ n −) dP dQ = E Z (τ n −) E | Fτ n = dP = E (Z (τ n −) Λ (τ n )) ≤ kn · E (Λ (τ n )) = = kn · E ({τ n > 0} Λ (τ n ) + {τ n = 0} Λ (τ n )) ≤ ≤ kn · (n + E (Λ (0))) < ∞. 84 See:
Corollary 1.87, page 54.
274
GENERAL THEORY OF STOCHASTIC INTEGRATION
6. We show that ∆U p = 0. The stopping time τ can be covered by its predictable and totally inaccessible parts so one can assume that τ is either totally inaccessible or predictable. If τ is predictable then χ ([τ ]) is predictable therefore ∆ (U p ) =
p
(∆U )
p
(∆X (τ ) χ ([τ ])) =
p
(∆X · χ ([τ ])) =
= (p ∆X) · χ ([τ ]) = 0 · χ ([τ ]) = 0. If τ is totally inaccessible then P (τ = σ) = 0 for every predictable stopping time σ, hence p
0 ((∆Xχ ([τ ])) (σ) | Fσ− ) = E 0 (0 | Fσ− ) = 0, (∆Xχ ([τ ])) (σ) E
so ∆U p =
p
(∆Xχ ([τ ])) = 0. Therefore in both cases ∆U p = 0.
2 ( 1d is purely discontinuous, hence X 1d = 1d and 7. X ∆X 1d = ∆X d − Λ ∆ &X d , Λ' + ∆U p . ∆X
Since ∆U p = 0
1d ∆X = ∆X d − Λ · ∆X d ∆Λ =
= ∆X d 1 − Λ · ∆Λ =
= ∆X d χ (Λ = 0) + Λ · Λ− . " " H 2 • [X d ] ∈ Aloc under P. One can assume that τ n localizes H 2 • [X d ] in (4.21). Therefore one may assume that )
H 2 • [X d ] (τ n ) < ∞. E ∆X d (τ n ) H (τ n ) ≤ E Using this % (
1 2 d E ∆ H • X (τ n ) = Q
% =E
H2
1d ∆X
2
dQ (τ n ) dP
1d (τ n ) Λ (τ n ) = = E H∆X
= E H∆X d (τ n ) χ (Λ = 0) + Λ− Λ (τ n ) Λ (τ n ) =
= E H∆X d (τ n ) Λ− Λ (τ n ) Λ (τ n ) ≤
≤ E H (τ n ) ∆X d (τ n ) Λ (τ n −) ≤ n · E ∆X d (τ n ) H (τ n ) < ∞.
STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES
8. As
√
x+y ≤
√
x+
√
y
% EQ (Z (τ n )) EQ
275
(
1d (τ n ) H2 • X
≤
% (
1d (τ n ) < ∞. ∆ H2 • X ≤ E (Z (τ n −)) + E Q
Q
Therefore Z ∈ Aloc under measure Q. 9. Let us consider the decomposition + A − Up + Λ−1 • [X, Λ] − U p X X=X and let us assume that the integral H • X exists under measure P. As the inte exists under Q one should prove that the Lebesgue–Stieltjes integrals gral H • X H • A and H • U p also exist. By the inequality of Kunita and Watanabe
T
T
|H| Λ dVar ([X, Λ]) ≤
|H| dVar (A) = 0
0
-
T
≤
2 |H| Λ d [X]
0
=
0 T
2 Λ d |H| • [X]
T
Λ d [Λ] = -
T
Λ d [Λ].
0
0
Λ > 0 and Λ− > 0 almost surely under Q, that is almost all trajectories of Λ and Λ− are positive85 hence Λ has regular trajectories almost surely under Q. Hence almost surely the trajectories of Λ are bounded on every finite inter) T Λ d [Λ] is finite. Similarly as H • X exists val, therefore the expression 0 R |H| • [X] ∈ V, hence Λ • R is finite under Q. That is for every trajectory T |H| dVar (A) < ∞, hence H • A exists under Q. Let σ be a stopping time in 0 " a localizing sequence of H 2 • [X]. 2
) E ((|H| • U p ) (σ)) = E ((|H| • U ) (σ)) ≤ E
2
|H| • [X] (σ)
< ∞.
Hence H • U p is almost surely finite under P so it is almost surely finite under Q. Therefore the integral H • X exists under Q. 10. Let us denote by (P) H • X and by (Q) H • X the value of H • X under P and under Q respectively. Let us denote by H the set of processes H for 85 See:
Proposition 4.55, page 266.
276
GENERAL THEORY OF STOCHASTIC INTEGRATION
which (P) H • X and (Q) H • X are indistinguishable under Q. From the Dominated Convergence Theorem and from the linearity of the stochastic integral it is obvious that H is a λ-system, which contains the π-system of the elementary processes. From the Monotone Class Theorem it is clear the H contains all the bounded predictable processes. 11. If Hn Hχ (|H| ≤ n) then Hn is bounded. Hence the value of the integral (P) Hn • X is Q almost surely equal to the integral (Q) Hn • X. As H • X exists under P and under Q by the Dominated Convergence Theorem uniformly ucp in probability on compact intervals (P) Hn • X → (P) H • X and (Q) Hn • ucp X → (Q) H • X. The stochastic convergence under P implies86 the stochastic convergence under Q, hence (P) H • X = (Q) H • X almost surely under Q. Let us prove some consequences of the proposition. During the construction of the stochastic integral we emphasized that we cannot define the integral pathwise. But it does not mean that the integral is not determined by the trajectories of the integrator and the integrand. Corollary 4.60 Let X and X be semimartingales. Assume that for the predictable processes H and H the integrals H • X and H • X exist. If . / . / A ω : H (ω) = H (ω) ∩ ω : X (ω) = X (ω) then the processes H • X and H • X are indistinguishable on A. Proof. One may assume that P (A) > 0. Define the measure Q (B)
P (A ∩ B) . P (A)
Obviously Q P. The processes H, H and X, X are indistinguishable under Q. Hence processes (Q)H • X and (Q) H • X are indistinguishable under Q. By the proposition above under Q up to indistinguishability (P) H • X = (Q) H • X = (Q) H • X = (P) H • X which means that (P) H • X = (P) H • X on A. The proof of the following corollary is similar: Corollary 4.61 Let X be a semimartingale and let assume that the integral H • X exists. If on a set B the trajectories of X have finite variation then almost surely on B the trajectories of H • X are equal to the pathwise integrals of H with respect to X. 86 A sequence is stochastically convergent if and only if every subsequence of the sequence has another subsequence which is almost surely convergent to the same, fixed random variable.
THE PROOF OF DAVIS’ INEQUALITY
4.5
277
The Proof of Davis’ Inequality
In this section we prove the following inequality: Theorem 4.62 (Davis’ inequality) There are positive constants c and C such that for any local martingale L ∈ L and for any stopping time τ
"
" [L] (τ ) ≤ E sup |L (t)| ≤ C · E [L] (τ ) . c·E t≤τ
Example 4.63 In the inequality one cannot write |L| (τ ) in the place of supt≤τ |L|.
If w is a Wiener process and τ inf {t : w (t) = 1} then L wτ is a martingale. E (L (t)) = 0 for every t, hence
L (t)1 = E (|L(t)|) = 2E L+ (t) ≤ 2. On the other hand if t → ∞ " √
√ τ ∧t →E τ . [L] (t) = E 1
The density function87 of τ is
1 exp − f (x) = √ 3 2x 2x π √ hence the expected value of τ is 1
E
,
x > 0,
1 exp − dx = 2x 2x3 π 0 ∞ 1 1 1 √ exp − = dx = 2x 2π x 0 ∞ u
1 1 exp − du = ∞. =√ 2 2π 0 u
√ τ =
∞
√
x√
1
If σ is an arbitrary stopping time then in place of L one can write Lσ in the inequality. On the other hand if for some localizing sequence σ n ∞ the inequality is true for all Lσn then by the Monotone Convergence Theorem it is true for L as well. By the Fundamental Theorem of Local Martingales L ∈ L has a 2 decomposition L = H + A where H ∈ Hloc and A ∈ Aloc . With localization 2 one can assume that H ∈ H and A ∈ A. L− is left-regular, hence it is locally 87 See:
(1.58) on page 83.
278
GENERAL THEORY OF STOCHASTIC INTEGRATION
bounded, so with further localization of the inequality one can assume that L− is bounded. It suffices to prove the inequality on any finite time horizon [0, T ]. It is
suffi(n) is an cient to prove the inequality for finite, discrete-time horizons: If tk infinitesimal sequence of partitions of [0, T ] then trivially
(n) E sup L tk E sup |L (t)| . (n)
t≤T
tk ≤T
Recall that as L(0) = 0 at any time t the quadratic variation [L] is the limit in probability of the sequence (n)
[L]
(t)
2 ( (n) (n) L tk ∧ t − L tk−1 ∧ t = k
= L2 (t) − 2
(
(n) (n) (n) L tk−1 ∧ t L tk ∧ t − L tk−1 ∧ t .
k
If Yn (t)
(n) (n) (n) L tk−1 ∧ t χ tk−1 ∧ t, tk ∧ t ,
k
then the sum in the above expression is (Yn • L) (t). Obviously Yn → L− and |Yn (t)| ≤ sup |L− (s)| ≤ k. s≤t
Repeating the proof of the Dominated Convergence Theorem we prove that for all t (Yn • L) (t) → (L− • L) (t) in L1 (Ω). As (Yn ) is uniformly bounded, by Itˆ o’s isometry the convergence Yn • H → L− • H holds in H2 and therefore L2
(Yn • H) (t) → (L− • H) (t). Obviously |(Yn • A) (t) − (L− • A) (t)| ≤ 2k · Var (A) (t) . As A ∈ A by the classical Dominated Convergence Theorem L1
(Yn • A) (t) → (L− • A) (t) .
THE PROOF OF DAVIS’ INEQUALITY
279
Therefore, as we said, L1
(Yn • A) (T ) → (L− • A) (T ) . (n)
L1
Hence [L] (T ) → [L] (T ) , so by Jensen’s inequality ) ) "
" (n) E ≤ E [L](n) (T ) − [L] (T ) ≤ [L] (T ) − E [L] (T ) % (n) ≤E [L] (T ) − [L] (T ) ≤
%
(n) ≤ E [L] (T ) − [L] (T ) → 0. This means that if the inequality holds in discrete-time then it is true in continuous-time. 4.5.1
Discrete-time Davis’ inequality
Up to the end of this section we assume that if M is a martingale then M (0) = 0. Definition 4.64 Let us first introduce some notation. For any sequence M (Mn ) ∆Mn Mn − Mn−1 . If M (Mn ) is a discrete-time martingale then (∆Mn ) is the martingale difference of M . [M ]n
n k=1
Mn∗
2
(∆Mk ) =
n
2
(Mk − Mk−1 )
k=1
sup |Mk | k≤n
for any n. If n is the maximal element in the parameter set or n = ∞ then we drop the subscript n. With this notation the discrete-time Davis’ inequality has the following form: Theorem 4.65 (Discrete-time Davis’ inequality ) There are positive constants c and C such that for every discrete-time martingale M for which M (0) = 0 "
"
c·E [M ] ≤ E (M ∗ ) ≤ C · E [M ] .
280
GENERAL THEORY OF STOCHASTIC INTEGRATION
The proof of the discrete-time Davis’ inequality is a simple but lengthy88 calculation. Let us first prove two lemmas: Lemma 4.66 Let M (Mn , Fn ) be a martingale and let V (Vn , Fn−1 ) be a predictable sequence89 , for which |∆Mn | |Mn − Mn−1 | ≤ Vn . If λ > 0 and 0 < δ < β − 1 then
" P M ∗ > βλ, [M ] ∨ V ∗ ≤ δλ ≤ P
"
[M ] > βλ, M ∗ ∨ V ∗ ≤ δλ ≤
2δ 2 (β − δ − 1)
2 P (M
∗
> λ) ,
" 9δ 2 [M ] > λ . P 2 β −δ −1 2
Proof. The proof of the two inequalities are similar. 1. Let us introduce the stopping times µ inf {n : |Mn | > λ} , ν inf {n : |Mn | > βλ} , ) σ inf n : [M ]n ∨ Vn+1 > δλ . For every j c
Fj {µ < j ≤ ν ∧ σ} = {µ < j} ∩ {ν ∧ σ < j} ∈ Fj−1 , hence if Hn
n
∆Mj χFj ,
j=1
then n ∆Mj χFj | Fn−1 ) = E (Hn | Fn−1 ) E( j=1
=
n−1 j=1
88 And 89 That
boring. is Vn is Fn−1 -measurable.
∆Mj χFj + E(∆Mn χFn | Fn−1 ) =
THE PROOF OF DAVIS’ INEQUALITY
=
n−1
281
∆Mj χFj + χFn E(∆Mn | Fn−1 ) =
j=1
=
n−1
∆Mj χFj Hn−1 ,
j=1
therefore (Hn ) is a martingale. By the assumptions of the lemma |∆Mj | ≤ Vj , hence by the definition of σ
2 [H]n ≤ [M ]σ = [M ]σ−1 + (∆Mσ ) χ (σ < ∞) + [M ]σ χ (σ = ∞) ≤
≤ [M ]σ−1 + Vσ2 χ (σ < ∞) + [M ]σ χ (σ = ∞) ≤ ≤ 2δ 2 λ2 . {M ∗ ≤ λ} = {µ = ∞} hence on this set H = 0 so [H] = 0. Therefore E ([H]) = E ([H] χ (M ∗ > λ) + [H] χ (M ∗ ≤ λ)) = = E ([H] χ (M ∗ > λ)) ≤ 2δ 2 λ2 P (M ∗ > λ) . Observe that Fj ∩ {ν < ∞, σ = ∞} = {µ < j ≤ ν} ∩ {ν < ∞, σ = ∞} hence on the set {ν < ∞, σ = ∞} Hn = Mν∧n − Mµ∧n . On {ν < ∞} obviously supn |Mν∧n | ≥ λβ. On {σ = ∞} by definition V ∗ ≤ δλ, hence |Mµ | = |Mµ−1 + ∆Mµ | ≤ λ + δλ. This implies that on the set {ν < ∞, σ = ∞} H ∗ = sup |Mν∧n − Mµ∧n | > λβ − λ (δ + 1) = λ (β − (1 + δ)) . n
282
GENERAL THEORY OF STOCHASTIC INTEGRATION
By Doob’s inequality90 using the definition of ν and σ
" P1 P M ∗ > βλ, [M ] ∨ V ∗ ≤ δλ = = P (ν < ∞, σ = ∞) ≤
2 E H∞
∗
≤ P (H > λ (β − (1 + δ))) ≤ ≤ ≤
E ([H]) λ (β − 1 − δ) 2
2
2
λ2 (β − 1 − δ)
≤
≤
2δ 2 λ2 P (M ∗ > λ) 2
λ2 (β − (1 + δ))
=
2δ 2 (β − 1 − δ)
2 P (M
∗
> λ) ,
which is the first inequality. 2. Analogously, let us introduce the stopping times ) µ inf n : [M ]n > λ ,
) ν inf n : [M ]n > βλ ,
σ inf {n : Mn∗ ∨ Vn+1 > δλ} . Again for all j let Fj {µ < j ≤ ν ∧ σ } . As Fj ∈ Fj−1 Gn
n
∆Mj χFj
j=1
is again a martingale. If µ ≥ σ then G∗ = 0. Hence if σ < ∞ then G∗ = G∗ χ (µ < σ ) ≤
≤ Mµ∗ + Mσ∗ χ (µ < σ ) ≤
≤ Mσ∗ −1 + Mσ∗ χ (µ < σ ) =
= Mσ∗ −1 + Mσ∗ −1 + ∆Mσ∗ χ (µ < σ ) ≤
≤ Mσ∗ −1 + Mσ∗ −1 + Vσ χ (µ < σ ) ≤ ≤ δλ + δλ + δλ = 3δλ. 90 See:
line (1.14), page 33.
THE PROOF OF DAVIS’ INEQUALITY
283
If σ = ∞ then of course σ − 1 is meaningless, but in this case obviously
∗ Mµ + Mσ∗ χ (µ < σ ) ≤ 2δλ, so in this case the inequality G∗ ≤ 3δλ still holds. On the set
" [M ] ≤ λ =
{µ = ∞} obviously G∗ = 0. "
"
2 2 2 E (G∗ ) = E (G∗ ) χ [M ] > λ + (G∗ ) χ [M ] ≤ λ = "
"
2 [M ] > λ ≤ 9δ 2 λ2 P [M ] > λ . = E (G∗ ) χ On the set {ν < ∞, σ = ∞} [G]n = [M ]ν ∧n − [M ]µ ∧n . By this using that ν < ∞ and σ = ∞ 2
2
[G] > (βλ) − [M ]µ −1 − (∆Mµ ) ≥ 2
2
≥ (βλ) − λ2 − (Vµ ) ≥
2 ≥ (βλ) − 1 + δ 2 λ2 . By Markov’s inequality and by the energy identity91 "
P2 P [M ] > βλ, M ∗ ∨ V ∗ ≤ δλ = = P (ν < ∞, σ = ∞) ≤
≤ P [G] > λ2 β 2 − 1 + δ 2 ≤
E ([G]) =
λ β − 1 + δ2
2 ∗ 2 ) E (G E G ≤ 2 2 ≤ = 2 2 λ β − 1 + δ2 λ β − 1 + δ2 "
9δ 2 P [M ] > λ . ≤ 2 2 β − (1 + δ) 2
2
Lemma 4.67 Let M (Mn , Fn ) be a martingale and let assume that M0 = 0. If dj ∆Mj Mj − Mj−1 ,
aj dj χ |dj | ≤ 2d∗j−1 − E dj χ |dj | ≤ 2d∗j−1 | Fj−1 ,
bj dj χ |dj | > 2d∗j−1 − E dj χ |dj | > 2d∗j−1 | Fj−1 E G2 = ∞ then the inequality is true, otherwise one can use Proposition 1.58 on page 35. 91 If
284
GENERAL THEORY OF STOCHASTIC INTEGRATION
then the sequences Gn
n
aj
and
j=1
Hn
n
bj ,
j=1
are F-martingales, M = G + H and |aj | ≤ 4d∗j−1 , ∞
(4.22)
dj χ |dj | > 2d∗j−1 ≤ 2d∗ ,
(4.23)
j=1 ∞
E (|bj |) ≤ 4E (d∗ ) .
(4.24)
j=1
Proof. As M0 = 0 n
dj
j=1
n
∆Mj = Mn − M0 = Mn .
j=1
One should only prove the three inequalities, since from this identity the other parts of the lemma are obvious92 . 1. (4.22) is evident. / . 2. |dj | + 2d∗j−1 ≤ 2 |dj | on |dj | > 2d∗j−1 , hence ∞ ∞
dj χ |dj | > 2d∗j−1 ≤ 2 |dj | − 2d∗j−1 χ |dj | > 2d∗j−1 ≤ j=1
j=1
≤2
∞
d∗j − d∗j−1 = 2d∗ ,
j=1
which is exactly (4.23). 3. ∞ j=1
E (|bj |) ≤
∞
E |dj | χ |dj | > 2d∗j−1 +
j=1
+
∞
E E dj χ |dj | > 2d∗j−1 | Fj−1 .
j=1 92 For any sequence (ξ , F ) E (ξ | F n n−1 ) = 0 if and only if (ξ n , Fn ) n n difference sequence.
is a martingale
THE PROOF OF DAVIS’ INEQUALITY
285
If in the second sum we bring the absolute value into the conditional expectation, then ∞ ∞
E (|bj |) ≤ 2E |dj | χ |dj | > 2d∗j−1 . j=1
j=1
By (4.23) the expression in the conditional expectation is not larger than 2d∗ , from which (4.24) is evident. The proof of the discrete-time Davis’ inequality: Let M = H + G be n the decomposition of the previous lemma. Gn j=1 aj is a martingale, |aj | ≤ 4d∗j−1 , hence by the first lemma, if λ > 0 and 0 < δ < β − 1, then
" P G∗ > βλ, [G] ∨ 4d∗ ≤ δλ ≤ P
"
[G] > βλ, G∗ ∨ 4d∗ ≤ δλ ≤
2δ 2 (β − δ − 1)
2 P (G
∗
> λ) ,
" 9δ 2 [G] > λ . P β 2 − δ2 − 1
Hence for any λ > 0 P (G∗ > βλ) ≤ P +
"
[G] > δλ + P (4d∗ > δλ) + 2δ 2
(β − δ − 1)
2 P (G
∗
> λ) ,
and P
"
[G] > βλ ≤ P (G∗ > δλ) + P (4d∗ > δλ) + +
" 9δ 2 [G] > λ . P β 2 − δ2 − 1
Integrating w.r.t. λ and using that if ξ ≥ 0 then ∞ ∞ E (ξ) = 1 − F (x)dx = P(ξ > x)dx, 0
0
one has that ∗
E (G ) ≤ β
E
"
[G] +
δ +
2δ 2 (β − δ − 1)
4E (d∗ ) + δ 2 E (G
∗
),
286
GENERAL THEORY OF STOCHASTIC INTEGRATION
and E
"
[G]
β
≤
E (G∗ ) 4E (d∗ ) + + δ δ "
9δ 2 + 2 E [G] . β − δ2 − 1
For the stopped martingale Gn the expected values in the inequalities are finite, hence one can reorder the inequalities
2
1 2δ − 2 β (β − δ − 1)
E (G∗n ) ≤
E
"
[G]n
δ
+
4E (∆Mn∗ ) . δ
and
1 9δ 2 − 2 β β − δ2 − 1
E
"
E (G∗ ) 4E (∆M ∗ ) n n + . [G]n < δ δ
If δ is small enough then the constants on the left-hand side are positive, hence we can divide by them. Hence if n ∞ then by the Monotone Convergence Theorem "
∗ E (G∗ ) ≤ A1 E [G] + A2 E (∆M ) , "
∗ E [G] ≤ B1 E (G∗ ) + B2 E (∆M ) . By the second lemma E (M ∗ ) ≤ E (G∗ + H ∗ ) ≤ ≤ E (G∗ ) + E (|bj |) ≤ E (G∗ ) + 4E (d∗ ) ≤ "
j
∗ ∗ ≤ A1 E [G] + A2 E (∆M ) + 4E (∆M ) , "
" "
E [M ] ≤ E [G] + [H] ≤ " "
≤E [G] + E (|bj |) ≤ E [G] + 4E (d∗ ) ≤ j
∗ ∗ ≤ B1 E (G ) + B2 E (∆M ) + 4E (∆M ) . ∗
THE PROOF OF DAVIS’ INEQUALITY
287
As G = M − H by the second lemma again E (G∗ ) ≤ E (M ∗ ) + E (H ∗ ) ≤ E (M ∗ ) +
∗ ≤ E (M ∗ ) + 4E (∆M )
∞
E (|bj |) ≤
j=1
and E
"
∞
"
"
"
[G] ≤ E [M ] + E [H] ≤ E [M ] + E (|bj |) ≤
"
∗ ≤E [M ] + 4E (∆M ) .
j=1
From this with simple calculation E (M ∗ ) ≤ A1 E
"
"
∗ [M ] + A3 E (∆M ) ≤ A · E [M ] ,
and E
"
∗ [M ] ≤ B1 E (M ∗ ) + B3 E (∆M ) ≤ B · E (M ∗ ) ,
from which Davis’ inequality already follows, trivially. 4.5.2
Burkholder’s inequality
One can extend Davis’ inequality in such a way that instead of the L1 (Ω)-norm one can write the Lp (Ω)-norm for every p ≥ 1. Theorem 4.68 (Burkholder’s inequality) For any p > 1 there are constants cp and Cp , such that for every local martingale L ∈ L and for every stopping time τ " " cp [L] (τ ) ≤ sup |L (t)| ≤ Cp [L] (τ ) . p
t≤τ
p
p
During the proof of the inequality we shall use the next result: Lemma 4.69 Let A be a right-regular, non-negative, increasing, adapted process and let ξ be a non-negative random variable. Assume that almost surely for every t E (A (∞) − A (t) | Ft ) ≤ E (ξ | Ft )
(4.25)
288
GENERAL THEORY OF STOCHASTIC INTEGRATION
and ∆A (t) ≤ ξ. Then for every p ≥ 1 A (∞)p ≤ 2p ξp .
(4.26)
Proof. A is increasing, so for every n χ (A (t) ≥ n) (A (∞) − A (t)) = (A ∧ n) (∞) − (A ∧ n) (t) . So if (4.26) holds for some A then it holds for A ∧ n. Hence one can assume that A is bounded, since otherwise we can replace A with A ∧ n and in (4.26) one can take n ∞. If ξ is not integrable then the inequality trivially holds. Hence one can assume that ξ is integrable. 1. As ξ is integrable E (ξ | Ft ) is a uniformly integrable martingale. As A is bounded E (A (∞) − A (t) − ξ | Ft ) = E (A (∞) | Ft ) − E (ξ | Ft ) − A (t) is a uniformly integrable, non-positive supermartingale. By the Optional Sampling Theorem for every stopping time τ E (A (∞) − A (τ ) | Fτ ) ≤ E (ξ | Fτ ) .
(4.27)
Let x > 0 and let τ x inf {t : A (t) ≥ x} . Obviously A (τ x −) ≤ x. By (4.27) E ((A (∞) − x) χ (x < A (∞))) ≤ E ((A (∞) − x) χ (τ x < ∞)) = ≤ E ((A (∞) − A (τ x −)) χ (τ x < ∞)) = = E ((A (∞) − A (τ x )) χ (τ x < ∞)) + E (∆A (τ x ) χ (τ x < ∞)) ≤ ≤ E (ξχ (τ x < ∞)) + E (ξχ (τ x < ∞)) ≤ ≤ 2E (ξχ (x ≤ A (∞))) .
THE PROOF OF DAVIS’ INEQUALITY
2. With inequality
simple
calculation
using
Fubini’s
theorem
and
289
H¨ older’s
p
A (∞)p E (Ap (∞)) = pE (Ap (∞)) − (p − 1) E (Ap (∞)) = A(∞) p−2 = p (p − 1) E A (∞) x dx 0
− p (p − 1) E
p−1
x
=
A(∞)
(A (∞) − x) x
= p (p − 1) E
dx
0
= p (p − 1)
A(∞)
p−2
dx
=
0 ∞
E ((A (∞) − x) χ (x < A (∞))) xp−2 dx ≤
0
∞
≤ 2p (p − 1)
E (ξχ (x ≤ A (∞))) xp−2 dx =
0
A(∞)
= 2p (p − 1) E
p−2
ξx
dx
= 2p · E ξAp−1 (∞) ≤
0 p−1
≤ 2p · ξp A (∞)p
. p−1
If A (∞)p > 0 then we can divide both sides by A (∞)p inequality trivially holds.
, otherwise the
Proof of Burkholder’s inequality: Let L be a local martingale. Let B ∈ Ft and let N χB (L − Lt ). N is a local martingale so by Davis’ inequality c·E
"
"
[N ] (∞) ≤ E sup |N (s)| ≤ C · E [N ] (∞) , s
which immediately implies that c·E
"
[L − Lt ] (∞) | Ft ≤ E sup |L − Ls | | Ft ≤ s
≤C ·E
"
[L − Lt ] (∞) | Ft .
Let L∗ (t) sups≤t |L (s)|. Since " " " " " [L] (∞) − [L] (t) ≤ [L] (∞) − [L] (t) = [L − Lt ] (∞) ≤ [L] (∞)
290
GENERAL THEORY OF STOCHASTIC INTEGRATION
and L∗ (∞) − L∗ (s) ≤ sup L − Lt (s) ≤ 2L∗ (∞) s
if A (t)
" [L] (t)
and ξ c−1 2 · L• (∞)
or if A (t) L∗ (t)
and ξ C
" [L] (∞)
then estimation (4.25) in the lemma holds. Without loss of generality one can assume that the constants in the definition of ξ are larger than one. Since for every constant k ≥ 1 " " ∆L∗ ≤ |∆L| = ∆ [L] ≤ k · [L] (∞) " " ∆ [L] ≤ ∆ [L] = |∆L| ≤ k · 2L∗ (∞) in both cases we get that ∆A ≤ ξ. Hence A (∞)p ≤ 2p ξp which is just the two sides of Burkholder’s inequality. p/2
p Corollary 4.70 If L ∈ L and p ≥ 1 then L ∈ Hloc if and only if [L]
∈ Aloc .
Corollary 4.71 If M is a local martingale and for some p ≥ 1 for every sequence of infinitesimal partitions of the interval [0, t] (n)
[M ]
Lp
(t) → [M ] (t) ,
then M ∗ (t) sup |M (s)| ∈ Lp (Ω) s≤t
that is M ∈ Hp on the interval [0, t]. (n)
Proof. Let (Mn ) be a discrete-time of M . If [M ] (t) is con approximation (n) p vergent in L (Ω), then K supn [M ] (t) < ∞. By the Davis–Burkholder p
inequality and by Jensen’s inequality ) % (n) sup |Mn | (s) ≤ Cp [M ](n) (t) ≤ Cp ] (t) [M ≤ L < ∞. s≤t
p
p
p
THE PROOF OF DAVIS’ INEQUALITY
291
For a subsequence sup |Mn | sup |M | , hence by the Monotone Convergence Theorem M ∗ (t)p ≤ L < ∞. Corollary 4.72 If q ≥ 1 and L ∈ Hq is purely discontinuous then L is the Hq -sum of its compensated jumps. Proof. Let us denote by (ρk ) the stopping times exhausting the jumps of L. Let L ∈ Hq be purely discontinuous and let L = Lk where Nk H (ρk ) χ ([ρk , ∞)) and Lk N − Nkp are the the compensated jumps of L. Recall that the convergence holds in the topology of uniform convergence in probability93 . L ∈ Hq so q/2 by Burkholder’s inequality [L] ∈ A and as the compensator Nkp is continuous q/2
[Lk ]
q
(∞) = (∆L (ρk )) ≤ q/2
≤ [L]
2
q/2
(∆L) (∞)
≤
(∞) ∈ L1 (Ω) .
This implies that Lk ∈ Hq . Hq is a vector space hence Yn n > m then ≤ sup Yn − Ym Hq |Y (t) − Y (t)| n m t
n k=1
Lk ∈ Hq . If
q
" ≤ Cp [Yn − Ym ] (∞) = q 2 = Cp (∆L) (s)χ (B \B ) n m , s q
where Bn ∪nk=1 [ρk ].
)
2
(∆L) is in Lq (Ω). Therefore if n, m → ∞ then Yn − Ym Hq → 0.
So (Yn ) is convergent in Hq . Convergence in Hq implies uniform convergence in Hq
probability so obviously Yn → L.
93 See:
Proposition 4.30, page 243.
5 SOME OTHER THEOREMS In this chapter we shall discuss some further theorems from the general theory of stochastic processes. First we shall prove the so-called Doob–Meyer decomposition. By the Doob–Meyer decomposition every integrable submartingale is a semimartingale. We shall also prove the theorem of Bichteler and Dellacherie, which states that the semimartingales are the only ‘good integrators’.
5.1
The Doob–Meyer Decomposition
If A ∈ A+ and M ∈ M then X A + M is a class D submartingale. Since if τ is a finite valued stopping time then |A (τ )| = |A (τ ) − A (0)| ≤ Var (A) (∞) ∈ L1 (Ω) ,
(5.1)
hence the set {X (τ ) : τ < ∞ is a stopping time} is uniformly integrable. The central observation of the stochastic analysis is that the reverse implication is also true: Theorem 5.1 (Doob–Meyer decomposition) If a submartingale X is in class D then X has a decomposition X = X (0) + M + A, where A ∈ A+ , M ∈ M and A is predictable. Up to indistinguishability this decomposition is unique. 5.1.1
The proof of the theorem
We divide the proof into several steps. The proof of the uniqueness is simple. If X (0) + M1 + A1 = X (0) + M2 + A2 292
THE DOOB–MEYER DECOMPOSITION
293
are two decompositions of X then M1 − M2 = A2 − A1 . A2 − A1 is a predictable martingale, hence it is continuous1 . As A2 − A1 has finite variation by Fisk’s theorem2 A1 = A2 , hence M1 = M2 . The proof of the existence is a bit more complicated. Definition 5.2 We say that a supermartingale P is a potential 3 , if 1. P is non-negative and 2. limt→∞ E (P (t)) = 0. Proposition 5.3 (Riesz’s decomposition) If X is a class D submartingale then X has a decomposition X = X (0) + M − P
(5.2)
where P is a class D potential and M is a uniformly integrable martingale. Up to indistinguishability this decomposition is unique. Proof. As X is in class D the set {X (t) : t ≥ 0} is uniformly integrable, hence it is bounded in L1 (Ω). Hence
sup E X + (t) ≤ sup E (|X (t)|) < K. t
t
By the submartingale convergence theorem4 the limit lim X (t) = X (∞) ∈ L1 (Ω)
t→∞
exists. Let us define the variables M (t) E (X (∞) | Ft ). As the filtration satisfies the usual conditions M has a version which is a uniformly integrable martingale. The process P M − X is in class D since it is the difference of two processes of class D. By the submartingale property P (s) M (s) − X (s) ≥ E (M (t) | Fs ) − E (X (t) | Fs ) = = E (M (t) − X (t) | Fs ) . a.s.
If t → ∞, then M (t) − X (t) → 0 and as (M (t) − X (t))t is uniformly integrable the convergence holds in L1 (Ω) as well. By the L1 (Ω)-continuity of the 1 See:
Corollary 3.40, page 205. Theorem 2.11. page 117. 3 Recall that the expected value of the supermartingales is decreasing. 4 See: Corollary 1.72, page 44. 2 See:
294
SOME OTHER THEOREMS
conditional expectation the right-hand side of the inequality almost surely goes a.s.
to zero, that is P (s) ≥ 0. E (P (s)) = E (M (s)) − E (X (s)) → E (M (∞)) − E (X (∞)) = 0, hence P is a potential. Assume that the decomposition is not unique. Let Pi , Mi , i = 1, 2 be two decompositions of X. In this case (P1 − P2 ) (t) = M1 (t) − M2 (t) = E (M1 (∞) − M2 (∞) | Ft ) . L
By the definition of the potential Pi (t) →1 0. Hence if t → ∞, then 0 = E (M1 (∞) − M2 (∞) | F∞ ) = M1 (∞) − M2 (∞) , hence M1 = M2 , so P1 = P2 . It is sufficient to proof the Doob–Meyer decomposition for the potential part of the submartingale. One should prove that if P is a class D potential, then there is one and only one N ∈ M and a predictable process A ∈ A+ for which P = N − A. If it holds then substituting −P = −N + A into line (5.2) we get the needed decomposition of X. From the definition of the potential E (A (t)) = E (N (t)) − E (P (t)) ≤ E (N (∞)) . A ∈ A+ , so A is increasing. 0 = A (0) ≤ A (t) A (∞) where E (A (∞)) < ∞. L1
Hence by the Monotone Convergence Theorem A (t) → A (∞). By the definition L1
of the potential P (t) → P (∞) = 0, hence A (∞) = N (∞). So to prove the theorem it is sufficient to prove that there is a predictable process A ∈ A+ and N ∈ M such that P (t) + A (t) = N (t) = E (N (∞) | Ft ) = E (A (∞) | Ft ) , which holds if there is an A ∈ A+ such that P (t) = E (A (∞) − A (t) | Ft ) . By the definition of the conditional expectation it is equivalent to E (χF (A (∞) − A (t))) = E (χF P (t)) = E (χF (P (t) − P (∞))) ,
F ∈ Ft .
THE DOOB–MEYER DECOMPOSITION
295
Observe that S −P is a submartingale and S (∞) = 0, hence the previous line is equivalent to E (χF (A (∞) − A (t))) = E (χF (S (∞) − S (t))) ,
F ∈ Ft .
(5.3)
For an arbitrary process X on the set of predictable rectangles (s, t] × F,
F ∈ Fs
let us define the set function µX ((s, t] × F ) E (χF (X (t) − X (s))) . Recall5 that the predictable rectangles and the sets {0} × F, F ∈ F0 generate the σ-algebra of the predictable sets P. Let µX ({0} × F ) 0,
F ∈ F0 .
Definition 5.4 If a set function µX has a unique extension to the σ-algebra P which is a measure on P then µX is called6 the Dol´eans type measure of X. Observe that the sets in (5.3) are in the σ-algebra generated by the predictable rectangles. Hence to prove the Doob–Meyer decomposition one should prove the following: Proposition 5.5 If S ∈ D is a submartingale then there is a predictable process A ∈ A+ such that the measure µS of S on the predictable sets is generated by A, that is there is a predictable process A ∈ A+ such that µA (Y ) = µS (Y ) ,
Y ∈ P.
(5.4)
As a first step we prove that µS is really a measure on P. Proposition 5.6 If S is a class D submartingale then the Dol´eans type measure µS of S can be extended from the semi-algebra of the predictable rectangles to the σ-algebra of the predictable sets. Proof. Denote by C the semi-algebra of the predictable rectangles. We want to use Carath´eodory’s extension theorem. To do this we should prove that µS is a measure on C. As S is a submartingale µS is non-negative. µS is trivially additive, hence µS is monotone on C. For all C ∈ C, using that µS is monotone 5 See: 6 See:
Corollary 1.44, page 26. Definition 2.56, page 151.
296
SOME OTHER THEOREMS
and (0, ∞] ∈ C, µS (C) ≤ µS ([0, ∞]) = µS ({0} × Ω) + µS ((0, ∞]) = = µS ((0, ∞]) E (S (∞) − S (0)) ≤ ≤ E (|S (∞)|) + E (|S (0)|) < ∞. Observe that in the last line we used that S is uniformly integrable and therefore S (∞) and S (0) are integrable. As µS is finite it is sufficient to prove that whenever Cn ∈ C, and Cn ∅, then µS (Cn ) 0. Let ε > 0 be arbitrary. If (s, t] × F ∈ C then 1 1 s + , t × F ⊆ s + , t × F ⊆ (s, t] × F. n n S is a submartingale so for every F ∈ Fs 1 E χF S s + − S (s) ≥ 0, n 1 E χF c S s + − S (s) ≥ 0. n S is uniform integrable, hence for the sum of the two sequences above 1 1 − S (s) = E lim S s + − S (s) = lim E S s + n→∞ n→∞ n n = E (S (s+) − S (s)) = 0, hence
1 lim E χF S s + − S (s) =0 n→∞ n so lim µS
n→∞
s+
1 1 , t × F lim E χF S (t) − S s + = n→∞ n n = E (χF [S (t) − S (s)]) µS ((s, t] × F ) .
Hence for every Cn ∈ C there are sets Kn and Bn ∈ C such that Bn ⊆ Kn ⊆ Cn , and for all ω the sections Kn (ω) of Kn are compact and µS (Cn ) < µS (Bn ) + ε2−n .
(5.5)
THE DOOB–MEYER DECOMPOSITION
297
Let us introduce the decreasing sequence Ln ∩k≤n Bk . C is a semi-algebra, hence Ln ∈ C for every n. Let Ln and B n be the sets in which we close the time intervals of Ln and Bn . Ln ⊆ B n ⊆ Kn ⊆ Cn ∅, We prove that if / . γ n (ω) inf {t : (t, ω) ∈ Ln } = min t : (t, ω) ∈ Ln < ∞ then γ n (ω) ∞ for all ω. Otherwise γ n (ω) ≤ K for some ω and K < ∞ and (γ n (ω) , ω) ∈ Ln . The sets [0, K] ∩ Ln (ω) are compact and γ n (ω) ∈ [0, K] ∩ Ln (ω) for all n. Hence their intersection is non-empty. Let γ ∞ be in the intersection. Then (γ ∞ , ω) ∈ Ln for all n so (γ ∞ , ω) ∈ ∩n Ln , which is impossible. Let S = S(0) + M − P be the decomposition of S, where P is the potential part of S. As M is uniformly integrable E(M (∞)) = E(M (γ n )). Therefore µS (Ln ) ≤ E(S(∞) − S(γ n )) = E(P (γ n )). As P is in class D (P (γ n ∧ t)) is uniformly integrable for every t, so as γ n ∞ lim E(P (γ n ∧ t)) = E(P (t)).
n→∞
Using that P is a supermartingale lim sup E(P (γ n )) ≤ lim sup E(P (γ n ∧ t)) = E(P (t)). n→∞
n→∞
As lim E(P (t)) = 0
t→∞
obviously µS (Ln ) → 0. By (5.5) µS (Ln ) ≤ E (S (γ n ) − S (∞)) → 0. By (5.5) c
µS (Cn \ Ln ) µS (Cn ∩ (∩k≤n Bk ) ) = µS (Cn ∩ (∪k≤n Bkc )) ≤ ≤
n k=1
µS (Cn \ Bk ) ≤
n k=1
µS (Ck \ Bk ) ≤ ε,
298
SOME OTHER THEOREMS
hence lim sup µS (Cn ) ≤ lim sup µS (Cn \ Ln ) + lim sup µS (Ln ) ≤ ε. n→∞
n→∞
n→∞
Now we can finish the proof of the Doob–Meyer decomposition. Let us recall that by (5.4) one should prove that there is a predictable process A such that Y ∈ P.
µA (Y ) = µS (Y ) ,
(5.6)
To construct A let us extend µS from P to the product measurable subsets of R+ × Ω with the definition µ (Y ) µS ( Y ) p
p R+ ×Ω
χY dµS .
(5.7)
Observe that as p χY is well-defined the set function µ (Y ) is also well-defined. If Y1 and Y2 are disjoint then by the additivity of the predictable projection µ (Y1 ∪ Y2 ) µS ( (Y1 ∪ Y2 ))
p
p
p
=
R+ ×Ω
= R+ ×Ω
p
R+ ×Ω
χY1 ∪Y2 dµS =
χY1 + χY2 dµS =
χY1 +
p
χY2 dµS =
= µS (p Y1 ) + µS (p Y2 ) µ (Y1 ) + µ (Y2 ) , so µ is additive. It is clear from the Monotone Convergence Theorem for the predictable projection that µ is σ-additive. Hence µ is a measure. µ is absolutely continuous, since if Y ⊆ R+ ×Ω is a negligible set, then there is a set N ⊆ Ω with probability zero that Y can be covered by the random intervals [0, τ n ] where τ n (ω)
n 0
if ω ∈ N . if ω ∈ /N
As P (N ) = 0 and as the usual conditions hold τ n is a stopping time for every n. Hence the intervals [0, τ n ] are predictable, and their Dol´eans-measure is obviously zero. So µ (Y ) ≤
n
µ ([0, τ n ]) =
n
µS ([0, τ n ]) = 0.
THE DOOB–MEYER DECOMPOSITION
299
By the generalized Radon–Nikodym theorem7 we can represent µ with a predictable8 process A ∈ A+ . Hence for all predictable Y µA (Y ) = µ (Y ) µS (p Y ) = µS (Y ) therefore for this A (5.6) holds. 5.1.2
Dellacherie’s formulas and the natural processes
In some applications of the Doob–Meyer decomposition it is more convenient to assume that in the decomposition the increasing process A is natural. Definition 5.7 We say that a process V ∈ V is natural if for every non-negative, bounded martingale N
t
N dV
E
t
=E
N− dV
0
.
(5.8)
0
Recall that for local martingales p N = N− , hence (5.8) can be written as
t
N dV
E
t p
=E
0
N dV
.
0
Proposition 5.8 (Dellacherie’s formula) If V ∈ A+ is natural then for every non-negative, product measurable process X
∞
E
XdV 0
∞
=E
p
XdV
,
(5.9)
0
where the two sides exist or do not exist in the same time. Proof. If η is non-negative, bounded random variable and X η · χ ((s, t]) then E
∞
XdV
= E (η (V (t) − V (s))) =
0
(n)
(n) =E η V tk − V tk−1 =
8 See:
=
k
(n) (n) E E η V tk − V tk−1 | Ft(n) =
k 7 See:
Proposition 3.49, page 208. Proposition 3.51, page 211.
k
300
SOME OTHER THEOREMS
=E
(n) (n) E η | Ft(n) V tk − V tk−1
k
k
E
M
(n) tk
(n) (n) V tk − V tk−1 .
k
By our general assumption the filtration satisfies the usual conditions so M (t) E (η | Ft ) has a version which is a bounded, non-negative martingale. If (n) (n) max tk − tk−1 → 0 k
then using that M , as every martingale, is right-continuous, Mn
(n) (n) (n) χ tk−1 , tk → M. M tk
k
η is bounded and V ∈ A+ , hence the sum behind the expected value is dominated by an integrable variable, so by the Dominated Convergence Theorem
∞
XdV
E 0
= lim E n→∞
=E
=E
lim
n→∞
lim
n→∞
k
M
(n) tk
(n) (n) V tk − V tk−1
k
M
(n) tk
t
(n) (n) V tk − V tk−1
Mn dV
=E
s
t
lim Mn dV
s n→∞
=
=
t
=E
M dV
.
s
Remember that if X η · χI then9 p
X
p
(η · χI ) = M− · χI .
Using that V is natural E
∞
XdV
=E
0
t
M dV s
=E
=E
M− dV s
∞
M− χ ((s, t]) dV 0
t
=
=E
∞
p
XdV
.
0
Hence for this special X (5.9) holds. These processes form a π-system. The bounded processes for which (5.9) is true is a λ-system, hence by the Monotone 9 See:
Corollary 3.43, page 206.
THE DOOB–MEYER DECOMPOSITION
301
Class Theorem one can extend (5.9) to the bounded processes which are measurable with respect to the σ-algebra generated by the processes X η · χ ((s, t]), hence (5.9) is true if X is a bounded product measurable process. To prove the proposition it is sufficient to apply the Monotone Convergence Theorem. Proposition 5.9 (Dellacherie’s formula) If A ∈ V and A is predictable then for any non-negative, product measurable process X ∞ ∞ p E XdA = E XdA , 0
0
where the two sides exist or do not exist in the same time. Proof. If A is predictable then Var (A) is also predictable. Therefore we can assume that A is increasing. In this case the expressions in the expectations exist and they are non-negative. Define the process σ (t, ω) inf {s : A (s, ω) ≥ t} . As A is increasing σ (t, ω) is increasing and right-continuous in t for any fixed ω. As the usual conditions hold σ t , as a function of ω is a stopping time for any fixed t. Observe that as A is right-continuous [σ t ] ⊆ {A ≥ t} , so as A is predictable Graph (σ t ) = [σ t ] = [0, σ t ] ∩ {A ≥ t} ∈ P, hence σ t is a predictable stopping time10 . By the definition of the predictable projection E (X (σ t ) χ (σ t < ∞)) = E ( p X (σ t ) χ (σ t < ∞)) . Let us remark, that for every non-negative Borel measurable function f ∞ ∞ f (u) dA (u) = f (σ t ) χ (σ t < ∞) dt. 0
0
To see this let us remark that A is right-continuous and increasing hence {t ≤ A (v)} = {σ t ≤ v} . So if f χ ([0, v]) then as A (0) = 0 ∞ ∞ f dA = A (v) = χ (t ≤ A (v)) dt = 0
=
0 ∞
0 10 See:
χ (σ t ≤ v) dt =
Corollary 3.34, page 199.
0
∞
f (σ t ) χ (σ t < ∞) dt.
(5.10)
302
SOME OTHER THEOREMS
One can prove the general case in the usual way. As σ t is predictable and as σ (t, ω) is product measurable by Fubini’s theorem ∞ ∞ XdA = E X (σ t ) χ (σ t < ∞) dt = E 0
0
∞
=
E (X (σ t ) χ (σ t < ∞)) dt =
0
∞
=
E ( p X (σ t ) χ (σ t < ∞)) dt =
0
∞
=E
p
XdA .
0
Theorem 5.10 (Dol´ eans) A process V ∈ A+ is natural if and only if V is predictable. Proof. If V is natural, then by the first formula of Dellacherie if p X = p Y , then µV (X) = µV (Y ), hence by the uniqueness of the representation of µV V is predictable11 . To see the other implication assume that V is predictable. By the second formula of Dellacherie for every product measurable process X
∞
XdV
E
∞
=E
0
p
XdV
.
0
If N is a local martingale then12 p N = N− , hence V is natural. Dellacherie’s formulas have an interesting consequence. When the integrator is a continuous local martingale then the stochastic integral is meaningful whenever the integrand is progressively measurable. By Dellacheries’s formulas even in this case the set of all possible integral processes is the same as the set of integral processes when the integrands are just predictable. Assume first 2 that X ∈ L2 (M ). By Jensen’s inequality ( p X) ≤ p X 2 , hence by the second Dellacherie’s formula p X ∈ L2 (M ). [M, N ] is continuous, hence it is predictable also by Dellacherie’s formula
∞
E
Xd [M, N ] = E
0
∞
p
Xd [M, N ] .
0
Hence during the definition of the stochastic integral the linear functionals N → E
∞
Xd [M, N ] ,
0 11 See: 12 See:
Proposition 3.51, page 211. Proposition 3.38, page 204.
N → E 0
∞
p
Xd [M, N ]
THE DOOB–MEYER DECOMPOSITION
303
coincide. Hence X • M = p X • M , and with localization if X ∈ L2loc (M ) then X ∈ L2loc (M ) and X • M = p X • M .
p
5.1.3
The sub- super- and the quasi-martingales are semimartingales
The main problem with the definition of the semimartingales is that it is very formal. An important consequence of the Doob–Meyer decomposition is that we can show some nontrivial examples for semimartingales. The most important direct application of the Doob–Meyer decomposition is the following: Proposition 5.11 Every integrable13 sub- and supermartingale X is semimartingale. Proof. Let X be integrable submartingale. To make the notation simple we shall assume that X (0) = 0. 1. Let us first assume that if X is an integrable submartingale. Let τ be an arbitrary stopping time. We prove that as in the case of martingales, X τ is also a submartingale. Let s < t and A ∈ Fs . Let us define the bounded stopping time σ (τ ∧ t) χAc + (τ ∧ s) χA . As X is integrable one can use the Optional Sampling Theorem, hence as σ ≤ τ ∧t E (X (σ)) E (X (τ ∧ t) χAc + X (τ ∧ s) χA ) ≤ ≤ E (X (τ ∧ t)) = E (X τ (t) χAc + X τ (t) χA ) , therefore E (X τ (s) χA ) ≤ E (X τ (t) χA ) , which means that X τ (s) ≤ E (X τ (t) | Fs ) , that is X τ is a submartingale. 2. If submartingale X is in class D then by the Doob–Meyer decomposition X is semimartingale. One should prove that there is a localizing sequence (τ n ), for which X τ n is in class D for all n , hence as the Doob–Meyer decomposition 13 That
is X (t) is integrable for every t.
304
SOME OTHER THEOREMS
is unique the decomposition Ln+1 + Vn+1 of X τ n+1 on the interval [0, τ n ] is indistinguishable from the decomposition Ln + Vn of X τ n . From this it is clear that X has the decomposition L + V lim Ln + lim Vn , n
n
where L is a local martingale and V has finite variation. 3. Let us define the bounded stopping times τ n inf {t : |X (t)| > n} ∧ n. As X is integrable by the Optional Sampling Theorem X (τ n ) ∈ L1 (Ω). For all t |X τ n (t)| ≤ n + |X (τ n )| ∈ L1 (Ω) , hence X τ n is a class D submartingale. Obviously τ n ≤ τ n+1 . Assume that for some ω the sequence (τ n (ω)) is bounded. In this case τ n (ω) τ ∞ (ω) < ∞. So there is an N such that if n ≥ N then τ n (ω) < n. Hence |X (τ n (ω))| ≥ n by the definition of τ n , therefore the sequence (X (τ n (ω))) is not convergent, which is a contradiction as by the right-regularity of the submartingales X has finite left limit at τ ∞ (ω). The semimartingales form a linear space, therefore if X Y − Z, where Y and Z are integrable, non-negative supermartingales then X is also a semimartingale. Let us extend X to t = ∞. By definition let X (∞) Y (∞) Z (∞) 0. As Y and Z are non-negative, after this extension they remain supermartingales14 . Hence one can assume that Y, Z and X are defined on [0, ∞]. Let ∆ : 0 = t0 < t1 < . . . < tn < tn+1 = ∞
(5.11)
be an arbitrary decomposition of [0, ∞]. Let us define the expression
sup E ∆ 14 Observed
n
|E (X (ti ) − X (ti+1 ) | Fti )| ,
i=0
that we used the non-negativity assumption.
(5.12)
THE DOOB–MEYER DECOMPOSITION
305
where one should calculate the supremum over all possible subdivisions (5.11).
E
≤E
|E (X (ti ) − X (ti+1 ) | Fti )|
i
|E (Y (ti ) − Y (ti+1 ) | Fti )|
+E
i
≤
|E (Z (ti ) − Z (ti+1 ) | Fti )| .
i
Y is a supermartingale, hence E (Y (ti ) − Y (ti+1 ) | Fti ) = Y (ti ) − E (Y (ti+1 ) | Fti ) ≥ 0. Therefore one can drop the absolute value. By the simple properties of the conditional expectation, using the assumption that Y is integrable E
n
|E (Y (ti ) − Y (ti+1 ) | Fti )|
= E (Y (0)) − E (Y (∞)) = E (Y (0)) < ∞.
i=0
Applying the same to Z one can easily see that if X has the just mentioned decomposition then the supremum (5.12) is finite. Definition 5.12 We say that the integrable15 , adapted, right-regular process X is a quasi-martingale if the supremum in (5.12) is finite. Proposition 5.13 (Rao) An integrable, right-regular process X defined on R+ is a quasi-martingale if and only if it has a decomposition X =Y −Z where Y and Z are non-negative supermartingales. Proof. We have already proved one implication. We should only show that every quasi-martingale has the mentioned decomposition. X is defined on R+ , hence as above we shall assume that X (∞) 0. Let us fix an s. For any decomposition ∆ : t0 = s < t1 < t2 . . . of [s, ∞] let us define the two variables ± C∆
(s) E
(E (X (ti ) − X (ti+1 ) | Fti )) | Fs
i 15 That
±
is X (t) is integrable for every t.
.
SOME OTHER THEOREMS
306
± The variables C∆ (s) are Fs -measurable. Let (∆n ) be an infinitesimal16 sequence of partitions of [s, ∞] , and let us assume that ∆n ⊆ ∆n+1 , that is let us assume that we get ∆n+1 by adding further points to ∆n . We shall prove that the
± (s) are almost surely convergent and the limits are almost surely sequences C∆ n finite. First we prove that if the partition ∆ is finer than ∆ , then ± ± C∆ (s) ≤ C∆ (s) ,
(5.13)
which will imply the convergence. By the quasi-martingale property the set of ± variables C∆ (s) is bounded in L1 (Ω). From the Monotone Convergence Theorem ± (s) ∞ cannot hold on a set which has positive measure. it is obvious, that C∆ n To prove (5.13) let us assume that the new point t is between ti and ti+1 . Let us introduce the variables ξ E (X (ti ) − X (t) | Fti ) ,
η E (X (t) − X (ti+1 ) | Ft ) ,
ζ E (X (ti ) − X (ti+1 ) | Fti ) . As ζ = ξ + E (η | Fti ), by Jensen’s inequality
+ ζ + ≤ ξ + + E (η | Fti ) ≤ ξ + + E η + | Fti , hence
E ζ + | Fs ≤ E ξ + | Fs + E η + | Fs , from which the inequality (5.13) is trivial. Let us introduce the variables ± C ± (s) lim C∆ (s) . n n→∞
Obviously C ± (s) is integrable and Fs -measurable. Let us observe that the vari± (s) are defined up to a measure-zero set, hence the variables C ± (s) ables C∆ n
(n) are also defined up to a measure-zero set. For arbitrary partitions ∆n ti as X (∞) 0 and as X is adapted + C∆ n
(s) −
− C∆ n
(s) = E =
E X
(n) ti
−X
(n) ti+1
| Ft(n) | Fs
=
i
i
(n) (n) E X ti − X ti+1 | Fs =
i a.s
= E (X (s) | Fs ) − E (X (∞) | Fs ) = X (s) . 16 As the length of the [s, ∞] is infinite this property, it means that we map order preservingly [0, ∞] onto [0, 1] and then the (∆n )n is infinitesimal on [0, 1] .
THE DOOB–MEYER DECOMPOSITION
307
This remains valid after we take the limit, hence for all s C + (s) − C − (s) = X (s) . a.s
(5.14)
Let us assume that t is in ∆n for all n. As s < t
±
± (n) (n) E X ti − X ti+1 | Ft(n) (t) | Fs = E | Fs ≤ E C∆ n i
(n)
tii ≥t
± (n)
(n) ≤E | Fs E X ti − X ti+1 | Ft(n) i
i ± = C∆ (s) , n
from which taking the limit and using the Monotone Convergence Theorem for the conditional expectation
E C ± (t) | Fs ≤ C ± (s) . (5.15) Let (∆n ) be an infinitesimal sequence of partitions of [0, ∞]. Let S be the union of the points in (∆n ). Obviously S is dense in R+ . By the above C ± are supermartingales on S. As S is countable so on S one can define the trajectories of C ± up to a measure zero set. By the supermartingale property except on a measure zero set N for every t the limit D± (t, ω) C ± (t+, ω)
lim
st,s∈S
C ± (s, ω)
exist and D± (t) is right-regular. X is also right-regular, hence from (5.14) on the N c for every t ≥ 0 D+ (t) − D− (t) = X (t) . D± (t) is Ft+1/n -measurable for all n, hence D± (t) is Ft+ -measurable. As F satisfies the usual conditions D± (t) is Ft measurable, that is the processes D± are adapted. If sn t and sn ∈ S, then the sequence (C ± (sn )) is a reversed supermartingale. Hence for the L1 (Ω) convergence of (C ± (sn )) it is necessary and sufficient that the sequence is bounded in L1 (Ω). By the supermartingale property as (sn ) is decreasing the expected value of (C ± (sn ))n is increasing. By the quasi-martingale property the variables C ± (0) are integrable, hence by the non-negativity the sequences (C ± (sn )) are bounded in L1 (Ω). Hence they are convergent in L1 (Ω). From this D± (t) is integrable for all t. The conditional expectation is continuous in L1 (Ω) therefore one can take the limit in (5.15) into the conditional expectation. Hence the processes D± are integrable supermartingales on R+ . Corollary 5.14 Every quasi-martingale is a semimartingale.
308
5.2
SOME OTHER THEOREMS
Semimartingales as Good Integrators
The definition of the semimartingales is quite artificial. In this section we present an important characterization of the semimartingales. We shall prove that the only class of integrators for which one can define a stochastic integral with reasonable properties is the class of the semimartingales. Recall the following definition: Definition 5.15 Process E is a predictable step process if E=
n
ξ i χ ((ti , ti+1 ])
i=0
where 0 = t0 < t1 < . . . < tn+1 and ξ i are Fti -measurable random variables. If X an arbitrary process then the only reasonable definition of the stochastic integral E • X is17 (E • X) (t) = ξ i (X (ti+1 ∧ t) − X (ti ∧ t)) . i
For an arbitrary stochastic process X the definition obviously makes the integral linear over the linear space of the predictable step processes. On the other hand it is reasonable to say that a linear mapping is an integral if the correspondence has some continuity property. Let us define the topology of uniform convergence in (t, ω) among the predictable step processes and let us define the topology for the random variables with the stochastic convergence. Definition 5.16 We say that process X is a good integrator, if for every t the correspondence E → (E • X) (t) is a continuous, linear mapping from the space of predictable step processes to the set of random variables. Observe that the required continuity property is very weak, as on the domain of definition we have a very strong, and on the image space we have a very weak, topology. As the integral is linear it is continuous if and only if it is continuous at E = 0. This means that if a sequence of step processes is uniformly convergent to zero then for any t the integral on the interval (0, t] is stochastically convergent to zero. 17 See: Theorem 2.88, page 174, line (4.11), page 252. Recall that by definition (E • X) (t) is the integral on (0, t].
SEMIMARTINGALES AS GOOD INTEGRATORS
309
Theorem 5.17 (Bichteler–Dellacherie) An adapted, right-regular process X is a semimartingale if and only if it is a good integrator. Proof. If X is a semimartingale, then by the Dominated Convergence Theorem it is obviously a good integrator18 . Hence we have to prove only the other direction. We split the proof into several steps. 1. As a first step let us separate the ‘big jumps’ of X, that is let us separate from X the jumps of X which are larger than one. By the assumptions of the theorem the trajectories of X are regular so the ‘big jumps’ do not have an accumulation point. Hence the decomposition is meaningful. From this trivially follows that the process
∆Xχ (|∆X| ≥ 1)
has finite variations. As the continuity property of the good integrators holds for processes with finite variation Y X − ∆Xχ (|∆X| ≥ 1) is also a good integrator. If we prove that Y is a semimartingale, then we obviously prove that X is a semimartingale as well. Y does not contain ‘big jumps hence if it is a semimartingale, then it is a special semimartingale19 . Therefore the decomposition of Y is unique20 . As the decomposition is unique it is sufficient to prove that Y is a semimartingale on every interval [0, t]. 2. As we have already seen21 if probability measures P and Q are equivalent, that is the measure-zero sets under P and Q are the same, then X is a semimartingale under P if and only if it is a semimartingale under Q. Therefore it is sufficient to prove that if X is a good integrator under P then one can find a probability measure Q which is equivalent to P and X is a semimartingale under Q. Observe that a sequence of random variables is stochastically convergent to some random variable if and only if any subsequence of the original sequence has another subsequence which is almost surely convergent to the same function. Therefore the stochastic convergence depends only on the collection of measurezero sets, which is not changing during the equivalent change of measure. From this it is obvious that the class of good integrators is not changing under the equivalent change of measure. 3. Let us fix an interval [0, t]. As the trajectories of X are regular the trajectories are bounded on any finite interval. Hence η sups≤t |X (s)| < ∞. Again by the regularity of the trajectories it is sufficient to calculate the supremum over the rational points s ≤ t. Therefore η is a random variable. Let Am {m ≤ η < m + 1} and ζ m 2−m χAm . ζ is evidently bounded, and as 18 See:
Lemma 2.12, page 118. Example 4.47, page 258. 20 See: Corollary 3.41, page 205. 21 See: Corollary 4.58, page 271. 19 See:
310
SOME OTHER THEOREMS
η is finite ζ is trivially positive. As E (ηζ) =
E η2−m χ (m ≤ η < m + 1) ≤ (m + 1) 2−m
m
m
it is obvious that ηζ is integrable under P. 1 R (A) E (ζ)
ζdP A
is a probability measure and as ζ is positive it is equivalent to P. For every s ≤ t
|X (s)| dR ≤
Ω
ηdR = Ω
1 E (ζ)
ηζdP < ∞, Ω
therefore X (s) is integrable under R for all s. To make the notation simple we assume that X (s) are already integrable under P for all s ∈ [0, t]. 4. Let us define the set B {(E • X) (t) : |E| ≤ 1, E ∈ E} ,
(5.16)
where E is the set of predictable step processes over [0, t]. Using the continuity property of the good integrators we prove that B is stochastically bounded, that is for every ε > 0 there is a number k, such that P (|η| ≥ k) < ε for all η ∈ B. If it was not true then there were an ε > 0, a sequence of step processes |En | ≤ 1 and kn ∞, such that P
(En • X) (t) ≥1 kn
≥ ε.
The sequence (En /kn ) is uniformly converging to zero, hence by the continuity property of the good integrators (En • X) (t) = kn
En P • X (t) → 0, kn
which is, by the indirect assumption, is not true. 5. As a last step of the proof in the next point we shall prove that for every non-empty, stochastically bounded, convex subset B of L1 there is a probability measure Q which is equivalent to P and for which
βdQ : β ∈ B
sup Ω
c < ∞.
(5.17)
SEMIMARTINGALES AS GOOD INTEGRATORS
311
From this the theorem follows as for every partition of [0, t] 0 = t0 < t1 < . . . < tn+1 = t if22
ξ i sgn EQ (X (ti+1 ) − X (ti ) | Fti ) , and E
ξ i χ ((ti , ti+1 ])
i
then as |E| ≤ 1 (E • X) (t) ∈ B, therefore Q
c ≥ E ((E • X) (t)) =
n
EQ (ξ i [X (ti+1 ) − X (ti )]) =
i=0
=
n
EQ EQ (ξ i [X (ti+1 ) − X (ti )] | Fti ) =
i=0
=
n
EQ ξ i EQ (X (ti+1 ) − X (ti ) | Fti ) =
i=0
n Q E (X (ti ) − X (ti+1 ) | Ft ) . =E i Q
i=0
Hence X is a quasi-martingale under Q. Therefore23 it is a semimartingale under Q. 6. Let B ⊆ L1 (Ω) be a non-empty stochastically bounded convex convex set24 . We prove the existence of the equivalent measure Q in (5.17) with the Hahn– ∞ Banach theorem. Let L∞ + denote the set of non-negative functions in L . H
ζ ∈ L∞ + : sup
βζdP : β ∈ B
<∞ .
Ω 22 Of
EQ
course denotes the expected value under Q. Corollary 5.14, page 307. 24 One can assume that B is stochastically bounded above that is for every ε > 0 there is a k (ε) such that P (B ≥ k (ε)) ≤ ε. 23 See:
312
SOME OTHER THEOREMS
It is sufficient to prove that H contains a strictly positive function ζ 0 , since in this case 1 Q (A) ζ dP E (ζ 0 ) A 0 is an equivalent probability measure for which (5.17) holds. Let G be the set of points of positivity of the functions in H. The set G is closed under the countable union: if ζ n ∈ H, and
βζ n dP : β ∈ B
sup
≤ cn
Ω
cn ≥ 1 then n
2−n ζ ∈ H. cn ζ n ∞ n
Using the lattice property of G in the usual way one can prove that G contains a set D which has maximal measure, that is P (G) ≤ P (D) for all G ∈ G. Of course to D there is a ζ D ∈ H. We should prove that P (D) = 1, hence in this case ζ D ∈ H, as an equivalence class, it is strictly positive. Let us denote by C the complement of D. We shall prove that P (C) = 0. As an indirect assumption let us assume that P (C) ε > 0.
(5.18)
As B is stochastically bounded to our ε > 0 in (5.18) there is a k, such that P (β ≥ k) ≤ ε/2 for all random variable β ∈ B. From this θ 2kχC ∈ / B. Of course, if ϑ ≥ 0, then P (θ + ϑ ≥ k) ≥ ε hence θ + ϑ ∈ / B, that is θ ∈ / B − L1+ . We can prove a bit more: θ is not even in the closure in L1 (Ω) of the convex25 set B − L1+ . That is
θ∈ / cl B − L1+ . P
If γ n β n − ϑn → θ in L1 (Ω), then γ n → θ, but if δ is small enough, then as ϑn ≥ 0 P (|γ n − θ| > δ) P (|β n − ϑn − θ| > δ) ≥ ≥ P ({β n < k} ∩ {θ ≥ 2k}) = = P ({β n < k} ∩ C) = P (C\ {β n ≥ k}) ≥ 25 The
B is conves hence B − L1+ is also convex.
ε , 2
SEMIMARTINGALES AS GOOD INTEGRATORS
313
which is impossible. By the Hahn–Banach theorem26 there is a ζ = 0 ∈ L∞ (Ω) , such that
(β − ϑ) ζdP <
θζdP,
β ∈ B, ϑ ∈ L1+ .
(5.19)
Ω
Ω
Observe that ζ ≥ 0 as if ζ was negative with positive probability then the lefthand side of (5.19) for some ϑ ∈ L1+ would be greater than the fix value on the right-hand side. As ζ = 0 obviously ζ is positive on some subset U ⊆ C with positive measure. Taking ϑ = 0
θζdP c < ∞,
βζdP < Ω
β∈B
Ω
that is ζ ∈ H. Extending the set D with the support of ζ as U ⊆ C = Dc one can get a set in G which has larger measure than D. This contradicts to the definition of D. Theorem 5.18 (Stricker) Let X be a semimartingale under a filtration F and let Gt ⊆ Ft for all t for some filtration G. If X is adapted to G then X is a semimartingale under G as well. Proof. The set of step processes under G under F.
are also step processes
Example 5.19 The theorem of Stricker is not valid for local martingales.
Let us remark that the above property holds for martingales as well. The problem with the local martingales comes from the fact that when one shrinks the filtration the set of stopping times can also shrink. Let η be a symmetric random variable which does not have an expected value. Let us assume that the density function of η is continuous and strictly positive. Let X (t)
0 η
if t < 1 . if t ≥ 1
Let the filtration Ft
σ (|η|) σ (η)
if t < 1 . if t ≥ 1
26 Using that L∞ is the dual of L1 and that every convex closed set can be strictly separated from any point of its complement.
314
SOME OTHER THEOREMS
The τ n (ω)
0 if |η| ≥ n ∞ if |η| < n
is stopping time under F, and as η is symmetric X τ n is a martingale. The filtration generated by X is Gt
{0, Ω} if t < 1 . σ (η) if t ≥ 1
The τ n is not a stopping time under G, as by the assumptions about the density function of η {τ n ≤ 0} = {τ n = 0} = {|η| ≥ n} ∈ / {0, Ω} = G0 . Let τ be a stopping time for the G. If on a set of positive measure τ ≥ 1, then almost surely τ ≥ 1, therefore X τ (1) = X (1) = η is not integrable27 , so X is not a local martingale under G.
5.3
Integration of Adapted Product Measurable Processes
Let M be a continuous local martingale. If the processes X and Y are almost everywhere equal under the Dol´eans measure28 αM then by the definition of L2loc (M ) they belong to the same equivalence class. Hence if the integrals X • M and Y • M exist they are indistinguishable. Using this in certain cases we can extend the integration from progressively measurable processes to adapted product measurable functions. Proposition 5.20 If M is continuous local martingale and αM λ × P then one can define the stochastic integral X •M for every adapted product measurable process X. Proof. The proposition directly follows from the next proposition. Proposition 5.21 Every adapted product measurable process X is λ × P 0 equivalent to a progressively measurable process X. Proof. We divide the proof into several steps. One can assume that X is defined on an interval [a, b) and X is bounded. 27 The G does not satisfy the usual conditions but adding the measure-zero sets is not solving the problem. 28 See: Definition 2.56, page 151.
INTEGRATION OF ADAPTED PRODUCT MEASURABLE PROCESSES
315
1. Introduce the functions ϕn : R → R k−1 , ϕn (t) 2n
if t ∈
k−1 k , . 2n 2n
As a first step we prove that for any s ≥ 0 t−
1 ≤ ϕn (t − s) + s < t. 2n
As s ≥ 0, there is an integer number m ≥ 0 such that m+1 m ≤s< . n 2 2n Hence, if t ∈ ((k − 1) /2n , k/2n ], then k−m k−m−2
k−m−2 , 2n
ϕn (t − s)
k−m−1 . 2n
in the second case
By simple calculation in the first case t−
1 k−m−2 ≤ + s ϕn (t − s) + s < t, 2n 2n
and in the second case t−
1 k−m−1 ≤ + s = ϕn (t − s) + s < t. 2n 2n
(5.20)
316
SOME OTHER THEOREMS
2. Fix an s ≥ 0, and let us define the stochastic processes Xn (s, t, ω) X (ϕn (t − s) + s, ω) . As X is perhaps defined only on the interval [a, b), it can happen that Xn (s, t, ω) is not defined so to make the notation as simple as possible we assume that if ϕn (t − s) + s < a, then Xn (s, t, ω) 0. ϕn (t − s) + s < t, and X (t, ω) is Ft adapted, hence if t ∈ [a, b] , then Xn (s, t, ω) is also Ft -adapted. Xn (s, t, ω) is a left-continuous step function on [a, b] for a fixed s and ω. So Xn (s, t, ω) is progressively measurable for every s. If s1 < s2 are two points in the interval [i/2n , (i + 1) /2n ) for some integer i ≥ 0, then ϕn (t − s1 ) + s1 = ϕn (t − s2 ) + s2 , since otherwise 1 > s2 − s1 = ϕn (t − s2 ) − ϕn (t − s1 ) > 0 2n which is impossible as ϕn (u) k/2n . Using (5.20) one can easily prove that s → ϕn (t − s) + s is an injective map of [i/2n , (i + 1) /2n ) into [t − 1/2n , t). Using the definition of this map29 for all t n (i+1)/2
|Xn (s, t) − X (t)| ds
i/2n
n (i+1)/2
|X (ϕn (t − s) + s) − X (t)| ds ≤
i/2n
≤
t
|X (u) − X (t)| du ≤
t−1/2n
≤
n 1/2
|X (t − h) − X (t)| dh.
0 29 The
function y = x +
jumps where the number of jumps is finite in the interval.
INTEGRATION OF ADAPTED PRODUCT MEASURABLE PROCESSES
317
Using this if
b
1
IE
|Xn (s, t, ω) − X (t, ω)| dsdt , 0
a
then by Fubini’s theorem, using that X is product measurable 2n −1 b (i+1)/2n IE |Xn (s, t, ω) − X (t, ω)| dsdt ≤ a
b
1/2n
≤2 E
|X (t − h, ω) − X (t, ω)| dhdt
n
a
0
1/2n
= 2n
max
=
b
|X (t − h, ω) − X (t, ω)| dt dh ≤
E 0
≤
(5.21)
i/2n
i=0
a
0≤h≤2−n
2n ·E 2n
b
|X (t − h, ω) − X (t, ω)| dt . a
3. Let f be an integrable function over R with respect to λ. We show that lim |f (t − h) − f (t)| dt = 0. (5.22) h→0
R
The relation is trivial if f is continuous. Let Th f (t) f (t − h). Obviously Th f 1 = f 1 . As λ is regular the continuous functions are dense in L1 (R, λ). If g is continuous and f − g1 < ε then Th f − f 1 ≤ Th f − Th g1 + Th g − g1 + g − f 1 ≤ ≤ 2ε + Th g − g1 , from which (5.22) is obvious. 4. X is bounded, hence if n → ∞ then by the just proved result and by the Dominated Convergence Theorem, using that [a, b] is finite, the last expression in (5.21) goes to zero. Hence in the space L1 ([0, 1] × [a, b] × Ω) a.s.
Xn (s, t, ω) → X (t, ω). For a subsequence Xnk (s, t, ω) → X (t, ω). Let H be the set of such points (s, t, ω) where the convergence holds. By Fubini’s theorem on a
318
SOME OTHER THEOREMS
subset of the product space which has complete measures, almost all the sections parallel with the coordinate axes have complete measure. Hence for almost all s ∈ [0, 1] the measure of Hs {(t, ω) : (s, t, ω) ∈ H} is equal to the measure of [a, b] × Ω. Hence there is a point s for which this holds, that is there is an s such that for almost all (t, ω) lim Xnk (s, t, ω) = X (t, ω) .
k→∞
If 0 lim inf Xn , X k k→∞
0 is progressively measurable. On the other hand X 0 (t) = X (t) for almost then X all30 t. a.s.
Corollary 5.22 Let w be a Wiener process. In the definition of the stochastic integral X • w the integrand X can be an arbitrary adapted product measurable process for which
t
X 2 (s)ds < ∞
P
= 1.
0
0 be the proProof. Let X be an adapted product measurable process and let X a.s. 0 gressively measurable process in the previous proposition. X = X with respect to λ × P, so by Fubini’s theorem
t
a.s.
ˆ 2 ds = 0 (X − X) 0
for almost all t. As the integral is continuous in t the relation is valid for every t. Let N be a continuous local martingale. To construct the integral X • w one should show that X • [w, N ] is adapted. By the Kunita-Watanabe inequality E
t
2 t 0 [w, N ] 0 2 ds = 0. (X − X)d (X − X) ≤ E [N ](t)
0
0
So
t
a.s.
Xd [w, N ] = 0 30 And
not for all t.
0
t
0 [w, N ] Xd
THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS
319
0 is progressively measurable, the integral on the right-hand for every t. But as X side is adapted. As the filtration contains the measure-zero sets the integral on the left-hand side is also adapted.
5.4
Theorem of Fubini for Stochastic Integrals
In this section we discuss a generalization of Fubini’s theorem to stochastic integrals. Let X be a semimartingale on a stochastic base (Ω, A, P, F) and let (C, C, µ) be an arbitrary finite measure space. In this generalization we want to interchange a stochastic and a classical abstract integral:
(H • X) (c)dµ (c) = C
H (c) dµ (c) • X.
C
As in the classical case the discussion is built on the Monotone Class Theorem and on the Dominated Convergence Theorems. First with the Monotone Class Theorem one proves the theorem in the bounded case then one generalizes it to unbounded integrands. Let us shortly discuss the general properties of the expressions in the equality. Assume that H is bounded. As H is bounded and as µ is finite by the classical theorem of Fubini, the classical parametric integral H (c, t, ω) dµ (c) is bounded and measurable. Hence the double integral on C the right-hand side obviously exists. On the other hand, on the left-hand side the parametric stochastic integral (H • X) (c) is not necessarily bounded. The classical integration is monotone, order preserving operation, but the stochastic integration is not! As (H • X) (c) is not necessarily bounded, it is not obvious that for a fixed t and ω why does the classical integral C (H • X) (c)dµ (c) exist? The same problem arises when one wants to use the Dominated Convergence Theorem. On the right-hand side, as the classical integral is order preserving, one can easily apply the Dominated Convergence Theorem, but on the left-hand side we cannot directly use it. Recall that in the general case we have defined the stochastic integral only for predictable processes. On the other hand, if the integrator is continuous then the set of the possible integrable processes is a subset of the progressively measurable processes. If the integrator is a Wiener process then one can assume that the integrand is adapted and product measurable. In this section we shall denote by G the σ-algebra of the subsets of R+ × Ω for which the integrands in the stochastic integrals with respect to X are measurable. As a generalization of the classical approach first we discuss the measurability of the parametric stochastic integrals. Proposition 5.23 (Measurability of the parametric integrals) Let X be a semimartingale and let (C, C) be an arbitrary measurable space. If H (c, t, ω) is measurable with respect to the product σ-algebra C ×G and H (c) is integrable with
320
SOME OTHER THEOREMS
respect to X for every c, then the parametric stochastic integral c → H (c)•X has a version, denoted by (H • X)(c) which is measurable with respect to the product σ-algebra C × B (R+ ) × A. This means that there is a function Z (c, t, ω) (H • X)(c, t, ω), which is: 1. measurable with respect to the product σ-algebra C × B (R+ ) × A, 2. Z (c) is almost surely right-regular and adapted in t for any fixed c, 3. the function (t, ω) → Z(c, t, ω) is indistinguishable from the stochastic integral 31 H (c) • X for any fixed c. Proof. The proof is nearly the same as the proof of the classical theorem of Fubini. Let S be the set of bounded processes on the product C × R+ × Ω for which the proposition holds. For an arbitrary set B ⊆ C as the stochastic integration is homogeneous χB (H • X) = (χB H • X), where for every c ∈ C the two sides are indistinguishable. Of course if B ∈ C then for any fixed version of H •X Z (c, t, ω) χB (c)(H • X) satisfies the proposition. If H (c, t, ω) = H1 (c) H2 (t, ω), where H1 is a C-measurable step function and H2 is G-measurable, then by the linearity32 of stochastic integral the product H1 (H2 • X) satisfies the proposition. S is obviously a vector space. One should prove that S is a λ-system, that is S is closed under the monotone convergence, hence in this case using the Monotone Class Theorem we can use the property in the proposition for every bounded (C × G)-measurable process. To prove this monotone class property one needs the following lemma: Lemma 5.24 Let (Zn (c, t, ω))n be a sequence of right-regular, (C × B (R+ ) × A)measurable functions. If for all c the sequence (Zn (c))n is uniformly convergent on compacts in probability, then there is a function Z which is 1. (C × B (R+ ) × A)-measurable, 2. Z (c) is almost surely right-regular in t for every c, ucp
3. Zn (c) → Z (c) for every c. Proof. Let us denote by D [0, ∞) the space of right-regular functions. D [0, ∞) is a complete metric space with the topology of uniform convergence on compacts. Let us denote by d the metric of D [0, ∞). As the functions in D [0, ∞) are right-regular for every interval [0, s] one can calculate the supremum in the seminorm supt≤s |f (t)| only for the rational numbers in [0, s], hence the product measurability of the functions Zn implies the measurability of the value of the 31 Observe
that the stochastic integral is defined up to indistinguishability. that the equality (aH)•X = a(H •X) means that the two side are indistinguishable, hence we can apply the linearity only for finite valued processes H1 . 32 Recall
THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS
321
semi-norms supt≤s |Zn (t)|, therefore for d (Zi (c, ω), Zj (c, ω)) is measurable in (c, ω) all i and j. For all c let n0 (c) 1. With induction let us define the sequence
nk (c) inf m > nk−1 (c) : sup P d (Zi (c) , Zj (c)) > 2−k < 2−k . i,j≥m
As we observed the real valued functions d (Zi (c, ω) , Zj (c, ω)) are measurable in (c, ω), therefore by Fubini’s theorem the probability in the formula depends on c in a measurable way. Hence nk is a measurable function of c. Let us define the ‘stopped variables’ Yk (c, t, ω) Znk (c) (c, t, ω) . For all open set G {Yk ∈ G} = ∪p {nk = p, Zp ∈ G} , therefore Yk is also product measurable. For all c −k
sup P d (Yi (c) , Yj (c)) > 2−k ≤ 2 < ∞, k
i,j≥k
k
hence for every c by the Borel and Cantelli lemma if the indexes i, j are big enough then except on a measure-zero set ω ∈ N (c) d (Yi (c, ω) , Yj (c, ω)) ≤ 2−k . D [0, ∞) is complete, hence (Yi (c, ω)) is almost surely convergent in D [0, ∞) for all c. The function limi Yi (c, t, ω) , if the limit exists, Z (c, t, ω) 0 otherwise is product measurable and Z is right-regular almost surely for all c. For an arbitrary c (Yi (c) − Z(c)) is a subsequence of (Zn (c) − Z(c)), therefore it is stochastically convergent in D [0, ∞). The measure is finite therefore for the metric space valued random variables the almost sure convergence implies the stochastic convergence. Hence Z (c, ω) is the limit of the sequence (Zn (c, ω)) for almost all ω. Returning to the proof of the proposition let us assume that Hn ∈ S and 0 ≤ Hn H, where H is bounded. By the Dominated Convergence Theorem Hn (c)• ucp X → H • X for every c. Hence by the lemma H • X has a (C × B (R+ ) × A)measurable version. That is H ∈ S. Hence the proposition is valid for bounded processes. If H is not bounded, then let Hn Hχ (|H| ≤ n). The processes Hn are also (C × G)-measurable, and of course they are bounded. Therefore the processes Hn • X have the stated version. By the Dominated Convergence
322
SOME OTHER THEOREMS ucp
Theorem Hn (c) • X → H (c) • X for every c. By the lemma this means that H (c) • X also has a measurable version. Theorem 5.25 (Fubini’s theorem for bounded integrands ) Let X be a semimartingale, and let (C, C, µ) be an arbitrary finite measure space. Let H(c, t, ω) be a function measurable with respect to the product σ-algebra C × G. Let us denote by (H • X)(c) the product measurable version of the parametric integral c → H(c) • X. If H (c, t, ω) is bounded, then (H • X) (c)dµ (c) = H (c) dµ (c) • X, (5.23) C
C
that is the integral of the parametric stochastic integral on the left side is indistinguishable from the stochastic integral on the right side. Proof. It is not a big surprise that the proof is built on the Monotone Class Theorem again. 1. By the Fundamental Theorem of Local Martingales semimartingale X has 2 . For V ∈ V one can a decomposition X (0) + V + L, where V ∈ V and L ∈ Hloc prove the equality by the classical theorem of Fubini, hence one can assume that 2 . One can easily localize the right side of (5.23). On the left side one X ∈ Hloc can interchange the localization and the integration with respect to c therefore one can assume that X (0) = 0 and X ∈ H2 . Therefore33 we can assume that E ([X] (∞)) < ∞. 2. Let us denote by S the set of bounded, (C × G)-measurable processes for which the theorem holds. If H H1 (c) H2 (t, ω) , where H1 is C-measurable step function and H2 is G-measurable and H1 and H2 are bounded functions, then arguing as in the previous proposition
H • Xdµ
C
(H1 (c) H2 ) • Xdµ(c) = C
=
C
=
αi χBi H2
• Xdµ(c) =
i
αi C
i
χBi (H2 • X) dµ(c) =
H1 (c) dµ (c) (H2 • X) =
= C
=
H1 (c) dµ (c) H2
C
so H ∈ S. 33 See:
Proposition 3.64, page 223.
•X = C
Hdµ • X,
THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS
323
3. By the Monotone Class Theorem, one should prove that S is a λ-system. Let Hn ∈ S and let 0 ≤ Hn H, where H is bounded. We prove that one can take the limit in the equation
Hn dµ • X.
(Hn • X) dµ = C
C
As H is bounded and µ is finite, therefore on the right-hand side the integrands are uniformly bounded so one can apply the classical and the stochastic Dominated Convergence Theorem, so on the right-hand side
ucp
Hn dµ • X →
Hdµ • X.
C
(5.24)
C
4. Introduce the notations Zn Hn • X and Z H • X. One should prove that the left-hand side is also convergent that is P Z (c) dµ (c) → 0. δ sup Zn (c) dµ (c) − t
C
C
By the inequalities of Cauchy–Schwarz and Doob sup |Zn (c) − Z (c)| dµ (c) ≤
E (δ) ≤ E
t
C
* + 2 + " , sup |Zn (c) − Z (c)| dµ (c) = ≤ µ(C) E C
t
* + 2 + " sup |Zn (c) − Z (c)| dµ (c) ≤ = µ(C), E t
C
-
" 2 ≤ µ(C) 4 · E (Zn (c, ∞) − Z (c, ∞)) dµ (c). C
By Itˆo’s isometry34 the last integral is
E C
t
(Hn − H) d [X] dµ. 2
(5.25)
0
As µ and E ([X] (∞)) are finite and as the integrand is bounded and Hn → H by the classical Dominated Convergence Theorem the (5.25) goes to zero. 34 See:
Proposition 2.64, page 156.
324
SOME OTHER THEOREMS
So E (δ) → 0, that is ucp (Hn • X) dµ Zn dµ → Zdµ (H • X) dµ. C
C
C
(5.26)
C
Particularly sup |Zn (c) − Z (c)| dµ (c) < ∞, C
a.s.
t
The expression Hn (c) dµ (c) • X = (Hn (c) • X) dµ (c) Zn dµ C
C
C
is meaningful, therefore for all t and for almost all outcome ω |(H(c) • X) (t, ω)| dµ (c) |Z (c, t, ω)| dµ (c) < ∞. C
C
Hence the left-hand side of (5.23) is meaningful for H as well. By (5.24) the right-hand side is also convergent, hence from (5.26)
Hdµ • X = lim
n→∞
C
Hn dµ • X = C
(Hn • X) dµ =
= lim
n→∞
C
(H • X) dµ. C
The just proved stochastic generalization of Fubini’s theorem is sufficient for most of the applications. On the other hand one can still be interested in the unbounded case: Theorem 5.26 (Fubini’s theorem for unbounded integrands) Let X be a semimartingale and let (C, C, µ) be a finite measure space. Let H (c, t, ω) be a (C × G)-measurable process, and assume that the expression - H (t, ω)2
H 2 (c, t, ω) dµ (c) < ∞
(5.27)
C
is integrable with respect to X. Under these conditions µ almost surely the stochastic integral H (c)•X exists and if (H •X)(c) denote the measurable version of this parametric integral then (H • X)(c)dµ (c) = H (c) dµ (c) • X. (5.28) C
C
THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS
325
Proof. If on the place of H one puts Hn Hχ (|H| ≤ n) then the equality holds by the previous theorem. As in the proof of the classical Fubini’s theorem one should take a limit on both sides of the truncated equality. 1. Let us first investigate the right-hand side of the equality. By the Cauchy– Schwarz inequality - " |H (c, t, ω)| dµ (c) ≤ µ (C) H 2 (c, t, ω) dµ (c).
C
(5.29)
C
By the assumptions µ is finite and H (c, t, ω) as a function of c is in the space L2 (µ) ⊆ L1 (µ), hence by the Dominated Convergence Theorem for all (t, ω)
Hn (c, t, ω) dµ (c) →
C
H (c, t, ω) dµ (c) . C
By the just proved inequality (5.29) the processes C Hdµ and C |H| dµ are integrable with respect by the Dominated Convergence Theorem for
to X, hence ucp stochastic integrals C Hn dµ • X → C Hdµ • X. This means that one can take the limit on the right side of the equation. 2. Now let us investigate the left-hand side. We first prove that for almost all c the integral H (c) • X exists. Let X X (0) + V + L, where V ∈ V, L ∈ L is the decomposition of X for which the integral H (t, ω)2 • X exists. One can assume that V ∈ V + . Using (5.29) and for every trajectory the theorem of Fubini
t
t
|H| dV dµ = C
0
|H| dµdV =
0
0
C
t
H1 dV
t " ≤ µ (C) H2 dV < ∞. 0
t Therefore for any t for almost every35 c the integral 0 H(c)dV is finite. Of course if the integral exists for every rational t then it exists for every t, therefore unifying the measure-zero sets it is easy to show that for almost all c the integral H(c) • V is meaningful. Recall that"a process G is integrable with respect to the + 2 local ) martingale L if and only if G • [L] ∈ Aloc . This means that H2 H 2 (c) dµ (c) is integrable if and only if there is a localizing sequence (τ n ) C 35 Of
course with respect to µ.
SOME OTHER THEOREMS
326
for which the expected value of -
τn
-
τn
H 2 (c) dµ (c) d [L] = 0
H 2 (c) d [L] dµ (c)
C
0
C
is finite. By Jensen’s inequality - C
τn
0
µ (c) H 2 (c) d [L] d ≥ µ(C)
-
τn
H 2 (c) d [L]d C
0
µ (c) . µ(C)
Therefore by Fubini’s theorem -
H2
E C
-
τn
H2
(c) d [L] dµ (c) = E
0
τn
(c) d [L]dµ (c)
< ∞.
0
C
Hence except on a set Cn with µ (Cn ) = 0 the expected value of -
τn
H 2 (c) d [L] 0
is finite. Unifying the measure-zero sets Cn one can easily see that " 36 H 2 (c) • [L] ∈ A+ c, that is for almost all c the integral loc for almost all H(c) • L exists. ucp
3. If integral H (c) • X exists, then Hn (c) • X → H (c) • X. Unfortunately, as we mentioned above from the inequality |Hn (c)| ≤ |H (c)| does not follow the inequality |Hn (c) • X| ≤ |H (c) • X|, and we do not know that H (c) • X is µ integrable hence one cannot use the classical Dominated Convergence Theorem for the outer integral with respect to µ. Therefore, as in the proof of the previous theorem, we prove the convergence of the right side with direct estimation. As by the classical Fubini’s theorem the theorem is obviously valid if the integrator has finite variation one can assume that X ∈ L. 4. Let s ≥ 0. Like in the previous proof introduce the variable δ n sup ((Hn (c) − H (c)) • X) dµ (c) . t≤s C
36 Of
course with respect to µ.
THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS
327
By Davis’ inequality sup |(Hn (c) − H (c)) • X| dµ (c) =
E (δ n ) ≤ E
(5.30)
C t≤s
E sup |(Hn (c) − H (c)) • X| dµ (c) ≤
= C
≤K
t≤s
E C
"
)
2
(Hn (c) − H(c)) • [X] (s) dµ =
E
=K
[(Hn (c) − H(c)) • X] (s) dµ =
C
%
= µ(C)KE C
µ (Hn (c) − H (c)) • [X] (s) d µ(C)
2
≤
- ≤ µ(C)KE
=
µ = (Hn (c) − H (c)) • [X] (s) d µ(C) C - 2
" µ(C)KE
2
(Hn (c) − H (c)) dµ • [X] (s) . C
) C
H 2 dµ is integrable with respect to X, therefore -
- 2
(Hn (c) − H (c)) dµ • [X] ≤ C
C
H 2 dµ • [X] ∈ A+ loc .
Let (τ m ) be a localizing sequence. With localization one can assume that the last expected value is finite, that is -
H 2 dµ
E
•
[X τ m ]
< ∞.
C
Applying the estimation (5.30) for X τ m and writing δ (m) instead of δ n by n
(m) the classical Dominated Convergence Theorem E δ n → 0. Hence if m is sufficiently large then
> ε + P (τ m ≤ s) P (δ n > ε) ≤ P (δ n > ε, τ m > s) + P (τ m ≤ s) ≤ P δ (m) n Therefore δ n → 0 in probability. From this point the proof of the theorem is the same as the proof of the previous one.
328
SOME OTHER THEOREMS
Corollary 5.27 (Fubini’s theorem for local martingales) Let (C, C, µ) be a finite measure space. If L is a local martingale, H (c, t, ω) is a (C ×P)-measurable function and t H 2 (c, s)dµ(c)d [L] (s) ∈ A+ loc , 0
C
then
t
t
H (c, s) dL (s) dµ (c) = C
0
H (c, s) dµ (s) dL (s) . 0
(5.31)
C
If L is a continuous local martingale and H is a (C × R)-measurable process and t P 0
H 2 (c, s) dµ (c) d [L] (s) < ∞ = 1,
C
then (5.31) holds. Corollary 5.28 (Fubini’s theorem for Wiener processes) Let (C, C, µ) be a finite measure space. If w is a Wiener process, H (c, t, ω) is an adapted, product measurable process and t
H (c, s) dµ (c) ds < ∞ = 1, 2
P 0
C
then
t
t
H (c, s) dw (s) dµ (c) = C
5.5
0
H (c, s) dµ (s) dw (s) . 0
C
Martingale Representation
Let H0p denote the space of Hp martingales which are zero at time zero. Recall that by definition martingales M and N are orthogonal if their product M N is a local martingale. This is equivalent to the condition that the quadratic variation [M, N ] is a local martingale. This implies that if M and N are orthogonal then M τ and N are also orthogonal for every stopping time τ . The topology in the spaces H0p is given by the norm supt |M (t)|p . The basic message of the Burkholder–Davis inequality is that this norm is equivalent to the norm " M Hp [M ] (∞) . (5.32) 0 p
In this section we shall use this norm. Observe that if p ≥ 1 then H0p is a Banach space.
MARTINGALE REPRESENTATION
329
Definition 5.29 Let 1 ≤ p < ∞. We say that the closed, linear subspace X of H0p is stable if it is stable under truncation, that is if X ∈ X then X τ ∈ X for every stopping time τ . If X is a subset of H0p then we shall denote by stablep (X ) the smallest closed linear subspace of H0p which is closed under truncation and contains X . Obviously H0p is a stable subspace. The intersection of stable subspaces is also stable, hence stablep (X ) is meaningful for every X ⊆ H0p . To make the notation as simple as possible if the subscript p is not important we shall drop it and instead of stablep (X ) we shall simply write stable(X ). Lemma 5.30 Let 1 ≤ p < ∞ and let X ⊆ H0p . Let N be a bounded martingale. If N is orthogonal to X then N is orthogonal to stable(X ). Proof. Let us denote by Y the set of H0p -martingales which are orthogonal to N . Of course X ⊆ Y so it is sufficient to prove that Y is a stable subspace of H0p . As we remarked Y is closed under stopping. Let Mn ∈ Y and let Mn → M∞ in H0p . As N is bounded Mn N is a local martingale which is in class D. Hence it is a uniformly integrable martingale. So E ((Mn N ) (τ )) = 0 for every stopping time τ . Let k < ∞ be an upper bound of N. |E ((M∞ N ) (τ ))| = |E ((M∞ N ) (τ )) − E ((Mn N ) (τ ))| ≤ ≤ E (|((M∞ − Mn ) N ) (τ )|) ≤ ≤ k · E (|(M∞ − Mn ) (τ )|) ≤
" [M∞ − Mn ] (∞) ≤ ≤k·E ≤ k · M∞ − Mn Hp → 0. 0
So M∞ N is also a martingale. Hence Y {X ∈ H0p : X ⊥ N } is closed in H0p . Definition 5.31 Let 1 ≤ p < ∞. We say that the subset X ⊆ H0p has the Martingale Representation Property if H0p = stable(X ). Recall that we have fixed a stochastic base (Ω, A, P, F). Definition 5.32 Let 1 ≤ p < ∞. Let us say that the probability measure Q on (Ω, A) is a H0p -measure of the subset X ⊆ H0p if 1. Q P, 2. Q = P on F0 , 3. if M ∈ X then M is in H0p under Q as well. Mp (X ) will denote the set of H0p -measures of X .
330
SOME OTHER THEOREMS
Lemma 5.33 Mp (X ) is always convex. Proof. If Q1 , Q2 ∈ Mp (X ) and 0 ≤ λ ≤ 1 and Qλ λQ1 + (1 − λ)Q2 then for every M ∈ X EQλ
p sup |M (t)| = t
Q1
= λE
p p Q2 sup |M (t)| + (1 − λ)E sup |M (t)| < ∞. t
t
If F ∈ Fs and t > s then by the martingale property under Q1 and Q2 M (t)dQλ = λ M (t)dQ1 + (1 − λ) M (t)dQ2 = F
F
F
M (s)dQ1 + (1 − λ)
=λ F
M (s)dQ2 = F
M (s)dQλ .
= F
Hence M is in H0p under Qλ . Definition 5.34 If C is a convex set and x ∈ C then we say that x is an extremal point of C if whenever u, v ∈ C and x = λu + (1 − λ)v for some 0 ≤ λ ≤ 1 then x = u or x = v. Proposition 5.35 Let 1 ≤ p < ∞ and let X ⊆ H0p . If X has the Martingale Representation Property then P is an extremal point of Mp (X ). Proof. Assume that P = λQ + (1 − λ) R, where 0 ≤ λ ≤ 1 and Q, R ∈ Mp (X ). As R ≥ 0 obviously Q P so one can define the Radon–Nikodym derivative L (∞) dQ/dP ∈ L1 (Ω, P, F∞ ). Define the martingale L (t) E (L (∞) | Ft ) . From the definition of the conditional expectation L (t) dP = L (∞) dP = Q (F ) , F
F ∈ Ft ,
F
so L (t) is the Radon–Nikodym derivative of Q with respect to P on the measure space (Ω, Ft ). Let X ∈ X . If s < t and F ∈ Fs then as X is a
MARTINGALE REPRESENTATION
331
martingale under Q dQ X (t) L (t) dP = X (t) X (t) dQ = dP = dP F F F = X (s) dQ = X (s) L (s) dP F
F
so XL is a martingale under P. Obviously Q ≤ P/λ so 0 ≤ L ≤ 1/λ. Hence L is uniformly bounded. L (0) is bounded and F0 -measurable so X · L (0) is a martingale. This implies that X · (L − L (0)) is also a martingale under P, that is X and L − L (0) are orthogonal as local martingales. That is L − L (0) is orthogonal to X . Hence by the previous lemma L − L (0) is orthogonal to stable(X ). As X has the Martingale Representation Property L − L (0) is orthogonal to H0p . As L − L (0) is bounded L − L (0) ∈ H0p . But this means37 that L − L (0) = 0. By definition Q and P are equal on F0 , hence L (∞) = L (0) = 1. Hence P = Q. Now we want to prove the converse statement for p = 1. Let P be an extremal point of Mp (X ) and assume that X does not have the Martingale Representation Property, that is stable(X ) = H0p . As stable(X ) is a closed linear space by the Hahn–Banach theorem there is a non-zero linear functional L for which L (stable(X )) = 0.
(5.33)
Assume temporarily that L has the following representation: there is a locally bounded local martingale N such that L (M ) = E ([M, N ] (∞)) ,
M ∈ H0p .
(5.34)
stable(X ) is closed under truncation, hence for every stopping time τ τ
E ([M, N τ ] (∞)) = E ([M, N ] (∞)) = = E ([M τ , N ] (∞)) = L (M τ ) = 0 whenever M ∈ stable(X ). Hence instead of N we can use N τ . As N is locally bounded we can assume that N is a uniformly bounded martingale. Instead of N we can also write N − N (0) so one can assume that N (0) = 0. Let |N | ≤ c. If N (∞) N (∞) dQ 1 − dP, dR 1 + dP 2c 2c then Q and R are non-negative measures. As N is a bounded martingale E (N (∞)) = E (N (0)) = E (0) = 0, 37 See:
Proposition 4.4, page 228.
332
SOME OTHER THEOREMS
so Q and R are probability measures and obviously P = (Q + R) /2. If X ∈ X then
p
p
sup |X(s)| dQ =
sup |X(s)|
s
Ω
Ω
s
N (∞) 1− 2c
dP ≤
p
≤2
sup |X(s)| dP < ∞. Ω
s
If s < t and F ∈ Fs then
N (∞) X(t) 1 − dP = 2c F 1 = X(t)dP − X (t) N (∞) dP = 2c F F 1 = X (s) dP − X (t) N (∞) dP. 2c F F
X (t) dQ F
As F ∈ Fs σ(ω)
if ω ∈ F if ω ∈ /F
s ∞
is a stopping time. As s ≤ t τ (ω)
t if ω ∈ F ∞ if ω ∈ /F
is also a stopping time. Hence X τ , X σ ∈ stable(X ), so
X τ − X s = X t − X s χF ∈ stable(X ).
(5.35)
Obviously H0p ⊆ H01 if p ≥ 1 so |M N | ≤ sup |M | (t) sup |N | (t) ∈ L1 (Ω) . t
t
As N is bounded obviouly38 N ∈ H0q . Hence by the Kunita–Watanabe inequality using also H¨ older’s inequality |[M, N ]| ≤
"
" [M ] (∞) [N ] (∞) ∈ L1 (Ω) .
38 Recall the definition of the Hp spaces! See: (5.32) on page 328. Implicitly we have used the Burkholder–Davis inequality.
MARTINGALE REPRESENTATION
333
By this M N − [M, N ] is a class D local martingale hence it is a uniformly integrable martingale39 . Hence E (M (∞) N (∞)) = E (M (∞) N (∞)) − L (M ) = = E (M (∞) N (∞) − [M, N ] (∞)) = = E (M (0) N (0) − [M, N ] (0)) = 0 so by (5.35)
E N (∞) χF X t (∞) = E (N (∞) χF X s (∞)) . Therefore
X (t) N (∞) dP = F
X (s) N (∞) dP. F
Hence X is a martingale under Q. This implies that Q ∈ Mp (X ). In a similar way R ∈ Mp (X ) which is a contradiction. So one should only prove that if stable(X ) = H0p then there is a locally bounded local martingale N for which (5.33) and (5.34) hold. It is easy to see that if p > 1 then the dual of H0p is H0q , where of course 1/p + 1/q = 1. The H0q martingales are not locally bounded40 so the argument above is not valid if p > 1. Assume that p = 1. Proposition 5.36 If L is a continuous linear functional over H01 then (5.34) holds, that is for some locally bounded local martingale N L (M ) = E ([M, N ] (∞)) ,
M ∈ H01 .
Proof. Obviously H02 ⊆ H01 and M H1 ≤ M H2 so if c L then |L (M )| ≤ c M H1 ≤ c M H2 0
0
so L is a continuous linear functional over H02 . 1. H02 is a Hilbert space so for some N ∈ H02 L (M ) = E (M (∞) N (∞)) ,
M ∈ H02 .
Let M ∈ H02 . From the Kunita–Watanabe inequality41 " " " " |[M, N ]| ≤ [M ] [N ] ≤ [M ] (∞) [N ] (∞) ∈ L1 (Ω) . 39 See:
Example 1.144, page 102. can easily modify Example 1.138, on page 96 to construct a counter-example. 41 Observe that we used again that the two definition of H2 spaces are equivalent. 0 40 One
334
SOME OTHER THEOREMS
Also as M, N ∈ H02 |(M N ) (t)| ≤ sup |M (t)| sup |N (t)| ∈ L1 (Ω) . t
t
Therefore M N − [M, N ] has an integrable majorant so it is a local martingale from class D. Therefore it is a uniformly integrable martingale. This implies that for some N ∈ H02 L (M ) = E (M (∞) N (∞)) = E ([M, N ] (∞)) ,
M ∈ H02 .
(5.36)
2. Now we prove that for almost all trajectory |∆N | ≤ 2c. Let τ inf {t : |∆N | > 2c} . As N (0) = 0 and N is right-continuous τ > 0. If τ (ω) < ∞ then |∆N (τ )| (ω) > 2c. Hence we should prove that P (|∆N (τ )| > 2c) = 0. Every stopping time can be covered by countable number totally inaccessible or predictable stopping times, hence one can assume that τ is either predictable or totally inaccessible. If P (|∆N (τ )| > 2c) > 0 then let ξ
sgn (∆N (τ )) χ (|∆N (τ )| > 2c) . P (|∆N (τ )| > 2c)
S ξχ ([τ , ∞)) is adapted, right-continuous and it has an integrable variation. Let M S − S p . If τ is predictable then the graph [τ ] is a predictable set, hence ∆ (S p ) =
p
(∆S)
p
(ξχ ([τ ])) = (p ξ) χ ([τ ]) .
where p (ξ) is the predictable projection of the constant process U (t) ≡ ξ. By the definition of the predictable projection p
(ξ) (τ ) = E (ξ | Fτ − ) .
If τ is totally inaccessible then P (τ = σ) = 0 for every predictable stopping time σ. Hence p
(∆M ) (σ) =
p
0 (ξχ ([τ ]) (σ) | Fσ− ) = (ξχ ([τ ])) (σ) E
0 (0 | Fσ− ) = 0. =E Hence (∆M p ) = p (∆M ) = 0. Therefore in both cases S p has just one jump which occurs at τ . This implies that M has finite variation and it has just one jump which occurs at τ . As we have seen ξ − E (ξ | Fτ − ) if τ is predictable ∆M (τ ) = . ξ if τ is totally inaccessible
MARTINGALE REPRESENTATION
335
Obviously M H1 E
"
0
)
2 [M ] (∞) = E (∆M ) (τ ) =
= E (|∆M (τ )|) ≤ E (|ξ|) + E (|E (ξ | Fτ − )|) ≤ ≤ 2E (|ξ|) = 2. t
M− dM is a local martingale with localizing sequence (ρn ). By the integration 0 by parts formula and by Fatou’s lemma
E M 2 (t) = E lim M 2 (t ∧ ρn ) ≤ lim supE M 2 (t ∧ ρn ) = n→∞
n→∞
= lim supE ([M ] (t ∧ ρn )) ≤ E ([M ] (t)) ≤ E ([M ] (∞)) = n→∞
2 = E (∆M (τ )) < ∞. Hence M ∈ H02 . If τ is totally inaccessible then L (M ) = E ([M, N ] (∞)) = E ((∆M (τ ) ∆N (τ ))) = = E ((ξ∆N (τ ))) = =
E (|∆N (τ )| χ (|∆N (τ )| > 2c)) > P (|∆N (τ )| > 2c)
> 2c
E (χ (|∆N (τ )| > 2c)) = 2c ≥ c M H1 P (|∆N (τ )| > 2c)
which is impossible. If τ is predictable then E ((∆M (τ ) ∆N (τ ))) = E ((ξ∆N (τ ))) − E ((E (ξ | Fτ − ) ∆N (τ ))) . N is a martingale therefore p (∆N ) = 0 so E (E (ξ | Fτ − ) ∆N (τ )) = E (E (ξ | Fτ − ) E (∆N (τ ) | Fτ − )) = 0, and we can get the same contradiction as above. This implies that |∆N | ≤ 2c. Therefore N is locally bounded. 3. To finish the proof we should show that the identity in the theorem holds not only in H02 but in H01 as well. To do this we should prove that H02 is dense in H01 and E ([M, N ] (∞)) is a continuous linear functional in H01 . Because these statements have some general importance we shall present them as separate lemmas.
336
SOME OTHER THEOREMS
Lemma 5.37 H2 is dense in H1 . Proof. If M ∈ H1 then M = M c + M d , where M c is the continuous part and M d is the purely discontinuous part of M . ' & [M ] = [M c ] + M d so from (5.32) it is obvious that M c , M d ∈ H1 . τ
1. M c is locally bounded so there is a localizing sequence (τ n ) that (M c ) n ∈ H2 for all n. Observe that if (τ n ) is a localizing sequence then by the Dominated Convergence Theorem M τ n − M H1 → 0 for every M ∈ H1 . ∞ 2. For the purely discontinuous part M d = k=1 Lk where Lk are continuLk converges ously compensated single jumps of M . Recall42 that the series in H1 . Therefore it is sufficient to prove the lemma when M S − S p is a continuously compensated single jump. Let τ be the jump-time of M, that is let S ∆M (τ ) χ ([τ , ∞)). Let ξ k ∆M (τ ) χ (|∆M (τ )| ≤ k) . Let Sk = ξ k χ ([τ , ∞)) and Mk Sk −Skp . By the construction of Lk the stopping time τ is either predictable or totally inaccessible. In a same way as in the proof of the proposition just above one can easily prove that Mk has just one jump which occurs at τ . Also as during the previous proof one can easily prove that Mk ∈ H2 . M − Mk H1 = ∆M (τ ) − ∆Mk (τ )1 . If τ is totally inaccessible then as ∆M (τ ) is integrable ∆M (τ ) − ∆Mk (τ )1 = ∆M (τ ) χ (|∆M (τ )| > k)1 → 0. If τ is predictable then we also have the component E (∆M (τ ) χ (|∆M (τ )| > k) | Fτ − )1 . But if k → ∞ then in L1 (Ω) lim E (∆M (τ ) χ (|∆M (τ )| > k) | Fτ − ) = E (∆M (τ ) | Fτ − ) = 0,
k→∞
from which the lemma is obvious. 42 See:
Theorem 4.26, page 236 and Proposition 4.30, page 243.
MARTINGALE REPRESENTATION
337
Our next goal is to prove that E ([M, N ] (∞)) in (5.36) is a continuous linear functional over H01 . To do this we need two lemmas. As a first step we prove the following observation: Lemma 5.38 If for some N ∈ H02 E (|[M, N ] (∞)|) ≤ c · M H1 , 0
M ∈ H02
then %
2 sup E (N (∞) − N (τ −)) | Fτ τ
<∞
∞
where the supremum is taken over all possible stopping times. Proof. Let τ be a stopping time and let M N − N τ . Obviously M ∈ H2 . E ([M ] (∞)) E ([M, M ] (∞)) = E ([M, N ] (∞)) ≤ c · M H1 =
"
" [M ] (∞) = c · E [M ] (∞)χ (τ < ∞) ≤ =c·E " " ≤ c · E ([M ] (∞)) P (τ < ∞). Hence E ([N ] (∞) − [N ] (τ )) = E ([M ] (∞)) ≤ c2 · P (τ < ∞) . If F ∈ Fτ and one applies the inequality for the stopping time τ (ω) if ω ∈ F τ F (ω) ∞ if ω ∈ /F then we get the inequality [N ] (∞) − [N ] (τ ) dP = E ([N ] (∞) − [N ] (τ )) ≤ c2 · P (F ) . F
From this E ([N ] (∞) − [N ] (τ ) | Fτ ) ≤ c2 . M 2 − [M ] is a uniformly integrable martingale, hence almost surely
2 E (N (∞) − N (τ )) | Fτ E M 2 (∞) | Fτ = = E ([M ] (∞) | Fτ ) = = E ([N ] (∞) − [N ] (τ ) | Fτ ) ≤ c2 .
338
SOME OTHER THEOREMS
During the proof of the proposition we proved that the jumps of N are bounded so
2 2 E (N (∞) − N (τ −)) | Fτ = E (N (∞) − N (τ ) + ∆N (τ )) | Fτ =
2 2 = E (N (∞) − N (τ )) + (∆N (τ )) | Fτ ≤a, where a is independent of τ . From this the lemma follows. Finally we prove the next inequality: Lemma 5.39 (Fefferman’s inequality) If M ∈ H01 and N ∈ H02 then E (|[M, N ] (∞)|) ≤
√
2 · M H1 0
%
2 · sup E (N (∞) − N (τ −)) | Fτ τ
∞
where the supremum is taken over all stopping times. Proof. Let
a
0 a−1
if a = 0 . if a > 0
From the Kunita–Watanabe inequality
∞
0
1dVar ([M, N ]) ≤ -
≤
∞
"
) [M ] +
[M ]−
0
-
∞
d [M ]
"
) [M ] +
0
[M ]− d [N ].
Therefore by the Cauchy–Schwarz inequality 2
(E (|[M, N ] (∞)|)) ≤ ≤E
∞
"
) [M ] +
[M ]−
0
d [M ] E 0
Let (n)
a = t0
(n)
< t1
"
∞
< . . . < t(n) n =b
2
[M ]d [N ] .
MARTINGALE REPRESENTATION
339
be an infinitesimal sequence of partitions of [a, b]. Let f > 0 be a right-regular function with bounded variation on [a, b]. n " " f (b) − f (a) =
%
%
(n) (n) f ti − f ti−1 =
i=1
=
(n) (n) f ti − f ti−1 %
%
. (n) (n) + f ti−1 f ti
n i=1
f generates a finite measure on [a, b]. As f is right-regular and it is positive 1 %
%
(n) (n) + f ti−1 f ti (n) (n) is bounded and for every t ∈ ti−1 , ti 1 1 " % .
%
→" f (t) + f (t−) (n) (n) + f ti−1 f ti So by the Dominated Convergence Theorem it is easy to see that if n → ∞ then " " f (b) − f (a) = a
b
"
1 "
f (t) +
f (t−)
df (t) .
With the Monotone Convergence Theorem one can easily prove that if f is a right-regular, non-negative, increasing function then43 "
f (∞) −
"
f (0) =
∞
"
f (t) +
" f (t−) df (t) .
0
Using this E 0 43 See:
∞
"
) [M ] +
[M ]−
Example 6.50, page 400.
"
d [M ] = E [M ] (∞) M H1 . 0
340
SOME OTHER THEOREMS
Let us estimate the second integral. Integrating by parts
∞
E 0
"
=E
[M ]d [N ] =
"
∞
[M ] (∞) [N ] (∞) − 0
∞
=E
0
[N ]− d
"
[M ] =
" [N ] (∞) − [N ]− d [M ] .
It is easy to see that44
∞
E
[N ] (∞) d 0
=E
"
" [M ] = E [N ] (∞) [M ] (∞) =
[N ] (∞)
"
[M ] (sk ) −
"
[M ] (sk−1 )
=
k
=E
E ([N ] (∞) | Fsk )
"
k
=E
∞
E ([N ] (∞) | Fs ) d
"
[M ] (sk ) −
"
[M ] (sk−1 )
=
[M ] (s) .
0
So if %
2 k sup E (N (∞) − N (τ −)) | Fτ τ
∞
then
∞
E 0
"
[M ]d [N ] =
∞
=E
E ([N ] (∞) | Fs ) − [N ] (s−) d
"
[M ] (s)
0
∞
=E 0
=E
∞
E ([N ] (∞) | Fs ) − [N ] (s) + ∆ [N ] (s) d
= " [M ] (s)
" E N (∞) − N (s) + (∆N (s)) | Fs d [M ] (s) =
2
2
2
0
one should assume that [N ] (∞) is bounded and we should use that [M ] (∞) is integrable. Then with Monotone Convergence Theorem one can drop the assumption that [N ] (∞) is bounded. 44 First
MARTINGALE REPRESENTATION
∞
=E 0
≤k ·E 2
341
" 2 E (N (∞) − N (s−)) | Fs d [M ] (s) ≤
"
[M ] (∞) = k 2 · M H1 . 0
So 2
2
(E (|[M, N ] (∞)|)) ≤ 2 · k 2 · M H1 0
which proves the inequality. Definition 5.40 N is a BMO martingale if N ∈ H2 and %
2 sup E (N (∞) − N (τ −)) | Fτ τ
< ∞.
∞
Corollary 5.41 The BMO martingales are locally bounded. Corollary 5.42 (Dual of H01 ) L is a continuous linear functional over H01 if and only if for some BMO martingale N L (M ) = E ([M, N ] (∞)) . The dual of the Banach space H01 is the space of BMO martingales. Let us return to the Martingale Representation Problem. We proved the following statement: Theorem 5.43 (Jacod–Yor) The set X ⊆ H01 has the Martingale Representation Property if and only if the underlying probability measure P is an extremal point of M1 (X ). Proposition 5.44 Let 1 ≤ p < ∞ and let X be a closed linear subspace of H0p . The following properties are equivalent: 1. If M ∈ X and H • M ∈ H0p for some predictable process H then H • M ∈ X . 2. If M ∈ X and H is a bounded and predictable process then H • M ∈ X . 3. X is stable under truncation, that is if M ∈ X and τ is an arbitrary stopping time then M τ ∈ X . 4. If M ∈ X , s ≤ t ≤ ∞ and F ∈ Fs then (M t − M s ) χF ∈ X . Proof. Let H be a bounded predictable process and let |H| ≤ c.
[H • M ] (∞) = H 2 • [M ] (∞) ≤ c2 [M ] (∞)
342
SOME OTHER THEOREMS
so if M ∈ H0p then H • M ∈ H0p and the implication 1.⇒ 2. is obvious. If τ is an arbitrary stopping time then χ ([0, τ ]) • M = 1 • M τ = M τ − M (0) = M τ hence 2. implies 3. If F ∈ Fs then τ (ω)
s if ω ∈ F ∞ if ω ∈ /F
is a stopping time. If 3. holds then M τ ∈ X . As s ≤ t t if ω ∈ F σ(ω) ∞ if ω ∈ /F is also a stopping time hence M σ ∈ X . As X is a linear space M σ − M τ ∈ X . But obviously M σ − M τ = (M t − M s )χF , hence 3. implies 4. Now let H=
χFi χ ((ti , ti+1 ])
(5.37)
i
where Fi ∈ Fti . Obviously (H • X) (t) =
χFi (M (t ∧ ti+1 ) − M (t ∧ ti ))
i
and by 4. H • M ∈ X . Hn • M − H • M Hp = (Hn − H) • M Hp = 0 0 " = [(Hn − H) • M ] (∞) = p ) 2 = (Hn − H) • [M ] (∞) . p
" M ∈ H0p so [M ] (∞) < ∞. Therefore if Hn → H is a uniformly bounded p
sequence of predictable processes then from the Dominated Convergence Theorem it is obvious that ) 2 Hn • M − H • M Hp = (Hn − H) • [M ] (∞) → 0. 0
p
MARTINGALE REPRESENTATION
343
X is closed so if Hn • M ∈ X for all n then H • M ∈ X as well. Using this property and 4. with the Monotone Class Theorem one can easily show that if H is a bounded predictable process then H • M ∈ X . If H • M ∈ H0p for some predictable process H then " (H 2 • [M ]) (∞) < ∞. p
From this as above it is easy to show that in H0p H (χ (|H| ≤ n)) • M → H • M, so H • M ∈ X . Proposition 5.45 If 1 ≤ p < ∞ and M ∈ H0p then the set C {X ∈ H0p : X = H • M } is closed in H0p . Proof. It is easy to see that the set of predictable processes H for which45 " HLp (M ) H 2 • [M ] (∞) < ∞ (5.38) p
is a linear space. In the usual way, as in the classical theory of Lp -spaces46 , one can prove that if H1 ∼ H2 whenever H1 − H2 Lp (M ) = 0 then the set of equivalence classes, denoted by Lp (M ), is a Banach space. Let Xn ∈ C and assume that Xn → X in H0p . Let Xn = Hn • M . " " Xn Hp [Xn ] (∞) = H 2 • [M ] (∞) Hn Lp (M ) . 0 p
p
This implies that (Hn ) is a Cauchy sequence in Lp (M ), so it is convergent, hence Hn → H in Lp (M ) for some H and Hn • M → H • M . Therefore X = H • M , so C is closed. n
Proposition 5.46 Let (Mi )i=1 be a finite subset of H0p . Assume that if i = j then the martingales Mi and Mj are strongly orthogonal 47 as local martingales, that is [Mi , Mj ] = 0 whenever i = j. In this case stable(M1 , M2 , . . . , Mn ) =
n i=1
45 See:
Definition 2.57, page 151 [80], Theorem 3.11, page 69. 47 See: Definition 4.1, page 227. 46 See:
4 Hi • Mi : Hi ∈ Lp (Mi ) .
344
SOME OTHER THEOREMS
That is the stable subspace generated by a finite set of strongly orthogonal H0p martingales is the linear subspace generated by the stochastic integrals Hi • Mi , Hi ∈ Lp (Mi ). Proof. Recall that as in the previous proposition Lp (M ) is the set of equivalence classes of progressively measurable processes for which (5.38) hold. Let I denote the linear space on the right side of the equality. By Proposition 5.44 for all i Hi • Mi ∈ stable(Mi ) ⊆ stable(X ) hence I ⊆ stable(X ). From the stopping rule of the stochastic integrals I is closed under stopping. Mi (0) = 0 and Mi = 1 • Mi so Mi ∈ I for all i. By strong orthogonality * *# $ + n + n + + 2 , E, H i • Mi = Hi • [Mi ] ≤ i=1 i=1 p
p
n ) ≤ Hi2 • [Mi ] . i=1
p
From Jensen’s inequality it is also easy to show that *# n $ + n ) + 1 , √ E Hi2 • [Mi ] ≤ H • M i i . n i=1 i=1 p p
" n " n This means that the norms E [ i=1 Hi • Mi ] and i=1 Hi2 • [Mi ] are p
p
equivalent. In a similar way, as in the previous proposition, one can show that I is a closed linear subspace of H02 . Therefore stable(M1 , . . . , Mn ) ⊆ I.
Example 5.47 The assumption about orthogonality is important.
MARTINGALE REPRESENTATION
345
Let w1 and w2 be independent Wiener processes. Let J (t) t. If M1 w1 ,
M2 (1 − J) • w1 + J • w2
then [M1 , M2 ] = [w1 , (1 − J) • w1 + J • w2 ] = (1 − J) [w1 ] = (1 − J) J which is not a local martingale. So the conditions of the above proposition do not hold. We show that 4 2 p Hi • Mi : Hi ∈ L (Mi ) I i=1
is not a closed set in H0p . Let ε > 0. Obviously (ε)
H1
J −1+ε , J +ε
(ε)
H2
1 J +ε
are bounded predictable processes. (ε)
(ε)
X ε H1 • M1 + H 2 • M2 = 1−J J J −1+ε • w1 + • w1 + • w2 = J +ε J +ε J +ε ε ε • w1 + w2 − • w2 . = J +ε J +ε =
As w1 and w2 are independent
2 t ε ε ε ds → 0, • w1 − • w2 (t) = 2 J +ε J +ε s+ε 0
so Xε → w2 in H0p . Assume that for some H1 and H2 w2 = H1 • M1 + H2 • M2 = = H1 • w1 + H2 (1 − J) • w1 + H2 J • w2 . Reordering (H1 + H2 (1 − J)) • w1 = (1 − H2 J) • w2 . From this [(1 − H2 J) • w2 ] = [(H1 + H2 (1 − J)) • w1 , (1 − H2 J) • w2 ] = = (H1 + H2 (1 − J)) (1 − H2 J) • [w1 , w2 ] = 0,
346
SOME OTHER THEOREMS
so (H1 + H2 (1 − J)) • w1 = (1 − H2 J) • w2 = 0. This implies that 1 − H2 J = (H1 + H2 (1 − J)) = 0 that is H2 = 1/J and H1 = 1 − 1/J. But as t 1− 0
1 s
2 ds = +∞
/ Lp (w1 ). H1 = 1 − 1/J ∈ n
n
Definition 5.48 Let (Mi )i=1 be a finite subset of H0p . We say that (Mi )i=1 has the Integral Representation Property if for every M ∈ H0p M=
n
H i • Mi ,
Hi ∈ Lp (Mi ) .
i=1
The main result about integral representation is an easy consequence of the Jacod–Yor theorem and the previous proposition: n
Theorem 5.49 (Jacod–Yor) Let 1 ≤ p < ∞ and let X (Mi )i=1 be a finite subset of H0p . Assume that if i = j then the martingales Mi and Mj are strongly orthogonal 48 as local martingales, that is [Mi , Mj ] = 0 whenever i = j. If these assumptions hold then X has the Integral Representation Property in H0p if and only if P ∈ Mp (X ). Proof. If X has the Integral Representation Property then49 stable(X ) = H0p so P is an extremal point of Mp (X ). Assume that X does not have the Integral Representation Property. This means that stablep (X ) = H0p . We show that in this case stable1 (X ) = H01 as well: If stable1 (X ) = H01 then for every M ∈ H0p ⊆ H01 M=
n i=1
48 See: 49 See:
Definition 4.1, page 227. Proposition 5.35, page 330.
H i • Mi ,
Hi ∈ L1 (Mi ) .
MARTINGALE REPRESENTATION
347
But by the strong orthogonality assumption for every k $ # n n H i • Mi = Hi2 • [Mi ] ≥ Hi2 • [Mi ] [M ] (∞) = i=1
i=1
" " [M ] (∞) ∈ Lp (Ω) so Hi2 • [Mi ] (∞) ∈ Lp (Ω). Hence Hi ∈ Lp (Mi ) for every i, which is impossible as X does not have the Integral Representation Property in H0p . Hence stablep (X ) ⊆ stable1 (X ) = H01 .
∗ By the Hahn–Banach theorem there is a continuous linear functional L ∈ H01 that L (stable1 (X )) = 0. This implies that L (stablep (X )) = 0. L is of course a BMO martingale so it is locally bounded. As we have remarked one can assume that L is bounded. As we already discussed in this case P is not an extremal point of Mp (X ). The most important example is the following: Example 5.50 If X (wk )n k=1 are independent Wiener processes and the filtration F is the filtration generated by X then X has the Integral Representation Property on any finite interval.
On any finite interval50 wk ∈ H01 . We show that M1 (X ) = {P}. If Q ∈ M1 (X ) then wk is a continuous local martingale under Q for every k. Obviously [wk , wj ] (t) = δ ij t. By L´evy’s characterization theorem51 X (w1 , w2 , . . . , wn ) is an n-dimensional Wiener process under Q as well. This implies that f (X) dP = f (X) dQ. Ω
Ω
for every F∞ -measurable bounded function f . As F is the filtration generated by X this implies that P (F ) = Q (F ) for every F ∈ F∞ so P = Q. Example 5.51 If X (π k )n k=1 are independent compensated Poisson processes and the filtration F is the filtration generated by X then X has the Integral Representation Property on any finite interval.
50 On
finite interval [0, s] w H1 = E
51 See:
Theorem 6.13, page 368.
√ [w] (s) = s. See: Example 1.124, page 87.
348
SOME OTHER THEOREMS
On any finite interval π k ∈ H01 . If two Poisson processes are independent then they do not have common jumps52 so [π k , π j ] = 0. So we can apply the Jacod– Yor theorem. We shall prove that again M1 (X ) = {P}. If X is a compensated Poisson process, then a.s.
[X] (t) − λt = X (t) .
(5.39)
Of course this identity holds under any probability measure Q P. As in the previous example one should show that if X is a local martingale then (5.39) implies that X is a compensated Poisson process with parameter λ. Let us assume that for some process X under some measure (5.39) holds. In this case obviously 2 (∆X) = ∆X, that is if ∆X = 0 then ∆X = 1. [X] has finite variation, hence X also has finite variation, so X ∈ V ∩ L. Hence X is purely discontinuous, 2 that is X is a quadratic jump process: [X] = (∆X) . The size of the jumps is constant, so as [X] is finite for every trajectory there is just finite number of jumps on every finite interval. Let N (t) denote the number of jumps in the interval [0, t]. N (t) − λt = [X] (t) − λt = X (t) .
(5.40)
As X is a local martingale this means that the compensator of N is λt. N is a counting process so exp (itN (u)) = (exp (itN (s)) − exp (itN (s−))) = = s≤u
=
(exp (it (N (s−) + 1)) − exp (itN (s−))) =
s≤u
=
(exp (it) − 1) · exp (itN (s−)) · 1 =
s≤u
= (exp (it) − 1)
exp (itN (s−)) [N (s) − N (s−)] =
s≤u
u
= (exp (it) − 1)
exp (itN (s−)) dN (s) . 0
Taking expected value and using elementary properties of the compensator, and that on every finite interval N has only finite number of jumps u ϕu (t) E (exp (itN (u))) = (exp (it) − 1) E exp (itN (s−)) dN (s) =
0
= (exp (it) − 1) E
p
exp (itN (s−)) dN (s) 0
52 See:
u
Proposition 7.13, page 471.
=
MARTINGALE REPRESENTATION
u
= λ (exp (it) − 1) E
exp (itN (s−)) ds 0
=
u
= λ (exp (it) − 1) E = λ (exp (it) − 1)
349
exp (itN (s)) ds
=
0 u
ϕs (t) ds, 0
where ϕu (t) is the Fourier transform of N (u). Differentiating both sides by u d ϕ (t) = λ (exp (it) − 1) ϕu (t) . du u The solution of this equation is ϕu (t) = exp (λu (exp (it) − 1)) . Hence N (u) has a Poisson distribution with parameter λu. By (5.40) X is a compensated Poisson process with parameter λ. Finally recall that Poisson processes are independent if and only if53 they do not have common jumps. This means that under Q the processes π k remain independent Poisson processes. Example 5.52 Continuous martingale which does not have the Integral Representation Property.
Let ((w1 , w2 ) , G) be a two-dimensional Wiener process. Let X w1 • w2 , and let F be the filtration generated by X. Evidently Ft ⊆ Gt . X is obviously a local martingale under G.
T
w12 d [w2 ]
E ([X] (T )) = E
=
0
T
E w12 (t) dt < ∞
0
so on every finite interval X is in H02 . Hence X is a G-martingale. As X is F-adapted one can easily show that X is an F-martingale. The quadratic variation [X] is F-adapted.
t
w12 d [w2 ] =
[X] (t) = 0 53 See:
Proposition 7.11, page 469 and 7.13, page 471
t
w12 (s) ds, 0
SOME OTHER THEOREMS
350
therefore the derivative of [X] is w12 . This implies that w12 is also F-adapted. As [w1 ] is deterministic Z
1 2 w1 − [w1 ] = w1 • w1 2
is also F-adapted. Z is an F-martingale: If s < t, then using that Z = w12 − [w1 ] is a G-martingale54 M (Z (t) | Fs ) = M (M (Z (t) | Gs ) | Fs ) = = M (Z (s) | Fs ) = Z (s) . If X had the Integral Representation Property then for some Y Z = Y • X Y • (w1 • w2 ) = Y w1 • w2 . As w1 and w2 are independent [w1 , w2 ] = 0. 0 < [Z • Z] = [w1 • w1 , Y • X] = [w1 • w1 , Y w1 • w2 ] = Y w12 • [w1 , w2 ] = 0, which is impossible.
54 w
1
is in H02 .
6 ˆ FORMULA ITO’s Itˆ o’s formula is the most important relation of stochastic analysis. The formula is a stochastic generalization of the Fundamental Theorem of Calculus. Recall that for an arbitrary process X, for an arbitrary differentiable function f and (n) for an arbitrary partition (tk ) of an interval [0, t] f (X(t)) − f (X(0)) =
k
=
(n) (n) f (X(tk )) − f (X(tk−1 )) =
(6.1)
(n) (n) (n) f (ξ k ) X(tk ) − X(tk−1 ) .
k (n)
where ξ k
(n)
(n)
∈ (X(tk−1 ), X(tk )). If X is continuous then by the intermediate (n) ξk
(n) X(τ k ),
(n)
(n)
(n)
value theorem = where τ k ∈ (tk−1 , tk ). If X has finite variation then if n ∞ the sum on the right-hand side will be convergent and one can easily get the Fundamental Theorem of Calculus: f (X(t)) − f (X(0)) =
t
f (X(s))dX(s).
0
On the other hand, if X is a local martingale then the telescopic sum on the right-hand side of (6.1) does not necessarily converge to the stochastic integral t (n) (n) f (X(s))dX(s), as one cannot guarantee the convergence unless τ k = tk−1 . 0 If we make a second-order approximation
(n) (n) (n) (n) (n) f (X(tk )) − f (X(tk−1 )) = f (X(tk−1 )) X(tk ) − X(tk−1 ) +
2 (n) (n) (n) + 12 f (ξ k ) X(tk ) − X(tk−1 ) then the sum of the first order terms
(n) (n) (n) In f X(tk−1 ) X(tk ) − X(tk−1 ) k
351
352
ˆ FORMULA ITO’s
t is an approximating sum of the Itˆ o–Stieltjes integral 0 f (X(s))dX(s). Of course the sum of the second order terms is also convergent, the only question is what is the limit? As ( ( (n) (n) (n) (n) (X(tk ) − X(tk−1 ))2 ≈ X(tk ) − X(tk−1 ) one can guess that the limit is 1 2
t
f (X(s))d [X] (s).
0 (n)
(n)
This is true if X is continuous as in this case again ξ k = X(τ k ) and the second order term is ‘close’ to the Stieltjes-type approximating sum (
1 (n)
( (n) (n) f X τk X(tk ) − X(tk−1 ) . 2 The argument just introduced is ‘nearly valid’ even if X is discontinuous. In this case the first order term is again an Itˆ o–Stieltjes type approximating sum and it is convergent again in Itˆ o–Stieltjes sense and the limit is1
t
f (X(s)) dX(s) =
0
t
f (X− (s)) dX(s).
0
The main difference is that in this case one cannot apply for the second order term the intermediate value theorem. Therefore the second order term is not a simple Stieltjes type approximating sum. If we take only the ‘continuous’ subintervals, then one gets a Stieljes-type approximating sum and the limit is 1 2
t
f (X− (s))d [X c ] .
0
For the remaining terms one can only apply the approximation
2 1 (n) (n) f (ξ k ) ∆X(tk ) = 2
(n) (n) (n) (n) (n) = f (X(tk )) − f (X(tk−1 )) − f (X(tk−1 )) X(tk ) − X(tk−1 ) which converges to f (X(s)) − f (X(s−)) − f (X(s−))∆X(s), 1 See:
Theorem 2.21, page 125. The second integral is convergent in the general sense as well.
ˆ FORMULA FOR CONTINUOUS SEMIMARTINGALES ITO’s
353
so in the limit the second-order term is 1 2
t
0
6.1
f (X− (s))d [X c ] +
(f (X(s)) − f (X(s−)) − f (X(s−))∆X(s)) .
0<s≤t
Itˆ o’s Formula for Continuous Semimartingales
Recall that for continuous semimartingales one has the following integration by parts formula2 : Proposition 6.1 If X and Y are continuous semimartingales then for every t
t
X (t) Y (t) − X (0) Y (0) =
XdY + 0
t
Y dX + [X, Y ] (t) .
(6.2)
0
Theorem 6.2 (Itˆ o’s formula) Let U be an open subset of Rn . If the elements of the vector X (X1 , X2 , . . . , Xn ) are continuous semimartingales, X (t) ∈ U for every t and f ∈ C 2 (U ), then f (X (t)) − f (X (0)) =
n
t
∂f (X) dXk + ∂x k k=1 0 1 t ∂2f (X) d [Xi , Xj ] . + 2 i,j 0 ∂xi ∂xj
(6.3)
Proof. We divide the proof into several steps. 1. As a first step we prove the theorem for polynomials. If f ≡ c, where c is a constant, then the theorem is trivial. It is sufficient to prove that if the identity is valid for a polynomial f then it is true for the polynomial g xl f as well. Assume, that f (X) = f (X (0)) +
∂f 1 ∂2f (X) • Xk + (X) • [Xi, Xj ] . ∂xk 2 i,j ∂xi ∂xj k
By (6.2) g (X) Xl f (X) = = g (X (0)) + Xl • f (X) + f (X) • Xl + [Xl , f (X)] = 2 See:
Proposition 2.28, page 129.
354
ˆ FORMULA ITO’s
∂f = g (X (0)) + Xl • f (X (0)) + Xl • (X) • Xk + ∂xk k ∂2f 1 (X) • [Xi, Xj ] + + Xl • 2 i,j ∂xi ∂xj + f (X) • Xl + [Xl , f (X)] . Now Xl • f (X (0)) = 0, and by the associativity rule for stochastic integrals3 g (X) = g (X (0)) +
Xl
k
∂f (X) • Xk + ∂xk
∂2f 1 Xl (X) • [Xi, Xj ] + + 2 i,j ∂xi ∂xj + f (X) • Xl + [Xl , f (X)] . By the product rule of differentiation ∂f xl ∂x ∂g k = ∂xk xl ∂f + f ∂xl
if k = l .
(6.4)
if k = l
Substituting it in the formula above, g (X) = g (X (0)) +
∂g (X) • Xk + ∂xk k
∂2f 1 Xl (X) • [Xi, Xj ] + [Xl , f (X)] . + 2 i,j ∂xi ∂xj The second partial derivatives of g are ∂2f x l ∂xi ∂xj ∂2f ∂f x + l ∂xl ∂xj ∂xj ∂2g = ∂xi ∂xj ∂f ∂2f xl /+ ∂xi ∂xl ∂xi 2 ∂ f ∂f xl +2 ∂ 2 xl ∂xl 3 See:
Proposition 2.71, page 160.
if i, j = l if i = l, j = l , if i = l, j = l if i = j = l
(6.5)
ˆ FORMULA FOR CONTINUOUS SEMIMARTINGALES ITO’s
355
that is matrices f and g are different only in column l and in row l. It is sufficient to prove that
[Xl , f (X)] =
n ∂f (X) • [Xl , Xj ] . ∂xj j=1
By the induction hypothesis f (X) is a semimartingale. As Xl is continuous the quadratic co-variation of the bounded variation part of f (X) is zero. The quadratic variation of the stochastic integral part is n n ∂f ∂f Xl , (X) • Xk = (X) • [Xl , Xj ] . ∂xk ∂xj j=1 j=1
This means that the theorem is valid for polynomials. 2. Let us prove that one can localize the expression. That is, it is sufficient to prove the theorem for Xτ n where (τ n ) is some localizing sequence of X. Let τ be an arbitrary stopping time. The integrals in the second line are integrals taken by trajectory, hence obviously ' & ∂2f ∂2f τ (Xτ ) • Xiτ , Xjτ = (Xτ ) • [Xi , Xj ] = ∂xi ∂xj ∂xi ∂xj =
∂2f (X) χ ([0, τ ]) • [Xi , Xj ] . ∂xi ∂xj
In a similar way, using the stopping rule for stochastic integrals ∂f ∂f (Xτ ) • Xkτ = (Xτ ) χ ([0, τ ]) • Xk = ∂xk ∂xk ∂f (X) χ ([0, τ ]) • Xk . = ∂xk Assume that the theorem is valid for the truncated processes Xτ n . f ∈ C 2 (U ), hence the trajectories of the ∂f /∂xk (X) and ∂ 2 f /∂xi ∂xj (X) are continuous and therefore they are integrable. Evidently the integrands above are dominated by these common integrable processes. If τ n → ∞, then χ ([0, τ n ]) → 1. Applying the Dominated Convergence Theorem on both sides and using that f (Xτ n ) → f (X) one can easily prove the equality. 3. As X is continuous it is locally bounded. Let (τ n ) be a localizing sequence for which the images of the stopped processes Xτ n are bounded. Let K ⊆ U be a compact set which contains the image of Xτ n . One can prove, that there is a sequence of polynomials (pn ) that in the topology of C 2 (K) one has pn |K → f |K . By the definition of the topology of C 2 all the derivatives
356
ˆ FORMULA ITO’s
are uniformly convergent. As the formula is valid for every polynomial by the Dominated Convergence Theorem it is valid for the function f ∈ C 2 (U ) as well. Proposition 6.3 If the semimartingale Xl has finite variation, then it is sufficient to assume that the partial derivative ∂f /∂xl exists and it is continuous. In this case in the formula (6.3) one can drop the second-order terms with index l. Proof. If Xl has finite variation then as Xi is continuous [Xl , Xi ] = 0. If f is a polynomial, then the second-order terms with index l are zero, and in the approximation we do not need the second-order terms with index l. Corollary 6.4 (Time dependent Itˆ o formula) If the elements of the vector X (X1 , X2 , . . . , Xn ) are continuous semimartingales and the image space of X is part of an open subset U ⊆ Rn and f ∈ C 2 (R+ × U ) then4
t
f (t, X (t)) = f (0, X (0)) + n
0
∂f (s, X (s)) ds+ ∂s
t
∂f (s, X (s)) dXi (s)+ ∂x i i=1 0 n n 1 t ∂2f + (s, X (s)) d [Xi , Xj ] (s). 2 i=1 j=1 0 ∂xi ∂xj
+
If X and Y are real-valued semimartingales then we can define the object Z X + iY , which one can call a complex semimartingale. Let f : C → C be a holomorphic function. f (z) has the representation u(x, y) + iv(x, y), where u and v are differentiable functions. Recall that ∂v ∂u = ∂x ∂y
and
∂u ∂v =− . ∂y ∂x
If Z is a complex semimartingale then f (Z) = u(X, Y ) + iv(X, Y ). 4 It
is sufficient to assume that f is continuously differentiable by the time parameter.
ˆ FORMULA FOR CONTINUOUS SEMIMARTINGALES ITO’s
One can apply Itˆ o’s formula for u and for v. u(X(t), Y (t)) = u(X(0), Y (0))+ t t ∂u ∂u + (X, Y )dX + (X, Y )dY + ∂x 0 0 ∂y 1 t ∂2u (X, Y )d [X, X] + + 2 0 ∂x2 1 t ∂2u + (X, Y )d [Y, Y ] + 2 0 ∂y 2 t 2 ∂ u (X, Y )d [X, Y ] + ∂x∂y 0 and v(X(t), Y (t)) = v(X(0), Y (0))+ t t ∂v ∂v + (X, Y )dX + (X, Y )dY + 0 ∂x 0 ∂y 1 t ∂2v (X, Y )d [X, X] + + 2 0 ∂x2 1 t ∂2v + (X, Y )d [Y, Y ] + 2 0 ∂y 2 t 2 ∂ v (X, Y )d [X, Y ] . + 0 ∂x∂y The sum of the first-order terms is t ∂u ∂u (X, Y )dX + (X, Y )dY + ∂x 0 0 ∂y t t ∂v ∂v +i (X, Y )dX + i (X, Y )dY. 0 ∂x 0 ∂y
t
As ux + ivx = vy − iuy = f this sum is
t
f (Z)d (iY )
f (Z)dX + 0
t
0
0
t
f (Z)dZ.
357
358
ˆ FORMULA ITO’s
Let us calculate the second-order terms:
t
∂2u ∂2v + i 2 d [X, X] = 2 ∂x ∂x
0
t
0
∂2u ∂2v + i d [Y, Y ] = ∂y 2 ∂y 2
t
0
t
− 0
t
= 0
0
t
∂2v ∂2u +i d [X, Y ] = ∂x∂y ∂x∂y
∂2u ∂2v + i 2 d [X, X] , 2 ∂x ∂x
∂2u ∂2v + i 2 d [iY, iY ] , 2 ∂x ∂x
t
− 0
t
= 0
∂2u ∂2v − i d [Y, Y ] = ∂x2 ∂x2
∂2u ∂2v + i 2 d [X, Y ] = 2 ∂ x ∂x
∂2u ∂2v d [X, iY ] . + i ∂x2 ∂2x
Also by definition [Z] [X] + 2i [X, Y ] − [Y ] . Therefore the second order term is 1 2
t
0
∂2u ∂2v 1 + i 2 d [Z] = 2 ∂ x ∂ x 2
t
f (Z)d [Z] .
0
Corollary 6.5 (Itˆ o’s formula for holomorphic functions) If f (t, z) is continuously differentiable in t and it is holomorphic in z and Z is a continuous complex semimartingale then
t
∂f (s, Z (s)) ds+ 0 ∂s 1 t ∂ 2 f (s, Z (s)) ∂f (s, Z (s)) dZ (s) + d [Z] (s) . ∂z 2 0 ∂z 2
f (t, Z (t)) = f (0, Z (0)) + + 0
t
Example 6.6 If Z w1 + iw2 is a planar Brownian motion and f is an entire function then f (Z) is a complex local martingale and
t
f (Z (t)) = f (Z (0)) +
f (Z)dZ.
0
As [w1 , w2 ] = 0 and [w1 ] (t) = [w2 ] (t) = t obviously [Z] = 0.
SOME APPLICATIONS OF THE FORMULA
6.2
359
Some Applications of the Formula
In this section we present some famous and important applications of the formula. 6.2.1
Zeros of Wiener processes
As a first application let us investigate some important properties of the multidimensional Wiener processes. By definition assume that the coordinates of a d-dimensional Wiener process w are independent one-dimensional Wiener processes. To simplify the notation we say that a stochastic process w is a d-dimensional Wiener process starting from some point x ∈ Rd if it has the rep where w is an ordinary d-dimensional Wiener process, resentation w = x + w, obviously starting from the origin. In the same way if x is an F0 -measurable random vector then one can talk about a Wiener process starting from x. Assume that w starts from some vector x. Let5 ϑ inf {w (t) : t ≥ 0} . What is the distribution of ϑ? Theorem 6.7 (Return of a Wiener process to the origin) Every d-dimensional Wiener process w starting from some vector x = 0 satisfies the following6 : 1. If d ≥ 2 then for almost every outcome ω the trajectory w(ω) is never zero, that is P (w (t) = 0, ∀t > 0) = 1. 2. If d = 2 then P (ϑ = 0) = 1, that is, w is almost surely never zero, but it hits every neighborhood of the origin almost surely. 3. If d = 2 then the trajectories of w are almost surely dense in R2 . 4. If d ≥ 3 and w (0) = x = 0 then P (ϑ ≤ r) = 5 In
this section x denotes the norm
6 See:
Corollary B.8. page 565.
r x
d−2 ,
k
x2k .
if
0 ≤ r ≤ x .
360
ˆ FORMULA ITO’s
Proof. Assume that the twice continuously differentiable function f defined on an open set U ⊆ Rd satisfies the Laplace equation d ∂2f k=1
∂x2i
= 0,
f ∈ C 2 (U ) .
(6.6)
Let τ be a stopping time. If a d-dimensional Wiener process w starting from an x remains in U then by Itˆ o’s formula
f (wτ ) − f (w (0)) =
d ∂f (wτ ) • wkτ + ∂xk
k=1
+
' & 1 ∂2f (wτ ) • wiτ , wjτ . 2 i,j ∂xi ∂xj
& ' If i = j then7 wiτ , wjτ = 0τ = 0. Hence as [wiτ ] (s) = s ∧ τ (6.7) f (wτ (t)) − f (x) = f (wτ (t)) − f (w (0)) = d t ∂f 1 τ (wτ ) dwkτ + (∆f )(wτ (s))ds = = 2 0 0 ∂xk k=1
=
d k=1
0
t
∂f (wτ ) dwkτ . ∂xk
Assume that τ < ∞ and w is bounded on the random interval [0, τ ]. In this case the integrands in (6.7) are bounded. As on any finite interval wτ is squareintegrable the stochastic integrals are martingales8 . Hence for every point of time t<∞ E (f (wτ (t))) = E (f (w (t ∧ τ ))) = E (f (w (0))) = E (f (x)) . By the assumption w is bounded on [0, τ ], therefore by the Dominated Convergence Theorem one can take the limit t → ∞. Hence we get the so-called Dynkin’s formula E (f (w (τ ))) = E (f (x)) . 7 See: 8 See:
Example 2.46, page 144. Proposition 2.59, page 152.
(6.8)
SOME APPLICATIONS OF THE FORMULA
361
Using Dynkin’s formula one can deduce the theorem with some direct calculation: 1. With a simple calculation one can show that the function9 f (u)
log u 2−d u
if d = 2 if d ≥ 3
(6.9)
satisfies the Laplace equation (6.6) on the open set U Rd \ {0}. 2. Assume10 that x lies in between the radii 0 < r < R < ∞. The trajectories of Wiener processes are continuous and almost surely unbounded on the half-line t ≥ 0, hence w almost surely leaves the d-dimensional ring B {u : r ≤ u ≤ R} . The only question is whether it leaves the ring first at the outer or at the inner boundary of B. 3. Assume that d ≥ 3. Let11 τ ∂B inf {t : w (t) ∈ ∂B} . By Dynkin’s formula E (f (w (τ ∂B ))) = E (f (x)) = f (x) .
(6.10)
Substituting expression (6.9) for f in (6.10) 2−d
r2−d P (w (τ ∂B ) = r) + R2−d (1 − P (w (τ ∂B ) = r)) = x
,
that is 2−d
P (w (τ ∂B ) = r) =
− R2−d x . r2−d − R2−d
If R → ∞ then using that R2−d converges to zero if R → ∞, the limit of the d−2 right-hand side is (r/ x) . This is the probability that w intersects the ball with radius r ≤ x. 2−d
≡ 1, and one cannot use the previous calcu4. If d = 2 then u lation. In this case f (u) log u. Using the fact that in this case 9 Observe
that if x → ∞ then f behaves differently if n = 2 and if n ≥ 3. that x can be random. It is sufficient to assume, that x is deterministic. 11 See: Example 1.32, page 17. 10 Observe
362
ˆ FORMULA ITO’s
limu→∞ f (u) = ∞ lim P (w (τ ∂B ) = r) = lim
R→∞
R→∞
log x − log R = 1. log r − log R
Hence with probability one w intersects the ball with radius r. For some fixed R if r 0 then log x − log R = 0. r0 log r − log R lim
This means that w starting from some x = 0 reaches the point 0 with probability zero before it leaves the ball with radius R. It is valid for every R so if x = 0 then w can be exactly zero only with probability zero. 5. Assume that x = 0. Let ε > 0. P inf w (t) > 0 = P inf w (t) > 0 | w (ε) = y dρ (y) , t≥ε
t≥ε
Rd
where ρ is the distribution of w (ε). Let us calculate the conditional probability. As w has stationary and independent increments
P inf w (t) > 0 | w (ε) = y
=
t≥ε
= P inf w (t) − w (ε) + w (ε) > 0 | w (ε) = y t≥ε
=
= P inf w (t) − w (ε) + y > 0 = t≥ε
=P
inf w (u) + y > 0
u≥0
=P
=
inf wy (u) > 0 ,
u≥0
where wy is the Wiener process starting from the point y. By the formula already proved for x = 0 in 3. and 4. above P inf w (t) > 0 = t≥ε
Rd
P inf wy (t) > 0 dρ (y) =
=
Rd \{0}
=
t≥0
y
P inf w (t) > 0 dρ (y) = t≥0
1dρ (y) = 1. Rd \{0}
SOME APPLICATIONS OF THE FORMULA
363
If ε → 0 then P (w (t) > 0, ∀t > 0) = lim P inf w (t) > 0 = 1. t≥ε
ε0
This means that with probability one w does not return back to the origin. Hence we have proved the theorem for all initial vectors x ∈ Rd . 6. Instead of balls around the origin one can take any ball. If we take the balls with rational centers and rational radii then the two-dimensional Wiener process with probability one intersects all of them. Therefore the trajectories of the Wiener processes are dense in R2 . In the same way one can prove the following: Corollary 6.8 Let d ≥ 3 and let w be a d-dimensional Wiener process starting from some random vector x. If x is deterministic then P (ϑ ≤ r) =
r x
d−2 ,
if
0 ≤ r ≤ x .
Corollary 6.9 If d ≥ 3 and w is a d-dimensional Wiener process then limt→∞ w (t) = ∞. Proof. Let r > 0 be arbitrary and for any a ≥ r let τ a inf {t : w (t) ≥ a} . As almost surely12 lim sup w (t) = ∞ t→∞
obviously τ a < ∞ almost surely. By the strong Markov property of w w∗ (t) (w (t + τ a ) − w (τ a )) + w (τ a ) ,
t≥0
is a Wiener process starting from the random point w (τ a ) ∈ {u = a} . Since d ≥ 3 P (∃t ≥ τ a , w (t) ≤ r) = P (∃t ≥ 0, w∗ (t) ≤ r) = 12 See:
Proposition B.7, page 564.
r d−2 a
.
364
ˆ FORMULA ITO’s
If a ∞ then this probability goes to zero. Let an ∞. The probability that w (t) returns to the ball {u ≤ r} after infinitely many τ an is zero. Hence with probability one for any ω there is an n n (ω) that w (t, ω) ∈ / {u ≤ r}
t ≥ τ n (ω) .
That is with probability one13 if t ∞ then w (t, ω) → ∞. Example 6.10 Hitting times of open and closed sets in higher dimensions14 .
. / 1. Let B (x0 , r) x ∈ Rd : x − x0 < r . Let x0 = 0 ∈ B (0, 1) and let f (x) g (x − x0 )
log x − x0 2−d x − x0
if if
d=2 . d≥3
Obviously f satisfies the Laplace equation (6.6) on Rd \ {x0 }. If B (x0 , r) ⊆ B (0, 1) and B B (0, 1) \ cl (B (x0 , r)) then f is bounded on B. Let w be a d-dimensional Wiener process and let τ inf {t : w (t) ∈ ∂B (0, 1)} . As lim supt w (t) = ∞, obviously15 almost surely τ < ∞. By Itˆo’s formula X f (wτ ) is a bounded local martingale on B, therefore X is a uniformly integrable martingale16 . Hence if ρ inf {t : w (t) ∈ ∂B} , then E (X (ρ)) = E (X (0)) = f (0) . If ρ1 inf {t : w (t) ∈ ∂B (0, 1)} ρ2 inf {t : w (t) ∈ ∂B (x0 , r)} 13 Take
r 1, 2, . . . . Corollary B.12, page 566. 15 See: Proposition B.7, page 564. 16 See: Corollary 1.145, page 103. 14 See:
SOME APPLICATIONS OF THE FORMULA
365
then as ρ = ρ1 ∧ ρ2 f (0) = E (X (ρ)) = = E (X (ρ1 ) χ (ρ1 ≤ ρ2 )) + E (X (ρ2 ) χ (ρ2 < ρ1 )) . Obviously E (X (ρ2 ) χ (ρ2 < ρ1 )) = g (r) · P (ρ2 < ρ1 ) and for some k E (X (ρ1 ) χ (ρ1 ≤ ρ2 )) ≤ k for all r > 0. This implies that for any 0 < r < 1 P (ρ1 > ρ2 ) =
g (x0 ) − k log x0 − E (X (ρ2 ) χ (ρ1 > ρ2 )) ≤ . g (r) g (r)
If r 0, then the right-hand side goes to zero, so for any ε > 0 there is an r > 0 such that P (ρ2 < ρ1 ) < ε. 2. Let (qi ) be the non-zero rational points of B (0, 1) and for any i let ri > 0 be such that
(i) P ρ2 < ρ1 < 2−(i+1) where of course (i)
ρ2 inf {t : w (t) ∈ ∂B (qi , ri )} . Let G ∪i B (qi , ri ). Obviously G is open and τ cl(G) inf {t : w (t) ∈ cl (G)} = inf {t : w (t) ∈ cl (B (0, 1))} = 0. On the other hand obviously ρ1 > 0 and if τ G inf {t : w (t) ∈ G} then P (τ G ≥ τ 1 ) = 1 − P (τ G < τ 1 ) ≥ 1 − ≥1−
i
i
2−(i+1)
1 ≥ . 2
Therefore τ cl(G) and τ G are not almost surely equal.
(i) P ρ2 < ρ1 ≥
366
ˆ FORMULA ITO’s
6.2.2
Continuous L´ evy processes
Let X be a continuous L´evy process. Since X is continuous all the moments of X are finite17 . Hence X(t) has an expected value for every t. Observe that as on any finite interval the second moments are bounded X is uniformly integrable on these intervals. Therefore E(X(t)) is continuous in t, hence E(X(t)) = tE(X(1)). Therefore if m denotes the expected value of X(1) then X(t)−t·m is a martingale. This means that X is a continuous semimartingale. To simplify the notation assume that m = 0. By the definition of the quadratic variation [X] is also a continuous L´evy process. This again implies that Y (t) [X] (t) − E ([X] (t)) is a martingale. As Y obviously has finite variation by Fisk’s theorem18 it is a constant. So [X] (t) = E ([X] (t)) = a · t. By Itˆ o’s formula t exp (iuX (s)) dX (s) − exp (iuX (t)) − 1 = iu 0
−
2
u 2
t
exp (iuX (s)) d [X (s)] . 0
exp (iuX (t)) is bounded and the quadratic variation of X is deterministic, therefore by the characterization of H2 -martingales19 the stochastic integral is a martingale. Taking expected value on both sides t u2 E (exp (iuX (t))) − 1 = − E exp (iuX (s)) d [X (s)] = 2 0 t u2 exp (iuX (s)) d (as) = =− E 2 0 u2 t E (exp (iuX (s))) ds. = −a 2 0 If ϕ (u, t) E (exp (iuX (t))) then u2 ϕ (u, t) − 1 = −a 2
t
ϕ (u, s) ds. 0
Differentiating w.r.t. t u2 dϕ (u, t) = −a · ϕ (u, t) . dt 2 Solving the differential equation, u2 ϕ (u, t) = exp −a · t 2 17 See:
Proposition 1.111, page 74. Theorem 2.11, page 117. 19 See: Proposition 2.53, page 148. 18 See:
SOME APPLICATIONS OF THE FORMULA
367
for every u. By√the formula of √ the Fourier transform for the normal distribution X (t) ∼ = N 0, at . Hence X/ a is a Wiener process. In general m is not zero, hence we have proved the next proposition: Theorem 6.11 Every continuous L´evy process is a linear combination of a Wiener process and a linear trend. One can extend the theorem to processes with independent increments: Theorem 6.12 Every continuous process with independent increments is a Gaussian process, that is for every t1 , t2 , . . . , tn (X (t1 ) , X (t2 ) , . . . , X (tn )) has Gaussian distribution. Proof. If X has independent increments then Z (t) X (t + s) − X (s) also has independent increments for every s. Therefore it is easy to prove that it is sufficient to show that X (t) has a Gaussian distribution for every t. By the continuity of X all the moments of X are bounded on every finite interval20 . Therefore the expected value E (X (t)) is finite for every t. As X is bounded in L2 (Ω) on every finite interval it is uniformly integrable on any finite interval, so E (X (t)) is continuous. Hence it is easy to see that Y (t) X (t) − E (X (t)) is a continuous martingale. Therefore one may assume that X is a continuous martingale. As X has independent increments [X] also has independent increments, so U (t) [X] (t) − E ([X] (t)) is again a continuous martingale. As [X] is increasing U has finite variation. So by Fisk’s theorem almost surely U ≡ 0. Therefore one can assume that [X] is deterministic. By Itˆo’s formula
t
exp (iuX (s)) dX (s) −
exp (iuX (t)) − 1 = iu 0
u2 − 2
t
exp (iuX (s)) d [X (s)] . 0
exp (iuX) is bounded and on any finite interval X ∈ H2 , therefore the stochastic integral is a martingale21 . Taking expected value u2 E (exp (iuX (t))) − 1 = − · E 2 20 See: 21 See:
Proposition 1.114, page 78. Proposition 2.24, page 128.
0
t
exp (iuX (s)) d [X] (s) .
(6.11)
368
ˆ FORMULA ITO’s
The quadratic variation is deterministic so one can change the order of the integration: u2 E (exp (iuX (t))) − 1 = − · 2
t
E (exp (iuX (s))) d [X] (s) . 0
If ϕ (u, t) E (exp (iuX (t))), then ϕ satisfies the integral equation u2 ϕ (u, t) − 1 = − · 2
t
ϕ (u, s) d [X] (s) .
(6.12)
0
If 2 u ϕ (u, t) exp − [X (t)] 2
(6.13)
then, as [X] is deterministic with finite variation, ϕ satisfies22 (6.12). One can easily prove23 that (6.13) is the only solution of (6.12). Therefore X (t) has a Gaussian distribution for every t. 6.2.3
L´ evy’s characterization of Wiener processes
The characterization theorem of L´evy is similar to the proposition just proved: it characterizes Wiener processes among the continuous local martingales. If X ∈ L and if [X] (t)√= t then by the same argument24 as above one can prove for every that X (t) ∼ = N 0, t . As X (t + s) − X (s) ∈ L √ s the increments u − v it is easy to prove of X are also Gaussian. As X (u) − X (v) ∼ N 0, = that the increments of X are not correlated. As X has Gaussian increments the increments are independent. Therefore by the same argument as above one can prove that X is a Wiener process with respect to its own filtration25 . Our goal is to prove that X is a Wiener process with respect to the original filtration26 . Theorem 6.13 (L´ evy’s characterization of Wiener processes) Let us fix a filtration F. If the n-dimensional continuous process X (X1 , X2 , . . . , Xn ) is zero at t = 0 then the next three statements are equivalent: 1. X is an n-dimensional Wiener process with respect to F. 2. X is a local martingale with respect to F and [Xi , Xj ] (t) = δ ij t. 22 See:
(6.32), page 398. (6.48), page 416. 24 Of course X ∈ H2 2 loc and not X ∈ H so one can first localize X and then take limit in (6.11) otherwise the argument is nearly the same. 25 See: Definition B.1, page 559. 26 See: Definition B.4, page 561. 23 See:
SOME APPLICATIONS OF THE FORMULA
369
3. Whenever fk ∈ L2 (R+ , λ), where λ is Lebesgue’s measure, then
n
E (i (f • X)) (t) exp i
k=1
0
t
1 fk dXk + 2 n
k=1
t
fk2 dλ 0
will be a complex martingale with respect to F. In particular, if X is a continuous local martingale and Y (t) X 2 (t) − t is a continuous local martingale then X is a Wiener process. Proof. Let us show that each statement implies the next one. 1. The implication 1. ⇒ 2. follows27 from the relation [w] (t) = t. 2. The proof of the implication 2. ⇒ 3. is the following: Using Itˆ o’s formula with a simple calculation one can show that E (if • X) is a local martingale. As fk ∈ L2 (R+ , λ) E (i (f • X)) (t) = exp i
n k=1
t
fk dXk
exp
0
1 2 n
k=1
t
fk2 dλ 0
is uniformly bounded, hence it is a local martingale in class D. Hence E (if • X) is a martingale28 . 3. Finally we prove the implication 3. ⇒ 1. If u ∈ Rn , 0 ≤ r < ∞ and f uχ ([0, r]) then as X (0) = 0 1 2 uk χ ([0, r]) dXk + u2 (t ∧ r) = E (if • X) (t) = exp i 2 k=1 0 1 2 = exp i (u, X (r ∧ t)) + u2 (t ∧ r) . 2
n
t
E (if • X) = 0 is a martingale, hence if s < t < r then
−1 1 = E E (if • X) (t) (E (if • X) (s)) | Fs = 1 2 = E exp i (u, X (t) − X (s)) + u2 (t − s) | Fs , 2 therefore 1 2 E (exp (i (u, X (t) − X (s))) | Fs ) = exp − u2 (t − s) , 2 27 See: 28 See:
Example 2.27. page 129, Example 2.46, page 144. Proposition 1.144, page 102.
ˆ FORMULA ITO’s
370
which means that for any set F ∈ Fs F
1 2 exp (i (u, X (t) − X (s))) dP = P (F ) · exp − u2 (t − s) . 2
√ If F = Ω then this implies that the distribution of Xi (t)−Xi (s) is N 0, t − s . Therefore
exp (i (u, X (t) − X (s))) dP = P (F ) ·
exp (i (u, X (t) − X (s))) dP Ω
F
Since this equality holds for every trigonometric polynomial, by the Monotone Class Theorem for every B ∈ Rn
=
P ({X (t) − X (s) ∈ B} ∩ F ) = χB (X (t) − X (s)) dP = P (F ) χB (X (t) − X (s)) dP = Ω
F
= P ({X (t) − X (s) ∈ B}) · P (F ) . Hence the increment X (t) − X (s) is independent of the σ-algebra Fs . So X is a Wiener process.
Example 6.14 For every Wiener process w the integral sgn (w)•w is a Wiener process.
The process is a continuous local martingale. The quadratic variation of sgn (w) • w is
t
2
(sgn (w)) d [w] = 0
t
2
(sgn (w(s))) ds = t. 0
Example 6.15 The reflected Wiener process is also a Wiener process.
Let w be a Wiener process and let τ be a stopping time. Define the reflected process w 0 (t, ω)
w (t, ω) if t ≤ τ (ω) = (2wτ − w)(t, w). 2w (τ (ω) , ω) − w (t, ω) if t > τ (ω)
SOME APPLICATIONS OF THE FORMULA
371
Obviously w 0 (0) = 0, and the trajectories of w 0 are continuous. It is also obvious that [w] 0 = [2wτ − w] = [2wτ ] − 2[2wτ , w] + [w] = 4[w]τ − 4[w]τ + [w] = [w]. As wτ is a martingale and the sum of martingales is again a margingale w 0 is a continuous local martingale, so by L´evy’s theorem it is a Wiener process. Let us discuss an interesting relation between exponential martingales and the quadratic variation: Proposition 6.16 Let X and A be continuous adapted processes on the half-line t ≥ 0. If X (0) = 0 then the next statements are equivalent: 1. A has finite variation and for every α
exp αX − α2 A/2 is a local martingale, 2. [X] = A, and X is a local martingale.
∈
C the process Yα
Proof. We prove that each statement implies the other one. 1. Assume that Yα is a local martingale and let (σ n ) be a localizing sequence of Yα . Let τ n inf {t : |X (t)| ≥ n} ∧ inf {t : |A (t)| ≥ n} ∧ σ n . Yατ n is a martingale and obviously
|Yατ n |
1 2 ≤ exp |α| n + α n , 2
d τ Yα n ≤ |Yατ n | |X τ n − αAτ n | , dα 2 d τn τn τn τn 2 τn dα2 Yα ≤ |Yα | (X − αA ) − A . It is easy to see that if α is in a bounded neighbourhood of the origin then the expressions on the right-hand side are bounded. Hence in the next calculation one can differentiate under the integral sign at α = 0.
372
ˆ FORMULA ITO’s
If α = 0 then d τn Y = Xτn, dα α hence for any F ∈ Fs
E (X
τn
(t) | Fs ) dP =
F
E F
d τn Y (t) | Fs dP = dα α
d τn Y (t) dP = dα α d Y τ n (t) dP = = dα F α d Y τ n (s) dP = = dα F α d τn Yα (s) dP = X τ n (s) dP, = F dα F =
F
therefore a.s.
E (X τ n (t) | Fs ) = X τ n (s) . Therefore X τ n is a martingale. Hence X is a local martingale. In a similar way, using that at α = 0 d2 Yατ n 2 = (X τ n ) − Aτ n , dα2 2
one can prove that (X τ n ) − Aτ n is a martingale. This implies29 that A is increasing and [X] = A. 2. The implication 2. ⇒ 1. is an easy consequence of Itˆo’s formula. As the quadratic variation of a continuous semimartingale is equal to the quadratic variation of its local martingale part if Z αX − α2 A/2, then Yα = exp (Z) and 1 Yα − Yα (0) = Yα • Z + Yα • [Z] 2 1 2A 2A + Yα • αX − α = Yα • αX − α 2 2 2 29 See:
Proposition 2.40, page 141.
SOME APPLICATIONS OF THE FORMULA
α2 Yα • [X] + 2 α2 Yα • [X] + = αYα • X − 2 = αYα • X, = αYα • X −
373
1 Yα • [αX] = 2 α2 Yα • [X] = 2
which is, as a stochastic integral with respect to a continuous local martingale, a local martingale. 6.2.4
Integral representation theorems for Wiener processes
In this subsection we return to the Integral Representation Problem. Let w be a Wiener process and let F be the filtration generated by w. Let L be a local martingale with respect to F. Let assume that L (0) = 0. Every local martingale has an H1 -localization30 . By the integral representation property of Wiener processes31 Lτ n = H • w on any finite interval. Hence τn
[L]
= [Lτ n ] = [H • w] = H 2 • [w] .
As [w] (t) = t it is obvious that [L] is continuous. Therefore L is continuous. So 2 L ∈ Hloc and one can assume that Lτ n ∈ H2 . This implies that H ∈ L2 (w). By Itˆ o’s isometry32 H is unique in L2 (w). Hence L = H • w for some H ∈ L2loc (w). Proposition 6.17 If w is a Wiener process and L is a local martingale with respect to the filtration generated by w then L is continuous and L = L (0) + H • w with some H ∈ L2loc (w). Our next statement is an easy consequence of L´evy’s characterization theorem. Proposition 6.18 (Doob) Let M be a continuous local martingale on a stochastic base (Ω, A, P, F). If the quadratic variation of M has the representation 0 30 See:
Corollary 3.59, page 221. Example 5.50, page 347. 32 See: Proposition 2.64, page 156. 31 See:
t
α2 (s, ω) ds,
[M ] (t, ω) =
(6.14)
374
ˆ FORMULA ITO’s
where α (t, ω) > 0 and α is an adapted and product measurable process, then there is a Wiener process w on (Ω, A, P, F) for which
t
α (s) dw (s) .
M (t) = M (0) + 0
Proof. One can explicitly construct the Wiener process w: 1 • M. α
w
(6.15)
First we prove that the integral exists. [M ] λ, so if αM is the Dol´eans measure of M then αM λ × P. Therefore the stochastic integrals are defined among adapted product measurable processes33 .
t
0
1 d [M ] = α2
t
0
1 2 α ds = t < ∞. α2
Hence 1/α ∈ L2loc (M ). So 1/α is integrable with respect to M . That is integral (6.15) exists. As M is continuous w is a continuous local martingale. By (6.14) 1 1 • M (t) = [w] (t) • [M ] (t) = t. α α2
Therefore by L´evy’s theorem w is a Wiener process. By (6.14) α is integrable with respect to w, therefore α•w α•
1 •M α
=α
1 • M = 1 • M = M − M (0) . α
Hence the proposition holds. Corollary 6.19 Let M be a continuous local martingale on a stochastic base F of (Ω, A, P, F) A, P, (Ω, A, P, F). If [M ] λ then there is an extension Ω, and a Wiener process w on the extended base space that t% M (t) = M (0) + 0
d [M ] dw (s) . dλ
Proof. Let w 0 be an arbitrary Wiener process on some stochastic base 0 0 0 0 Ω, A, P, F . Let the new stochastic base be the product of (Ω, A, P, F) and 33 See:
Proposition 5.20, page 314.
SOME APPLICATIONS OF THE FORMULA
375
0 F0 . Obviously w 0 A, 0 P, Ω, 0 is independent of A. Let us define α by [M ] (t)
t 0
α2 (s) ds. That is, let % α
d [M ] . dλ
The process
t
w (t) 0
1 χ (α > 0) dM + α
t
χ (α = 0) dw 0 0
is a continuous local martingale. The quadratic co-variation of independent local martingales is zero34 , so [M, w] 0 = 0. Therefore
t
[w] (t) =
χ (α > 0) ds + 0
t
χ (α = 0) ds = t. 0
Hence by L´evy’s theorem w is a Wiener process. 1 χ (α > 0) • M + χ (α = 0) • w 0 = α•w α• α = χ (α > 0) • M. On the other hand [χ (α = 0) • M ] = χ (α = 0) • [M ] = 0, hence χ (α = 0) • M = 0. So α•w = χ (α > 0) • M + χ (α = 0) • M = 1 • M = M − M (0) .
6.2.5
Bessel processes
As an application of L´evy’s theorem let us investigate the Bessel processes. Let w (w1 , w2 , . . . , wd ) be a d-dimensional Wiener process. Define the Bessel process * + d + wk2 . R w w2 , k=1
We assume that w starts at x ∈ Rd , that is R (0) = x. If it is necessary we shall explicitly indicate the initial value x. Evidently the distribution of R 34 See:
Example 2.46, page 144.
376
ˆ FORMULA ITO’s
depends on x only through the size of r x: If x = y then Qx = y for some orthonormal transformation Q. It is easy to show that Qw is also a Wiener process and Qw starts at y. Obviously Rx w = Qw Ry . Proposition 6.20 If d ≥ 2 and r ≥ 0 then if we start w from some point x ∈ Rd with r = x then R w satisfies the integral equation
t
R (t) = r + 0
d−1 ds + B(t), 2R(s)
0 ≤ t < ∞,
(6.16)
wk dwk . R
(6.17)
where B is a Wiener process and B
B (k) ,
s
B (k) (s) 0
k
Put another way, R w satisfies the stochastic differential equation dR =
d−1 dt + dB. 2R
Proof. First observe that the expression in (6.16) is meaningful: as d ≥ 2 the R(s) in the denominator is almost surely not zero for every t ≥ 0. As the integral in (6.16) is taken by trajectories it is also meaningful. On the other hand t 2 t 2 t wk wk d [wk ] = dλ ≤ 1dλ = t, R R 0 0 0 hence the stochastic integrals in (6.17) are in L2 (wk ) on every finite interval. Therefore the stochastic integrals B (k) are also meaningful. 1. By the formula for the quadratic co-variation of the stochastic integrals (
B (k) , B (l) (t) = 0
t
wk wj d [wk , wj ] = δ kj R2
0
t
wk wj dλ, R2
therefore [B] (t) =
( k
t t 2 wk dλ = 1dλ = t. B (k) (t) = R2 0 0 k
The sum of local martingales is again a local martingale. Therefore by the characterization theorem of L´evy B is a Wiener process.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
2. The proof of (6.16) uses the integration by parts formula: 2 2 R (t) − R (0) 2 wk • wk + [wk ] (t) = k
=2
k
377
(6.18)
k t
wk dwk + t · d.
0
35 The multi-dimensional Wiener processes are almost surely not √ zero , therefore 2 almost surely R > 0. Hence one can use Itˆo’s formula with x:
& ' 1 1 111 R−r = √ • R2 − • R2 = 3/2 2 2 2 (R2 ) 2 R2 wk 1 1 2 d = • wk + •λ− 4 wk • λ = R 2R 8 R3 k
=
6.3
k
wk k
(6.19)
R
• wk +
d−1 • λ. 2R
Change of measure for continuous semimartingales
The class of semimartingales is remarkably stable under a lot of operations. For example, by Itˆ o’s formula a C 2 transform of a semimartingale is a semimartingale again. Later we shall show that convex transforms of semimartingales are also semimartingales. In this section we return to the discussion of the operation of equivalent changes of measure. 6.3.1
Locally absolutely continuous change of measure
If a measure Q is absolutely continuous with respect to P then one can define the Radon–Nikodym derivative dQ/dP. If a filtration F satisfies the usual conditions then the process dQ | Ft Λ (t) E dP is a martingale and as dQ dP = Q (F ) , Λ (t) dP = F F dP
F ∈ Ft
35 Let us remark that this is a critical observation as here we used the assumption that n ≥ 2. If n = 1, then one o’s formula as in this case one can only assume that R2 ≥ 0 √ cannot use Itˆ and the function x for x ≥ 0 is not a C 2 function. If we formally still apply the formula, then we get the relation R = sign (w) • w. By Example 6.14. this expression is a Wiener process. The left-hand side is non-negative, hence the two sides cannot be equal.
378
ˆ FORMULA ITO’s
Λ (t) is the Radon–Nikodym derivative of Q on (Ω, Ft , P). On the other hand let Q (t) be the restriction of Q and let P (t) be that of P to Ft . If Q (t) is absolutely continuous with respect to P (t) then one can define the derivative Λ (t)
dQ (t) . dP (t)
If F ∈ Fs ⊆ Ft then
Λ (t) dP
F
F
dQ (t) dP = Q (F ) = dP (t)
F
dQ (s) dP dP (s)
Λ (s) dP, F
hence Λ is a martingale. Of course Λ is not necessarily uniformly integrable, so it can happen that there is no ξ for which Λ (t) = E (ξ | Ft ). To put it another way, it can happen that Q P on Ft for every t, but Q is not absolutely continuous on the σ-algebra F∞ = σ (∪t Ft ). So the derivative dQ/dP need not necessarily exist. Recall the following definition: Definition 6.21 We say that a measure Q is locally absolutely continuous with respect to a measure P if Q (t) P (t) for every t where Q (t) is the restriction of Q and P (t) is the restriction of P to Ft . We shall denote this relation by loc
loc
loc
Q P. If Q P and P Q then we shall say that P and Q are locally loc equivalent. We shall denote this by P ∼ Q. loc
Definition 6.22 If Q P then the right-regular version of Λ (t)
dQ (t) dP (t)
is called the Radon–Nikodym process of P and Q. 6.3.2
Semimartingales and change of measure
We have already proved the following important observations36 : loc
Proposition 6.23 (Invariance of semimartingales) If Q P then every semimartingale under P is a semimartingale under Q. Proposition 6.24 (Integration and change of measure) Let X be an arbitrary semimartingale and assume that the integral H •X exists under the measure loc
P. If Q P then H • X exists under Q as well. Under the measure Q the two processes, the integral under P and the integral under Q, are indistinguishable. 36 See:
Proposition 4.55, page 266, Corollary 4.58, page 271, Proposition 4.59, page 271.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
379
loc
Proposition 6.25 (Transformation of local martingales) Let Q P and let Λ be the Radon–Nikodym process of P and Q. If L is a continuous local martingale under the measure P then under the measure Q: 1. Λ−1 is well defined, 2. the integral Λ−1 • [L, Λ] exists and has finite variation on compact intervals, 3. the expression 0 L − Λ−1 • [L, Λ] L
(6.20)
is a local martingale. loc
Corollary 6.26 If Q ∼ P then Λ > 0 and Λ−1 is a martingale under Q. Proof. One only needs to prove that Λ−1 is a martingale under Q. If F ∈ Fs and t > s then 1 1 (t) dQ = (t) Λ (t) dP = F Λ F Λ 1 = P (F ) = Λ (s) dP = F Λ (s) 1 = dQ. Λ (s) F loc
Corollary 6.27 If Q P and X and Y are semimartingales then [X, Y ] calculated under Q is indistinguishable under Q from [X, Y ] calculated under P. If L is a local martingale and N is a continuous semimartingale then ( 0 N [L, N ] = L, 0 is as in (6.20). where L Proof. As [X, Y ] XY − X (0) Y (0) − Y− • X − X− • Y the first statement is obvious from Proposition 6.24. Λ−1 • [L, Λ] ∈ V and N is continuous so & ( ' 0 N L − Λ−1 • [L, Λ] , N = L, & ' = [L, N ] − Λ−1 • [L, Λ] , N = [L, N ] . 0 in (6.20) is called the Girsanov transform of L. Definition 6.28 L
380 6.3.3
ˆ FORMULA ITO’s
Change of measure for continuous semimartingales
If L is a continuous local martingale then from Itˆ o’s formula it is trivial that the exponential martingale 1 E (L) exp L − [L] 2 is a positive local martingale. Proposition 6.29 (Logarithm of local martingales) If Λ is a positive and continuous local martingale then there is a continuous local martingale L Log (Λ) log Λ (0) + Λ−1 • Λ which is the only continuous local martingale for which 1 Λ = E (L) exp L − [L] . 2 log Λ = L −
1 1 [L] Log (Λ) − [Log (Λ)] . 2 2
Proof. If Λ = E (L1 ) = E (L2 ) , then as Λ > 0 1=
Λ 1 1 = exp L1 − L2 − [L1 ] + [L2 ] , Λ 2 2
that is L1 − L2 = 12 ([L1 ] − [L2 ]). Hence the continuous local martingale L1 − L2 has bounded variation and it is constant. Evidently L1 (0) = L2 (0) , therefore o’s formula L1 = L2 . As Λ > 0 the expression log Λ is meaningful. By Itˆ 1 1 • [Λ] 2 Λ2 1 1 1 • [Λ] = L − [L] . L− 2 Λ2 2
log Λ = log Λ (0) + Λ−1 • Λ −
Therefore
1 Λ = exp (log Λ) = exp L − [L] E (L) . 2 Proposition 6.30 (Logarithmic transformation of local martingales) loc
Assume that P ∼ Q and let Λ (t)
dQ (t) dP
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
381
be continuous. If Λ = E (L), that is L = Log (Λ) then dP (t) = dQ
−1
dQ −1 0 (t) . (t) = (E (L) (t)) = E −L dP
If M is a local martingale under measure P then : = M − [M, L] = M − [M, Log (Λ)] M
(6.21)
is a local martingale under measure Q. loc
Proof. Λ > 0 as P ∼ Q. & ' [M, L] [M, Log (Λ)] M, log Λ (0) + Λ−1 • Λ = & ' = M, Λ−1 • Λ = Λ−1 • [M, Λ] . : M − Λ−1 • [M, Λ] = M − [M, L] . M
1 ( 0 0 0 0 −L, −L = E −L exp −L − 2 1 = exp −L + [L, L] − [L, L] = 2 1 −1 = exp − L − [L, L] = (E (L)) . 2 Proposition 6.31 (Girsanov’s formula) If M and L ∈ L are continuous local martingales and the process
1 Λ E (L) exp L − [L] 2 is a martingale on the finite or infinite interval [0, s] then under the measure Q (A)
Λ (s) dP. A
the process : M − [L, M ] = M − 1 • [Λ, M ] M Λ is a continuous local martingale on [0, s].
(6.22)
382
ˆ FORMULA ITO’s
Proof. L (0) = 0, therefore Λ (0) = 1. Λ is a martingale on [0, s] so Λ (s) dP = 1.
Q (Ω) = Ω
Hence Q is also a probability measure. Λ (t) = E (Λ (s) | Ft ) E
dQ | Ft , dP
that is if F ∈ Ft then
Λ (t) dP =
F
F
dQ dP = Q (F ) , dP
so Λ (t) = dQ (t) /dP (t) on Ft . The other parts of the proposition are obvious from Proposition 6.30. 6.3.4
Girsanov’s formula for Wiener processes loc
Let w be a Wiener process under measure P. If Q P then w is a continuous semimartingale37 under Q. Let M + V be its decomposition under Q. M is a continuous local martingale and M (0) = 0. The quadratic variation of M under Q is38 [M ] (t) = [M + V ] (t) = [w] (t) = t. By L´evy’s theorem39 M is therefore a Wiener process under the measure Q. By (6.20) w 0 w − Λ−1 • [w, Λ] is a continuous local martingale. As Λ−1 • [w, Λ] has finite variation by Fisk’s theorem M = w. 0 If F is the augmented filtration of w then by the integral loc representation property of the Wiener processes Λ is continuous40 . If Q ∼ P then Λ > 0 hence for some L 1 Λ E (L) exp L − [L] . 2
37 See:
Proposition 6.23, page 378. Example 2.26, page 129. 39 See: Theorem 6.13, page 368. 40 See: Proposition 6.17, page 373. 38 See:
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
383
Therefore by Proposition 6.30 M =w 0 = w − [w, L] . If F is the augmented filtration of w then F0 is the trivial σ-algebra, so Λ (0) = 1, hence L (0) = 0. Again by the integral representation theorem there exists an X ∈ L2loc (w) L = L (0) + X • w = X • w,
X ∈ L2loc (w) .
Hence M =w 0 = w − [w, L] = w − [w, X • w] = = w − X • [w] . loc
Hence if P ∼ Q then there is an X ∈ L2loc (w) such that 1 t 2 X (s) ds 2 0 0 1 2 exp X • w − X • [w] (t) E (X • w) 2
Λ (t) exp
t
X (s) dw (s) −
(6.23)
and w 0 (t) w (t) −
t
X (s) ds,
X ∈ L2loc (w)
(6.24)
0
is a Wiener process under Q. On the other hand, let X ∈ L2loc (w, [0, s]). Assume that Λ in (6.23) is a martingale on [0, s]. Define the measure Q by dQ/dP Λ (s). Obviously the process in (6.24) is a Wiener process under Q. Theorem 6.32 (Girsanov formula for Wiener processes) Let w be a Wiener process under measure P and let F be the augmented filtration of w. Girsanov’s transform w 0 of w has the following properties: loc
1. If Q P then the Girsanov transform of w is a Wiener process under measure Q. loc 2. If Q ∼ P then the Girsanov transform of w has the representation (6.24). 3. If X ∈ L2loc (w) and the process Λ in line (6.23) is a martingale over the segment [0, s] then the process w 0 in (6.24) is a Wiener process over [0, s] under the measure Q where dQ/dP Λ (s). Example 6.33 Even on finite intervals Λ E (X • w) is not always a martingale.
384
ˆ FORMULA ITO’s
. / Let u = 1 and let τ inf t : w2 (t) = 1 − t . If t = 0 then almost surely w2 (t, ω) < 1 − t, and if t = 1 then almost surely w2 (t, ω) > 1 − t. So by the intermediate value theorem P (0 < τ < 1) = 1. If X (t)
−2w (t) χ (τ ≥ t) 2
(1 − t)
,
then as τ < 1
1
X 2 d [w] = 4 0
0
τ
w2 (t) (1 − t)
4 dt ≤ 4
0
τ
2
(1 − t)
4 dt
(1 − t)
< ∞.
Hence X ∈ L2loc (w, [0, 1]). By Itˆ o’s formula, if t < 1 then w2 (t) 2
(1 − t) From this I
t
2w2 (s)
=
3 ds + (1 − s)
0
t
0
2w (s)
2 dw (s) + (1 − s)
0
t
1
2 ds.
(1 − s)
τ 1 2 X • [w] = 2 0 0 τ τ τ 2 2 w (τ ) 2w (s) 1 2w2 (s) =− + ds + ds − 2 3 2 4 ds = (1 − τ ) 0 (1 − s) 0 (1 − s) 0 (1 − s) τ 1 1 1 1 2 + 2w (s) + =− 3 − 4 2 ds ≤ 1−τ (1 − s) (1 − s) (1 − s) 0 τ 1 1 + ≤− 2 ds = −1, 1−τ (1 − s) 0 1
Xdw −
1 2
1
τ
X 2 ds = (X • w) −
Therefore Λ (1) = exp (I) ≤ 1/e. Hence E (Λ (1)) = E (exp (I)) ≤
1 < 1 = E (Λ (0)) , e
so Λ is not a martingale. Example 6.34 If w (t) w (t) − µ · t then there is no probability measure Q P on F∞ for which w is a Wiener process under Q.
Let µ = 0 and let A
w 0 (t) w (t) = 0 = lim =µ . t→∞ t→∞ t t lim
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
385
If w 0 is a Wiener process under Q then by the law of large numbers, 1 = Q (A) = P (A) = 0. Therefore Q is not absolutely continuous with respect to P on F∞ . Observe that the martingale 1 Λ (t) = exp µw (t) − µ2 t 2 is not uniformly integrable. Therefore if s = ∞ then Λ is not a martingale on [0, s]. Let us discuss the underlying measure-theoretic problem. Definition 6.35 Let (Ω, F) be a filtered space. We say that the probability spaces (Ω, Ft , Pt ) are consistent, if for any s < t the restriction of Pt to Fs is Ps . The filtered space (Ω, F) is a Kolmogorov type filtered space if whenever (Ω, Ft , Pt ) are consistent probability spaces for 0 ≤ t < ∞, then there is a probability measure P on F∞ σ (Ft : t ≥ 0) such that every Pt is a restriction of P to Ft . Example 6.36 The space C ([0, ∞)) with its natural filtration is a Kolmogorov-type filtered space.
One can identify the σ-algebra Ft with the Borel sets of C ([0, t]). Let C ∪t≥0 Ft . If we have a consistent stream of probability spaces over F, then one can define a set function P (C) Pt (C) on C. C ([0, t]) is a complete, separable metric space so P is compact regular on C, hence P is σ-additive on C. By Carath´eodory’s theorem one can extend P to σ (C) = B (C [0, ∞)) = F∞ . Observe that in Example 6.34 Λ is a martingale so the measure spaces (Ω, Ft , Qt ) are consistent. If we use the canonical representation, that is Ω = C ([0, ∞)) , then there is a probability measure Q on Ω such that Q (t) is a restriction of Q for every t. Obviously w 0 is a Wiener process under Q with respect to the natural filtration F Ω . Recall that by the previous example Q cannot be absolutely continuous with respect to P. The P-measure of set A is zero so A and all of its subsets are in the augmented filtration F P . As Q (A) = 1 obviously w 0 cannot be a Wiener process under F P . If the measures P and Q are not equivalent then the augmented filtrations can be different! Hence with the change of the measure one should also change the filtration. Of course one should augment the natural filtration F Ω because F Ω does not satisfy the usual conditions. There is a simple method to solve this problem. Observe that on every FtΩ the two measures P and Q are equivalent. It is very natural to assume that we augment
386
ˆ FORMULA ITO’s
Ω FtΩ not with every measure-zero set of F∞ but only with the measure-zero sets Ω of the σ-algebras Ft for t ≥ 0. It is not difficult to see that this filtration is right-continuous and most of the results of the stochastic analysis remain valid with this augmented filtration.
There is nothing special in the problem above. Let us show a similar elementary example. Example 6.37 The filtration generated by the dyadic rational numbers.
Let (Ω, A,P) be the interval [0, 1] with Lebesgue’s measure as probability P λ. We change the filtration only at points t = 0, 1, 2, . . .. If n < t < n + 1 then Ft Fn . Obviously F is right-continuous. Let Fn be the σ-algebra generated by the finite number of intervals [k2−n , (k + 1) 2−n ] where k = 0, 1, . . . , 2n − 1. Observe that as the intervals are closed Fn contains all the dyadic rational numbers / Ft . It is also clear that 0 < k2−n < 1. It is also worth noting that {0} , {1} ∈ the dyadic rational numbers 0 < k2−n < 1 form the only measure-zero subsets of Fn . This implies that if Pt is the restriction of P to Ft , then (Ω, Ft , Pt ) is complete. F∞ σ (Ft , t ≥ 0) is the σ-algebra generated by the intervals with dyadic rational endpoints, so F∞ is the Borel σ-algebra of [0, 1]. B ([0, 1]) is not complete under Lebesgue’s measure. If we complete it, the new measure space is the set of Lebesgue measurable subsets of [0, 1]. In the completed space the number of the measure-zero sets is 2c , where c denotes the cardinality of the continuum. If we augment F∞ only with the measure-zero sets of the σalgebras Ft then F∞ does not change. The cardinality of B ([0, 1]) is just c! Let Q be Dirac’s measure δ 0 . If t < ∞, then the set {0} is not in Ft , so if A ∈ Ft and Pt (A) = 0, then Q (A) = 0, that is Q is absolutely continuous with respect loc
to Pt for every t < ∞, that is Q P. Obviously Q P does not hold. 6.3.5
Kazamaki–Novikov criteria
From Itˆ o’s formula it is clear that if L is a continuous local martingale then E (L) is also a local martingale. It is very natural to ask when E (L) will be a true martingale on some [0, T ]. As E (L) ≥ 0, from Fatou’s lemma it is clear that it is a supermartingale, that is if t > s then
E (E (L) (t) | Fs ) = E E lim Lτ n (t) | Fs ≤ n→∞
≤ lim inf E (Lτ n ) (s) = E (L) (s) . n→∞
Hence taking expected value on both sides E (E (L) (t)) ≤ E (E (L) (s))
t ≥ s.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
387
If L (0) = 0 then E (L) (0) = 1 and in this case E (L) is a martingale on some [0, t] if and only if E (E (L) (t)) = 1. Let us first mention a simple, but very frequently used condition: Proposition 6.38 If X is constant and w is a Wiener process then Λ E (X • w) is a martingale on any finite interval [0, t]. A bit more generally: if X and w are independent then Λ E (X • w) is a martingale on any finite interval [0, t]. Proof. The first part of the proposition trivially follows from the formula of the expected value of the lognormal distribution. Using the second condition one can assume that (Ω, A,P) = (Ω1 , A1 , P1 ) × (Ω2 , A2 , P2 ) . X depends only on ω 2 , hence for every ω 1 the integrand below is a martingale on Ω1 so E (Λ (t)) = Λ (t) d (P1 × P2 ) = Ω1 ×Ω2
t
exp Ω2
=
Ω1
0
1 X (ω 2 ) dw (ω 1 ) − 2
t
2
X (ω 2 ) dλ dP1 dP2 = 0
1dP2 = 1. Ω2
The next condition is more general: Proposition 6.39 (Kazamaki’s criteria) If for a continuous local martingale L∈L sup E exp
τ ≤T
1 L (τ ) 2
< ∞,
(6.25)
where the supremum is taken over all stopping times τ for which τ ≤ T then E (L) is a uniformly integrable martingale on [0, T ]. In the case if T = ∞ it is also sufficient to assume that the supremum in (6.25) is finite over just the bounded stopping times.
388
ˆ FORMULA ITO’s
Proof. Observe that if τ is an arbitrary stopping time and (6.25) holds for bounded stopping times then by Fatou’s lemma 1 1 E exp L (τ ) = E lim exp L (τ ∧ n) χ (τ < ∞) ≤ n→∞ 2 2 1 ≤ lim inf E exp L (τ ∧ n) ≤ k. n→∞ 2
1. Let p > 1 and assume that sup E exp
τ ≤T
√ p
√ L (τ ) k < ∞, 2 p−1
(6.26)
where the supremum is taken over all bounded stopping times τ ≤ T . We show that E (L) (τ ) is bounded in Lq (Ω), where 1/p + 1/q = 1. The Lq (Ω)-bounded sets are uniformly integrable hence if (6.26) holds then E (L) is a uniformly integrable martingale. Let √ p+1 . r √ p−1 Let s be the conjugate exponent of r. By simple calculation 1√ p + 1. 2
s= Obviously % q
E (L) = exp
% q q q L − [L] exp q− L . r 2 r
By H¨older’s inequality q
√
% 1/s q E exp s q − . L (τ ) r
1/r
E (E (L) (τ )) ≤ E (E ( rqL (τ )))
√ E rqL is a non-negative local martingale, so it is a supermartingale. Hence by the Optional Sampling Theorem41 the first part of the product cannot be larger than 1. % √ p q ,
= s q− √ r 2 p−1 41 See:
Proposition 1.88, page 54.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
389
hence
q
E (E (L) (τ )) ≤ E exp
1/s √ p
√ L (τ ) ≤ k 1/s . 2 p−1
2. As
exp (x) ≤ exp x+ ≤ exp (x) + 1 one has E exp
1 L (τ ) 2
1 + ≤ E exp L (τ ) ≤ 2 1 L (τ ) + 1 ≤ E exp 2
from which it is obvious that
sup E exp
τ ≤T
1 + L (τ ) <∞ 2
is equivalent to (6.25). 3. Let 0 < a < 1 and assume that √
1< √
p 1 < . p−1 a
Applying the part of the theorem already proved for aL √ a p 1 + +
√ L (τ ) L (τ ) E exp ≤ E exp ≤ 2 2 p−1 ≤ k + 1 < ∞, so E (aL) is a uniformly integrable martingale. a2 1 [L] = E (aL) exp aL − [aL] = exp aL − 2 2 a2 = exp (a (1 − a) L) exp a2 L − [L] = 2
a2
= exp (a (1 − a) L) (E (L))
.
Observe that as E (aL) and E (L) are non-negative supermartingales one can extend the inequality continuously to T = ∞. As L (0) = 0 and as E (aL) is a
390
ˆ FORMULA ITO’s
uniformly integrable martingale E (E (aL) (T )) = 1. By H¨older’s inequality 1 = E (E (aL) (T )) =
a2 = E E (L) (T ) exp (a (1 − a) L (T )) ≤ 1−a2 a (1 − a) ≤ (E (E (L) (T ))) L (T ) = E exp 1 − a2 1−a2 a a2 L (T ) = (E (E (L) (T ))) . E exp 1+a a2
From this L (T ) is not everywhere −∞. The function xy is continuous on the set x > 0, hence by the Dominated Convergence Theorem
lim E exp
a1
1−a2
a L (T ) 1+a
0 1 L (T ) = E exp = 1. 2
Therefore 1 ≤ E (E (L) (T )) from which, by the supermartingale property of E (L), the proposition is obvious.
Corollary 6.40 If L is a continuous local martingale and exp 12 L is a uniformly integrable submartingale then E (L) is a uniformly integrable martingale.
Proof. By the uniform integrability one can take exp 12 L on the closed interval [0, T ]. By the Optional Sampling Theorem for integrable submartingales42 if τ ≤ T then exp
1 L (τ ) 2
≤ E exp
1 L (T ) | Fτ , 2
from which (6.25) holds. Corollary 6.41 If L is a uniformly integrable continuous martingale and
E exp 12 L (T ) < ∞ then E (L) is a uniformly integrable martingale. 42 See:
Proposition 1.88, page 54.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
391
Proof. As L is uniformly integrable L (T ) is meaningful. A convex function of a martingale is a submartingale. exp
1
2 L (t)
≤ E exp 12 L (T ) | Ft .
Taking the expected value on both sides, it is clear that exp 12 L is an integrable submartingale. By the Optional Sampling Theorem
for submartingales exp 12 L (τ ) is integrable for every τ and (6.25) holds. Corollary 6.42 (Novikov’s criteria) If L ∈ L is a continuous local martingale on some finite or infinite interval [0, T ] and
E exp 12 [L] (T ) < ∞,
(6.27)
and Λ E (L) then E (Λ (T )) = E (Λ (0)) = 1 and Λ is a uniformly integrable martingale on [0, T ]. Proof. E (L) is a non-negative local martingale, hence it is a supermartingale. By the Optional Sampling Theorem43 for any bounded stopping time τ E (L (τ )) ≤ E (L (0)) = 1. By the Cauchy–Schwarz inequality 1 L (τ ) ≤ E exp 2 - - [L] (τ ) [L] (τ ) ≤ E exp L (τ ) − E exp 2 2 - - " [L] (τ ) [L] (τ ) E (L (τ )) E exp ≤ E exp ≤ 2 2 - 1 [L] (T ) < ∞. ≤ E exp 2 Hence Kazamaki’s criteria holds. 43 See:
Proposition 1.88, page 54.
392
ˆ FORMULA ITO’s
Corollary 6.43 If L X • w, T is finite and for some δ > 0
sup E exp δX 2 (t) < ∞
(6.28)
t≤T
then
t
Λ (t) exp
Xdw − 0
1 2
t
X 2 dλ 0
is a martingale on [0, T ]. Proof. Let L X • w. By Jensen’s inequality exp
1 T T X 2 (t) 1 [L] (T ) = exp dt ≤ 2 T 0 2 T X 2 (t) 1 T dt. exp ≤ 2 T 0
If T /2 ≤ δ then we can continue the estimation
E exp
1 T X 2 (t) 1 T [L] (T ) ≤ dt ≤ E exp 2 T 0 2
≤ sup E exp δX 2 (t) < ∞ t≤T
n
by condition (6.28), so Novikov’s criteria holds. Hence E (Λ (T )) = 1. Let (tk )k=0 be a partition of [0, T ]. Assume that the size of the intervals [tk−1 , tk ] is smaller than 2δ. If Λk exp
tk+1
tk
then Λ =
!
X (s) dw (s) −
1 2
tk+1
X 2 (s) ds
tk
a.s.
Λk , E (Λk ) = 1 and E (Λk | Ftk ) = 1. Hence
k
E (Λ (T )) = E E Λ (T ) | Ftn−1 =
= E E Λn−1 Λ (tn−1 ) | Ftn−1 =
= E Λ (tn−1 ) E Λn−1 | Ftn−1 = = E (Λ (tn−1 )) = · · · = E (Λ (t1 )) = 1.
CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES
393
Corollary 6.44 If X is a Gaussian process, T is finite and sup D (X (t)) < ∞,
t≤T
then Λ = E (X • w) is a martingale on [0, T ]. If µt and σ t denote the expected value and the standard deviation of X (t) then 2
2
2 1 x − µt 1 exp δx exp − dx = E exp δX (t) = √ 2 σt σ t 2π R
exp δµ2t / (1 − 2δσ t ) √ = . 1 − 2δσ t
If δ < 1/ 2 supt≤T D (X (t)) then E exp δX 2 (t) is bounded. Example 6.45 Novikov’s criteria is an elegant but not a too strong condition.
Let τ be a stopping time. If L is a continuous local martingale, then Lτ is also a continuous local martingale. 1 τ E (Lτ ) = exp Lτ − [Lτ ] = E (L) , 2 so one could write any stopping time τ ≤ T in (6.27) instead of T . If for a stopping time τ 1 τ <∞ (6.29) E exp 2 and w is a Wiener process then Novikov’s condition holds as for Wiener processes [w] (τ ) = τ . Hence if (6.29) holds then 1 = 1. E exp w (τ ) − τ 2 Perhaps the simplest stopping times for Wiener processes are first passage times. Let τ inf {t : w (t) = 1}. Observe that condition (6.29) does not hold44 for τ . It is well-known45 that the Laplace transform of τ is √
l (s) = exp − 2s . 44 See: 45 See:
(1.58), page 83. Example 1.118, page 82.
394
ˆ FORMULA ITO’s
Using this
1 E exp w (τ ) − τ 2
1 = E exp 1 − τ = 2 1 1 = e · E exp − τ =e·l = 1. 2 2
Hence if L = wτ then E (L) is a martingale on [0, ∞]. Perhaps the main weakness of Novikov’s condition is that it depends only on [L] so it holds for L if and only if it holds for −L as well. In our case L wτ and as
1 E (L) (t) = exp w (t) − τ ∧ t 2
τ
≤ exp (wτ (t)) ≤ e
E (L) is bounded. Hence it is a uniformly integrable martingale. However E (−L) is not uniformly integrable as
1 = E (E (−L) (∞)) = E exp (−L) (∞) − τ 2 1 = E exp −1 − τ = 2 1 −1 = e E exp − τ = e−2 < 1. 2
6.4
Itˆ o’s Formula for Non-Continuous Semimartingales
In this section we discuss the generalization of Itˆo’s formula to non-continuous semimartingales. Theorem 6.46 (Itˆ o’s formula) If the coordinates
of the vector X (X1 , X2 , . . . , Xd ) are semimartingales and f ∈ C 2 Rd then f (X (t)) − f (X (0)) = t ∂f 1 t ∂2f c = (X− ) dXk + (X− ) d [Xi , Xj ] + ∂x 2 ∂x ∂x k i j i,j 0 k=1 0 d ∂f (X (s−)) ∆Xk (s) . + f (X (s)) − f (X (s−)) − ∂xk
(6.30)
d
0<s≤t
k=1
If X(t) ⊆ U for all t where U is an open subset of Rd then one can assume that f ∈ C 2 (U ).
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
395
Recall46 that if X is a semimartingale then by definition X c is the continuous part of the local martingale part of X. If X and Y are arbitrary semimartingales c then [X c , Y c ] = [X, Y ] . So (6.30) can be written as f (X (t)) − f (X (0)) = d t ' & ∂f 1 t ∂2f = (X− ) dXk + (X− ) d Xic , Xjc + ∂xk 2 i,j 0 ∂xi ∂xj k=1 0 d ∂f + f (X (s)) − f (X (s−)) − (X (s−)) ∆Xk (s) . ∂xk 0<s≤t
k=1
The jumps of the stochastic integral part of the formula are d ∂f (X (s−)) ∆Xk (s) . ∂xk
k=1
As X does not necessarily have finite variation the sums (f (X (s)) − f (X (s−))) 0<s≤t
and n ∂f (X (s−)) ∆Xk (s) ∂xk
0<s≤t k=1
are generally not finite, so one cannot write Itˆo’s formula as f (X (t)) − f (X (0)) = d t ∂f 1 t ∂2f c = (X− ) dXkc + (X− ) d [Xi , Xj ] + ∂x 2 ∂x ∂x k i j i,j 0 k=1 0 (f (X (s)) − f (X (s−))) . + 0<s≤t
On the other hand as X− is locally bounded and f is continuous, after localization d d t 2 ∂ f d (X− ) d [Xi , Xj ] = i=1 j=1 0 ∂xi ∂xj d d ∂2f = (X(s−)) ∆Xi (s)∆Xj (s) ≤ 0<s≤t i=1 j=1 ∂xi ∂xj 46 See:
Definition 4.23, page 235.
396
ˆ FORMULA ITO’s
d d ∂2f ≤ (X(s−)) ∆X (s)∆X (s) i j ∂xi ∂xj
≤
0<s≤t i=1 j=1
≤K
d d
[Xi , Xj ] (t) < ∞.
i=1 j=1
The series
d ∂f (X (s−)) ∆Xk (s) f (X (s)) − f (X (s−)) − ∂xk
0<s≤t
k=1
and d d 0<s≤t i=1 j=1
∂2f (X(s−)) ∆Xi (s)∆Xj (s) ∂xi ∂xj
are absolutely convergent. Hence one can write the formula in the following way: Theorem 6.47 If the coordinates of the vector X (X1 , X2 , . . . , Xd ) are
semimartingales and f ∈ C 2 Rd then
=
d k=1
0
t
f (X (t)) − f (X (0)) = ∂f 1 t ∂2f (X− ) dXk + (X− ) d [Xi , Xj ] + ∂xk 2 i,j 0 ∂xi ∂xj R(s), +
(6.31)
0<s≤t
where R(s) is the ‘third-order remainder of the approximation of the jumps’: R(s) f (X (s)) − f (X (s−)) − −
d ∂f (X (s−)) ∆Xk (s) − ∂xk
k=1
−
1 ∂2f (X(s−)) ∆Xi (s)∆Xj (s). 2 i,j ∂xi ∂xj
If X(t) ⊆ U for all t where U is an open subset of Rd then one can assume that f ∈ C 2 (U ). As in the continuous case one can reformulate the theorem for holomorphic functions.
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
397
Corollary 6.48 (Itˆ o’s formula for holomorphic functions) If f is a holomorphic function and Z is a complex semimartingale then f (Z (t)) = f (Z (0)) +
1 t c + f (Z− )dZ + f (Z− )d [Z] + 2 0 0 + (f (Z (s)) − f (Z (s−)) − f (Z (s−)) ∆Z (s)) . t
0<s≤t
Proof. One has to calculate only the jump part. The calculation of the other terms is the same as in the continuous case. If Z X + iY and f u + iv then the jump part of the real part is the sum over the jumps of the expressions u (X (s) , Y (s)) − u(X(s−), Y (s−))− −
∂u ∂ u (X (s−) , Y (s−)) ∆X (s) − (X (s−) , Y (s−)) ∆Y (s) ∂x ∂y
and similarly the jump part of the imaginary part is the sum of the expressions v (X (s) , Y (s)) − v(X(s−), Y (s−))− −
∂v ∂v (X (s−) , Y (s−))∆X (s) − (X (s−) , Y (s−)) ∆Y (s) . ∂x ∂y
Adding them up and using that f =
∂v ∂v ∂v ∂u +i = −i ∂x ∂x ∂y ∂x
one can easily get that the jump part is the sum of the expressions f (Z (s)) − f (Z (s−) − f (Z(s−))∆Z(s).
Example 6.49 Itˆ o’s formula and the integration by parts formula.
Itˆ o’s formula is a generalization, and also a consequence, of the integration by parts formula. If f (x, y) xy and X and Y are semimartingales then by Itˆo’s
398
ˆ FORMULA ITO’s
formula 1 1 c c XY − X (0) Y (0) = X− • Y + Y− • X + [X, Y ] + [Y, X] + 2 2 + (XY − X− Y− − Y− ∆X − X− ∆Y ) . The expression after the sum is ∆X∆Y so c
XY − X (0) Y (0) = X− • Y + Y− • X + [X, Y ] + + ∆X∆Y, that is Itˆ o’s formula reduces to the integration by parts formula XY − X (0) Y (0) = X− • Y + Y− • X + [X, Y ] . 6.4.1
Itˆ o’s formula for processes with finite variation
1. Let f be a continuously differentiable function. First assume that X is a continuous process with finite variation. In this case Itˆ o’s formula has the following simple form: f (X (t)) − f (X (0)) =
t
f (X) dX.
(6.32)
0
(n) In this special case one can prove the formula in the following way: let tk be an arbitrary infinitesimal sequence of partitions of the interval [0, t]. By the mean value theorem and by the intermediate value theorem f (X (t)) − f (X (0)) = = =
(n)
(n) f X tk − f X tk−1 = k
(n) (n) (n) X tk − X tk−1 = f ξk
k
(n) (n) (n) f X τk X tk − X tk−1 ,
k
(n)
where τ k
( (n) (n) ∈ tk−1 , tk . As f and X are continuous, f (X (s)) is continuous.
Hence f (X (s)) is Stieltjes integrable with respect to X. Therefore if n → ∞ t then the right-hand side converges to 0 f (X (s)) dX (s). 2. Now assume that X is a right-regular simple t jump process with finite variation. Recall that in this case X (t) − X (0) = 0 ∆Xdµ where µ is the counting
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
399
measure. Obviously [X c ] = 0.
t
f (X− ) dX =
0
f (X− ) ∆X,
0<s≤t
and the sum is obviously finite. In this case Itˆ o’s formula is f (X (t)) − f (X (0)) =
(f (X (s)) − f (X (s−))) .
(6.33)
0<s≤t
One can prove this identity directly: if all the jumps of X are bigger than a given ε > 0 then X has just finitely many of jumps on the interval [0, t] and between the jumps X is constant. In this case (6.33) is a simple telescopic sum, therefore (6.33) holds. If X (ε) |∆X| χ (|∆X| > ε) then (6.33) is valid for X (ε) . lim |∆X| χ (|∆X| > ε) = 0
ε→0
therefore by the Dominated Convergence Theorem X (ε) (t ± 0) → X (t ± 0) for every t. f is continuous, hence
f X (ε) (t) − f X (ε) (0) → f (X (t)) − f (X (0)) . If f is continuously differentiable then on the interval [−ε, ε] |f (x) − f (y)| ≤ K |x − y| . So 0<s≤t,|∆X|≤ε
|∆f (X (s))| ≤
K |∆X (s)| < ∞,
0<s≤t
hence ∆f (X) = f (X) − f (X−) is also integrable with respect to µ. By the Dominated Convergence Theorem
(f (X (s)) − f (X (s−))) . f X (ε) (s) − f X (ε) (s−) → 0<s≤t
0<s≤t
3. Finally, assume, that X is an arbitrary right-regular process with finite variation. The continuous part of the semimartingale X is the continuous part of the local martingale part so X c = 0, hence in this case [X c , X c ] = 0. If X ∈ V
400
ˆ FORMULA ITO’s
then Itˆ o’s formula is
t
f (X (t)) − f (X (0)) = +
f (X− ) dX
0
(f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s)) .
0<s≤t
Denote by
X= X− ∆X + ∆X X c + X d the decomposition of X. Here47 X c denotes the continuous part of X. As X has finite variation the decomposition is well-defined. X c is a continuous and X d is a jump process. Both of them have finite variation. Of course
t
f (X− ) dX d =
0
f (X (s−)) ∆X (s) ,
0<s≤t
which is finite again. Hence Itˆ o’s formula simplifies to f (X (t)) − f (X (0)) = +
t
f (X− ) dX c +
(6.34)
0
(f (X (s)) − f (X (s−))) .
0<s≤t
Example 6.50 If X ∈ V + and a < b then48 X (b) − X (a) =
a
b
1 dX (s) . X (s) + X (s−)
(6.35)
Assume that X is positive. By (6.34) " " X (b) − X (a) = a
b
" " 1 " dX c + X (s) − X (s−). 2 X− a<s≤b
47 Of course now the symbol X c has a double meaning. Hopefully the reader is not confused by the notation. 48 See: Lemma 5.39, page 338.
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
401
As X c is continuous and the number of the jumps of X is at most countable
b
a
1 " dX c = 2 X−
b
√ a
1 " dX c . X + X−
On the other hand " a<s≤b
X (s) −
" X (s) − X (s−) " " = X (s−) = X (s) + X (s−) a<s≤b b 1 " " = dX d . X (s) + X (s−) a
Adding up we get (6.35). If X = 0 on some interval then both sides of (6.35) are, by definition, zero on that interval. 6.4.2
The proof of Itˆ o’s formula
Itˆ o’s formula has many proofs. One can prove the general formula by the same method we used in the continuous case: using the integration by parts formula49 one can first show the formula for polynomials by induction, then using approximation one can show the general case. We show (6.31). For f ≡ c or f (x) = x the formula is trivial. If f (x) = x2 then from Taylor’s formula it is clear that R (s) = 0, and Itˆ o’s formula is just the integration by parts formula. Now let f (x) = x3 . By the integration by parts formula again f (X) − f (X (0)) X 3 − X 3 (0) = ' & 2 = X− • X + X− • X 2 + X, X 2 = '
& 2 = X− • X + X− • 2X− • X + [X] − X 2 (0) + X, X 2 = & ' 2 = 3X− • X + X− • [X] + X, 2X− • X + [X] − X 2 (0) = 2 = 3X− • X + 3X− • [X] + [X, [X]]
1 f (X− ) • X + f (X− ) • [X] + [X, [X]] . 2 Now as [X] ∈ V it is purely discontinuous so [X, [X]] =
∆X∆ [X] =
3
(∆X) . 3
From Taylor’s formula it is easy to see that if f (x) = x3 then R (s) = (∆X (s)) . So Itˆ o’s formula is valid if f (x) = x3 . Now let f (x) be a polynomial of d variables 49 See:
Example 6.49, page 397.
402
ˆ FORMULA ITO’s
and assume that the formula is valid for f (X):
f (X) − f (X (0)) =
d ∂f 1 ∂2f (X− ) • Xi + (X− ) • [Xi , Xj ] + 2 i,j ∂xi ∂xj ∂xi i=1 + R (s) .
Let g (x) xl f (x). Integrating by parts g (X) − g (X (0)) = f (X− ) • Xl + Xl− • f (X) + [Xl , f (X)] . Using the formula of the quadratic variation of stochastic integrals
[Xl , f (X)] =
d ∂f (X− ) • [Xi , Xl ] + ∂x i i=1
∂2f 1 ∆Xl (X− ) ∆Xi ∆Xj + 2 ∂xi ∂xj i,j + ∆Xl (s) R (s) . +
Using the definition of R (s)
[Xl , f (X)] =
+
d ∂f (X− ) • [Xi , Xl ] + ∂x i i=1
∆Xl (s) f (X (s)) − f (X (s−)) −
d
f (X (s−) ∆Xi (s)) .
i=1
Using Itˆ o’s formula for f (X)
(Xl )− • f (X) =
d i=1
+
1 2
i,j
2
Xl−
∂f (X− ) • Xi + ∂xi
∂ f (X− ) • [Xi , Xj ] + ∂xi ∂xj + Xl (s−) R (s) .
Xl−
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
403
Using the same calculation as in the continuous case with (6.4) and (6.5) one can easily get the first part of the formula for g: g (X) − g (X (0)) =
d ∂g 1 ∂2g (X− ) • Xi + (X− ) • [Xi , Xj ] + ∂xi 2 i,j ∂xi ∂xj i=1
+ the jump part. We should finally calculate the value of the jump part:
d ∂f (X (s−) ∆Xi (s)) + ∆Xl (s) f (X (s)) − f (X (s−)) − ∂xi i=1 + Xl (s−) R (s) .
Obviously it is
d ∂f Xl (s) f (X (s)) − f (X (s−)) − (X (s−) ∆Xi (s)) − ∂xi i=1
−
(6.36)
∂2f 1 (X− ) ∆Xi (s) ∆Xj (s) . Xl (s−) 2 ∂xi ∂xj i,j
Observe that the expression after the sum in the first line is g (X (s)) − g (X (s−)) − f (X (s−)) ∆Xl (s) − − Xl (s−)
d ∂f (X (s−) ∆Xi (s)) ∂xi i=1
− ∆Xl (s)
d ∂f (X (s−) ∆Xi (s)) . ∂x i i=1
Using (6.4) it is just g (X (s)) − g (X (s−)) −
d ∂g (X (s−)) ∆Xk (s) − ∂xk
k=1
−∆Xl (s)
d ∂f (X (s−) ∆Xi (s)) . ∂xi i=1
(6.37)
404
ˆ FORMULA ITO’s
Adding the sum over s of the second line of (6.37) to the second line of (6.36) and using (6.5) we get the term −
1 ∂2g (X− ) ∆Xi (s) ∆Xj (s) . 2 ∂xi ∂xj i,j
Hence Itˆo’s formula is valid for polynomials. From this point the proof of the general case is the same as that of the continuous case50 . A very natural approach to prove Itˆ o’s formula is to use Taylor approximation. To make the proof more interesting, let us first introduce the following concepts: Recall that a measure is locally finite if the measure of every compact set is finite and that µn → µ in the vague topology if µn ((0, t]) → µ ((0, t]) for every point t which is a point of continuity51 of the limit µ. Let (0, t] be an arbitrary interval and let r > t be a point of continuity of µ. lim sup µn ((0, t]) ≤ lim sup µn ((0, r]) = µ ((0, r]) . n→∞
n→∞
Since the points of continuity of µ are dense in R+ and as µ is right-continuous lim sup µn ((0, t]) ≤ µ ((0, t])
(6.38)
n→∞
for every t ≥ 0. Also recall that µc denotes the continuous part of the increasing function t → µ ((0, t]). Definition 6.51 Let (∆n ) be an infinitesimal52 sequence of partitions: (n)
∆ n : 0 = t0
(n)
< t1
(n)
< . . . < tkn = ∞.
1. We say that a right-regular function f on [0, ∞) has finite quadratic variation with respect to (∆n ) if the sequence of point measures53
2
(n)
(n) (n) f ti+1 − f ti δ ti
µn
(n)
ti
∈∆n
50 One
should use the fact that X− is locally bounded. the points of continuity are dense the limit is unique. 52 That is, on any finite interval max (n) − t(n) → 0. k tk+1 k 51 As
53 Recall
that δ (a) is Dirac’s measure concentrated at point a.
(6.39)
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
405
converges in the vague topology to a locally finite measure µ where µ has the decomposition µ ((0, t]) = µc ((0, t]) +
2
(∆f (s)) .
s≤t
We shall denote µ ((0, t]) by [f ] (t) [f, f ] (t). 2. We say that right-regular functions f and g on [0, ∞) have finite quadratic co-variation with respect to (∆n ) if [f ] , [g] and [f + g] exist. In this case 1 ([f + g] − [f ] − [g]) . 2
[f, g]
3. A function g is (∆n )-integrable with respect to some function G if the limit lim
n→∞
(n)
ti
(n) (n) (n) g ti G ti+1 − G ti
≤t
is finite for every t ≥ 0. We shall denote this (∆n )-integral by
t
g (s−) dG (s) . 0
Theorem 6.52 (F¨ ollmer) Let F ∈ C 2 Rd and let (∆n ) be an infinitesimal d sequence of partitions of [0, ∞). If f (fk )k=1 are right-regular functions on R+ with finite quadratic variation and co-variation with respect to (∆n ) then for every t > 0 F (f (t)) − F (f (0)) = t ∂F = (f (s−)) , df (s) + ∂x 0 t ∂2F 1 (f (s−)) d [fi , fj ] (s) − + 2 i,j 0 ∂xi ∂xj −
+
s≤t
1 ∂2F (f (s−)) ∆fi (s) ∆fj (s) + 2 ∂xi ∂xj i,j s≤t
d ∂F F (f (s)) − F (f (s−)) − (f (s−)) ∆fi (s) ∂xi i=1
ˆ FORMULA ITO’s
406 where
t 0
∂F (n)
(n)
∂F (n) , f ti+1 − f ti (f (s−)) , df (s) lim f ti n→∞ ∂x ∂x (n) ti
≤t
where ∂F ∂x
∂F ∂F ∂F , ,..., ∂x1 ∂x2 ∂xd
denotes the gradient vector of F and all the other integrals are (∆n )-integrals. If the coordinates of the vector X (X1 , X2 , . . . , Xn ) are semimartingales, then the quadratic variations and co-variations exist and they converge uniformly on compact sets in probability. This implies that for some subsequence they converge uniformly, almost surely. Also, for semimartingales the stochastic integrals 0
t
∂F (X (s−)) dXk (s) ∂xk
exist and by the Dominated Convergence Theorem, uniformly on compact intervals in probability, 0
t
∂F ∂F (n) (n) (X (s−)) dXk (s) = (X (ti )) Xk ti+1 − X ti ∂xk ∂xk (n) ti
≤t
therefore F¨ ollmer’s theorem implies Itˆo’s formula. Proof. Fix t > 0. To simplify the notation we drop the superscript n. 1. If the first point in ∆n which is larger than t is tkn then tkn t. As f is right-continuous F (f (t)) − F (f (0)) = lim F (f (tkn )) − F (f (0)) = n→∞ = lim (F (f (ti+1 )) − F (f (ti ))) . n→∞
i
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
407
To simplify the notation further we drop all the point from ∆n which are larger than tkn . By Taylor’s formula F (f (ti+1 )) − F (f (ti )) =
d ∂F (f (ti )) (fk (ti+1 ) − fk (ti )) + ∂xk
k=1
1 ∂2F (f (ti )) (fk (ti+1 ) − fk (ti )) (fl (ti+1 ) − fl (ti )) + + 2 ∂xk ∂xl k,l
+r (f (ti ) , f (ti+1 )) where 2
|r (a, b)| ≤ ϕ (b − a) b − a . As F is twice continuously differentiable one may assume that ϕ is increasing and limc0 ϕ (c) = 0. 2. Given ε > 0 we split the set of jumps of f into two classes. C1 is a finite set and C2 is the set of jumps for which
s∈C2 ,s≤t
d
2 |∆fk (s)|
≤ ε.
k=1
As f has quadratic variation and co-variation this separation is possible. Since C1 is finite and as f is right-regular if (1) denotes the sum over the sub-intervals which contain a point from C1 then lim
n→∞
(F (f (ti+1 )) − F (f (ti ))) =
(F (f (s)) − F (f (s−))) .
(6.40)
s∈C1
(1)
Let F denote the first derivative and F the second derivative of F . Adding up the increments of other intervals
(F (f (ti+1 )) − F (f (ti ))) =
F (f (ti )) (f (ti+1 ) − fk (ti )) +
(2)
+ −
(1)
1 2
F (f (ti )) (f (ti+1 ) − f (ti )) −
1 F (f (ti )) (f (ti+1 ) − f (ti )) + F (f (ti )) (f (ti+1 ) − f (ti )) + 2 +
(2)
r (f (ti ) , f (ti+1 )) .
408
ˆ FORMULA ITO’s
As C1 is finite the expression in the third line goes to (1)
1 F (f (s−)) ∆f (s) + F (f (s−)) (∆f (s)) . 2
(6.41)
One can estimate the last expression as 2 ≤ ϕ max r (f (t ) , f (t )) f (t ) − f (t ) f (ti+1 ) − f (ti ) i i+1 i+1 i (2) (2) (2) therefore, using (6.38), lim sup r (f (ti ) , f (ti+1 )) ≤ k→∞ (2) 2 ≤ ϕ (ε+) lim sup f (ti ) − f (ti+1 ) ≤ n→∞
≤ ϕ (ε+) lim sup n→∞
d
ti ≤t
µ(k) n ((0, t]) ≤ ϕ (ε+)
k=1
d
[fk ] (t) .
k=1
If ε 0 then this expression goes to zero and the difference of (6.40) and (6.41) goes to s≤t
1 F (f (s)) − F (f (s−)) − F (f (s−)) (∆f (s)) − F (f (s−)) (∆f (s)) 2
3. Let G now be a continuous function. We show that if f is one of the functions fk or fk + fl then
2
G (f (ti )) (f (ti+1 ) − f (ti )) = t = G (f (s−)) d [f ] (s) .
lim
n→∞
0
Using the definition of measures related to the quadratic variation this means that lim
n→∞
t
G (f ) dµn = 0
t
G (f (s−)) dµ (s) , 0
(6.42)
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
where the integrals are usual Lebesgue–Stieltjes integrals. and let
h (u)
409
Let ε > 0
∆f (s) .
s∈C1 ,s≤u (C )
As C1 is a finite set it is Let µn 1 be the point measure like (6.39) based on h.
(C1 ) easy to see that the sequence of point measures µn converges to the point measure µ(C1 )
2
(∆f (s)) δ (s) .
s∈C1
As C1 is finite it is also easy to see, that
t
lim
G (f
n→∞
(s)) dµn(C1 )
t
G (f (s−)) dµ(C1 ) (s) .
(s) =
0
(6.43)
0
Let g f − h. As f = h + g obviously
2
(f (ti+1 ) − f (ti )) =
ti ≤u
+ +2
2
(h (ti+1 ) − h (ti )) +
ti ≤u 2
(g (ti+1 ) − g (ti )) +
ti ≤u
(g (ti+1 ) − g (ti )) (h (ti+1 ) − h (ti )) .
ti ≤u
C1 has only a finite number of points and if h is not continuous at some point s (C ) then g is continuous at s. Hence the third term goes to zero. Therefore µn −µn 1 converges to µ − µ(C1 ) . t t
(C1 ) (C1 ) G (f (s)) d µn − µn G (f (s−)) d µ − µ (s) − (s) ≤ 0 0 t t
(C1 ) (C1 ) G (f (s)) d µn − µn G (f (s)) d µ − µ (s) − (s) + ≤ 0
0
t
+ G (f (s)) − G (f (s−)) d µ − µ(C1 ) (s) . 0
The total size of the atoms of the measure µ − µ(C1 ) is smaller than ε2 . The function G (f ) is continuous at the point of continuity of µ − µ(C1 ) so one can
410
ˆ FORMULA ITO’s
estimate the second term by t
(C1 ) G (f (s)) − G (f (s−)) d µ − µ (s) ≤ 2ε2 sup |G (f (s))| . s≤t 0
Recall that f is bounded54 , and therefore sup |G (f (s))| < ∞. s≤t
Obviously µ − µ(C1 ) (C1 ) = 0. Hence there are finitely many open intervals which cover the points of C1 with total measure smaller than ε. Let O be the union of these intervals. As the points of continuity are dense one may assume that the points of the boundary of O are points of continuity of µ − µ(C1 ) . By the vague convergence one can assume that for some n sufficiently large (C ) (µn − µn 1 ) (O) < ε. If one deletes O from [0, t] the jumps of f are smaller than ε then on the compact set [0, t] \C1 . G is uniformly continuous on the bounded range55 of f so there is a δ such that if s1 , s2 ∈ [0, t] \O and |s1 − s2 | < δ then |G (f (s1 )) − G (f (s2 ))| < 2ε. This means that there is a step function H such that |H (s) − G (f (s))| < 2ε on [0, t] \O. On may also assume that the points of discontinuities of the step function H are points of continuity of measure µ − µ(C1 ) . t t
(C1 ) (C1 ) (s) − (s) ≤ G (f (s)) d µn − µn G (f (s)) d µ − µ lim sup n→∞ 0
0
≤ 2ε sup |G (f (s))| +
n→∞
s≤t
+2ε µn − µn(C1 ) ([0, t]) + µ − µ(C1 ) ([0, t]) + t t
(C1 ) (C1 ) H (s) d µn − µ H (s) d µ − µ (s) − (s) . + lim sup
0
0
Since the last expression, by the vague convergence goes to zero, for some k independent of ε t t
lim sup G (f (s)) d µn − µn(C1 ) (s) − G (f (s)) d µ − µ(C) (s) ≤ εk. n→∞
54 See: 55 See:
0
Proposition 1.6, page 5. Proposition 1.7, page 6.
0
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
411
As ε is arbitrary lim
n→∞
t
t
G (f (s)) d µn − µn(C1 ) (s) = G (f (s−)) d µ − µ(C1 ) (s) .
0
0
Using (6.43) one can easily show (6.42). 4. Applying this observation and the definition of the co-variation one gets the convergence of F (f (ti )) (f (ti+1 ) − f (ti )) = =
∂ 2 F (f (ti )) (fk (ti+1 ) − fk (ti )) (fl (ti+1 ) − fl (ti )) ∂xk ∂xk k,l
to the sum of integrals k,l
t
0
∂2F (f (s−)) d [fk , fl ] (s) . ∂xk ∂xl
5. As all the other terms converge, ∂F (f (ti )) (f (ti )) , f (ti+1 ) − f (ti ) ∂x i also converges and its limit, by definition, is t ∂F (f (s−)) , df (s) ∂x 0 which proves the formula. 6.4.3
Exponential semimartingales
As an application of the general Itˆ o formula let us discuss the exponential semimartingales. Let Z be an arbitrary complex semimartingale, that is let Z X + iY , where X and Y are real-valued semimartingales. Let us investigate the stochastic integral equation E = 1 + E− • Z.
(6.44)
Definition 6.53 The equation (6.44) is called the Dol´eans equation. The simplest version of the equation is when Z(s) ≡ s E (t) = 1 +
t
E (s−) ds = 1 + 0
t
E (s) ds, 0
412
ˆ FORMULA ITO’s
which characterizes the exponential function E (t) = exp (t). This explains the next definition: Definition 6.54 The solution of (6.44), denoted by E (Z), is called the exponential semimartingale of Z. Proposition 6.55 (Yor’s formula) If X and Y are arbitrary semimartingales then E (X) E (Y ) = E (X + Y + [X, Y ]) . Proof. By the formula for the quadratic variation of stochastic integrals ' & [E (X) , E (Y )] 1 + E (X)− • X, 1 + E (Y )− • Y =
= E (X)− E (Y )− • [X, Y ] . Integrating by parts E (X) E (Y ) − 1 = E (X)− • E (Y ) + E (Y )− • E (X) + [E (X) , E (Y )] =
= E (X)− E (Y )− • (Y + X + [X, Y ]) , from which, evident.
by the definition of the operator E,
Yor’s formula is
In the definition of E(Z) and during the proof of Yor’s formula we have implicitly used the following theorem: Theorem 6.56 (Solution of Dol´ eans’ equation) Let Z be an arbitrary complex semimartingale. 1. There is a process E which satisfies the integral equation (6.44). 2. If E1 and E2 are two solutions of (6.44) then E1 and E2 are indistinguishable. 3. If τ inf {t : ∆Z = −1} then E (Z) = 0 on [0, τ ), E (Z)− = 0 on [0, τ ] and E (Z) = 0 on [τ , ∞). 4. E (Z) is a semimartingale. 5. If Z has finite variation then E (Z) has finite variation. 6. If Z is a local martingale then E (Z) is a local martingale. 7. E has the following representation: 1 c (6.45) E E (Z) = exp Z − Z (0) − [Z] × 2 ! × (1 + ∆Z) exp (−∆Z) , where the product in the formula is absolutely convergent.
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
413
Proof. The proof of the theorem is a direct and simple, but lengthy calculation. We divide the proof into several steps. variation of semimartingales is finite. Hence the sum 1. The quadratic 2 |∆Z (s)| is convergent. Therefore on the interval [0, t] there are just finitely s≤t many moments when |∆Z| > 1/2. If |u| ≤ 1/2, then 2
|ln (1 + u) − u| ≤ C |u| , hence ln
!
|1 + ∆Z| |exp (−∆Z)| = (ln (|1 + ∆Z|) − |∆Z|) ≤ ≤ |ln (1 + |∆Z|) − |∆Z|| ≤ 2 ≤C |∆Z| < ∞.
Therefore the product V (t)
!
(1 + ∆Z (s)) exp (−∆Z (s))
s≤t
is absolutely convergent. Separating the real and the imaginary parts and taking logarithm, one can immediately see that V is a right-regular process with finite variation. By the definition of the product operation obviously56 V (0)
!
(1 + ∆Z (s)) = 1 + ∆Z (0) = 1.
s≤0
2. Let us denote by U the expression in the exponent of E (Z): U (t) Z − Z (0) −
1 c [Z ] . 2
With this notation E E (Z) V exp (U ) . By Itˆo’s formula for complex semimartingales, using that E (0) = 1, c and that V has finite variation, the co-variation [U, V ] = [U c , V c ] and 56 See:
(1.1) on page 4.
414
ˆ FORMULA ITO’s
c
[V ] = [V c ] are zero and hence E = 1 + E− • U + exp (U− ) • V + 1 c + E− • [U ] + 2 + (∆E − V− exp (U− ) ∆U − exp (U− ) ∆V ) . V is a pure jump process and therefore A exp (U− ) • V =
exp (U− ) ∆V.
As ∆U = ∆Z ∆E E − E− exp (U ) V − exp (U− ) V− = = exp (U− + ∆U ) V− (1 + ∆Z) exp (−∆Z) − exp (U− ) V− = = exp (U− + ∆U ) exp (−∆U ) V− (1 + ∆U ) − exp (U− ) V− = = exp (U− ) V− ∆U E− ∆U. Substituting the expressions A and ∆E A+
(∆E − E− ∆U − exp (U− ) ∆V ) = 0.
Obviously c 1 c c [U ] Z − Z (0) − [Z] = [Z c ] = [Z] , 2 c
and therefore 1 c E = 1 + E− • U + E− • [U ] = 2 1 c = 1 + E− • U + [Z] 2 1 + E− • (Z − Z (0)) = 1 + E− • Z, hence E satisfies (6.44). 3. One has to prove that the solution is unique. Let Y be an arbitrary solution of (6.44). The stochastic integrals are semimartingales so Y is a semimartingale. By Itˆo’s formula H Y · exp (−U ) is also a semimartingale. Applying the
ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S
415
multidimensional complex Itˆ o’s formula for the complex function z1 · exp (−z2 ) H = 1 − H− • U + exp (−U− ) • Y + 1 c c + H− • [U ] − exp (−U− ) • [U, Y ] + 2 + (∆H + H− ∆U − exp (−U− ) ∆Y ) . Y is a solution of the Dol´eans equation so exp (−U− ) • Y = exp (−U− ) Y− • Z H− • Z. c
c
c
[U, Y ] = [U, (Y− • Z)] = Y− • [U, Z] c 1 c c Y− • Z − [Z] , Z = Y− • [Z] . 2 c
c
exp (−U− ) • [U, Y ] = H− • [Z] . c
c
Adding up these terms and using that [U ] = [Z]
1 c c H− • Z + [U ] − [Z] 2
= H− • U,
hence H =1+
(∆H + H− ∆U − exp (−U− ) ∆Y ) .
Y is a solution of (6.44), so ∆Y = Y− ∆Z = Y− ∆U. Hence H =1+ 1+ =1+
(∆H + H− ∆U − exp (−U− ) Y− ∆U ) (∆H + H− ∆U − H− ∆U ) = ∆H.
(6.46)
416
ˆ FORMULA ITO’s
On the other hand, using (6.46) again ∆H H − H− Y exp (−U ) − H− = = exp (−U− − ∆U ) (Y− + ∆Y ) − H− = = exp (−U− − ∆U ) Y− (1 + ∆Z) − H− = = exp (−U− ) Y− exp (−∆U ) (1 + ∆Z) − H− = = H− (exp (−∆Z) (1 + ∆Z) − 1) so H = 1 + H− • R,
(6.47)
where R
(exp (−∆Z) (1 + ∆Z) − 1) .
For some constant C if |x| ≤ 1/2 |exp (−x) (1 + x) − 1| ≤ Cx2 . 2 Z is a semimartingale so (∆Z) < ∞ and therefore R is a complex process with finite variation. 4. Let us prove the following simple general observation: if v is a right-regular function with finite variation then the only right-regular function f for which
h
h≥0
f (s−) dv (s) ,
f (h) =
(6.48)
0
is f ≡ 0. Let s inf {t : f (t) = 0}. Obviously f = 0 on the interval [0, s). Hence by the integral equation (6.48)
s
s
f (t−) dv (t) =
f (s) = 0
0dv = 0. 0
If s < ∞ then, as v is right-regular, there is a t > s such that Var (v (t)) − Var (v (s)) ≤ 1/2. If t ≥ u > s then
u
s
≤ Var(v, s, u) sup |f (u)| ≤ s≤u≤t
u
f− dv ≤
f− dv =
f (u) = f (s) +
s
1 sup |f (u)| 2 s≤u≤t
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
417
and therefore sup |f (u)| ≤
s
1 sup |f (u)| . 2 s
As f is regular f is bounded on every finite interval so sups
(6.49)
If G H − V then subtracting equations (6.47) and (6.49) G = G− • R. R is right-regular and has finite variations. Therefore G ≡ 0. So H ≡ V . Hence Y H exp (U ) = V exp (U ) = E (Z) , so E (Z) is the only solution of Dol´eans’ equation. 6. As we have already mentioned, by Itˆ o’s formula E (Z) is a semimartingale. By (6.44) and by the basic properties of stochastic integrals if Z is a local martingale then E (Z) is also a local martingale, if Z has finite variation then E (Z) also has finite variation. From (6.45) the other parts of the theorem are obvious.
6.5
Itˆ o’s Formula For Convex Functions
In this section we present some generalizations of Itˆo’s formula. One particular deficiency of the formula is that one can use it only with smooth functions. But some very important functions like |x| or x± are non-differentiable so, e.g., with the differentiable Itˆ o’s formula we cannot prove that the absolute value of a semimartingale is a semimartingale again57 . As we shall see, the key property of the function |x| is that it is a convex function. The main result of this section is that the class of semimartingales is closed under transformation by convex functions. 57 See:
Theorem 6.65, page 422.
418 6.5.1
ˆ FORMULA ITO’s
Derivative of convex functions
We shall use the next elementary, but important observation very often: Theorem 6.57 (Fundamental Theorem of Calculus for Convex Functions) Let f be a continuous convex function defined on some finite or infinite and f− denote the right and the left derivatives of f , then interval [a, b]. If f+ f (b) − f (a) =
b
f+ (x) dx =
a
a
b
f− (x) dx.
(6.50)
Proof. f is convex so for an arbitrary x ∈ (a, b) h →
f (x + h) − f (x) h
(6.51)
is meaningful and increasing in some neighborhood of h = 0. So the deriva (x) exist and it is not difficult to show that they are increasing. Every tives f± monotone function is Riemann integrable on any finite interval [c, d]: if g is monotone then the difference of the upper and the lower approximating sums is bounded by |g (d) − g (c)| · max (xn − xn−1 ) → 0. n
Let (xn ) be a partition of some [c, d] ⊆ (a, b). As (6.51) is increasing f− (xn−1 ) ≤ f+ (xn−1 ) ≤
=
f (xn ) − f (xn−1 ) = xn − xn−1
f (xn−1 ) − f (xn ) ≤ f− (xn ) ≤ f+ (xn ) . xn−1 − xn
Multiplying by xn − xn−1 and adding up for all n
f± (xn−1 ) (xn − xn−1 ) ≤ f (b) − f (a) ≤
n
≤
f± (xn ) (xn − xn−1 ) .
n
The expression on the left is the lower approximating sum, the expression on the right is the upper approximating sum. As the Riemann integral exits on arbitrary compact interval [c, d] ⊆ (a, b) (6.50) holds on [c, d]. One can get the general formula for [a, b] by taking limits on both sides58 . 58 Of course on infinite intervals it is possible that the integral is not finite, but in this case f (b) − f (a) is also infinite.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
419
Lemma 6.58 If f is a convex function on an open interval59 I ⊆ R then the is left-continuous, the right derivative f+ is right-continuous. left derivative f− Proof. Let us take for example the function f− . One should show that lim f− (x) = f− (w) .
(6.52)
xw
As f is convex f (z) − f (x) . z−x z<x
f− (x) = sup
Let y < x < w. Since I is open f is continuous on I, so (x) = lim sup lim f−
xw
xw z<x
f (y) − f (w) f (z) − f (x) f (y) − f (x) ≥ lim = . xw z−x y−x y−w
Taking the supremum on the right-hand side (x) ≥ sup lim f−
xw
y<w
f (y) − f (w) = f− (w) . y−w
is increasing limxw f− (x) ≤ f− (w). Hence (6.52) On the other hand, since f− holds.
Definition 6.59 If f is a convex function then f will denote the left-hand side . derivative of f , that is f f− Definition 6.60 sign (x) is the left-hand side derivative of the convex function |x|, that is60 1 if x > 0 sign (x) . −1 if x ≤ 0 Definition 6.61 Let I be an open interval. We say that function g : I → R is a generalized derivative61 of function h : I → R, if for arbitrary62 φ ∈ Cc∞ (I) the integration by parts formula holds, that is, if φ ∈ Cc∞ (I) then hφ dλ = [hφ]I − gφdλ = − gφdλ. I
I
I
It is not too surprising that if f is a convex function on an open interval I then and f+ are generalized derivatives of f : The support of φ is in I and f f− 59 As
I is open f is continuous in I. See: (6.51) above. that it is a bit of an unusual definition of the sign function. 61 Recall that we are dealing with three different definitions of the derivative. 62 C ∞ (U ) is the set of continuously differentiable functions with compact support. c 60 Observe
420
ˆ FORMULA ITO’s
is continuous, hence it is bounded on the support of φ. For example, by the Dominated Convergence Theorem
f φ dλ =
I
φ (x + h) − φ (x) dx = h→0 h
f (x) lim I
= lim
h→0
f (x) I
−
= lim
h→0
I
w→0
f (x − h) − f (x) φ (x) dx = −h
−
= lim
I
φ (x + h) − φ (x) dx = h
f (x + w) − f (x) φ (x) dx = − w
I
f± φdλ.
or f+ as the generalOne can think about the generalized derivative of f f− ized second derivative of f . For convex functions the generalized second derivative f is generally not a function: it is the measure µ generated by the left-continuous or by the right-continuous function f+ . By the integration increasing function f− ∞ by parts formula, using that φ ∈ Cc (I), hence it is zero around the endpoints of I f± φ dλ = f± dφ = − φdf± − φdµ. I
I
I
I
Example 6.62 The generalized second derivative of function |x| is 2δ 0 . The generalized second derivative of functions x± are δ 0 .
f (x) |x| is a convex function and f− (x) = sign (x)
1 −1
if x > 0 , if x ≤ 0
and the measure generated by f− is 2δ 0 . The proof of the other relation is similar.
Lemma 6.63 Let the measure µ f be the generalized second derivative of a convex function f . If the support of µ is a subset of a compact interval [a, b], then with some constants α and β f (x) = αx + β +
1 2
1 = αx + β + 2
b
|x − t| dµ (t) = a
R
|x − t| dµ (t) .
(6.53)
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
421
Proof. For convex functions one can use the Fundamental Theorem of Calculus. If x ∈ [a, b] , then integrating by parts63 f (x) − f (a) = [a,x)
f− (t) dt =
x
a
(x) − af+ (a) − = xf+ = xf+ (a) − af+ (a) +
f+ (t) dt =
x
a x
tdf+ (t) = x − tdf+ (t) ,
a
that is with some constants α1 , β 1
x
|x − t| dµ (t) .
f (x) = α1 x + β 1 + a
With the same calculation from the other side
b
|x − t| dµ (t) .
f (x) = α2 x + β 2 + x
Adding up these identities and dividing by two one gets the representation (6.53). Example 6.64 Representation of function f (x) = x+ .
The generalized second derivative of f (x) = x+ is δ 0 , so 1 2
R
|x − t| dµ (t) =
1 |x| 2
and x+ =
1 1 x + |x| , 2 2
that is in the representation α = 1/2 and β = 0. 63 f −
f+
(b) − f (a). f is right continuous so µ ((a, b]) = is left continuous so µ ([a, b)) = f− − + b (b) − f+ (a). Recall that in this book a hdµ (a,b] hdµ.
422
ˆ FORMULA ITO’s
6.5.2
Definition of local times
The most important result of the present section is the following: Theorem 6.65 If f : R → R is a convex function, X is a semimartingale then f (X) is also a semimartingale. f (X) has the following decomposition:
t
f (X (t)) − f (X (0)) =
f (X (s−)) dX (s) + A (t) ,
(6.54)
0
where A ∈ V + . For the jumps of A ∆A = f (X) − f (X− ) − f (X− ) ∆X. Proof. f f− is increasing, hence it is Borel measurable. X− is locally bounded therefore f (X− ) is predictable and locally bounded. This means that the stochastic integral in (6.54) exists. The main idea of the proof is that one can approximate the convex function f by C 2 functions and for the approximating o’s formula. C 2 functions one can use Itˆ 1. Let g ∈ Cc∞ ((−∞, 0]) be a non-negative function for which R gdλ = 1. For every n let
y
f x+ g (y) dy = n R = n f (z) g (n (z − x)) dz.
fn (x)
R
f is convex on R, hence it is continuous, so on every finite interval it is bounded. Therefore by the last formula one can differentiate under the integral sign: fn (x) = −n2
R
f (z) g (n (z − x)) dz
and obviously fn ∈ C ∞ . By the just proved version of the Fundamental Theorem of Calculus64 f (t) = f (s) + s 64 See:
Theorem 6.57, page 418.
t
f (v) dv.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
423
If we integrate by parts and h (z) ng (n (z − x)) then
∞
0 = [hf ]−∞ =
hdf + R
f dh = R
hf dλ +
R
f h dλ.
R
Therefore fn
(x) =
f h dλ = n R
=
R
f (z) g (n (z − x)) dz =
(6.55)
y
g (y) dy. f x + n R
f is increasing, hence f is locally bounded. So, if n → ∞ one may take the limit under the integral sign. The support of g is in the set of non-positive numbers, so if n → ∞ then the limit of the integrand is f (x − 0) g (y). lim fn (x) =
y
g (y) dy = f (x − 0) gdλ = lim f x + n R n→∞ R
n→∞
= f (x − 0) = f− (x − 0) = f− (x) f (x) .
2. Let us apply Itˆ o’s formula to the functions fn ∈ C 2 : fn (X (t)) − fn (X (0)) =
t
fn (X− ) dX + An (t) ,
(6.56)
0
where An (t)
(fn (X (s)) − fn (X (s−)) − fn (X (s−)) ∆X (s)) +
0<s≤t
1 + 2
t
fn (X (s−)) d [X c (s)] .
0
fn is also convex so An is increasing. fn is continuous so the left-hand side of (6.56) is right-continuous. The integral on the right-hand side is, by the definition of the stochastic integrals, right-continuous. Hence the function An is rightcontinuous and the formula for the jumps of A in the statement of the theorem is trivially valid. 3. If n → ∞, then the limit on the left side is f (X (t)) − f (X (0)) .
424
ˆ FORMULA ITO’s
f is increasing, hence if y ≤ 0, then f (x + y/n) ≤ f (x), and therefore by (6.55) one has that f1 ≤ fn ≤ f . Assume that X (0) = 0. As X− is locally bounded and as the derivative of a convex function is increasing f (X− ) and f1 (X− ) are locally bounded, hence the sequence (fn (X− )) is dominated by a locally bounded process. Therefore if n → ∞ then uniformly on compact intervals in probability lim fn (X− ) • X = f (X− ) • X.
n→∞
From this it follows that An is stochastically convergent, and the limit is increasing. 4. We show that A is right-continuous, and the condition for the jumps holds. It is sufficient to show that the convergence An → A on every compact interval ucp is uniform in probability. For this one should prove that fn (X) → f (X). If n → ∞, then the convergence fn → f is uniform on every compact interval65 . Let (τ m ) be a localizing sequence of X− . Obviously one can localize the line (6.54) so it is sufficient to prove the relation for the truncated processes X τ m .
τm − fn (X (0)) + fn (X τ m ) − fn (X (0)) = fn X− + ∆fn (X (τ m )) . τm is bounded, If n → ∞, then as X−
by the uniform convergence on compacts τm is uniform for every trajectory. The of (fn ), the convergence of the fn X− convergence of ∆fn (X (τ m )) is a convergence of random variables, hence the convergence
fn (X τ m ) → f (X τ m ) is uniform on any compact interval. 5. If X (0) = 0 then f (X) = f (X − X (0) + X (0)) g (X − X (0) , X (0)) . One can approximate g by gn (X − X (0) , X (0)) = fn (X − X (0) + X (0)) . Using the multi-dimensional Itˆ o formula one can prove the theorem as in the case X (0) = 0. Our next goal is to investigate the properties of A. 65 It is generally true, see [35], page 105, that for convex functions, pointwise convergence implies uniform convergence. Now, using the definition of fn one can directly prove the uniform convergence.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
425
Corollary 6.66 For any a |X (t) − a| = |X (0) − a| +
t
sign (X (s−) − a) dX (s) + Aa (t) .
(6.57)
0
Definition 6.67 For arbitrary a the continuous part of Aa in (6.57) that is the expression L (a, t) Aa (t) −
(∆ |X (s) − a| − sign (X (s−) − a) ∆X (s))
0<s≤t
is called the local time of X at point a. By the construction of local times L (a, t) is continuous in the time parameter t. On the other hand, at the moment we cannot say anything about the spatial parameter a. The reason for this is that during the construction of A we used Itˆ o’s formula, therefore for each value of a the two sides of (6.57) are just indistinguishable and they are not the same. The number of possible values of parameter a is not countable so it is not clear how one can unify the exceptional zero sets. To say it in another way, the main problem is that for any fixed a the stochastic integral part is defined up to indistinguishability. By Fubini’s theorem for stochastic integrals66 one can assume that the parametric stochastic integral in (6.57) is product measurable. This implies the following: Proposition 6.68 The local time L (a, t, ω) has a version which is product measurable67 in (a, t, ω). From now on we shall assume that L is product measurable. Example 6.69 If X is a semimartingale, then (X (t) − a)+ − (X (0) − a)+ =
t
0
+
χ (X− > a) dX+
χ (X (s−) > a) (X (s) − a)− +
0<s≤t
+
χ (X (s−) ≤ a) (X (s) − a)+ +
0<s≤t
+ 66 See:
1 L (a, t) , 2
Proposition 5.23 page 319. Observe that the integrand is uniformly bounded. we show that for continuous local martingales the local time L (a, t, ω) has a version which is continuous in (a, t). 67 Later
426
ˆ FORMULA ITO’s
or (X (t) − a)− − (X (0) − a)− = −
t
χ (X− ≤ a) dX+
0
+
χ (X (s−) > a) (X (s) − a)− +
0<s≤t
+
χ (X (s−) ≤ a) (X (s) − a)+ +
0<s≤t
1 L (a, t) . 2
+
These formulas are called Tanaka’s formulas.
Let us apply the generalization of Itˆ o’s formula (6.54) for convex functions + − f (x) (x − a) and g (x) (x − a) :
t
f (X (t)) = f (X (0)) + 0 t
g (X (t)) = g (X (0)) +
f (X− ) dX + A(+) (t) , g (X− ) dX + A(−) (t) .
0
Subtracting the two lines above and using that f (x) = χ (x > a) ,
g (x) = −χ (x ≤ a)
one gets
t
1dX + A(+) (t) − A(−) (t) .
X (t) − X (0) = 0
This implies that A(+) (t) = A(−) (t). If B (+) (t) A(+) (t) −
(f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s))
0<s≤t
B (−) (t) A(−) (t) −
(g (X (s)) − g (X (s−)) − g (X (s−)) ∆X (s))
0<s≤t
then by the definition of the local time B (+) (t) + B (−) (t) = L (a, t). As the difference of the sums above is zero B (+) (t) = B (−) (t) , hence B (+) = B (−) = L (a) , so the formula is valid.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
427
For any process X one can introduce the occupation times measure µt (B) λ (s ≤ t : X (s) ∈ B) . Later we shall see68 that for Wiener processes the local time L (a, t) is the density function of µt . With the usual interpretation of the density functions for Wiener processes one can think about L (a, t) da as the time during the time interval [0, t] a Wiener process is infinitely closely around a. Example 6.70 The Green function and the local time of Wiener processes.
Let w(x) be a Wiener process starting from point x. Let x ∈ I (a, b) be a (x) bounded interval, and let τ I be the exit-time of w(x) from I. Let us calculate the expected value E L(x) (y, τ I ) . By the definition of local time L(x) t
(x) sign w(x) − y dw(x) + L(x) (y, t) . w (t) − y = w(x) (0) − y + 0 (x)
If we truncate w(x) by τ I then the truncated process is bounded. If we truncate (x) both sides with τ I then the truncated integrator is in H2 . By Itˆo’s isometry the integral is also in H2 . Therefore the stochastic integral is a uniformly integrable martingale. By the Optional Sampling Theorem the expected value of the stochastic integral is zero, so
(x) (x) E w(x) τ I − y = |x − y| + E L(x) y, τ I . w(x) leaves the bounded set [a, b] almost surely so
(x) (x) E w(x) τ I − y = |a − y| P w τ I =a
(x) + |b − y| P w τ I =b . With the Optional Sampling Theorem one can easily calculate the probabilities69 . Obviously
(x) (x) P w τI = a + P w τI = b = 1, 68 See: 69 See:
Corollary 6.75, page 435. Example 1.116, page 81.
428
ˆ FORMULA ITO’s
and
(x) x = E w(x) (0) = E w(x) τ I
(x) (x) = aP w τ I = a + bP w τ I =b . Solving the equations
b−x (x) , P w τI =a = b−a
x−a (x) P w τI . =b = b−a
Substituting back
x−a b−x (x) + |b − y| − |x − y| . = |a − y| E L(x) y, τ I b−a b−a With elementary calculation
(x)
E L
(x) y, τ I
2 = b−a
(x − a) (b − y) if a ≤ x ≤ y ≤ b . (y − a) (b − x) if a ≤ y ≤ x ≤ b
If we introduce the so-called Green function 1 (x − a) (b − y) if a ≤ x ≤ y ≤ b GI (x, y) (y − a) (b − x) if a ≤ y ≤ x ≤ b b−a then
(x) E L(x) y, τ I = 2GI (x, y) .
Example 6.71 If 0 < a < b then before reaching point b a Wiener process starting from x = 0 on average spends 2 (b − a) da time units in the da neighbourhood point a.
Let w be a Wiener process and let 0 < a < b. Let us denote by τ b the first passage time of point b. Using the interpretation of the local times one should calculate the expected value E (L (a, τ b )). Using the same method as in the previous example = |a| + E
|b − a| = E (|w (τ b ) − a|) = τb sign (w (s) − a) dw (s) + E (L (a, τ b )) .
0
Observe that now wτ b is not bounded, so it is not in H2 so the stochastic integral is not a uniformly integrable martingale. If c < 0 < a < b, then as in the previous
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
429
example E (L (a, τ b ∧ τ c )) = 2G(c,b) (0, a) . If c −∞, then the limit on the right-hand side is 2 (b − a). On the left-hand side τ b ∧τ c τ b and as t → L (a, t) is increasing and continuous by the Monotone Convergence Theorem E (L (a, τ b )) = 2 (b − a) . 6.5.3
Meyer–Itˆ o formula
Theorem 6.72 Let X be a semimartingale. If L (a) is the local time of X at point a then for almost all outcome ω the support of the measure generated by the increasing function t → L (a, t, ω) is in the set {s : X (s−, ω) = X (s, ω) = a} . Proof. By the definition of local times L (a, t, ω) is continuous in time parameter t. This implies that the measure of every single point, with respect to the measure generated by L (a, t, ω), is zero. For every trajectory the number of the jumps of X is maximum countable, so it is sufficient to prove that the support of the measure generated by L (a, t, ω) is a subset of {s : X (s−, ω) = a} for almost all outcome ω. As convex functions of semimartingales are semimartingales Y |X − a| is a semimartingale. Y 2 = Y 2 (0) + 2Y− • Y + [Y ] . Z X − a is also a semimartingale. Y 2 = Z 2 = Y 2 (0) + 2Z− • Z + [Z] . Obviously [Z] = [Y ], therefore Y− • Y = Z− • Z. As Y = |Z|
t
sign (Z− ) dZ + Aa (t) .
Y (t) = Y (0) + 0
By the associativity rule
t
Y− dY = 0
t
0
t
Y− dAa .
Y− sign (Z− ) dZ + 0
430
ˆ FORMULA ITO’s
By the definition of sign Y− sign (Z− ) |Z− | sign (Z− ) = Z− .
(6.58)
Therefore
t
t
Z− dZ =
Y− dY =
0
0
t
0
t
Y− dAa .
Z− dZ + 0
Hence, by the definition of L (a, t, ω)
t
Y− dAa =
0=
(6.59)
0
t
Y− dLa +
[4pt] = 0
Y (s−) (∆ |Z (s)| − sign (Z (s−)) ∆Z (t)) .
0<s≤t
Observe that by (6.58) the expression after the sum is finite and has the form |a| (|b| − |a|) − a (b − a) = |a| |b| − a2 − ab + a2 = = |ab| − ab ≥ 0. t La is increasing, therefore the integral 0 Y− dLa is non-negative. This implies that the sum and the integral in (6.59) are zero. But as the integral is zero the support of the measure generated by La is part of the set {Y (s−) = 0} {|X (s−) − a| = 0} = {X (s−) = a} .
Example 6.73 If L is the local time of a Wiener process and τ b is the first passage time of a point b and 0 ≤ a < b then L (a, τ b ) has an exponential distribution with parameter70 λ (2 (b − a))−1 .
We show that the Laplace transform of the random variable L (a, τ b ) is l (s) E (exp (−s · L (a, τ b ))) = 70 See:
Example 6.71, page 428.
1 . 1 + 2s · (b − a)
(6.60)
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
431
As the Laplace transform of an exponentially distributed random variable is 1 1 + s/λ this implies the statement. 1. The main idea of the proof is to show that X (t)
1 + + s · (w (t) − a) exp (−s · L (a, t)) 2
is a local martingale. As Xτb =
1 + + s · (wτ b − a) exp (−s · Lτ b (a)) 2
(6.61)
is bounded, X τ b is a bounded local martingale. Hence (6.61) is a uniformly integrable martingale. Therefore by the Optional Sampling Theorem as 0 ≤ a < b 1 =E 2
1 + exp (−s · L (a, 0)) = + s · (w (0) − a) 2
1 + + s · (w (τ b ) − a) exp (−s · L (a, τ b )) = 2
=E =
1 + s · (b − a) l (s) , 2
from which (6.60) is trivial. 2. Let us return to process X. Let U (t)
1 + + s · (w (t) − a) , 2
V (t) exp (−sL (a, t)) .
Integrating by parts
t
U dV +
X (t) = U (t) V (t) = X (0) + 0
t
V dU + [U, V ] . 0
U is continuous, V has finite variation so [U, V ] = 0. By the previous theorem the support of the measure generated by V is in {w = a}, so
t
U dV = 0
1 1 + + s · (a − a) (V (t) − V (0)) = (V (t) − 1) . 2 2
432
ˆ FORMULA ITO’s
By Tanaka’s formula 1 U (t) H (t) + s · L (a, t) , 2 where H isa continuous local martingale. V is continuous so it is locally bounded t so Z (t) 0 V dH is a local martingale. On the other hand, by the Fundamental Theorem of Calculus71
t
Vd 0
1 s t s·L = exp (−s · L (a, u)) L (a, du) = 2 2 0 t s exp (−s · L (a, u)) = = 2 −s 0 1 1 = − (exp (−s · L (a, u)) − 1) = − (V (t) − 1) . 2 2
Hence X (t) = X (0) + Z (t) +
1 1 (V (t) − 1) − (V (t) − 1) = 2 2
= X (0) + Z (t) , that is, X is a local martingale. Theorem 6.74 (Meyer–Itˆ o formula) Let X be a semimartingale and let f be denotes the left derivative of f and µ is the second a convex function. If f f− generalized derivative of f and L is the local time of X, then f (X (t)) − f (X (0)) = t f (X− ) dX+ = +
(6.62)
0
(f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s)) +
0<s≤t
+
1 2
L (a, t) dµ (a) . R
Proof. Recall that the second generalized derivative of |x| is 2δ 0 . So if f (x) = |x| , then by the theorem one gets just the definition of local times. 1. Let us first assume that the support of µ is compact. In this case the representation (6.53) holds. If f (x) = αx + β then the theorem is trivially true, 71 See:
(6.32), page 398. Or, if one likes, by Itˆ o’s formula.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
433
therefore one can assume that 1 f (x) = 2
R
|x − a| dµ (a) .
With the Dominated Convergence Theorem one can differentiate under the integral sign f (x) f− (x) =
1 2
R
sign (x − a) dµ (a) .
If J (a, t)
(|X (s) − a| − |X (s−) − a| − sign (X (s−) − a) ∆X (s)) ,
0<s≤t
then by the Monotone Convergence Theorem 1 2
=
J (a, t) dµ (a) = R
1 (|X (s) − a| − |X (s−) − a| − sign (X (s−) − a) ∆X (s)) dµ (a) 2 0<s≤t R = (f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s)) . 0<s≤t
Similarly if H (a, t) |X (t) − a| − |X (0) − a| , then f (X (t)) − f (X (0)) =
1 2
H (a, t) dµ (a) . R
Let Z (a, t)
t
sign (X (s−) − a) dX (s) 0
434
ˆ FORMULA ITO’s
and let us take a B (R) × B (R+ ) × A measurable version of this parametric integral72 . By Fubini’s theorem for stochastic integrals73 1 2
t
Z (a, t) dµ (a) = R
0
=
t
1 2
R
sign (X (s−) − a) dµ (a) dX (s) =
f (X (s−)) dX (s) .
0
By the definition of local times L = H − J − Z, that is 1 1 1 1 H = J + Z + L. 2 2 2 2 Integrating by µ and using the already proved formulas one can easily prove the theorem. 2. Let us take the general case and let x ≤ −n f (−n) + f (−n) (x + n) if f (x) if −n < x < n . fn (x) f (n) + f (n) (x − n) if x≥n fn is also convex. Let µn be the generalized second derivative of fn . Obviously the support of µn is in [−n, n] and the measure µn is finite. Hence we can use the already proved part of the theorem. Let τ n inf {t : |X (t)| ≥ n} , and let us consider the stopped processes X τ m . By the already proved part of the theorem fn (X τ n (t)) − fn (X τ n (0)) = t fn (X τ n (s−)) dX τ n (s) + = 0
+
(∆fn (X τ n (s)) − fn (X τ n (s−)) ∆X τ n (s)) +
0<s≤t
1 + 2
R
Ln (a, t) dµn (a) ,
where obviously Ln (a) denotes the local time of X τ n . Observe that |X τ n | ≤ n on [0, τ n ). Therefore on [0, τ n ) one can write f instead fn . The support of the measure generated by Ln (a) is in the set {X τ n (s−) = a} , that is if |a| ≥ n, then 72 See: 73 See:
Proposition 5.23, page 319. Theorem 5.25, page 322.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
435
Ln (a, t) = 0 for all t. The measure µ and µn are equal on the interval [−n, n] so in the integral containing the local time one can write µ instead of µn . That is R
Ln (a, t) dµn (a) =
R
Ln (a, t) dµ (a) .
From the definition of the local time it is evident that the local time of X τ n is Lτ n . Hence if t ≤ τ n , then Ln (a, t) dµn (a) = Ln (a, t) dµ (a) = Lτ n (a, t) dµ (a) = R
R
R
=
L (a, t) dµ (a) . R
If n → ∞, then τ n ∞, and the theorem holds in the general case as well. Corollary 6.75 (Occupation Times Formula) If X is a semimartingale and L is the local time of X then for every bounded Borel measurable function g : R → R and for all t for almost all outcomes
R
t
g (X (s−)) d [X c ] (s) .
L (a, t) g (a) da =
(6.63)
0
The identity is meaningful and it is also valid if g is a non-negative Borel measurable function. Proof. Let f be convex and let f ∈ C 2 . In this case one can use Itˆo’s formula. Comparing Itˆ o’s formula with (6.62)
L (a, t, ω) f (a) da =
R
t
f (X (s−)) d [X c ] .
0
Of course instead of f one can write any g non-negative, continuous function. By the Monotone Class Theorem the identity is valid for every bounded Borel measurable function. With the Monotone Convergence Theorem one can extend the identity to non-negative Borel measurable functions. Let X = w be a Wiener process. In this case [X] (s) = s and by (6.63) for every Borel measurable set B t L (a, t) da = χ (w (s) ∈ B) ds = λ (s ≤ t : w (s) ∈ B) . B
0
The last variable gives the time w is in the set B. For fixed t this occupation time is a measure on the time-line and L (a, t) is the Radon–Nikodym derivative
436
ˆ FORMULA ITO’s
of this occupation time measure. By the interpretation of the density functions L (a, t) da is the time w is around a during the time interval [0, t]. Corollary 6.76 If X is a semimartingale and L is the local time of X then [X c ] (t) =
L (a, t) da. R
Corollary 6.77 (Meyer–Tanaka formula) If X is a continuous semimartingale and L denotes the local time of X then74 |X| = |X (0)| + sign (X) • X + L (0) . By Itˆo’s formula and by the Itˆ o–Meyer formula the class of semimartingales is closed for a quite broad class of transformations. That is why the next example is interesting. Example 6.78 If X = 0 is a continuous local martingale, X (0) = 0 and 0 < α < 1 then |X|α is not a semimartingale.
1. The example is a bit surprising because |X| is a semimartingale and by the Itˆ o–Meyer formula a concave function of a semimartingale is again a semimartingale. But recall that in Theorem 6.65 the domain of definition of F is the whole real line, or at least an open convex set containing the range of X. Now this is α not true. Let us also observe that the function |x| is not concave on the whole line. 2. Let L be the local time of X. Assume that L (0) ≡ 0. By the Meyer–Tanaka formula |X| = sign (X) • X + L (0) = sign (X) • X. On the right-hand side the integral is a local martingale, hence |X| is a nonnegative local martingale so by Fatou’s lemma it is a supermartingale75 . As |X| (0) = 0 0 = E (|X (0)|) ≥ E (|X (t)|) , which implies that if L (0) ≡ 0 then |X| = 0. α 3. Now we prove that if Y |X| is a semimartingale then L (0) ≡ 0. With localization one can assume that X ∈ H02 . The support of L (0) is in {X (s) = 0} 74 Obviously 75 See:
L (0) denotes the process t → L (0, t). page 386.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
437
so by the Meyer–Tanaka formula
t
L (0, t) =
χ (X (s) = 0) dL (0, s) =
0
t
1dL = 0
t
t
χ (X (s) = 0) d |X (s)| −
=
χ (X (s) = 0) sign (X (s)) dX (s) .
0
0
Let us first investigate the second integral
t
Z (t)
χ (X (s) = 0) sign (X (s)) dX (s) . 0
By Itˆo’s isometry and by (6.63)
E Z 2 (t) = E
0
=E
χ (X (s) = 0) d [X] (s) =
t
χ ({0}) (a) L (a, t) da R
=E
L (a, t) da
=0
{0}
hence Z = 0. Now let us calculate the first integral
t
χ (X (s) = 0) d |X (s)| . 0
0 < α < 1 so β 1/α > 1. If β ≥ 2 then by Itˆ o’ formula for C 2 functions |X| = Y β = βY β−1 • Y +
β (β − 1) β−2 • [Y ] . Y 2
Using that {X (s) = 0} = {Y (s) = 0}
t
χ (Y (s) = 0) d |X (s)| =
I (t) 0
t
χ (Y (s) = 0) Y β−1 dY +
=β 0
β (β − 1) + 2
t
χ (Y (s) = 0) Y β−2 d [Y ] . 0
The integrand in the first integral is zero, so the integral is zero. If β > 2 then the integrand in the second integral is also zero, so the second integral is zero again. If β = 2, then using (6.63)
t
χ (Y (s) = 0) d [Y ] = 0
L (a, t) χ ({0}) da =
R
L (a, t) da = 0. {0}
438
ˆ FORMULA ITO’s
Let 2 > β > 1. The function g (x)
xβ 0
if x > 0 if x ≤ 0
is a convex function on R. Hence by Itˆo’s formula for convex functions 1 |X| = g (Y ) = Y β = g (Y ) • Y + H (a) dµ (a) , 2 R where H is the local time of Y . In this case again
t
χ (X = 0) g (Y ) dY =
0
t
χ (X = 0) βY β−1 dY = 0. 0
Let us calculate the integral
t
χ (Y (s) = 0) d
H (a, s) dµ (a) .
(6.64)
R
0
µ is defined by the increasing function x βxβ−1 if x > 0 g− (x) h (t) dt, = 0 if x ≤ 0 −∞ where h (x)
β (β − 1) xβ−2 0
H is the local time of Y so H (a, s) dµ (a) = R
0
if x > 0 . if x ≤ 0
∞
s
H (a, s) h (a) da =
h (Y ) d [Y ] , 0
therefore (6.64) is
t
χ (Y (s) = 0) h (Y ) d [Y ] = 0. 0
This means that if Y is a semimartingale then L (0) = 0, hence X = 0. 6.5.4
Local times of continuous semimartingales
Observe that for every a the local time L (a, t, ω) is defined only up to indistinguishability. This means that for every a one can modify L (a, t, ω) on a set with probability zero. The local time is always continuous in parameter t so one can think about L as an C ([0, ∞)) valued stochastic process: (a, ω) → L (a, ω),
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
439
where L (a, ω) denotes the trajectory of L in t. As this function valued process is defined only almost surely one can use any of its modification as local time. In this subsection we prove that under some restrictions on semimartingale X, the process L (a, t, ω) has a version which is right-regular in a. To do this we shall use the next result: Proposition 6.79 (Kolmogorov’s criteria) Let I be an interval in R and let X be a Banach space valued stochastic process on I. If for some positive constants a, b and c a
E (X (u) − X (v) ) ≤ c u − v
1+b
,
then X has a continuous modification. Proposition 6.80 If X is a continuous local martingale then the local time L (a, t, ω) of X has a modification in a which is continuous in (a, t). Proof. One can localize the proposition as if L is the local time of X and τ is a stopping time then the local time of X τ is Lτ . Therefore one can assume that X − X (0) ∈ H02 . By definition
t
sign (X (s) − a) dX (s) .
L (a, t) = |X (t) − a| − |X (0) − a| − 0
Let us introduce the notation76 : (a, u) M
u
sign (X (s) − a) dX c (s) . 0
: has a continuous version. We want to apply It is sufficient to show that M Kolmogorov’s criterion. C ([0, t]) is a Banach space for arbitrary fix t. Obviously if a function g : I → C ([0, t]) is continuous then it defines a continuous function over I × [0, t]. We show that for all t 4 : : (b) (a) − M E M
C([0,t])
4 : : E sup M (a, s) − M (b, s) ≤
(6.65)
s≤t
2
≤ k · |a − b| . 76 Of course now instead of X c one can write X. But later we shall re-use this part of the proof in a bit different situation.
440
ˆ FORMULA ITO’s
By Burkholder’s and by Jensen’s inequality, using the Occupation Times Formula
( 4 2 : : : : = E sup M (a, s) − M (b, s) ≤ c · E M (a, t) − M (b, t)
(6.66)
s≤t
=c·E
t
2 4χ (a < X (s) ≤ b) d [X c ] (s) =
0
= 4c · E
b
2 L (x, t) dx =
a
2 = 4c · (b − a) E
b
a
2
≤ 4c · (b − a) E
dx L (x, t) b−a
1 b−a
2 ≤
b 2
L (x, t) dx
.
a
Changing the integrals by Fubini’s theorem one can estimate the last line with the following expression:
2 4c · (b − a) sup E L2 (x, t) .
(6.67)
x
Using the definition of the local times and the elementary inequalities ||X (t) − a| − |X (0) − a|| ≤ |X (t) − X (0)| .
2 (z1 − z2 ) ≤ 2 z12 + z22 2
2
t
L2 (x, t) ≤ 2 (X (t) − X (0)) + 2
sign (X (s) − x) dX (s) 0
One can estimate the expected value in (6.67) by 2
2
2 X − X (0)H2 + 2 sign (X − x) • XH2 . By Itˆo’s isometry 2
sign (X − x) • XH2 = E
∞
1d [X] =
0 2
2
= 1 • XH2 = X − X (0)H2 ,
.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
441
so the estimation of E L2 (x, t) is independent of x. So by (6.67) inequality (6.65) follows. Definition 6.81 If X is a continuous local martingale then L (a, t, ω) denotes the version which is continuous in (a, t). Corollary 6.82 If X is a continuous local martingale then almost surely for every value of parameters a and t 1 L (a, t) = lim ε0 2ε
t
χ (−ε + a < X (s) < a + ε) d [X] (s) .
(6.68)
0
Proof. By the occupation times formula for any interval I 1 λ (I)
t
1 χI (X (s)) d [X] (s) = λ (I) 0 1 L (a, t) da. = λ (I) I
R
L (a, t) χI (a) da =
L is continuous in a hence if a0 ∈ I and λ (I) → 0 then 1 λ (I)
L (a, t) da → L (a0 , t) , I
from which (6.68) is evident. Corollary 6.83 If w is a Wiener process then the occupation time measure µt (B) λ (s ≤ t : w (s) ∈ B) almost surely has a differentiable distribution function and the derivative of this function is L (a, t) . Definition 6.84 A semimartingale X satisfies the so-called hypothesis A if for every t almost surely
|∆X (s)| < ∞.
0<s≤t
Proposition 6.85 If semimartingale X satisfies hypothesis A then the local time L (a, t, ω) has a B (R) × P-measurable equivalent modification which is almost surely continuous in t and right-regular in a.
442
ˆ FORMULA ITO’s
Proof. If X satisfies hypothesis A then process ∆X has finite variation. In this case X − ∆X is meaningful and it is a continuous semimartingale. Let J ∆X. As Y X −J is a continuous semimartingale it has a unique decomposition M + V , where M is a continuous local martingale, V is a continuous process with finite variation. By the definition of local times |X (t) − a| = |X (0) − a| + t sign (X (s−) − a) dX (s) + + 0
+
(∆ |X (s) − a| − sign (X (s−) − a) ∆X (s)) +
0<s≤t
+ L (a, t) . For every s by the triangle inequality |∆ |X (s) − a|| ≤ |∆X (s)| .
(6.69)
Therefore by hypothesis A the sums
sign (X (s−) − a) ∆X (s)
0<s≤t
and
∆ |X (s) − a|
0<s≤t
are finite. Hence one can separate the terms in
(∆ |X (s) − a| − sign (X (s−) − a) ∆X (s)) .
(6.70)
0<s≤t
For every semimartingale Z let 0 (a, t) Z
t
sign (X (s−) − a) dZ (s) . 0
Observe that the second term of the sum (6.70) is −J0 (a, t). Using the decomposition X M + V + J : (a, t) + V0 (a, t) + J0 (a, t) − J0 (a, t) + |X (t) − a| = |X (0) − a| + M + ∆ |X (s) − a| + L (a, t) , 0<s≤t
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
443
that is : (a, t) − V0 (a, t) − L (a, t) = |X (t) − a| − |X (0) − a| − M − ∆ |X (s) − a| .
(6.71)
0<s≤t
By (6.69) and by hypothesis A ∆ |X (s) − a| is continuous by a and it is dominated by an integrable variable with respect to the counting measure. By the Dominated Convergence Theorem lim ∆ |X (s) − a| , ∆ |X (s) − u| = u→a
0<s≤t
0<s≤t
so the sum is continuous with respect to a. One should show that the proposition : (a, t) and V0 (a, t). V has finite variation on any finite interval, the is valid for M bounded function sign (X (s−) − u) is right-regular with respect to u. By the Dominated Convergence Theorem V0 is right-regular with respect to a. Finally :. The continuous part of semimartingale X is X c = M, so let us consider M repeating the proof of the previous proposition one can easily prove that M (a, t) : (a, t) . has a continuous version M Corollary 6.86 If a semimartingale X satisfies hypothesis A and if M + V is the decomposition of X − ∆X then ∆L (a, t) L (a, t) − L (a−, t−) = L (a, t) − L (a−, t) = t χ (X (s−) = a) dV (s) = =2 0
t
=2
χ (X (s) = a) dV (s) . 0
Proof. By the proof of the previous proposition only V0 (a, t) is not continuous so t 0 ∆L (a) = −∆V (a) = − sign (X (s−) − a) − sign (X (s−) − a−) dV (s) = 0
t
=2
χ (X (s−) = a) dV (s) . 0
V continuous and X (s−) = X (s) outside countable number points s, so t t 2 χ (X (s−) = a) dV (s) = 2 χ (X (s) = a) dV (s) . 0
0
of
444
ˆ FORMULA ITO’s
Example 6.87 Even for continuous semimartingales the local time can be discontinuous.
1. Let w be a Wiener process and let X |w|. As the support of the measure generated by L (a) is in the set {X = a} if a < 0, then L (a, t) = 0. Let a = 0. L is right-continuous in parameter a therefore using the occupation times formula 1 ε0 ε
ε
L (0, t) = lim
1 = lim ε0 ε
1 ε0 ε
L (a, t) da = lim 0
R
χ (0 ≤ a < ε) L (a, t) da =
t
χ (|w| < ε) d [|w|] . 0
By Tanaka’s formula |w| = sign (w) • w + Lw (0) . Lw (0) is continuous and increasing so [|w|] = [sign (w) • w] = [w]. Hence using again that Lw is continuous 1 L (0, t) = lim ε0 ε 1 ε0 ε
t
χ (−ε < w < ε) d [w] = 0
ε
Lw (a) da = 2Lw (0) = 0.
= lim
ε
This implies that the local time L (a, t) is not left-continuous in parameter a. 2. On the other hand it is interesting to discuss the case a > 0. Again by the right-continuity 1 L (a, t) = lim ε0 ε 1 ε0 ε
t
χ (|w| ∈ [a, a + ε)) (s) ds = 0
t
χ (w ∈ [a, a + ε)) (s) ds+
= lim
0
1 ε0 ε
t
χ (−w ∈ [a, a + ε)) (s) ds.
+ lim
0
The first limit is Lw (a, t) and the second is Lw (−a, t). Hence L (a, t) = Lw (−a, t) + Lw (a, t) . This expression is continuous on the set a ≥ 0.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
6.5.5
445
Local time of Wiener processes
In this subsection we shall investigate the local times of Wiener processes. Definition 6.88 If w is a Wiener process then L denotes the local time of w at point a = 0. That is L Lw (0). We shall very often refer to L as the local time of w. Example 6.89 Tanaka’s formula for Wiener processes.
If w is a Wiener process and L Lw (0) is the local time of w then by Tanaka’s formula |w| = sign (w) • w + L β + L.
(6.72) 2
sign (w)•w is a continuous local martingale with quadratic variation (sign (w)) • [w] = [w]. By L´evy’s characterization theorem77 β sign (w)•w is also a Wiener process. Our goal is to describe the distribution of L. To do this we shall need the next simple lemma: Lemma 6.90 (Skorohod) If y is a continuous function defined on R+ and y (0) ≥ 0 then there are functions on R+ denoted by z and a for which: 1. z = y + a, 2. z is non-negative, 3. a is increasing, continuous and a (0) = 0 and the support of the measure generated by a is in the set {z = 0}. Functions a and z are unique and . / a (t) = sup y − (s) sup max (−y (s) , 0) . s≤t
(6.73)
s≤t
Proof. First we show that the decomposition is unique. Let (a1 , z1 ) and (a2 , z2 ) be two decompositions satisfying the conditions of the lemma. y = z1 − a1 = z2 − a2 , 77 See:
Theorem 6.13, page 368.
446
ˆ FORMULA ITO’s
so z1 − z2 = a1 − a2 . As a1 and a2 are increasing z1 − z2 and a1 − a2 have finite variation. Integrating by parts 2
0 ≤ (z1 − z2 ) (t) = 2
t
z1 (s) − z2 (s) d (z1 − z2 ) (s) = 0
t
z1 (s) − z2 (s) d (a1 − a2 ) (s) .
=2 0
By the assumption about the support of measures generated by functions a1 and a2 and as z1 ≥ 0 and z2 ≥ 0 the last integral is
t
z1 (s) da2 − 2
−2 0
t
z2 da1 ≤ 0. 0
Hence z1 = z2 . As a second step we show that a in (6.73) and z y + a satisfy the conditions of the lemma. a is trivially increasing. By the assumptions y is continuous, hence y − is also continuous. It is easy to show that a is continuous. For every t z (t) y (t) + a (t) ≥ y (t) + y − (t) = y + (t) ≥ 0. One should prove that the support of the measure generated by a is in the set {z = 0} , that is
χ (z > 0) da = lim
n→∞
R+
1 χ z> n R+
da = 0.
This means that one should prove that for every ε > 0 χ (z > ε) da = 0. R+
z is continuous, hence for every ε > 0 the set {z > ε} is open, hence {z > ε} is a union of countable number of open intervals. Let (u, v) be one of these intervals. It is sufficient to prove that a (v) = a (u). If s ∈ (u, v) then −y (s) a (s) − z (s) ≤ a (v) − ε. From this a (v) = max a (u) , sup y − (s) ≤ max (a (u) , a (v) − ε) . u≤s≤v
This can happen only if a (v) ≤ a (u), that is a (v) = a (u).
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
447
Proposition 6.91 The distribution of L (t, ω) L (0, t, ω) is the same as the distribution of the maximum of a Wiener process on the interval [0, t]. Hence the density function of L (t) is ft (x) √
2 x 2 , exp − 2t 2πt
x > 0.
Proof. By Tanaka’s formula |w| = β + L, where β is a Wiener process and the two sides are equal up to indistinguishability. The support of the measure generated by L is in the set {|w| = 0}. Hence by Skorohod’s lemma L (t) = sup β − (s) = sup (−β (s)) S−β (t) , a.s.
s≤t
(6.74)
s≤t
from which by the symmetry of Wiener process the proposition is evident78 . Proposition 6.92 The augmented filtration generated by β sign (w) • w is the same as the augmented filtration generated by |w|. Proof. Let F β and F |w| be the augmented filtration generated by β and by |w|. By (6.74) L is adapted with respect F β . By Tanaka’s formula |w| is F β adapted. Hence F |w| ⊆ F β . On the other hand for Wiener processes L (a, t) is almost surely continuous in a so by (6.68) and by the occupation times formula ε 1 L (t) = lim L (a, t, ω) da = ε0 2ε −ε 1 L (a, t, ω) χ ((−ε, ε)) (a) da = = lim ε0 2ε R 1 ε0 2ε
t
χ (|w (s)| < ε) ds.
= lim
0
Hence L is F |w| -adapted. Therefore β is F |w| -adapted, so F β ⊆ F |w| . Proposition 6.93 If L (a, ∞, ω) denote the limit limt→∞ L (a, t, ω) then for every a P (L (a, ∞) = ∞) = 1. 78 See:
Example 1.123, page 87 and Proposition B.7, page 564.
448
ˆ FORMULA ITO’s
Proof. By definition |w (t) − a| |a| + β (t) + L (a, t) . where β sign (w − a) • w. By L´evy’s theorem β is a Wiener process. Again by Skorohod’s lemma −
L (a, t) = sup (β (t) + |a|) . s≤t
Hence P (L (a, ∞) = ∞) = 1. Finally we show that for Wiener processes the support of the measure generated by t → L (t, ω) is not only almost surely in the set Z (ω) {t : w (t, ω) = 0} but the two sets are almost surely equal. Proposition 6.94 For almost all outcome ω the set Z (ω) is closed and has empty interior. Proof. The trajectories of Wiener processes are continuous which immediately implies that Z (ω) is closed. We show that almost surely the Lebesgue measure of Z (ω) is zero. This will imply that Z (ω) does not contain a segment with positive length. By Fubini’s theorem, using that for every t > 0 the value of a Wiener process has non-degenerated Gaussian distribution so P (w (t) = 0) = 0 for every t > 0 E (λ (Z (ω))) = E =
∞
χ (Z (ω)) (t) dt
=
0 ∞
E (χ (Z (ω)) (t)) dt = 0 0
hence λ (Z (ω)) = 0 almost surely. Definition 6.95 If w is a Wiener process then the intervals in the open set c Z (ω) = {|w (ω)| > 0} are called the excursion intervals of w.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
449
For every t let σ t (ω) inf {s > 0 : L (s) ≥ t} , ρt (ω) inf {s > 0 : L (s) > t} . σ t and ρt are obviously stopping times. [σ t , ρt ] is the largest closed interval where L is constantly t. Let O (ω) ∪t (σ t (ω) , ρt (ω)) . O (ω) is an open set in R so by the structure of the open sets of the real line O (ω) is the union of maximum countable many disjoint intervals. As L is increasing it is easy to see that if t1 = t2 then
σ t1 (ω) , ρt1 (ω) ∩ σ t2 (ω) , ρt2 (ω) = ∅.
Hence O (ω) is maximum countable union of some intervals (σ t (ω) , ρt (ω)). Obviously O (ω) is the maximum countable number of intervals where L is constant. Proposition 6.96 If w is a Wiener process and L is the local time of w at zero then almost surely O (ω) is the union of the excursion intervals of w, that is a.s
c
O (ω) = {|w (ω)| > 0} = Z (ω) . Proof. The proof uses several interesting properties of the Wiener processes. 1. Observe that with probability one the maximum of a Wiener process β on any two disjoint, compact interval is different: If a < b < c < d < ∞ then by the definition of the conditional expectation using the independence of the increments P sup β (t) = sup β (t) = =P
a≤t≤b
c≤t≤d
sup (β (t) − β (b)) + β (b) = sup (β (t) − β (c)) + β (c) =
a≤t≤b
c≤t≤d
P β (c) − β (b) = sup (β (t) − β (b)) − sup (β (t) − β (c)) =
= R
R
a≤t≤b
c≤t≤d
P (β (c) − β (b) = x − y) dF (x) dG (y) = =
1dF (x) dG (y) = 1. R
R
450
ˆ FORMULA ITO’s
Unifying the measure-zero sets one can prove the same result for every interval with rational endpoints. 2. This implies that with probability one every local maximum of a Wiener process has different value. 3. By Tanaka’s formula |w| = L − β
(6.75)
for some Wiener process β. Recall that by Skorohod’s lemma79 L is the running maximum of β. This and (6.75) implies that L is constant on any interval80 where |w| > 0. As with probability one, the local maximums of β are different on the flat segments of L with probability one w is not zero. Hence the excursion intervals of w and the flat parts of L are almost surely equal. Proposition 6.97 Let w be a Wiener process. For almost all ω the following three sets are equal81 : 1. the sets of zeros of w; 2. the complement of the O (ω); 3. support of the measure generated by local time L (ω). Proof. Let S (ω) denote the support of the measure generated by L (ω). By definition S (ω) is the complement of the largest open set G (ω) with L (G (ω)) = 0. L is constant on the components of O, so L (O) = 0 that is O (ω) ⊆ G (ω). Hence S (ω) G c (ω) ⊆ Oc (ω) . Let I be an open interval with I ∩ O (ω) = ∅. If s1 < s2 are in I then L (s1 , ω) = L (s2 , ω) is impossible, so the measure of I with respect to L (ω) is positive, hence O (ω) is the maximal open set with zero measure, that is O (ω) = G (ω). Hence the equivalence of the last two sets is evident. By the previous proposition c (Z (ω)) = O (ω) = S c (ω) so Z (ω) = S (ω). 6.5.6
Ray–Knight theorem
Let b be an arbitrary number and let τ b be the hitting time of b. On [0, b] one can define the process Z (a, ω) L (b − a, τ b (ω) , ω) , 79 See:
Proposition 6.91, page 447. Proposition 6.97, page 450. 81 See: Example 7.43, page 494. 80 See:
a ∈ [0, b] .
(6.76)
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
451
If a > 0 then Z (a) has an exponential distribution82 with parameter λ 1/ (2a). In this subsection we try to find some deep reason for this surprising result. Let us first prove some lemmas. Lemma 6.98 Let Z (Za ) be the filtration generated by (6.76). If ξ ∈ L2 (Ω, Za , P), then ξ has the following representation:
∞
ξ = E (ξ) +
H · χ (b ≥ w > b − a) dw.
(6.77)
0
In the representation H is a predictable process and E
∞
H 2 χ (b ≥ w > b − a) d [w] < ∞.
0
Proof. Let us emphasize that predictability of H means that H is predictable with respect to the filtration F generated by the underlying Wiener process. 1. Let U be the set of random variables ξ with representation (6.77). χ (b ≥ w > b − a) is a left-regular process, so the processes U H · χ (b ≥ w > b − a) ,
H ∈ L2 (w)
o’s isometry it is clear that the random form a closed subset of L2 (w). From Itˆ variables satisfying (6.77) form a closed subset of L2 (Ω, F∞ , P). Obviously Za ⊆ F∞ and so the set of variables with the given property is a closed subspace of L2 (Ω, Za , P). 2. Let η g exp −
a
g (s) Z (s) ds ,
g ∈ Cc1 ([0, a])
0
where Cc1 ([0, a]) denotes the set of continuously differentiable functions which are zero outside [0, a]. Z is continuous so the σ-algebra generated by the variables η g is equal Za . Let t U (t) exp − g (b − w (s)) ds exp (−K (t)) . 0 82 See:
Example 6.73, page 430.
452
ˆ FORMULA ITO’s
g is bounded so U is bounded. By the Occupation Times Formula η g exp −
a
g (s) Z (s) ds
exp −
0
= exp −
g (b − v) L (v, τ b ) dv
g (s) L (b − s, τ b ) ds
=
0
b
a
=
b−a
= exp − g (b − v) L (v, τ b ) dv = R τb
= exp −
g (b − w (v)) dv
= U (τ b ) .
0
Let f ∈ C 2 , M f (w) exp (−K) f (w) U. K is continuously differentiable so it has finite variation so by Itˆ o’s formula M − M (0) = f (w) U • w − f (w) U • K+ 1 + U f (w) • [w] . 2 Let f be zero on (−∞, b − a] , f (b) = 1 and f (x) = 2g (b − x) f (x). The third integral is 1 U f (w) • [w] = U g (b − w) f (w) • [w] = U f (w) • K 2 hence the second and the third integrals are the same. Hence M − M (0) = f (w) U • w. As f (x) = f (x) χ (x > b − a) M (τ b ) M (τ b ) = = M (τ b ) = f (w (τ b )) f (b) τb = M (0) + U (s) f (w (s)) dw (s) =
η g = U (τ b ) =
0
τb
= M (0) +
U (s) f (w (s)) χ (w (s) > b − a) dw (s)
0
E ηg +
0
τb
Hχ (w > b − a) dw.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
453
So for η g the representation (6.77) is valid. As η g generates Za and the set of variables for which (6.77) is valid is a closed set the lemma holds. Lemma 6.99 If the filtration is given by Z then Z (a) − 2a is a continuous martingale on [0, b]. Proof. Obviously Z (a) − 2a is continuous in a. By Tanaka’s formula +
t
(w (t) − (b − a)) = 0
1 χ (w (s) > b − a) dw (s) + L (b − a, t) . 2
If t = τ b , then Z (a) − 2a L (b − a, τ b ) − 2a = τb χ (w (s) > b − a) dw (s) = = −2 0
= −2
∞
χ (b ≥ w (s) > b − a) dw (s) .
0
From this Z (a) is integrable and its expected value is 2a. If u < v, then for every Zu -measurable bounded variable ξ, by the previous lemma and by Itˆ o’s isometry = −2E
∞
E ((Z (v) − 2v) ξ) = ∞ χ (b ≥ w > b − v) dw Hχ (b ≥ w > b − u) dw =
0
0
∞
= −2E 0
= −2E
χ (b ≥ w (s) > b − v) Hχ (b ≥ w (s) > b − u) ds
∞
=
Hχ (b ≥ w (s) > b − u) ds
= E ((Z (u) − 2u) ξ) .
0
Hence Z (a) − 2a is a martingale. Lemma 6.100 If X is a continuous local martingale and σ ≥ 0 is a random variable, then the quadratic variation of the stochastic process Lσ (a, ω) L (a, σ (ω) , ω) is finite. If u < v then the quadratic variation of Lσ on the interval [u, v] is v a.s.
[Lσ ]u = 4
v
L (a, σ) da. u
Proof. Of course, by definition, the random variable ξ is the quadratic
variation (n) of [u, v] of Lσ on the interval [u, v] if for arbitrary infinitesimal partition ak k,n
454
ˆ FORMULA ITO’s
if n → ∞ then
2 P (n) (n) → ξ. Lσ ak − Lσ ak−1
k
1. Let us fix t. Let 0 (a) X
t
sign (X (s) − a) dX (s) . 0
By the definition of local times 0 (a, t) . L (a, t) = |X (t) − a| − |X (0) − a| − X Let us remark that if f is a continuous and g is a Lipschitz continuous function then
(n) (n) (n) (n) |[f, g]| ≤ lim sup max f ak − f ak−1 − g ak−1 ≤ g ak n→∞
k
k
(n) (n) (n) (n) ≤ lim sup max f ak − f ak−1 K ak − ak−1 = 0. n→∞
k
k
The process Fσ (a) |X (σ) − a| − |X (0) − a| is obviously Lipschitz continuous in parameter a. X is a continuous local 0 is continuous83 in a so for every outcome martingale so X ( 0σ , Fσ = 0 and [Fσ ] = 0. Fσ + X Therefore ( ( 0σ . 0σ = X [Lσ ] = Fσ + X 2. By Itˆo’s formula
2 0 a(n) − X 0 a(n) = X k k−1
0 a(n) − X 0 a(n) − X 0 a(n) 0 a(n) =2 X • X + k k−1 k k−1
( 0 a(n) 0 a(n) − X . + X k k−1
83 See:
Proposition 6.80, page 439.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
455
By the Occupation Times Formula for every t almost surely (
0 a(n) − X 0 a(n) X k k−1
( (n) (n) sign X − ak − sign X − ak−1 • X =
( (n) (n) •X = = −2χ ak−1 < X ≤ ak
(n) (n) = 4χ ak−1 < X ≤ ak • [X] = 4
(n)
ak
(n)
L (a) da.
ak−1
Hence almost surely (
0 a(n) − X 0 a(n) X (σ) = 4 k k−1
v
u
k
v
L (a, σ) da = 4
Lσ (a) da. u
3. Finally we should calculate the limit of the sum of first terms. The sum of the stochastic integrals is
(n)
(n) (n) (n) 0 a 0 a − 2 X −X χ ak−1 < X ≤ ak • X. k k−1 k
0 is continuous if n → ∞ the integrand goes to zero. The integrand is locally As X bounded so the stochastic integral goes to zero uniformly on compact intervals in probability. Theorem 6.101 (Ray–Knight) There is a Wiener process β with respect to the filtration Z, that Z (a) L (b − a, τ b ) satisfies the equation a√ Zdβ, a ∈ [0, b] . (6.78) Z (a) − 2a = 2 0
Proof. L (u, t) is positive for every t > 0, so Z (a) > 0. The quadratic variation a of Z (a) − 2a is 4 0 Z (s) ds. By Doob’s representation theorem84 there is a Wiener process β with respect to filtration generated by Z for which (6.78) valid. Z (a) is a continuous semimartingale. By Itˆ o’s formula a exp (−sZ) d (−sZ) + exp (−sZ (a)) − 1 = 0
+ 84 See:
Proposition 6.18, page 373.
1 2
a
exp (−sZ) d [−sZ] . 0
456
ˆ FORMULA ITO’s
Y (u) Z (u) − 2u is a martingale Z ≥ 0 so, exp (−sZ) ≤ 1
a
E
(exp (−sZ)) d [−sZ] ≤ E
a
2
0
d [−sZ] =
0
= 4s2 E
a
Z (s) ds
=
0 a
= 4s2
E (Z (s)) ds = 0
a
sds < ∞.
2
= 8s
0
Hence the integral
a
exp (−sZ (u)) d (−s (Z (u) − 2u)) 0
is a martingale. Let L (a, s) E (exp (−sZ (a))) . Taking expected value on both sides of Itˆ o’s formula and using the martingale property of the above integral a L (s, a) − 1 = E exp (−sZ (u)) d (−2su) + 0
1 + E 2
a
exp (−sZ) d [−sZ] .
0
Let us calculate the second integral. Using (6.78) 2s2 E
a
exp (−sZ (u)) Z (u) du 0
= −2s
2
a
E 0
= 2s2
a
E (exp (−sZ (u)) Z (u)) du = 0
d exp (−sZ (u)) du. ds
Changing the expected value and differentiating by a ∂L d = −2sL (a, s) − 2s2 E exp (−sZ (a)) . ∂a ds For Laplace transforms one can change the differentiation and the integration so ∂L ∂L , = −2sL (a, s) + 2s2 ∂s ∂a
L (a, 0) = 1.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
457
With direct calculation one can easily verify that L (a, s) =
1 1 + 2sa
satisfies the equation. The Laplace transform L (a, s) is necessarily analytic so by the theorem of Cauchy and Kovalevskaja 1/ (1 + 2sa) is the unique solution of the equation. This implies that Z (a) has an exponential distribution with parameter λ = 1/ (2a). 6.5.7
Theorem of Dvoretzky Erd˝ os and Kakutani
First let us introduce some definitions: Definition 6.102 Let f be a real valued function on an interval I ⊆ R. 1. We say that t is a point of increase of f if there is a δ > 0 such that f (s) ≤ f (t) ≤ f (u) whenever s, u ∈ I ∩ (−δ + t, t + δ) and s < t < u. 2. We say that t is a point of strict increase of f if there is a δ > 0 such that f (s) < f (t) < f (u) whenever s, u ∈ I ∩ (−δ + t, t + δ) and s < t < u. A striking feature of Wiener processes is the following observation: Theorem 6.103 (Dvoretzky–Erd˝ os–Kakutani) Almost surely the trajectories of Wiener processes do not have a point of increase. Proof. Let w be a Wiener process. 1. One should show that P ({ω : w (ω) has a point of increase}) = 0. Obviously sufficient to prove that for an arbitrary v > 0 P ({ω : w (ω) has a point of increase in [0, v]}) = 0. By Girsanov’s theorem there is a probability measure P ∼ Q on (Ω, Fv ) such that w (t) w (t) + t is a Wiener process on [0, v] under Q. Every point of increase of w is a strict point of increase of w. Therefore it is sufficient to prove that P ({ω : w (ω) has a point of strict increase in [0, v]}) = 0.
458
ˆ FORMULA ITO’s
Of course this is the same as P ({ω : w (ω) has a point of strict increase}) = 0. To prove this it is sufficient to show that P (Ωp,q ) = 0 for every rational numbers p and q where Ωp,q
ω : ∃t such that w (s, ω) < w (t, ω) < w (u, ω) , for every s, u ∈ (p, q) , s < t < u
.
Using the strong Markov property of w one can assume that p = 0. 2. Let L be the local time of w. We show that for every b almost surely Z (a) L (b − a, τ b (ω) , ω) > 0,
∀a ∈ (0, b] .
As we know85 if a > 0 then Z (a) has an exponential distribution so it is almost surely positive for every fixed a ∈ (0, b]. Z (a) is continuous so if Ωn is the set of outcomes ω for which Z (a, ω) ≥ 1/n for every rational a then Z (a, ω) ≥ 1/n for every a ∈ (0, b]. If Ω ∪n Ωn then P (Ω ) = 1 and if ω ∈ Ω then Z (a, ω) > 0 for every a ∈ (0, b]. 3. Now it is obvious that there is an Ω∗ with P (Ω∗ ) = 1 that whenever ω ∈ Ω∗ then a. L (a, t, ω) is continuous in (a, t); b. the support of L (a, ω) is {w (ω) = a} for every rational number a; c. Z (a) L (b − a, τ b (ω) , ω) > 0 whenever 0 < a ≤ b for every rational number b. 4. Let ω ∈ Ω∗ and let ω ∈ Ωp,q = Ω0,q . This means that for some t w (s, ω) < w (t, ω) < w (u, ω) ,
0 ≤ s < t < u ≤ q.
(6.79)
Let us fix a rational number w (t, ω) < b < w (q, ω). Let (bn ) be a sequence of rational numbers for which bn w (t, ω). As w (t, ω) < b and b is rational by c. L (w (t, ω) , τ b (ω) , ω) = L (b − (b − w (t, ω)) , τ b (ω) , ω) > 0. L is continuous so the measure of every single point is zero so by b. Obviously L (bn , τ bn , ω) = 0. So L (w (t, ω) , τ b (ω) , ω) = L (w (t, ω) , τ b (ω) , ω) − L (bn , τ b (ω) , ω) + + L (bn , τ b (ω) , ω) − L (bn , t, ω) + + L (bn , t, ω) − L (bn , τ bn , ω) . 85 See:
Example 6.73, page 430.
ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S
459
By the construction as t is a point of increase bn < w (t, ω) < w (a, ω) < b,
a ∈ (t, τ b ) .
By b. the support of the measure generated by L (bn , ω) is {w (ω) = bn }. Hence the second line in the above estimation is zero. t is a point of increase so by (6.79) if n → ∞ then τ bn → t. Therefore using a. 0 < L (w (t, ω) , τ b (ω) , ω) = L (w (t, ω) , τ b (ω) , ω) − L (w (t, ω) , τ b (ω) , ω) + + L (w (t, ω) , t, ω) − L (w (t, ω) , t, ω) = 0. / Ωp,q . Hence P (Ωp,q ) = 0. This is a contradiction so if ω ∈ Ω∗ then ω ∈
7 PROCESSES WITH INDEPENDENT INCREMENTS In this chapter we discuss the classical theory of processes with independent increments. In the first section we return to the theory of L´evy processes. The increments of L´evy processes are not only independent but they are also stationary. L´evy processes are semimartingales, but the same is not true for processes with independent increments. In the second part of the chapter we show the generalization of the L´evy–Khintchine formula to processes with just independent increments. The main difference between the theory of L´evy processes and the more general theory of processes with independent increments is that every L´evy process is continuous in probability. This property does not hold for the more general class. This implies that processes with independent increments can have jumps with positive probability.
7.1
L´ evy processes
In this section we briefly return to the theory of L´evy processes. The theory of L´evy processes is much simpler than the more general theory of processes with independent increments. Recall that L´evy processes have stationary and independent increments. The main consequence of these assumptions is that if ϕt (u) denotes the Fourier transform of X (t) then for every u ϕt+s (u) = ϕt (u)ϕs (u),
(7.1)
so ϕt (u) for every u satisfies Cauchy’s functional equation1 . As the Fourier transforms of distributions are always bounded the solutions of equation (7.1) have the form ϕt (u) = exp (tφ(u)) , 1 See:
line (1.40), page 62.
460
(7.2)
´ LEVY PROCESSES
461
for some φ. One of our main goals is to find the proper form2 of φ(u). Representation (7.2) has two very important consequences: 1. ϕt (u) = 0 for every u and t, 2. ϕt (u) is continuous in t. As ϕt is continuous in t, if tn t, then ϕtn (u) → ϕt (u) for every u. Hence w
P
X(tn ) − X(t) → 0, that is X(tn ) − X(t) → 0. Hence for some subsequence a.s. a.s. X (tnk ) → X (t). Therefore X (t−) = X (t). Hence if X is a L´evy process then it is continuous in probability and, as a consequence of this continuity, for every moment of time t the probability of a jump at t is zero, that is P (∆X (t) = 0) = 0 for every t. As ϕt (u) = 0 for every u one can define the exponential martingale Zt (u, ω)
exp (iuX(t, ω)) . ϕt (u)
(7.3)
Recall that, applying the Optional Sampling Theorem to (7.3), one can prove that every L´evy process is a strong Markov process3 . 7.1.1
Poisson processes
Let us recall that a L´evy process X is a Poisson process if its trajectories are increasing and the image of trajectories is almost surely the set of integers {0, 1, 2, . . .}. One should emphasize that all the non-negative integers have to be in the image of the trajectories, so Poisson processes do not have jumps which are larger than one. To put it another way: Poisson processes are the L´evy type counting processes. Definition 7.1 A process is a counting process if its image space is the set of integers {0, 1, . . .}. X is a Poisson process with respect to a filtration F if it is a counting L´evy process with respect to the filtration F. Since the values of the process are integers and as the trajectories are rightregular there is always a positive amount of time between the jumps. That is if X (t, ω) = k then X (t + u, ω) = k, whenever 0 ≤ u ≤ δ for some δ (t, ω) > 0. As the trajectories are defined for every t ≥ 0 and the values of the trajectories are finite at every t the jumps of the process cannot accumulate. Let τ 1 (ω) inf {t: X (t, ω) = 1} = inf {t: X (t, ω) > 0} < ∞. 2 This 3 See:
is the famous L´evy–Khintchine formula. Proposition 1.109, page 70.
462
PROCESSES WITH INDEPENDENT INCREMENTS
τ 1 is obviously a stopping time. We show that τ 1 is exponentially distributed: if u, v ≥ 0 then P (τ 1 > u + v) = P (X (u + v) = 0) = = P (X (u) = 0, X (u + v) − X (u) = 0) = = P (X (u) = 0) · P (X (u + v) − X (u) = 0) = = P (X (u) = 0) · P (X (v) = 0) , hence if f (t) P (τ 1 > t) then f (u + v) = f (u) · f (v) ,
u, v ≥ 0.
f ≡ 0 and f ≡ 1 cannot be solutions as X cannot be a non-trivial L´evy process4 , so for some 0 < λ < ∞ P (τ 1 > t) = P (X (t) = 0) = exp (−λt) . By the strong Markov property of L´evy processes5 the distribution of X1∗ (t) X (τ 1 + t) − X (τ 1 ) is the same as the distribution of X (t) so if τ 2 (ω) inf {t: X (t + τ 1 (ω) , ω) = 2} = inf {t: X1∗ (t, ω) > 0} < ∞ then τ 1 and τ 2 are independent and they have the same distribution6 . Proposition 7.2 If λ denotes the common parameter, then for every t ≥ 0 n+1 n n (λt) P exp (−λt) . τk > t ≥ τ k = P (X (t) = n) = n! k=1
k=1
Proof. Recall that a non-negative variable has gamma distribution Γ (a, λ) if the density function of the distribution is fa,λ (x)
λa a−1 x exp (−λx) , Γ (a)
x > 0.
random First we show that if ξ i are independent n nvariables with distribution Γ (ai , λ) , then the distribution of i=1 ξ i is Γ ( i=1 ai , λ). It is sufficient to 4 If f ≡ 1 then τ = ∞, hence X ≡ 0 and the image of trajectories is {0} only and not the 1 set of integers. 5 See: Proposition 1.109, page 70. 6 Let us recall that τ ∗ 1 is Fτ 1 -measurable and by the strong Markov property X1 is independent of Fτ 1 . See Proposition 1.109, page 70.
´ LEVY PROCESSES
463
show the calculation for two variables. If the distribution of ξ 1 is Γ(a, λ), and the distribution of ξ 2 is Γ(b, λ), and if they are independent, then the density function of ξ 1 + ξ 2 is the convolution of the density functions of ξ 1 and ξ 2 h (x)
∞
−∞
x
= 0
=
fa,λ (x − t) fb,λ (t) dt = a−1
λa (x − t) Γ (a)
exp (−λ (x − t))
λa+b exp (−λx) Γ (a) Γ (b)
λa+b exp (−λx) = Γ (a) Γ (b) =
x
λb tb−1 exp (−λt) dt = Γ (b)
a−1 b−1
(x − t)
t
dt =
0
1
a−1
(x − xz)
b−1
(xz)
xdz =
0
λa+b exp (−λx) xa+b−1 Γ (a) Γ (b)
1
a−1
(1 − z)
z b−1 dz =
0
a+b
=
λ exp (−λx) xa+b−1 . Γ (a + b)
Hence the distribution of ξ 1 + ξ 2 is Γ (a + b, λ). The density function of Γ (1, λ) is λ1 1−1 x exp (−λx) = λ exp (−λx) , Γ (1)
x > 0,
so Γ (1, λ) is the exponential distribution with parameter λ. If σ m then σ m has gamma distribution Γ (m, λ) .
m k=1
P (X (t) < n + 1) =
∞
λn+1 xn exp (−λx) dx = Γ (n + 1) t ∞ ∞ n λn xn−1 (λx) exp (−λx) exp (−λx) dx = + n = − Γ (n + 1) Γ (n + 1) t t = P (σ n+1 > t) =
n
=
(λt) exp (−λt) + P (X (t) < n) . n!
Hence n
P (X (t) = n) = P (X (t) < n + 1) − P (X (t) < n) =
(λt) exp (−λt) . n!
τk
464
PROCESSES WITH INDEPENDENT INCREMENTS
7.1.2
Compound Poisson processes generated by the jumps
Let X now be a L´evy process and let Λ be a Borel measurable set. τ 1 (ω) inf {t: ∆X (t, ω) ∈ Λ} . Since (Ω, A, P, F) satisfies the usual conditions τ 1 is a stopping time7 . As τ 1 is measurable / Λ, ∀u ∈ [0, t]) P (τ 1 > t) = P (∆X (u) ∈ is meaningful. Assume that the closure of Λ denoted by cl (Λ) does not contain the point 0, that is Λ is in the complement of a ball with some positive radius r > 0. As X is right-continuous and as X (0) = 0 obviously 0 < τ 1 ≤ ∞. In a similar way as in the previous subsection, using that the jumps in Λ cannot accumulate8 P (τ 1 > t1 + t2 ) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 + t2 ]) = / Λ, u ∈ (t1 , t1 + t2 ]) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 ]) · P (∆X (u) ∈ / Λ, u ∈ (0, t2 ]) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 ]) · P (∆X (u) ∈ = P (τ 1 > t1 ) · P (τ 1 > t2 ) . So τ 1 has an exponential distribution. Let us observe that now we cannot guarantee that λ > 0 as τ 1 ≡ ∞ is possible. Let us assume that τ 1 < ∞. Let X ∗ (t) X (τ 1 + t) − X (τ 1 ) and let τ 2 inf {t : ∆X ∗ (t) ∈ Λ} , n / cl (Λ) and as X etc. If τ 1 < ∞ then τ k < ∞ for all k. Let σ n k=1 τ k . As 0 ∈ has limits from left the almost surely9 strictly increasing sequence (σ n ) almost surely cannot have a finite accumulation point. So almost surely σ n ∞. As on every trajectory the number of jumps is at most countable one can define the 7 See:
Corollary 1.29, page 16, Example 1.32, page 17. 0∈ / cl (Λ) all the jumps are larger than some r > 0. τ 1 is a stopping time so the sets below are measurable. 9 The trajectories of a Poisson process are just almost surely nice. For example, with probability zero N (ω) ≡ 0 is possible. 8 As
´ LEVY PROCESSES
465
process N Λ which counts the jumps of X with ∆X ∈ Λ. N Λ (t)
χΛ (∆X (s)) =
∞
χ {σ n ≤ t} .
(7.4)
n=1
0<s≤t
N Λ (t) − N Λ (s) is the number of jumps in Λ during the time interval (s, t] so it is evidently measurable with respect to the σ-algebra generated by the increments of X. Hence10 N Λ (t) − N Λ (s) is independent of the σ-algebra Fs . So N Λ has independent increments. It is also easy to prove that the distribution of N Λ (t)−N Λ (s) is the same as the distribution of N Λ (t − s). It is trivial from the definition that N Λ is a right-regular counting process. Hence N Λ is a counting L´evy process. Therefore we have proved the following: Lemma 7.3 If 0 ∈ / cl (Λ) then N Λ is a Poisson process. Definition 7.4 A stopping time σ is a jump time of a process X if ∆X (σ) = 0 almost surely. Example 7.5 The jump times of L´evy processes are totally inaccessible.
Let τ be a predictable stopping time and let P (∆X (τ ) = 0) > 0. We can assume that P (|∆X (τ )| ≥ ε) > 0 for some ε > 0. If Λ {|x| ≥ ε} and if (σ n ) are the stopping times of the Poisson process N Λ then P (σ n = τ ) > 0 for some n. But this is impossible as σ n is totally inaccessible11 for every n. Therefore if τ is predictable then P (∆X (τ ) = 0) = 0. With N Λ one can define the process J Λ (t, ω)
∆X (s, ω) χΛ (∆X (s, ω)) =
(7.5)
0<s≤t N Λ (t)
=
n=1
∆X (σ n ) =
∞
∆X (σ n ) χ {σ n ≤ t} .
n=1
Lemma 7.6 If 0 ∈ / cl (Λ) then J Λ is a compound Poisson process that is: 1. J Λ (0) = 0. 2. J Λ has countable many jumps. 3. After every jump J Λ has an exponentially distributed waiting time. After this waiting time J Λ jumps again. The time between the jumps are independent and they have the same distribution. 10 See: 11 See:
Proposition 1.97, page 61. Example 3.7, page 183.
466
PROCESSES WITH INDEPENDENT INCREMENTS
4. The sizes of the jumps are independent of the waiting times up to the jumps. 5. The sizes of the jumps have the same distribution and they are independent random variables. Proof. If η n ∆X (σ n ) then by the strong Markov property the variables (η n ) are independent and they have the same distribution. One need only prove that (σ n ) and (η n ) are independent. Let τ n σ n − σ n−1 . 1. If s > t, then (t) (t) {η 1 < a, σ 1 > s} = {σ 1 > t} ∩ η 1 < a, σ 1 > s − t , where η 1 and σ 1 are the size and the time of the first jump of X ∗ (u) = X (u + t) − X (t). As σ 1 is a stopping time {σ 1 > t} ∈ Ft . Hence by the strong Markov property {σ 1 > t} is independent of (t)
(t)
(t) (t) η 1 < a, σ 1 > s − t . Hence again by the strong Markov property
(t) (t) P (η 1 < a, σ 1 > s) = P {σ 1 > t} ∩ η 1 < a, σ 1 > s − t =
(t) (t) = P (σ 1 > t) P η 1 < a, σ 1 > s − t = = P (σ 1 > t) P (η 1 < a, σ 1 > s − t) . If s t then using that 0 ∈ / cl (Λ) and therefore P (σ 1 > 0) = 1, P (η 1 < a, σ 1 > t) = P (σ 1 > t) P (η 1 < a, σ 1 > 0) = = P (σ 1 > t) · P (η 1 < a) . Hence σ 1 τ 1 and η 1 are independent. In a similar way, using the strong Markov property again one can prove that τ n is independent of η n for every n. 3. By the strong Markov property (η n , τ n ) is independent of Fσn−1 . Hence
E exp i
= E E exp i
N
um η m + i
m=1 N m=1
um η m + i
N
vn τ n
n=1 N n=1
vn τ n
=
| FσN −1
=
´ LEVY PROCESSES
= E exp i
N −1
= E exp i
N −1
um η m + i
m=1 N −1
vn τ n
E exp (iuN η N + ivN τ N ) | FσN −1
n=1
m=1
= E exp i
um η m + i
N −1
N −1
um η m + i
m=1
=
vn τ n
n=1 N −1
467
· E (exp (iuN η N + ivN τ N )) =
vn σ n
· E (exp (iuN η N )) · E (exp (ivN τ N )) =
n=1
= ··· =
N !
E (exp (ium η m ))
m=1
N !
E (exp (ivm τ m )) .
m=1
This implies12 that the σ-algebras generated by (η m ) and (τ n ) are independent. Hence (η m ) and (σ n ) are also independent. Lemma 7.7 The Fourier transform of J Λ (s) is
(exp (iux) − 1) dF (x) E exp iu · J Λ (s) = exp λs R
where λ is the parameter of the Poisson part and F is the common distribution function of the jumps. Proof. Let G be the distribution function of N Λ (s). N Λ (s) ϕ (u) E exp iu · ∆X (σ k ) = = R
k=1
N Λ (s)
E exp iu ·
∆X (σ k ) | N Λ (s) = n dG (n) .
k=1
N Λ (s) has a Poisson distribution. As N Λ (s) and the variables (∆X (σ k )) are independent one can substitute and drop the condition N Λ (s) = k: ∞ n n (λs) exp (−λs) = ϕ (u) = E exp iu · ∆X (σ k ) n! n=0 k=1 n ∞ n (λs) exp (−λs) = = exp (iux) dF (x) n! R n=0 = exp λs (exp (iux) − 1) dF (x) . R
12 See:
Lemma 1.96, page 60.
468
PROCESSES WITH INDEPENDENT INCREMENTS
Lemma 7.8 If X is a L´evy process with respect to some filtration F and 0 ∈ / cl (Λ) then J Λ and X − J Λ are also L´evy processes with respect to F. Proof. First recall13 that if X is a L´evy process then the σ-algebra Gt generated by the increments X (u) − X (v) ,
u≥v≥t
is independent of Ft for all t. Observe that for all t increments of J Λ and X−J Λ of this type are Gt -measurable. So these processes have independent increment with respect to F. From the strong Markov property it is clear that the increments of these processes are stationary. As J Λ obviously has right-regular trajectories the processes in the lemma are L´evy processes as well. Lemma 7.9 If X is a L´evy process, Λ is a Borel measurable set and 0 ∈ / cl (Λ)
then the variables J Λ (t) and X − J Λ (t) are independent for every t ≥ 0. Proof. Let us fix a t. To prove the independence of the variables J Λ (t) and X (t) − J Λ (t) it is sufficient to prove14 that
'
& ϕ (u, v) E exp i u · J Λ (t) + v · X (t) − J Λ (t) =
= E exp iu · J Λ (t) · E exp iv · X (t) − J Λ (t) .
(7.6)
Let us emphasize that as 0 ∈ / cl (Λ) on every finite interval the number of jumps in Λ is finite so J Λ has trajectories with finite variation. That is J Λ ∈ V. Let
exp iu · J Λ (s, ω) , M (s, ω, u) E (exp (iu · J Λ (s, ω)))
& ' exp iv · X (s, ω) − J Λ (s, ω) N (s, ω, v) E (exp (iv · [X (s, ω) − J Λ (s, ω)])) be the exponential martingale of J Λ and X − J Λ . The Fourier transforms in the denominators are never zero and they are continuous, hence the expressions are meaningful and the jumps of these processes are the jumps of the numerators. Integrating by parts M (t) N (t) − M (0) N (0) =
t
M− dN + 0
+ [M, N ] (t) . 13 See: 14 See:
Proposition 1.97, page 61. Lemma 1.96, page 60.
t
N− dM + 0
´ LEVY PROCESSES
469
The Fourier transforms in the denominators are never zero and they are continuous so their absolute value have a positive minimum on the compact interval [0, t]. The numerators are bounded, so the integrators are bounded on any finite interval. Hence the stochastic integrals above are real martingales15 . So their expected value is zero. We show that [M, N ] = 0. As J Λ (t) has a compound Poisson distribution one can explicitly write down its Fourier transform: E exp iu · J (s) = exp λs (exp (iux) − 1) dF (x)
Λ
R
exp (s · φ (u)) As J Λ ∈ V obviously M ∈ V. So M is purely discontinuous. Hence16 [M, N ] =
∆M ∆N.
J Λ and X − J Λ do not have common jumps, therefore [M, N ] (t) =
∆M (s) ∆N (s) = 0.
0<s≤t
Hence E (M (t) N (t)) = E (M (0) N (0)) = 1. From which (7.6) trivially holds. If N1 and N2 are Poisson processes and N1 and N2 do not have common jumps then [N1 , N2 ] =
∆N1 ∆N2 = 0.
Using this one can prove in a similar way as above the following observation: Lemma 7.10 If N1 and N2 are Poisson processes with respect to some filtration F and N1 and N2 do not have common jumps almost surely then N1 (t) and N2 (t) are independent for every t. Proposition 7.11 If (Ni ) are finitely many Poisson processes with respect to some filtration then they do not have common jumps almost surely if and only if the variables (Ni (t)) are independent17 for every t. 15 See:
Proposition 2.24, page 128. Corollary 4.34, page 245. 17 See: Example 2.29, page 130. 16 See:
470
PROCESSES WITH INDEPENDENT INCREMENTS
Proof. If the values of Poisson processes are independent then the same is true for the compensated Poison processes. By the independence on every finite time interval the compensated Poisson processes are orthogonal in the Hilbert space H02 . Hence they are orthogonal as local martingales18 . Therefore their quadratic variation is a uniformly integrable martingale19 . This implies that the expected value of the quadratic co-variation [N1 , N2 ] =
∆N1 ∆N2
is almost surely zero. As ∆N1 ∆N2 ≥ 0 the quadratic co-variation is almost surely zero. Hence the two processes do not have common jumps almost surely. The proof of the other part of the proposition is clear from the previous lemma. Theorem 7.12 (Decomposition of L´ evy processes ) If X is a L´evy process, Λ is a Borel measurable set and 0 ∈ / cl (Λ) then J Λ and X − J Λ are independent L´evy processes. Proof. Recall that by definition two processes are independent if they are independent as sets of random variables. As we proved20 J Λ (t) and X − J Λ (t) are independent for every t. From the Markov property it is clear that if h > 0 then the increments J Λ (t + h) − J Λ (t) and
X − J Λ (t + h) − X − J Λ (t)
are also independent. Let (tk ) be a time sequence. Let (αk ) denote the corresponding increments of J Λ and let (β k ) denote the corresponding increments of X −J Λ . Let Gt be the σ-algebra generated by the increments of X after t. Observe that αk and β k are Gtk -measurable. Hence the linear combination uk αk + vk β k is also Gtk -measurable. So uk αk + vk β k is independent21 of Ftk . Using these one can easily decompose the joint Fourier transform: n n iuk αk + ivk β k = ϕ (u, v) E exp = E exp
k=1 n
k=1 18 See:
Proposition 4.15, Proposition 2.84, 20 See: Lemma 7.9, page 21 See: Proposition 1.97, 19 See:
page 230. page 170. 468. page 61.
k=1
i (uk αk + vk β k )
=
´ LEVY PROCESSES
= E E exp = E exp
n−1
n
k=1
i (uk αk + vk β k )
471
| Ftn−1
=
i (uk αk + vk β k ) E (exp (i (un αn + vn β n )))
=
k=1
= ··· =
n !
E (exp (i (uk αk + vk β k ))) =
k=1
=
n !
(E (exp (iuk αk )) · E (exp (ivk β k ))) = ϕ1 (u) · ϕ2 (v) .
k=1
This means that the sets of variables (αk ) and (β k ) are independent. Hence the σalgebras generated by the increments, that is by the processes, are independent. Therefore the processes X − J Λ and J Λ are independent. With nearly the same method one can prove the following proposition. Proposition 7.13 If (Ni ) are finitely many Poisson processes with respect to some common filtration then they do not have common jumps almost surely if and only if the processes are independent. Proof. Let F be the common filtration of N1 and N2 and let U and V be the exponential martingales of N1 and N2 . As N1 and N2 do not have a common jumps the quadratic co-variation of U and V is zero. Hence they are orthogonal. That is U V is a local martingale with respect to F. On every finite interval U, V ∈ H2 , therefore |U V (t)| ≤ sup |U (s)| sup |V (s)| ∈ L1 (Ω). s
s
Hence U V is a martingale. Therefore
E U V (tk ) | Ftk−1 = U V (tk−1 ) . If we use the notation of the proof of the previous proposition then with simple calculation one can write this as
E exp (i (uk αk + vk β k )) | Ftn−1 = E (exp (iuk αk )) · E (exp (ivk β k )) . From this the proof of the proposition is obvious. Corollary 7.14 If (Ni ) are countably many independent Poisson processes then they do not have common jumps almost surely. Proof. Let N1 and N2 be independent Poisson processes and let F (1) and F (2) be the filtration generated by the processes. Let U and V be the exponential
472
PROCESSES WITH INDEPENDENT INCREMENTS
martingales of N1 and N2 . U and V are martingales with respect to filtrations F (1) and F (2) . Let F be the filtration generated by the two processes N1 and N2 . Using the independence of N1 and N2 we show that U and V are martingales (1) (2) with respect to F as well. If F1 ∈ Fs and F2 ∈ Fs where s < t then F1 ∩F2
U (t) dP = E χF1 χF2 U (t) = E χF2 E χF1 U (t) =
= E χF2 E χF1 U (s) = E χF2 χF1 U (s) = U (s) dP. = F1 ∩F2
With the Monotone Class Theorem one can prove that the equality holds for every F ∈ σ F1 ∩ F2 : F1 ∈ Fs(1) , F2 ∈ Fs(2) = Fs , that is E (U (t) | Fs ) = U (s). Hence U is a martingale with respect to F.
Example 7.15 Poisson processes without common jumps which are not independent.
Let (σ k ) be the jump times generating some Poisson process. Obviously variables (2σ k ) also generate a Poisson process. As the probability that two independent continuous random variable is equal is zero the jump times of the two processes are almost surely never equal. But as they generate the same non-trivial σ-algebra they are obviously not independent. Proposition 7.16 If X is a L´evy process and (Λk ) are finitely many
disjoint Borel measurable sets with 0 ∈ / cl (Λk) for all k, then processes N Λk are independent. The same is true for J Λk . Proof. It is sufficient to show the second part of the proposition. If X J ∪i=1 Λk n then J ∪i=2 Λk = X − J Λ1 and J Λ1 are independent. From this the proposition is obvious. n
7.1.3
Spectral measure of L´ evy processes
First let us prove a very simple identity.
´ LEVY PROCESSES
473
Definition 7.17 Let (X, A) and (Y, B) be measurable spaces. A function µ : X × B → [0, ∞] is a random measure if: 1. for every B ∈ B the function x → µ (x, B) is A-measurable, 2. for every x ∈ X the set function B → µ (x, B) is a measure on (Y, B). Proposition 7.18 Let (X, A) and (Y, B) be measurable spaces and let µ : X × B → [0, ∞] be a random measure. If ρ is a measure on (X, A) and ν (B)
µ (x, B) dρ (x) , X
then ν is a measure on (Y, B). If f is a measurable function on (Y, B) then
f (y) µ (x, dy) dρ (x) ,
f (y) dν (y) = Y
X
Y
whenever the integral on the left-hand side
f dν is meaningful.
Y
Proof. ν is non-negative and if (Bn ) are disjoint sets then by the Monotone Convergence Theorem ν (∪n Bn )
µ (x, ∪n Bn ) dρ (x) =
X
=
n
X
µ (x, Bn ) dρ (x)
X
µ (x, Bn ) dρ (x) =
n
ν (Bn ) ,
n
so ν is really a measure. If f = χB , B ∈ B, then
f (y) dν (y) = ν (B)
Y
=
µ (x, B) dρ (x) = X
χB (y) µ (x, dy) dρ (x) = X
Y
X
Y
=
f (y) µ (x, dy) dρ (x) .
In the usual way, using the linearity of the integration and the Monotone Convergence Theorem the formula can be extended to non-negative measurable functions. If f is non-negative and Y f dν is finite then almost surely w.r.t. ρ
PROCESSES WITH INDEPENDENT INCREMENTS
474
the inner integral is also finite. Let f = f + − f − and assume that the integral of f − w.r.t. ν is finite. In this case, as we remarked, the integral Y f − (y) µ (x, dy) is finite for almost all x and the integral
f (y) µ (x, dy) −
f (y) µ (x, dy) = Y
+
Y
f − (y) µ (x, dy)
Y
is almost surely meaningful. The integral of the second part with respect to ρ is finite, hence
f dν Y
f dν − +
Y
f − dν =
Y
f + (y) µ (x, dy) dρ (x) −
= X
Y
X
f (y) µ (x, dy) −
X
Y
−
f (y) µ (x, dy) dρ (x)
+
=
f − (y) µ (x, dy) dρ (x) =
Y
Y
f (y) µ (x, dy) dρ (x) . X
Y
Let us fix a moment t. For an arbitrary ω define the counting measure supported by the jumps of s → X (s, ω) in [0, t]. Denote this random measure by µX (t, ω, Λ) = µX t (ω, Λ). That is µX t (ω, Λ)
χΛ (∆X (s, ω)) = N Λ (t, ω) .
(7.7)
0<s≤t
In general the process X is fixed so in order to simplify the notation as much as possible we shall drop the superscript X and instead of µX we shall simply write µ. If 0 ∈ / cl (Λ) then by (7.7) µt (ω, Λ) is measurable in ω. Obviously if Λ ⊆ R \ {0} then c
µ (t, ω, Λ) = lim µ (t, ω, Λ ∩ [−1/n, 1/n] ) , n→∞
so µt (ω, Λ) is also measurable in ω for any Borel measurable subset Λ of R \ {0}. This implies that µt (ω, Λ) is a random measure over R \ {0}. Hence Λ → ν t (Λ) E (µt (Λ))
µt (ω, Λ) dP (ω) ,
Λ ∈ B (R \ {0})
Ω
is a measure on (R \ {0} , B (R \ {0})). If 0 ∈ / cl (Λ) then ν t (Λ) is the expected value of a Poisson process at a fixed time, therefore ν t (Λ) < ∞. Therefore ν t is σ-finite for every t.
´ LEVY PROCESSES
475
Definition 7.19 The measures ν t (Λ) E (µt (Λ)) ,
Λ ∈ B (R \ {0})
are called the spectral measures of X. To simplify the notation let ν ν 1 . Lemma 7.20 ν t (Λ) = t · ν 1 (Λ) t · ν (Λ). Proof. If 0 ∈ / cl (Λ) then N Λ is a Poisson process. In this case
ν t (Λ) E N Λ (t) = t · E N Λ (1) tν (Λ) . In the general case by the Monotone Convergence Theorem
c ν t (Λ) = E lim µt (Λ ∩ [−1/n, 1/n] ) = n→∞
c
= lim E (µt (Λ ∩ [−1/n, 1/n] )) = n→∞
c
= lim t · ν (Λ ∩ [−1/n, 1/n] ) = t · ν (Λ) . n→∞
Proposition 7.21 (L1 -identity) If X is a L´evy process then for every Borel measurable function f : R \ {0} → R E f dµt = E f (∆X (s)) χ (∆X (s) = 0) R\{0}
0<s≤t
= R\{0}
whenever the integral
R\{0}
f dν t = t
f dν,
(7.8)
R\{0}
f dν is meaningful.
Proof. As µt (ω, Λ) is a counting measure for ever Borel measurable function f f (x) µt (ω, dx) = f (∆X (s, ω)) χ (∆X (s) = 0) . R\{0}
0<s≤t
The other parts of (7.8) are direct consequences of the previous proposition. Corollary 7.22 Let X be a L´evy process. If 0 ∈ / cl (Λ) and Λ xdν (x) is finite then
J Λ (t) − E J Λ (t) = J Λ (t) − t xdν (x) (7.9) Λ
is a martingale. In particular if Λ is bounded and 0 ∈ / cl (Λ) then (7.9) is a martingale.
476
PROCESSES WITH INDEPENDENT INCREMENTS
Proof. As Λ xdν (x) R\{0} xχΛ (x) dν (x) is finite by the L1 -identity with f (x) xχΛ (x)
E J Λ (t) E
∆X (s) χΛ (∆X (s)) =
0≤s≤t
=t R\{0}
xχΛ (x) dν (x) = t
xdν (x) . Λ
/ cl (Λ) the jumps X is a L´evy process so J Λ has independent increments. As 0 ∈ Λ has right-regular trajectories. This implies that in Λ cannot accumulate. So J J Λ (t) − E J Λ (t) is a martingale. Let P denote the σ-algebra of the predictable sets. By the martingale property of the compensated jumps it is clear that if 0 ∈ / cl (Λ), F ∈ Fs and s < t then
µ (t, ω, Λ) − t · ν (Λ) dP (ω) = F
µ (s, ω, Λ) − s · ν (Λ) dP (ω) . F
This means that as ν (Λ) < ∞
µ (t, ω, Λ) − µ (s, ω, Λ) dP (ω) = F
(t − s) · ν (Λ) dP (ω) , F
that is if H (u, ω, e) χΛ (e) χF (ω) χ(s,t] (u) then
∞
Hµ (du, ω, de)
E 0
R\{0}
∞
=E
Hdν (e) du .
0
R\{0}
The meaning of the left-hand side is the following. For every ω let µ (ω, D) denote22 the counting measure of the jumps of X, that is if D ∈ B (R+ ) × B (R \ {0}) then let µ (ω, D) be the number of jumps in D. First we integrate by this measure and then, if it is meaningful, we take the expected value. If the time interval is finite and we restrict µ to a set with ν (Λ) < ∞ then the set of bounded processes for which the formula is valid is a linear space. From this in the usual way, using the Monotone Class Theorem and the Monotone 22 See:
Definition 7.44, page 496.
´ LEVY PROCESSES
477
Convergence Theorem, one can prove the following: Proposition 7.23 (General L1 -identity) If H ≥ 0 is measurable with respect to P × B (R \ {0}) then
∞
E
H (u, ω, e) µ (du, ω, de) 0
∞
=E
R\{0}
H (u, ω, e) dν (e) du . R\{0}
0
Example 7.24 The L´evy–Khintchine formula for compound Poisson processes.
Let X be a L´evy process and let 0 ∈ / cl (Λ). Let J Λ be the compound Poisson process of the jumps of X. The Fourier transform of J Λ (s) is23
exp λs R
(exp (iux) − 1) dF (x)
,
where F is the common distribution function of the jumps, and λ is the parameter of the underlying Poisson process. What is the relation between F and ν? If B ∈ B (R\ {0}) and τ is the time of the first jump in Λ then by the general L1 -identity using that χ ([0, τ ]) is predictable F (B) = P (∆X (τ ) ∈ B ∩ Λ) = E (χB∩Λ (∆X (τ ))) = ∞ =E χ ([0, τ ]) µ (du, B ∩ Λ) = 0
∞
=E
χB∩Λ (e) χ ([0, τ ]) µ (du, de)
R\{0}
0 ∞
χB∩Λ (e) χ ([0, τ ]) dν (e) du
=E 0
R\{0}
= ν (B ∩ Λ) E
∞
χ ([0, τ ]) du
=
0
= ν (B ∩ Λ) E (τ ) =
ν (B ∩ Λ) . λ
That is the Fourier transform of J Λ (s) is (exp (iux) − 1) dν (x) . exp s Λ 23 See:
Lemma 7.7, page 467.
=
=
478
PROCESSES WITH INDEPENDENT INCREMENTS
Definition 7.25 Let (E, E, ν) be a measure space and let (Ω, A, P) be a probability space. We say that the random measure µ : Ω × E → [0, ∞] is a random Poisson measure with control measure ν if: 1. whenever the sets (Λk ) are disjoint the variables µ (ω, Λk ) are independent and 2. whenever ν (Λ) < ∞ the variable µ (ω, Λ) has a Poisson distribution with parameter ν (Λ). Proposition 7.26 Let X be a L´evy process. For every t the counting measure µt (ω, Λ) is a random Poisson measure. The control measure of µt is the spectral measure ν t . 'c & Proof. For every Λ ⊆ R \ {0} let Λn Λ ∩ − n1 , n1 . Obviously 0 ∈ / cl (Λn ) and µ (t, ω, Λ) = lim µ (t, ω, Λn ) . n→∞
As 0 ∈ / cl (Λn ) the variable ω → µ (t, ω, Λn ) = N Λn (t, ω) has a Poisson distribution. The Fourier transform of this variable is exp (tν (Λn ) (exp (iu) − 1)) . The convergence for every ω implies the weak convergence, so if ν (Λ) < ∞, then as ν (Λ) = limn→∞ ν (Λn ) the Fourier transform of ω → µ (t, ω, Λ) is exp (tν (Λ) (exp (iu) − 1)) . Hence it has a Poisson distribution. If the sets Λk =
∪n Λ(k) n
∪n
c 1 1 Λk ∩ − , n n
(k)
are disjoint then the sets Λn are also disjoint for every n. Hence the variables
µ t, ω, Λ(k) n are independent. The limit of independent variables is independent, so if the sets (Λk ) are disjoint, then the variables µ (t, ω, Λk ) are independent.
´ LEVY PROCESSES
479
Definition 7.27 Let H be a Hilbert space and let (C, C, ν) be a measure space and let S ⊆ C denote the subsets of C with finite measure. π : S → H is a vector measure with control measure ν if for every S ∈ S: 1. π (S) ∈ H is defined, 2
2. π (S)H = ν (S), 3. if S1 and S2 are disjoint sets in S then the vectors π (S1 ) and π (S2 ) are orthogonal. We say that a function f : C → R is integrable to π if there is a
with respect sequence of finite valued step functions (sn ) = c χ k nk Cnk with: 1. sn → f in L2 (ν) and 2. In k cnk π (Cnk ) is a Cauchy sequence in H. If I limn→∞ In , then we shall call this limit I the integral of f with respect to π. We shall denote this integral by C f (x) dπ (x) or simply C f dπ. Proposition 7.28 If f ∈ L2 (C, C, ν) and π is a vector measure with control measure (C, C, ν) then f is integrable with respect to π and f dπ C
Proof. Let s and 2.
k ck
- = f 2
f 2 dν.
(7.10)
C
H
· χCk where Ck are disjoint and in S. By conditions 3.
2 2 s dπ = c · π (C ) k k = C H k H 2 2 = ck · π (Ck )H = k
=
k
(7.11)
c2k
2
· ν (Ck ) = C
s2 dν = s2 .
there is a sequence sn k cnk χCnk with As the step functions are dense in L2 sn → f in L2 (ν). From (7.11) In k cnk π (Cnk ) is a Cauchy sequence in H. From this the proposition is obvious. Corollary 7.29 If f ∈ L2 (C, C, ν) and π is a vector measure with control measure ν then the value of the vector integral C f dπ is independent of the approximating sequence (sn ). Proposition 7.30 If X is a L´evy process and H L2 (Ω) then for every t ≥ 0 π t (Λ) N Λ (t) − ν t (Λ) = N Λ (t) − t · ν (Λ)
(7.12)
480
PROCESSES WITH INDEPENDENT INCREMENTS
is a a Hilbert space valued vector measure over (R \ {0} , B (R \ {0}) , ν t ). The same is true if H H02 on the time interval [0, t] and (π (Λ)) (s) N Λ (s) − s · ν (Λ) ,
s ≤ t < ∞.
Proof. As we have already proved, if Λ ⊆ R \ {0} and ν t (Λ) < ∞ then the Fourier transform of N Λ (t) is exp (ν t (Λ) (exp (iu) − 1)) . Hence if ν t (Λ) < ∞, then N Λ (t) has a Poisson distribution with parameter 2 ν t (Λ). This implies that the expected value of (7.12) is zero and π t H = ν t (Λ). Λ1 As we have also proved that if Λ1 ∩ Λ2 = ∅ then N (t) and N Λ2 (t) are independent24 . So (π t (Λ1 ) , π t (Λ2 )) π t (Λ1 ) π t (Λ2 ) dP = 0. Ω
7.1.4
Decomposition of L´ evy processes
Now we are ready to prove that L´evy processes are semimartingales. Proposition 7.31 If X is a L´evy process then: 1. X is a semimartingale, 2. X has a decomposition X =V +M where: 3. V and M are independent L´evy processes, 4. M is a martingale with bounded jumps and on every finite interval M ∈ H02 , 5. V ∈ V, that is on every finite interval the trajectories of V have finite variation. Proof. If Λ {|x| ≥ 1} then the jumps of Y X − J Λ are bounded. Y is a L´evy process with bounded jumps25 . This implies that Y (t) has an expected value26 for every t. Therefore M (t) Y (t) − E (Y (t)) =
= X (t) − J Λ (t) − t · E X (1) − J Λ (1) X (t) − J Λ (t) − t · γ
24 See:
Proposition 7.16. page 472. Lemma 7.8, page 468. 26 See: Proposition 1.111, page 74. 25 See:
´ LEVY PROCESSES
481
is a L´evy process with zero expected value. Hence M is a martingale. The martingale M has finite moments, so on any finite interval M is in H02 . Therefore M satisfies 4. Obviously V (t) J Λ (t) + E (Y (t)) J Λ (t) + γ · t satisfies 5. As X − J Λ and J Λ are independent27 the proposition holds.
Corollary 7.32 The spectral measure ν has the following properties x2 dν (x) < ∞.
ν (|x| ≥ 1) < ∞, 0<|x|<1
That is
R\{0}
1 ∧ x2 dν (x) < ∞.
(7.13)
Proof. M ∈ H02 therefore M 2 −[M ] is a martingale28 . Let Λ {0 < |x| < 1}. By the L1 -identity (7.8)
2
x dν (x) = E
t
2
x dµt (x)
Λ
= E
Λ
= E
(∆X (s)) χ (∆X (s) ∈ Λ) = 2
s≤t
2
(∆M (s))
≤ E ([M ] (t)) = E M 2 (t) < ∞.
s≤t
The other relations are obvious. Every element of H02 has a unique decomposition into a sum of a continuous and of a purely discontinuous martingale29 . For L´evy processes one can prove a bit more: Proposition 7.33 The martingale M in the decomposition X = V + M has a decomposition M = Mc + Md 27 See:
Theorem 7.12, page 470. Proposition 2.84, page 170. 29 See: Corollary 4.18, page 232. 28 See:
(7.14)
482
PROCESSES WITH INDEPENDENT INCREMENTS
where 1. M c and M d are independent L´evy processes, 2. the trajectories of M c are almost surely continuous, 3. M d is purely discontinuous and for every t, in L2 (Ω)-convergence
M d (t) = lim
ε0
x d (µt (x) − ν t (x)) . (7.15)
x dπ t (x) = lim
ε0
ε≤|x|<1
ε≤|x|<1
Proof. Let Λk
1 1 ≤ |x| < k+1 k
Λn,m
,
1 1 ≤ |x| < m+1 n
.
1. On any finite interval the processes
M Λk (t) J Λk (t) − E J Λk (t) are independent H02 martingales. If U ∈ H02 then U 2 − [U ] is a martingale30 so
E U 2 (t) = E ([U ] (t)) . Hence by Doob’s inequality and by the L1 -identity (7.8) if n < m
2 2 2 Mn − Mm H2 ≤ 4 Mn (t) − Mm (t)2 = 4E [Mn − Mm ] (t) = 2 = 4E (∆M (s)) χ (∆M (s) ∈ Λn,m ) = s≤t
2
= 4E
x dµt (x) Λn,m
x2 dν (x) .
= 4t Λn,m
By (7.13) and by the Dominated Convergence Theorem (Mn ) is a Cauchy sequence in H02 . As H02 is a Hilbert space there is a martingale M∞ ∈ H02 with Mn → M∞ . As the processes M Λk are independent their sum Mn is a L´evy process. By the convergence of the trajectories this implies that M∞ is also a L´evy process. M Λk is a purely discontinuous martingale. The set of purely discontinuous martingales is closed linear subspace in H02 , so M d is purely discontinuous. Let M M c + M d be the decomposition of M . By the uniform convergence of the trajectories ∆M∞ = ∆M . As ∆M d = ∆M obviously 30 See:
Proposition 2.84, page 170.
´ LEVY PROCESSES
483
∆M d = ∆M∞ . As M d and M∞ are purely discontinuous they are equal31 . As M − Mn and Mn are independent by the uniform convergence of the trajectories M c = M − M d and M d are also independent. This also implies that M c is a L´evy-process. 2. 0 ∈ / cl (Λk ) so ν (Λk ) < ∞, and Λk is bounded. Hence Λk xdν (x) is meaningful. By the L1 -identity (7.8)
M Λk (t) J Λk (t) − E J Λk (t) =
= χ (∆X (s) ∈ Λk ) ∆X (s) − E J Λk (t) = s≤t
= R\{0}
χΛk (x) x dµt (x) − E
Λk
R\{0}
χΛk (x) x dµt (x)
=
x dµt (x) −
=
(7.16)
x d (µt − ν t ) (x) .
xdν t (x) = Λk
Λk
For every t, in L2 (Ω)-convergence d
M (t) =
M
k
Λk
(t) =
k
x d (µt (x) − ν t (x)) ,
= lim
k→∞
x d (µt (x) − ν t (x)) =
Λk
1/k≤|x|<1
and from which (7.15) is evident. Corollary 7.34 One can also think about (7.15) as a H02 -valued vector integral d
M (t) =
x dπ t (x) . 0<|x|<1
Proof. Let Λ {0 < |x| < 1} and let f (x) x. As Λ x2 dν (x) < ∞ obviously f ∈ L2 (Λ, ν). This implies that the vector integral Λ f dπ t Λ x dπ t (x) is meaningful. Since f ∈ L2 (Λ, ν t ), by the Dominated Convergence Theorem - f 2 (x) dν t (x) = 0.
lim
k→∞ 31 See:
Corollary 4.7, page 228.
0<|x|<1/k
484
PROCESSES WITH INDEPENDENT INCREMENTS
From (7.10) using that the vector integral is obviously additive x dπ t (x) − x dπ t (x) = 1/k≤|x|<1 0<|x|<1 2 - = x dπ t (x) = x2 dν t (x) → 0. 0<|x|<1/k 0<|x|<1/k 2
Hence in L2 (Ω)
lim
k∞
x dπ t (x) = 1/k≤|x|<1
x dπ t (x) . 0<|x|<1
That is d = lim M (t) − x dπ x dπ (x) − x dπ =0 t t t k∞ Λ 1/k≤|x|<1 Λ 2 2
Example 7.35 For some L´evy process the pathwise integral 0<|x|<1
xd (µt − ν t ) (x)
is meaningless32 .
It is natural to ask whether one can define the random signed measure ρ (t, ω, Λ) µt (ω, Λ) − ν t (Λ) ,
Λ ∈ B (R\ {0})
and whether one can express the limit in (7.15) as an ordinary pathwise integral over {0 < |x| < 1} with respect to ρ. Recall that if ρ is a signed measure then by definition
f dρ
A
f dρ+ −
A
f dρ−
A
and of course we assume that the integral on the left is meaningful if both integrals on the right are meaningful and one does not get an expressions33 of 32 See:
Example 4.20, page 233. course one does not have this problem if ν is a finite measure. But in general ν is just σ-finite, so ν (Λ) = ∞ is possible. 33 Of
´ LEVY PROCESSES
485
the type ∞ − ∞. Therefore it is sufficient to show that for some L´evy process M lim x d (µt (x) − ν t (x)) ε0
ε<|x|<1
is finite, but
x dν t (x) = 0<|x|<1
x dµt (x) = 0<|x|<1
=
χ (|∆M | < 1) ∆M (s) ≡ ∞.
s≤t
Let (Ni ) be a sequence of independent Poisson processes with λ = 1. For any t the compensated Poisson processes on the finite time horizon [0, t] Mi (t) Ni (t) − λt = Ni (t) − t are in H02 . As they are independent, they are also orthogonal in H02 . As ∞ the sequence M
∞ 1 i=1
i
i
1/i2 <
Mi
is convergent in the Hilbert space H02 . As the processes Ni are independent they almost surely do not have common jumps34 , so all the jumps are not larger than one and obviously they are non-negative. By Fubini’s theorem and by the Monotone Convergence Theorem
1
x dµt = 0
∆M (s) =
∞ ∆Ni (s) i=1 s≤t
i
s≤t i=1
s≤t
=
∞ ∆Mi (s)
i
=
∞ Ni (t) i=1
i
=
∞ ∆Mi (s) i=1 s≤t
i
=
.
The variables Ni (t) − t are independent, they have zero expected value. So for any t the sequence Rn
n Ni (t) − t i=1
i
is a discrete time martingale. Obviously (Rn ) is bounded in L2 (Ω) so by the Martingale Convergence Theorem it is convergent almost surely. As i 1/i = ∞ 34 See:
Corollary 7.14, page 471.
486
PROCESSES WITH INDEPENDENT INCREMENTS
obviously
∆M (s) =
∞ Ni (t) i=1
s≤t
i
= ∞.
The spectral measure ν of M is ν {1/i} 1. Therefore ν ((0, 1]) = ∞ and
1
x dν (x) = 0
∞ 1 i=1
i
=∞
but
1
x2 dν (x) = 0
∞ 1 < ∞. 2 i i=1
From Kolmogorov’s zero–one law it is also clear, that if t > 0 then µt (ω, (0, 1)) = ∞ almost surely, which implies that the signed measure ρt is almost surely meaningless. 7.1.5
L´ evy–Khintchine formula for L´ evy processes
1. If 0 ∈ / cl (Λ) then N Λ (t) has a Poisson distribution with parameter λ tν (Λ). So its Fourier transform is
ϕt (u) E exp iuN Λ (t) = exp (tν (Λ) (exp (iu) − 1)) . 2. If 0 ∈ / cl (Λ), e.g. if Λ {|x| ≥ 1}, then by the general L1 -identity the Fourier transform of J Λ is35 exp t (exp (iux) − 1) dν (x) . Λ
3. If ν (Λ) is finite and if Λ is bounded then Λ x dν (x) is finite. So in this case the Fourier transform of J Λ (t) − t x dν (x) (7.17) Λ
is exp t (exp (iux) − 1 − iux) dν (x) . Λ 35 See:
Example 7.24, page 477.
´ LEVY PROCESSES
487
4. By (7.15) M d is a limit in L2 of processes in (7.17). L2 -convergence implies weak convergence, hence the Fourier transform of M d is
lim exp (t · φε (u)) = exp lim t · φε (u) ,
ε0
ε0
where φε (u)
exp (iux) − 1 − iux dν (x) . ε<|x|<1
On any bounded interval |exp (iux) − 1 − iux| ≤ kx2 . By (7.13) f (x) x2 is ν-integrable on 0 < |x| < 1, hence by the Dominated Convergence Theorem
exp (iux) − 1 − iux dν (x) =
lim
ε0
exp (iux) − 1 − iux dν (x) . 0<|x|<1
ε<|x|<1
5. This implies that the Fourier transform of the jump part of X is ϕt (u) = exp (t · φ (u)) ,
(7.18)
where φ (u)
|x|≥1
exp (iux) − 1 dν (x) +
exp (iux) − 1 − iux dν (x) = 0<|x|<1
= R\{0}
exp (iux) − 1 − iuxχ (|x| < 1) dν (x) .
6. As we know every continuous L´evy process is a linear combination of a Wiener process and a linear trend36 . As M c is a martingale M c = σw where w is a Wiener process. Recall that if X is a L´evy process then X (t) = J {|x|≥1} (t) + γt + M d (t) + M c (t) 36 See:
Theorem 6.11, page 367.
488
PROCESSES WITH INDEPENDENT INCREMENTS
where γ is the expected value of the small jumps. So we have proved the next famous theorem. Theorem 7.36 (L´ evy–Khintchine formula) Let X be a L´evy process. If ϕt (u) = exp (tφ (u)) is the Fourier transform of X (t) then φ (u) iuγ −
σ 2 u2 + 2
(7.19)
+ R\{0}
exp (iux) − 1 − iux · χ (|x| < 1) dν (x) .
where ν is the spectral measure of X and x2 ∧ 1dν (x) < ∞.
(7.20)
R\{0}
Proof. It is sufficient to remark that if w is a Wiener process then its characteristic function at time t is u2 exp −t 2 and if X is a linear trend that is X (t) = γt then the characteristic function of X (t) is exp (iγut) .
Definition 7.37 The triplet (γ, σ, ν) is called the characteristics of the L´evy process X. Corollary 7.38 If X is a non-negative L´evy process then the Laplace transform of X (t) has the representation Lt (s) = exp t −βs +
∞
exp (−sx) − 1dν (x) ,
s≥0
0
where β ≥ 0 and ν is the spectral measure of X. In this case ∞ ν ((−∞, 0)) = 0, x ∧ 1dν t (x) < ∞. 0
Proof. As X (t) ≥ 0 its Laplace transform Lt (s) E (exp (−sX (t))) ,
s≥0
(7.21)
´ LEVY PROCESSES
489
is finite. As X ≥ 0 and as X has stationary increments almost surely the trajectories of X are increasing. Hence
∞
x ∧ 1dν t (x) = E
0
∆X (s) ∧ 1 ≤ E
X − J {x≥1} (t) < ∞
s≤t
as the jumps of X − J {x≥1} are bounded. From this (7.21) is clear. By (7.21) one can separate the integral in (7.19) as
∞
1
exp (iux) − 1dν (x) − iu
0
xdν (x) 0
and we can join the second term into the constant. It is easy to see that
∞
exp (−sx) − 1dν (x)
0
is finite for every s ≥ 0 so one can extend the Fourier transform analytically to the complex half-plane {−s + iu, s ≥ 0}. That is one can put −s on the place of iu and the Laplace transform of X (t) has the representation ∞ σ 2 s2 exp t −βs + + exp (−sx) − 1dν (x) . 2 0 As X is increasing in t the Laplace transform is decreasing for every s. But it can happen only if σ = 0. Hence Lt (s) has the stated representation.
7.1.6
Construction of L´ evy processes
Let us assume that (γ, σ, ν) satisfies (7.20). We want to construct37 a L´evy process X with characteristics (γ, σ, ν). To do this one needs to construct a right-continuous process with Fourier transform satisfying the L´evy–Khintchine formula with the given triplet. 1. As a first step one should construct a Wiener process38 w. If X c (t) γt + σw (t) then the Fourier transform of X c (t) is
σ 2 u2 exp t · iuγ − 2
.
37 One can easily see that the probability space (Ω, A, P) should be rich enough to carry a countable number of independent random variables. One can assume that (Ω, A, P) = ([0, 1] , B, λ). 38 See: Theorem B.13, page 567.
490
PROCESSES WITH INDEPENDENT INCREMENTS
This means that it will be sufficient to construct a L´evy process X d with Fourier transform exp t R\{0}
exp (iux) − 1 − iux · χ (|x| < 1) dν (x) .
2. As a second step let us construct a random Poisson measure with control measure ν. First assume that ν is finite. Let (ξ k ) be a sequence of independent random variables with distribution F (B) ν (B) /ν (R\ {0}) and let N be a Poisson process independent from the sequence (ξ k ) with parameter N λ = ν (R\ {0}). If U k=1 ξ k then its Fourier transform is = exp t exp (iux) − 1dv (x) . ϕt (u) = exp t λ exp (iux) − 1dF (x) R
R
As we have seen39 the random measure generated by the jumps of U is a random Poisson measure with control measure ν. 3. If 0 ∈ / cl (Λ) then for some ε > 0 ε2 1 ν (Λ) = 2 ν (Λ) ≤ 2 ε ε
x2 dν (x) < ∞, Λ
and by the previous step one can easily construct a process J Λ with Fourier transform exp t exp (iux) − 1dν (x) . Λ
As a special case one can construct a L´evy process with Fourier transform exp t
exp (iux) − 1dν (x) .
|x|≥1
4. The only problem is the convergence of the compensated sum
J
Λk
(t) − t
xdν (x) M Λk (t)
Λk
k
k
where Λk 39 See:
Example 7.24, page 477.
1 1 < |x| ≤ k+1 k
.
´ LEVY PROCESSES
491
By Doob’s inequality, if β < α then40 2 2 sup M (α,1) (s) − M (β,1) (s) ≤ 4 M (α,1) (t) − M (β,1) (t) = 0≤s≤t 2 2 α 2 = 4 M (β,α] (t) = 4t x2 dν (x) . 2
β
x2 dν (x) < ∞ on every finite interval [0, t] the processes M (1/n,1)
form a Cauchy sequence in H02 . So on every finite interval M (1/n,1) has a limit in H02 . The limit is a L´evy process with Fourier transform ϕt (u) = exp (tφ (u)) where φ is the function in (7.19). As
0<|x|<1
Example 7.39 Symmetric stable processes.
Perhaps the simplest construction of a non-trivial L´evy process is the following: assume that ν is symmetric, that is if A ⊆ R+ \ {0} then ν (−A) = ν (A). Let us also assume that ν ((x, ∞)) x−α with some α > 0. In this case ν ((a, b]) = a−α − b−α . By (7.20)
1
x2 dν (x) = 0
1
αx−α−1 x2 dx = α
0
1
x−α+1 dx < ∞.
0
This happens only if −α + 1 > −1, that is if α < 2. One can prove41 that if α (γ, σ, ν) = (0, 0, ν) then the Fourier transform is ϕt (u) = exp (−t · (c |u| )), that is the distribution of the increments is α-stable. 7.1.7
Uniqueness of the representation
Sometimes the L´evy–Khintchine formula is written in a different way. Instead of the representation of φ (u) above one can write σ 2 u2 + φ (u) iuγ − 2
40 See: 41 See:
Corollary 7.32, page 481. [23].
R\{0}
iux exp (iux) − 1 − 1 + x2
dν (x)
PROCESSES WITH INDEPENDENT INCREMENTS
492
since the difference of the two integrals is iu∆γ, where 1 x χ (|x| < 1) − ∆γ dν (x) ≤ 2 R\{0} 1+x 1dν (x) < ∞. ≤ x2 dν (x) + |x|≥1
0<|x|<1
Since
x2 x2 ≤ min 1, x2 ≤ 2 · 2 1+x 1 + x2
min 1, x2 is ν-integrable if and only if ρ (A) A
x2 dν, 1 + x2
A ∈ B (R\ {0})
is a finite measure. Definition 7.40 The kernel function
iux exp (iux) − 1 − H (u, x) 1 + x2 2 −u /2
1 + x2 if x = 0 x2 if x = 0
is called the L´evy–Khintchine kernel. The L´evy–Khintchine kernel has some useful properties. For a fixed u exp (iux) − 1 −
iux 1 + x2
1 + x2 x2
(7.22)
is obviously bounded in x outside any neighborhood of x = 0.
exp (iux) − 1 iux 1 + x2 = − 2 lim H (u, x) = lim 2 2 x→0 x→0 x x (1 + x ) 1 exp (iux) − 1 − iux 1 = lim = − − iux x→0 x2 x2 (1 + x2 ) x2
exp (iux) − 1 − iux u2 = − x→0 x2 2
= lim
so x → H (u, x) is continuous in x = 0 for every u. Therefore H is bounded and it is continuous in x on R. This implies that if ρ is a finite measure then for every
´ LEVY PROCESSES
u one can define the integral 1 + x2 iux H (u, x) dρ (x) = dρ (x) . exp (iux) − 1 − 2 1+x x2 R R
493
(7.23)
We can also assume that ρ ({0}) 0. If 2 σ if 0 ∈ A δ (A) 0 if 0 ∈ /A then σ 2 u2 = − 2
H (u, x) dδ (x) , R
which is a very useful relation as this implies that if π ρ + δ and σ 2 u2 iux + exp (iux) − 1 − dν (x) φ (u) iuγ − 2 1 + x2 R\{0} then
φ (u) = iuγ +
H (u, x) dπ (x) . R
Obviously π is a finite measure on R. If π is any finite measure and σ 2 π ({0}) and ν (A) A
1 + x2 dπ (x) , x2
A ∈ B (R\ {0}) ,
then (γ, σ, ν) with an arbitrary γ forms a triplet which satisfies (7.20). Theorem 7.41 The representation (7.19) is unique. The same is true for the representation φ (u) = iuγ + H (u, x) dπ (x) . R
That is, if π 1 and π 2 are finite measures and iuγ 1 + H (u, x) dπ 1 (x) = iuγ 2 + H (u, x) dπ 2 (x) R
for all u then γ 1 = γ 2 and π 1 = π 2 .
R
494
PROCESSES WITH INDEPENDENT INCREMENTS
The theorem is an easy consequence of a more general statement which we shall prove later42 . Corollary 7.42 (Uniqueness) The Fourier transform of a L´evy process uniquely determines the triplet (γ, σ, ν). Proof : Let ϕ (t, u) = exp (tφ (u)) be the Fourier transform of some L´evy process. It is sufficient to prove that if exp (φ1 (u)) = exp (φ2 (u)) and φ1 (u) and φ2 (u) are continuous with φ1 (0) = φ2 (0) = 0 then φ1 (u) = φ2 (u). Obviously φ1 (u) = φ2 (u) + i2πk (u) where k (u) is always an integer number. As k (u) is always an integer it can be continuous if and only if 0 = k (0) = k (u). Example 7.43 L´evy’s representation of local time L (0).
1. Let S be the process of the first passage times43 of some √ Wiener process. If a ≥ 0 then the Laplace transform of S (a) is44 exp (−a s). With simple calculation ∞ ∞ exp (−sx) − 1 exp (−u) − 1 3/2 du = dx = s 3/2 s x u3/2 0 0 ∞ √ exp (−u) − 1 = s du. u3/2 0 The value of the integral is 0
∞
exp (−u) − 1 du = − u3/2
∞
1
=− 0
=−
0 1
=− 0
0
1 u1/2
0
1
∞
1
exp (−ux) dxdu =
u1/2
0
1
∞
exp (−ux) dudx =
√
x dy exp (−y) √ dx = 1/2 x y 0 √ 1 1 √ Γ dx = −2 π 2 x
Hence the Laplace transform of S (a) has the representation
1 exp a √ 2π 42 See:
Theorem 7.85, page 530. Example 1.126, page 90. 44 See: Example 1.118, page 82. 43 See:
0
∞
exp (−sx) − 1 dx . x3/2
´ LEVY PROCESSES
495
As the representation of the Fourier transform with the characteristics is unique, the same is true for the Laplace transform. Hence the spectral measure of S is 1 ν (Λ) = √ 2π
1
Λ
x3/2
dx.
Fix a t and let N (s) be the number of jumps of S in (0, t] which are bigger than 1/s2 . Obviously N (s) has a Poisson distribution with expected value tν
%
1 ,∞ s2
=t
2 s π
(7.24)
Hence N is a Poisson process. So by the law of large numbers for Poisson processes45 almost surely % t
√ 2 N (s) = lim εN Λ(ε) (t) . = lim ε0 π s→∞ s
(7.25)
where Λ (ε) {ε > 0} and N Λ(ε) (t) is the number of jumps of S bigger than ε during the time interval (0, t]. Unifying the measure-zero sets one can assume that (7.25) holds with probability one for every t. 2 Recall that the flat parts of L (0) and the excursion intervals of w are almost surely equal46 . 3. Now let S be the first passage time of β. By (7.24) % Lt (0) = lim
ε0
επ Λ(ε) N (Lt (0)) . 2
(7.26)
N Λ(ε) (Lt (0)) is the number of jumps of S during the time interval (0, Lt (0)] =
0, max β (s) s≤t
which are bigger than ε. The jumps of S (a) are exactly the flat parts of the maximum process of the corresponding Wiener process. Hence N Λ(ε) (Lt (0)) is exactly the number of flat parts of maxs β (s) which are bigger than ε and started during the time interval (0, t]. Hence N Λ(ε) (Lt (0)) is the number of excursion intervals of w which are bigger than ε > 0 and started before t. If we denote this 45 The proof is nearly the same as for the Wiener processes. See: B.9, page 565. If we assume that λ = 1 then instead w one can write in the proof the compensated Poisson process. 46 See: Proposition 6.96, page 449.
PROCESSES WITH INDEPENDENT INCREMENTS
496
number by E (ε, t) then by (7.26) almost surely for every t % Lt (0) = lim
ε0
επ · E (ε, t) . 2
This relation is the so called L´evy’s representation of L (0). Observe that instead of the above definition we can define E (ε, t) as the number of excursions bigger than ε during the time interval (0, t] as the two numbers differ only by maximum one and the difference goes to zero when ε 0.
7.2
Predictable Compensators of Random Measures
The key of the L´evy–Khintchine formula is the decomposition of L´evy processes into three parts. The most noticeable object in the decomposition is the spectral measure of the process. Let us fix a measurable space (E, E). One can think about E as the space of the possible jumps of some n-dimensional process X. Hence in this chapter E Rn \ {0} and E is the set of Borel sets of E. To simplify the notation if x and y are n-dimensional vectors then yx will denote the scalar product of x and y. The main result of this section is that in some sense one can define a spectral measure for every right-regular process. Of course it is not clear how one can define the spectral measure in this general setting. The main idea is that the spectral measure of a right regular process is a measure which satisfies the general L1 -identity47 . Observe that the spectral measure of a L´evy process is deterministic. So if H (t, ω, e) is a non-negative function of three variables and it is measurable with respect to P × B (R \ {0}) then by Fubini’s theorem H (t, ω, e) dν t (e) = t H (t, ω, e) dν (e) R\{0}
R\{0}
is P-measurable. Definition 7.44 Let X be a right-regular stochastic process with jumps in E Rn \ {0} and let µX be the random counting measure of the jumps of the trajectories of X: If D ∈ B (R+ ) × E then let µX (ω, D) the number of jumps of X in D for trajectory X (ω) .
(7.27)
The main result of this section is the following48 : Theorem 7.45 (Existence of predictable compensator) If X is a rightregular process on Rn and µX is the random counting measure in (7.27) then µX 47 See: 48 See:
Proposition 7.23, page 477. Corollary 7.63, page 508.
PREDICTABLE COMPENSATORS OF RANDOM MEASURES
497
has a predictable compensator ν (ω, D) ,
D ∈ B (R+ ) × E.
By definition this means that ν (ω, D) is a random measure over B (R+ ) × E and for every non-negative, (P × E)-measurable function H on R+ × Ω × E 1. the parametric integral I (t, ω)
H (s, ω, e) ν (ds, ω, de) (0,t]×E
is P-measurable and
2. E (0,∞)×E HdµX E (I (∞)) = E (0,∞)×E Hdν . If ν 1 and ν 2 are two predictable compensators of µX then they are almost surely equal. Definition 7.46 Let µX be the counting measure generated by the jumps of a right-regular process X. The predictable compensator ν of µX is called the spectral measure of X. 7.2.1
Measurable random measures
Let us start with some definitions: R+ × Ω × E. Definition 7.47 To make the notation simple let Ω Definition 7.48 In the following under random measure we shall always mean some random measure defined on the measurable space (R+ × E, B (R+ ) × E) ,
(7.28)
that is if µ is a random measure then µ (ω) is a measure on (7.28) for every ω. Definition 7.49 Let (R+ × Ω, Z) be some measurable space. To make the ter → R is measurable with respect minology as simple as possible if a function f : Ω to the product σ-algebra Z Z × E then we say that function f is Z-measurable. Therefore one can talk about product measurable, adapted, predictable etc. functions on Ω. that is if f is measurable with respect to the If f is product measurable on Ω σ-algebra B (R+ ) × A × E then by Fubini’s theorem (t, e) → f (t, ω, e) is measurable with respect to B (R+ ) × E for every fixed ω.
498
PROCESSES WITH INDEPENDENT INCREMENTS
and µ is a random measure Definition 7.50 If f is product measurable on Ω then one can define the pathwise stochastic integral
fω (s, e) dµω (s, e)
(f • µ) (t, ω) (0,t]×E
f (s, ω, e) µ (ds, ω, de) . (0,t]×E
More generally if f ≥ 0 then sometimes49 f • µ will denote the random measure f dµ that is the random measure B→
fω (s, e) dµω (s, e) ,
B ∈ B (R+ ) × E.
B
Example 7.51 If X is an Rn -valued right-regular stochastic process and f : Rn → R is Borel measurable then f • µX (t, ω) = f (∆X (s, ω)) . 0<s≤t
If the stochastic process (f • µ) is finite for every (t, ω) then it has right-regular trajectories with finite variation. In the general case we do not know anything about the measurability properties of µ. Therefore we do not know anything about the measurability of the process (f • µ). Definition 7.52 One can define the measurability and integrability properties of a random measure µ via the properties of the integrals f • µ: 1. A random measure µ is Z-measurable if f • µ is a Z-measurable stochastic → R+ . process on R+ × Ω for every predictable, non-negative function f : Ω 2. We say that µ is a finite random measure if the expected value of ω → µ (ω, R+ × E) is finite. 3. We shall denote the set of adapted, finite random measures by A+ . 4. A random measure µ is σ-finite if there is a sequence of predictable sets and ∪n Zn = Ω, with χZ • µ is finite for all n. (Zn ) with Zn ⊆ Ω n + 5. µ ∈ Aloc if µ is σ-finite and adapted. Example 7.53 If X is an arbitrary adapted right-regular process then µX ∈ A+ loc . 49 In most cases f • µ is a stochastic process and not a random measure. Using standard measure theory one can easily prove that for every f, g ≥ 0 product measurable functions (gf ) • µ = g • (f • µ) , where of course (f • µ) is a random measure and not a stochastic process.
PREDICTABLE COMPENSATORS OF RANDOM MEASURES
499
By definition this means that µX is adapted and it is σ-finite. If s < t, F ∈ Fs , Λ ∈ E and 0 ∈ / cl (Λ) then it is easy so see that χ ((s, t] × F × Λ) • µX is an adapted process. In the usual way with the Monotone Class Theorem and with the Monotone Convergence Theorem it is easy to prove that H • µX is adapted for every non-negative predictable H. The second property is a direct and easy consequences of the right-regularity of X: for every m the jumps of X larger than 1/m cannot have an accumulation point. So for every m one can (m) define the sequence of stopping times (τ n ) covering the jumps of X which are larger than 1/m. If ( Pnm 0, τ (m) × {x ≥ 1/m} ∈ P × E, n then E
χPnm • µX (∞) ≤ n < ∞,
hence µX is σ-finite. + Lemma 7.54 Let µ be a random measure. µ ∈ A+ loc if and only if V • µ ∈ A for some positive, predictable function V .
Lemma 7.55 If µ (ω, B) < ∞ for every ω and B ∈ B (R+ ) × E then µ is a predictable random measure if and only if (t, ω) → µ (ω, (0, t] × Λ)
(7.29)
is a predictable function for every Λ ∈ E. The same is true for adapted random measures. Proof. Assume that (7.29) is predictable. If H χ ((s, t]) · χF · χΛ , where Λ ∈ E and F ∈ Fs then (H • µ) (u, ω) = µ (ω, (0, u] ∩ (s, t] × Λ) · χF (ω) χ ((s, ∞)) (u) which is a predictable process. As µ is finite the set of bounded processes H for which H • µ is predictable is a linear space. The lemma follows from the Monotone Class Theorem and from the Monotone Convergence Theorem. The other direction is obvious. The same argument is valid for adapted random measures.
500
PROCESSES WITH INDEPENDENT INCREMENTS
Example 7.56 Predictable random measures generated by predictable kernels.
Let K (s, ω, Λ) be a predictable kernel on E. By definition this means that K (s, ω, Λ) is a P-measurable process for every fixed Λ ∈ E and K is a measure on (E, E) for every fixed (s, ω). Let assume that K is bounded. Let A ∈ A+ and let assume that A is predictable. Let ∞ χC (s, e) K (s, ω, de) A (ds, ω) , C ∈ B (R+ ) × E. α (ω, C) 0
E
If C is a measurable rectangle then the inner integral is product measurable. Let L be the set of bounded functions for which this property holds. As K is bounded L is a λ-system. With the Monotone Class Theorem one can easily prove that the inner integral is product measurable for every product measurable set. Hence by Fubini’s theorem α is well-defined. We show that α is a predictable random measure. If C = (0, t] × Λ then α (ω, C) = K (t, ω, Λ) (A (t, ω) − A (0, ω)) which is finite and predictable. Hence by the previous lemma α is a predictable random measure. Lemma 7.57 If µ ∈ A+ loc then H • µ is adapted for every non-negative progressively measurable process H. Proof. As µ ∈ A+ loc there is a predictable process V V • µ ∈ A+ . This implies that the random measure ρ (A) V dµ, A ∈ B (R+ ) × E
> 0 for which
A
is finite and adapted. As we are integrating by trajectories it is easy too see that H • µ = HV −1 V • µ = HV −1 • ρ. Hence one can assume that µ is finite. Let us fix a t. If I (a, b] ⊆ (0, t] and F ∈ Ft then for every Λ ∈ E (χI χF χΛ • µ) (t, ω) = µ (ω, (a, b] × Λ) χF which is Ft -measurable. Let L be the set of bounded processes H for which (H • µ) (t) is Ft -adapted. As µ ((0, t] × E) < ∞ obviously L is a linear space and it is obviously a λ-system. As the processes χI χF χΛ form a π-system by the Monotone Class Theorem L contains the bounded processes measurable with respect to the product σ-algebra B ((0, t])
PREDICTABLE COMPENSATORS OF RANDOM MEASURES
501
× Ft × E. Therefore if H ≥ 0 is progressively measurable then H • µ is Ft -measurable for every t. and let µ be a random Definition 7.58 Let f be a measurable function on Ω measure. 1. f is integrable with respect to µ if |f | • µ ∈ A+ . 2. f is locally integrable with respect to µ if |f | • µ ∈ A+ loc . Z, µ) L1 (Ω, Z, µ) denotes the set of Z-measurable Definition 7.59 L1 (Ω, 1 Z, µ) L1 (Ω, Z, µ) denotes the set of µ-integrable functions. Lloc (Ω, loc Z-measurable functions locally integrable with respect to µ. 7.2.2
Existence of predictable compensator
Let X be a right-regular process and let f be a non-negative deterministic function. Y f • µX = f (∆X) is an increasing, right-regular stochastic process. Obviously in general Y is not predictable, but if Y ∈ Aloc then Y has a predictable compensator Y p . It is not a great surprise that one can generalize this theorem to random measures. One can call this generalization the Extended Edition of the Doob–Meyer decomposition. Definition 7.60 Let µ be a random measure. A predictable random measure µp is the predictable compensator of µ if one of the next conditions holds:
P, µ then H ∈ L1 Ω, P, µp and ∈ L1 Ω, 1. if H loc loc •µ−H • µp ∈ L. H
(7.30)
is a non-negative and predictable function on Ω then 2. if H E
Hdµ
E
(0,∞)×E
=E
• µ (∞) = H
(7.31)
p H • µ (∞) = E
p Hdµ
.
(0,∞)×E
Let assume that µ is finite and µ (ω, (0, t] × Λ) is adapted. Let Λ ∈ E and let A (t, ω) µ (ω, (0, t] × Λ) . As µ is a finite random measure A is increasing and right-regular in t. As A ∈ A+ it has a predictable compensator Ap . In the L´evy process case the spectral measure ν t (Λ) is the expected value of the number of the jumps in Λ during the
502
PROCESSES WITH INDEPENDENT INCREMENTS
time period (0, t]. By the elementary properties of the predictable compensator of the locally integrable processes E (µ ((0, t] × Λ)) E (A (t)) = E (Ap (t)) . More generally if for some D ∈ P A (t, ω) (χD • µ) (t, ω)
χD dµ, (0,t]×E
then A ∈ A+ . Let Ap denotes the predictable compensator of A. As µp is a predictable measure and χD is a predictable process, by the definition of the predictability of the random measures χD • µp is a predictable, right-continuous, increasing process. By (7.30) χD • µp is the predictable compensator of χD • µ Therefore for any D ∈ P. p
χD • µp = Ap = (χD • µ) ,
D ∈ P.
Hence its is quite natural to try the definition p
µp (ω, D) = (χD • µp ) (∞, ω) (χD • µ) (∞, ω) ,
D ∈ B (R+ ) × E.
This expression is nearly a random measure. The main problem with this defp inition is that the predictable compensator Ap (χD • µ) is defined up to an event with zero probability. So for every D one can define µp (ω, D) only up to an event with zero probability. The situation is very similar to the situation one has during the construction of the conditional probabilities. Do we have a regular version? The answer depends on the topological properties of E. It is not a great surprise that during the proof of the existence of predictable compensators we use the existence of regular conditional distributions on (E, E). The technical details are in the next lemma: Lemma 7.61 If µ is a finite and adapted measure then there is a predictable measure µp such that E
Hdµ
=E
(0,∞)×E
p Hdµ
(0,∞)×E
≥ 0 predictable process on Ω. for every H Proof. As the main problem is some micro surgery on the level of measure-zero P × E. sets, it is not a great surprise, that the proof is quite technical. Let P
PREDICTABLE COMPENSATORS OF RANDOM MEASURES
503
P) the finite measure Let us define on the measurable space (Ω,
E χ • µ (∞) E P D D
(0,∞)×E
∈ P. D
χD dµ ,
As µ is adapted χD • µ (∞) is measurable50 . Hence P is well-defined. Observe that P is a natural generalization of the Dol´eans measure. As µ is finite P is a finite measure. Therefore one can assume that P is a probability measure P). Let us denote by E(· | F) the conditional on the measurable space (Ω, One can expectation operator generated by P and by some σ-algebra F ⊆ P. P × E with the mapping embed σ-algebra P into P D → D × E,
D ∈ P.
0 Let ξ(t, ω, e) e. ξ is an E-valued random variable Denote this σ-algebra by P. P). E is a complete separable metric space so the conditional distribution on (Ω, 0 has a regular version p ( E(χ (ξ ∈ Λ) | P) ω , Λ) p (t, ω, e, Λ). By the definition 0 of the conditional expectation p is P-measurable. So p is constant in e. Hence we can drop the third component and we can denote the conditional distribution
0 E χ (ξ ∈ Λ) | P 0 , P Λ | P
Λ ∈ E.
0 as p (t, ω, Λ). Let H (t, ω) be a P-measurable, therefore P-measurable process Hχ (ξ ∈ Λ) then almost surely by P and let Λ ∈ E. If H
|P 0 E Hχ (ξ ∈ Λ) | P 0 =H ·E χ (ξ ∈ Λ) | P 0 H · P Λ | P 0 . H E p is a regular version of the conditional distribution so
0 = P Λ | P dp = χΛ (e) dp (e) , Λ
Λ ∈ E.
E
Therefore
|P 0 =H H (e) . χΛ (e) dp (e) = HχΛ (e) dp (e) E Hdp E 50 See:
Lemma 7.57, page 500.
E
E
PROCESSES WITH INDEPENDENT INCREMENTS
504
In the usual way one can extend this identity to every P-measurable function Hence the disintegration formula H.
H |P 0 = (e) E Hdp E
on Ω. Let is valid P-almost surely for every predictable function H P (A) P (A × E) = E
χA×E dµ ,
A ∈ P.
(0,∞)×E
| P) 0 is P0 By the definition of the conditional expectation using that E(χ D measurable, hence it does not depend on e
=E χ = χ | P 0 dP = E P D D D = R+ ×Ω
0 E χD | P dP =
= (0,∞)×Ω
E
Ω
R+ ×Ω
E
(7.32)
χD dp (e) dP =
χD dp (e) dP.
If D ∈ P then by the elementary properties of the predictable compensator P (D) P (D × E) E
χD×E
• µ (∞) E
=E
χD χE dµ
χD×E dµ
=
(0,∞)×E
= E ((χD • (χE • µ)) (∞)) =
(0,∞)×E p
= E ((χD • (χE • µ) ) (∞)) E ((χD • Ap ) (∞)) , where A (t, ω) (χE • µ) (t, ω) = µ (ω, (0, t] × E) . As µ is adapted A is adapted. µ is finite so A has integrable variation. Therefore Y Ap exists. Hence if D ∈ P then P (D) = E 0
∞
χD (t, ω) Y (dt, ω) .
(7.33)
PREDICTABLE COMPENSATORS OF RANDOM MEASURES
505
In the usual way one can extend (7.33) to every predictable process. So if H is a predictable process then
H (t, ω) dP (t, ω) =
(0,∞)×Ω
∞
H (t, ω) Y (dt, ω) dP (ω) . Ω
0
0 = χ | P H (t, ω) E χ dp is predictable, hence by (7.32) D E D E
=
∞
= Ω
0
E
χD • µ (∞) P D χD dp (e) dP (t, ω) =
(0,∞)×Ω
(7.34)
E
χD p (t, ω, de) Y (dt, ω) dP (ω) E χD • µp (∞) ,
where µ (ω, C) p
p (t, ω, de) Y (dt, ω) ,
C ∈ B (R+ ) × E.
C
Y Ap is predictable p (t, ω, Λ) is a predictable kernel so by the above example51 µp is a predictable random measure. From (7.34) the lemma is obvious. Theorem 7.62 (Predictable compensator of random measures) If µ ∈ A+ loc then: 1. µ has a predictable compensator denoted by µp and 2. µp ∈ A+ loc . If µp1 and µp2 are predictable compensators of some µ then almost surely µp1 = µp2 . That is the predictable compensator is unique up to indistinguishability. Proof. By definition µ ∈ A+ loc if µ is adapted and σ-finite. The second property means that V • µ ∈ A+ for some positive predictable process V . 1. Let us prove that the first condition in the definition of the predictable com be a non-negative predictable function pensator implies the second one. Let H 1 P, µ) that is H •µ ∈ A+ . This means on Ω. Let us first assume that H ∈ Lloc (Ω, loc that the process H • µ has a predictable compensator (H • µ)p . µp is predictable, • µp is a predictable, increasing process. By which by definition means that H the first condition (7.30) and by the definition of the predictable compensator • µp . By the basic properties of the • µ)p = H of locally integrable processes (H 51 See:
Example 7.56, page 500.
506
PROCESSES WITH INDEPENDENT INCREMENTS
predictable compensator of locally integrable processes52 E
p
• µ (∞) = E H • µ (∞) = E H • µp (∞) , H
be an arbitrary non-negative, which is just the second condition (7.31). Now let H predictable function on Ω. Let V > 0 be a predictable function with V • µ ∈ A+ . n Hχ( H ≤ nV ) is also predictable and as H n • µ ≤ nV • µ H n satisfies n is integrable. Hence by the just proved part of the observation H H (7.31). Using Monotone Convergence Theorem one can show that H also satisfies (7.31). ∈ L1 (Ω, P, µ). This means 2. Now we show that (7.31) implies (7.30). Let H loc + • µ. We are that |H| • µ ∈ Aloc . Let (τ n ) be a localizing sequence of process |H| integrating by trajectories so trivially
τ n ∈ A+ . •µ H χ ([0, τ n ]) • µ = H From this, using condition (7.31)
τ n = H χ ([0, τ n ]) • µp ∈ A+ . H • µp ∈ L1 (Ω, P, µp ). Let τ be a stopping time. As by (7.31) Hence H loc E
± χ ([0, τ n ∧ τ ]) • µp (∞) <∞ ± χ ([0, τ n ∧ τ ]) • µ (∞) =E H H
the expected value of the stopped variable
([0, τ n ]) • µ − Hχ ([0, τ n ]) • µp (τ ) Hχ
is zero. Hence ([0, τ n ]) • µp ([0, τ n ]) • µ − Hχ Hχ •µ− H •µp is a local martingale. So condition is a martingale. This means that H (7.31) implies (7.30). 52 See:
Theorem 3.52, page 213.
PREDICTABLE COMPENSATORS OF RANDOM MEASURES
507
+ 3. As µ ∈ A+ loc it is σ-finite. This means that V •µ ∈ A with some predictable As the second condition of the definition holds obviously function V > 0 on Ω. p + p V • µ ∈ A , so µ is also σ-finite with respect to P. 4. Let us show that µp is unique. Assume that µp1 and µp2 are two predictable compensators of µ. As E is the Borel σ-algebra of a separable metric space it has a countable base (Λn ). Obviously V χΛn is (P × E)-measurable so it is a Obviously predictable function on Ω.
P, µ = L1 Ω, P, µp = L1 Ω, P, µp . V χΛn ∈ L1loc Ω, loc loc 1 2 So by the first condition V χΛn • µp1 − V χΛn • µp2 ∈ L ∩ V is predictable. Hence by Fisk’s theorem53 it is almost surely zero. Let N be the union of the exceptional sets. As (Λn ) is countable P (N ) = 0. V χ ([0, t] × Λn ) • µp1 (ω) = V χ ([0, t] × Λn ) • µp2 (ω) ,
ω∈ / N.
By the Monotone Class Theorem for every non-negative and (B (R+ ) × E)measurable function H V H • µp1 (ω) = V H • µp2 (ω) ,
ω∈ / N.
Hence if ω ∈ / N then µp1 (ω, A) = µp2 (ω, A) for every set A ∈ B (R+ )×E. Therefore the predictable compensator is unique. 5. Let µV (B) V χB dµ, B ∈ B (R+ ) × E. B
≥ 0 is predictable, then µV is an adapted, finite measure. So by the lemma, if H E
−1 • µV (∞) = E HV −1 • µp (∞) . HV V
We are integrating by trajectories, so E
53 See:
• V V −1 µ (∞) = • µ (∞) = E H H
−1 • (V • µ) (∞) = E HV
Theorem 2.11, page 117.
508
PROCESSES WITH INDEPENDENT INCREMENTS
E =E
−1 • µV (∞) = E HV −1 • µp (∞) = HV V
• V −1 • µp (∞) . H V
V −1 is predictable, therefore V −1 • µpV = V −1 dµpV is also predictable. Hence µ has a predictable compensator. Corollary 7.63 Theorem 7.45 is true, that is if X is a right-regular process on Rn then the counting measure µX in (7.27) has a predictable compensator ν.
7.3
Characteristics of Semimartingales
The characteristics of semimartingales are generalizations of the characteristics of L´evy processes. Let X be an n-dimensional semimartingale. Let us fix the so-called truncating function h. h can be any bounded function Rn → Rn with compact support and with the property that h (x) = x in some neighborhood of the origin. The simplest example of truncating function is h (x) xχ (x < 1). To make the notation as simple as possible we shall use h (x) xχ (x < 1) as truncating function, but the specific form of h does not really matter. If the jumps of X are small that is if ∆X is in the neighbourhood related to h then ∆X − h (∆X) = ∆X − ∆X = 0. Let 0 (h) (t) X
(∆X (s) − h (∆X (s)))
s≤t
0 (h) . X (h) X − X 0 (h) is the process of the big jumps of X and X (h) is the process from which X 0 (h) we deleted the big jumps of X. The big jumps cannot accumulate therefore X has bounded variation on finite intervals. As ∆ (X (h)) = h (∆X) and as h has compact support, X (h) is a semimartingale with bounded jumps, that is X (h) is a special semimartingale54 . As X (h) is a special semimartingale it has a unique decomposition X (h) = X (0) + B (h) + L (h)
(7.35)
where L (h) is a local martingale and B (h) is a predictable process with finite variation. 54 See:
Example 4.46, page 258.
CHARACTERISTICS OF SEMIMARTINGALES
509
Definition 7.64 If X is a semimartingale and h is a truncating function then we call the triplet (B, C, ν) the characteristics of X under h where: 1. B B (h) is the predictable process with finite variation in line (7.35), ' & c 2. C = (Cij ), where Cij Xic , Xjc = [Xi , Xj ] , where Xic is the continuous 55 part of semimartingale Xi , 3. ν is the spectral measure of X, that is ν is the predictable compensator of µX . Example 7.65 Characteristics of L´evy processes.
Every L´evy process X has a representation 0 (h) (t) + γt X (t) = M c (t) + M d (t) + X where γt is the expected value of the small jumps. Obviously 0 (h) = M c (t) + M d (t) + γt X (h) X − X is a special semimartingale, with B (h) (t) = γt. As we have seen M c = σw, where w is a Wiener process. Hence C (t) = [σw] (t) = σ 2 t. Let ρ be the / cl (Λ) predictable compensator of µX . For every 0 ∈ χΛ • µX − χΛ • ρ ∈ L, χΛ • µX − tν (Λ) = N Λ (t) − tν (Λ) ∈ L. From this χΛ • ρ − tν (Λ) ∈ L ∩ V. As χΛ • ρ − tν (Λ) is predictable dρ = tν (Λ) . (0,t]×Λ
Hence ρ ((0, t] × Λ) = tν (Λ). As the predictable compensator is unique (µX )p = tν. So the semimartingale characteristics of X is the triplet (γt, σ 2 t, tν). Proposition 7.66 If X is a semimartingale and (B, C, ν) is the characteristics of X then: 2
1. (x ∧ 1) • ν ∈ A+ loc and 2. X is a special semimartingale if and only if 55 See:
Theorem 4.21, page 234.
2 x ∧ x • ν ∈ A+ loc .
PROCESSES WITH INDEPENDENT INCREMENTS
510
In this case if X = X (0) + A + L is the canonical decomposition of X and if h denotes the truncating function of the characteristics then A = B + (x − h (x)) • ν. Proof. One can prove the proposition in several steps: 1. For every semimartingale on any finite interval the quadratic is variation 2 2 2 ∆X ∧ 1 finite so ∆X = x • µX ∈ V + . The right-regular process has bounded jumps therefore it is locally bounded56 . As it is increasing
2 2 ∆X ∧ 1 ∈ A+ x ∧ 1 • µX = loc .
2
x ∧ 1 is a deterministic, non-negative function so it is predictable, therefore by (7.31), using that χ ([0, τ n ]) is also predictable E
2 2 x ∧ 1 • ν (τ n ) = E x ∧ 1 χ [0, τ n ] • ν (∞) =
2 = E x ∧ 1 χ [0, τ n ] • µX (∞) =
2 = E x ∧ 1 • µX (τ n ) < ∞.
Hence
2 x ∧ 1 • ν ∈ A+ loc ,
which proves the first statement. 0 (h) is always a special semimartingale57 . Therefore X is a 2. X (h) X − X 0 (h) is a special special semimartingale if and only if the process of big jumps X semimartingale. On the other hand 0 (h) X
(∆X − h (∆X)) = (x − h (x)) • µX ∈ V.
0 (h) is a special semimartingale if and only if Therefore58 X 0 (h) = (x − h (x)) • µX ∈ Aloc . X 56 See:
Proposition 1.152, page 107. Example 4.47, page 258. 58 See: Theorem 4.44, page 257. 57 See:
CHARACTERISTICS OF SEMIMARTINGALES
511
x − h (x) is deterministic so it is predictable. Hence using (7.31) on the positive and on the negative parts of the coordinates one can prove that X is a special semimartingale if and only if x − h (x) • ν ∈ A+ loc .
(7.36)
Assume that h (x) xχ (x < 1). 2
2
x − h (x) ≤ x ∧ x ≤ x − h (x) + x ∧ 1. 2
By the first part of the proposition for every semimartingale (x ∧ 1) • ν ∈ A+ loc . Which implies that the first part of the second statement under the truncation function h (x) xχ (x < 1) holds. One can prove the general case in a similar way. 3. If X has the canonical decomposition X = X (0) + A + L and X (h) = X (0) + B + N then 0 (h) X − X (h) = A − B + L − N. X 0 (h) ∈ Aloc and the L − N ∈ L and A − B is predictable. This implies59 that X X 0 predictable compensator of X (h) = (x − h (x)) • µ ∈ Aloc is A − B. Hence p
p A − B = (x − h (x)) • µX = (x − h (x)) • µX (x − h (x)) • ν, from which the second part of the second statement is evident. Example 7.67 A compound Poisson process X is a special semimartingale, if and only if the distribution of the absolute value of the size of the jumps has finite expected value.
As60 ν = λtF, where F is the distribution function of the jumps
2 2 x ∧ x dF (x) . x ∧ x • ν (t, ω) = λt Rn
2
X is a special semimartingale if and only if (x ∧ x) • ν ∈ A+ loc . ν is deterministic so X is a special semimartingale if and only if 2 x ∧ x dF (x) < ∞, Rn
which happens if and only if 59 See: 60 See:
Rn
Proposition 3.35, page 200. Example 7.24, page 477.
x dF (x) is finite.
512
PROCESSES WITH INDEPENDENT INCREMENTS
Corollary 7.68 Let X be a semimartingale. X is a local martingale if and only if x − xχ (x < 1) • ν x − h (x) • ν ∈ A+ loc and 0 = B + (x − xχ (x < 1)) • ν B + (x − h (x)) • ν. Proof. Every local martingale is a special semimartingale with canonical decomposition X = X (0) + 0 + L where L ∈ L. During the proof of the previous proposition61 we have seen that X is a special semimartingale if and only if x − h (x) • ν ∈ A+ loc . The second condition is equivalent to the assumption that in the canonical decomposition the finite variation part is zero. Example 7.69 Symmetric stable processes.
Recall62 that a L´evy process X is a symmetric stable process if its characteristics is (0, 0, t|x|α ) , where 0 < α < 2. By the just proved result X is a local martingale if and only if (x − h (x) • ν) (t, ω) = 2t
∞
xdx−α < ∞.
1
The integral is finite if and only if α > 1. As we shall prove every local martingale with independent increments is a martingale63 . So X is a martingale if and only if α > 1. If α ≤ 1 then X is a semimartingale64 , but it is not even a special semimartingale. 61 See:
line (7.36) page 511. Example 7.39, page 491. 63 See: Theorem 7.97, page 545. 64 Every L´ evy process is a semimartingale. 62 See:
´ LEVY–KHINTCHINE FORMULA
7.4
513
L´ evy–Khintchine Formula for Semimartingales with Independent Increments
In this section we prove the generalization of the L´evy–Khintchine formula. Recall that if X is a L´evy process then ϕ (u, t) = exp (tφ (u)) where φ (u) iuγ −
σ 2 u2 + 2
R\{0}
(exp (iux) − 1 − xχ (|x| < 1)) dν (x) .
As Ψ (u, t) tφ (u) is a continuous process with finite variation it is also clear that ϕ (u, t) = E (Ψ (u, t)) .
(7.37)
Let X be a semimartingale. Using the characteristics of X one can define the exponent Ψ (u, t) in a very straightforward65 way. Our goal is to show that (7.37) is true for semimartingale with independent increments66 . There are two major steps in the proof. The first one, and perhaps the more difficult one, is to show that if X is a semimartingale with independent increments then Ψ (u, t) is deterministic. As an other major step with Itˆ o’s formula we shall prove that if X is a semimartingale then Y (u) exp (iuX) − exp (iuX−) • Ψ (u) is a local martingale for every u. If Ψ is deterministic then Y (u) is bounded on any finite interval so it is a martingale. Using Fubini’s theorem one can easily show that E (exp (iuX (t))) − E (exp (iuX (t) −)) • Ψ (u) = 1. That is for every u ϕ (u) − ϕ− (u) • Ψ (u) = 1. By definition this means that (7.37) holds. 7.4.1
Examples: probability of jumps of processes with independent increments
As we have seen every L´evy process is continuous in probability. This implies that the probability of a jump of a L´evy process at every moment of time is zero. This property does not hold for processes with independent increments. Perhaps this is the most remarkable property of the class of processes with independent 65 See: 66 See:
Definition 7.76, page 518. Definition 1.93, page 58.
514
PROCESSES WITH INDEPENDENT INCREMENTS
increments. To correctly fix the ideas of the reader in this subsequence we show some examples. Later we shall prove p that for processes with independent increments the spectral measure ν µX has a deterministic version67 . We shall use this fact several times in the examples of this subsection. Example 7.70 If X is an arbitrary right-regular process and if ν is the spectral measure of X then for every Λ ∈ E B (Rn \ {0}) a.s.
ν ({t} × Λ) = P (∆X (t) ∈ Λ | Ft− ) . If X has independent increments then ν ({t} × Λ) = P (∆X (t) ∈ Λ) . A process with independent increments has a jump with positive probability at time t if and only if ν ({t} × (Rn \ {0})) > 0.
Let H χ ({t} × Λ). H is deterministic pso it is predictable and obviously H •
µX ∈ Aloc . By (7.30) H • ν = H • µX . By the formula for the the jumps of predictable compensators almost surely p
ν ({t} × Λ) = (∆ (H • ν)) (t) = ∆ H • µX (t) =
(t) = E ∆ H • µX (t) | Ft− = = p ∆ H • µX = P (∆X (t) ∈ Λ | Ft− ) . If X has independent increments then ∆X (t) is independent of Ft−1/n for any n. Hence it is independent of Ft− . So in this case a.s.
ν ({t} × Λ) = ν (ω, {t} × Λ) = P (∆X (t) ∈ Λ) . Definition 7.71 Let J {t: ν ({t} × (Rn \ {0})) > 0}. Example 7.72 Processes with independent increments which are not continuous in probability. 67 See:
Corollary 7.88, page 532.
´ LEVY–KHINTCHINE FORMULA
515
1. Perhaps the simplest example is the following. Let ξ be an arbitrary random variable. Let X (t)
0 if t < 1 . ξ if t ≥ 1
It is easy to see, that X is a process with independent increments. If ξ = 0 then X is not continuous in probability and J = {1}. Let F be the distribution of ξ. The only non-zero part of the spectral measure of X is ν ({1} × Λ) = F (Λ) ,
Λ ∈ B (R) \ {0} .
Obviously the Fourier transform of X (t) is ϕ (t, u) =
R
1 if t < 1 . exp (iux) dF (x) if t ≥ 1
Obviously ν ({1} × (R\ {0})) = P (ξ = 0) = 1 − P (ξ = 0) . From this68
R
exp (iux) dF (x) = 1 · P (ξ = 0) +
R\{0}
=1+ R\{0}
exp (iux) ν ({1} × dx) =
(exp (iux) − 1) ν ({1} × dx) .
2. Assume that ξ has uniform distribution over [−1, 1]. In this case exp (iux) dF (x) = R
1 2
1
cos ux dx = −1
sin u . u
For certain values of u the Fourier transform ϕ (t, u) is never zero, but for certain u at t = 1 it jumps to zero. 3. To make the example a bit more complicated let (ξ k ) be a sequence of independent random variables and let tk ∞. Let X (t)
tk ≤t
68 See:
Corollary 7.91, page 535.
ξk .
(7.38)
516
PROCESSES WITH INDEPENDENT INCREMENTS
It is easy to see again that X is a process with independent increments and J = {(tk )}. The Fourier transform of X is69 ϕX (t, u) =
! tk ≤t
=
!
R
exp (iux) dFk (x) =
1+ R\{0}
tk ≤t
(exp (iux) − 1) ν ({tk } × dx) .
4. Let B (t) be a deterministic, right-regular function. Obviously it is a process with independent increments. Its Fourier transform is ϕB (t, u) = exp (iuB (t)). 5. Let us now investigate the process V X + B, where X is the process in line (7.38). As X and B are independent the Fourier transform of V is ϕB ϕX . But let us observe that the spectral measure of V is different from the spectral measure of X as the jumps of B introduce some new jumps for V . Therefore
!
×
ϕV (t, u) = exp (iuB (t)) ×
exp (−iu∆B (tk )) 1 +
tk ≤t
R\{0}
(exp (iux) − 1) ν ({tk } × dx)
where of course ν denotes the spectral measure of V . Which one can write as exp (iuB (t)) × ; × exp (−iu∆B (r)) 1 +
R\{0}
0
(exp (iux) − 1) ν ({r} × dx)
,
where r can be any jump-time of B or one of the points tk . Example 7.73 If X is an adapted, right-regular process and if ν is the spectral measure of X then for every predictable stopping time τ and for every set Λ ∈ E B (Rn \ {0}) on the set {τ < ∞} a.s.
ν ([τ ] × Λ) = P (∆X (τ ) ∈ Λ | Fτ − ) E (χ (∆X (τ ) ∈ Λ) | Fτ − ) that is ν (ω, {t} × Λ) is a version of
p
(χ (∆X ∈ Λ)) for every Λ ∈ E B (Rn \ {0}).
If V (t, ω) χ ({t} × Λ) then V is predictable. As ν (ω, {t} × Λ) = (∆ (V • ν)) (t, ω) , 69 See:
Theorem 7.90, page 534.
´ LEVY–KHINTCHINE FORMULA
517
and V • ν is predictable ν (ω, {t} × Λ) is a predictable process. Let H χ ([τ ] × Λ). As τ is a predictable stopping time [τ ] is a predictable subset70 . So and H is predictable. Obviously H • µX ∈ Aloc . By (7.30) H • ν = ∈ P
[τ ] × Λ X p . By the formula for the the jumps of predictable compensators almost H •µ surely on the set {τ < ∞} p
ν ([τ ] × Λ) = (∆ (H • ν)) (τ ) = ∆ H • µX (τ ) = =
p
∆ H • µX (τ ) = E ∆ H • µX (τ ) | Fτ − =
= P (∆X (τ ) ∈ Λ | Fτ − ) .
Example 7.74 If X is an adapted, right-regular process and if ν is the spectral measure of X then every jump time of X is totally inaccessible if and only if ν (ω, {t} × (Rn \ {0})) ≡ 0 up to indistinguishability.
If τ is a predictable jump time then P (∆X (τ ) = 0) = E (ν ([τ ] × (Rn \ {0}))) = 0. On the other hand if X does not have a predictable jump time then (χ (∆X = 0)) = 0. As the predictable projection is unique ν (ω, {t} × E) = 0 up to indistinguishability. p
Example 7.75 Jump times of a process with independent increments are totally inaccessible if and only if the process is continuous in probability.
1. Let τ be a predictable stopping time. From the just proved proposition using that ν has a deterministic version71 P (∆X (τ ) = 0) = E (ν ([τ ] × E)) =
ν (ω, {τ (ω)} × E) dP (ω) = Ω
ν ({τ (ω)} × E) dP (ω) =
= Ω
P (∆X (τ (ω)) = 0) dP (ω) =
= Ω 70 See: 71 See:
Corollary 3.34, page 199. Corollary 7.88, page 532.
0 dP (ω) = 0, Ω
518
PROCESSES WITH INDEPENDENT INCREMENTS
where in the last line we used that X is continuous in probability so P (∆X (t) = 0) = 0 for every t. Therefore P (∆X (τ ) = 0) = 0, which implies that τ , or any part of τ , is not a jump time of X. 2. On the other hand if X is not continuous in probability then P (∆X (t) = 0) > 0 for some t. Obviously τ ≡ t is a predictable stopping time.
7.4.2
Predictable cumulants
Definition 7.76 If (B, C, ν) is a characteristic of some semimartingale X then as in the L´evy–Khintchine formula let us introduce the exponent 1 Ψ (u, t) iuB (t) − uC (t) u + (L (u, x) • ν (x)) (t) 2
(7.39)
where L (u, x) exp (iux) − 1 − iuh (x) is the so called L´evy kernel. We shall call Ψ the predictable cumulant72 of X. Observe that L is deterministic and
2 |L (u, x)| ≤ k (u) · x ∧ 1 .
(7.40)
Therefore the integral in (7.39) exists73 and L • ν ∈ Aloc . As uC (t) u is a continuous increasing process and as every right-regular predictable process is locally bounded74 it is clear from the definition that Ψ (u, t) ∈ Aloc for every u. First we prove an important technical observation: Lemma 7.77 X is an n-dimensional semimartingale if and only if exp (iuX) is a semimartingale for every u. Proof. If X is a semimartingale, then by Itˆ o’s formula exp (iuX) is a semimartingale. On the other hand assume that exp (iuX) is a semimartingale for every u. This implies that sin (uXj ) is a semimartingale for every u and for 72 As
ν and B are predictable and C is continuous Ψ is predictable. Proposition 7.66, page 509. 74 See: Proposition 3.35, page 200. 73 See:
´ LEVY–KHINTCHINE FORMULA
519
every coordinate Xj of X . Let f ∈ C 2 (R) be such that f (sin x) = x on the set |x| ≤ 1 < π/2. Let us introduce the stopping times 1 τ n inf {t : |Xj (t)| > n} = inf t: Xj (t) > 1 . n 1 Xj (t) ≤ 1 on [0, τ n ), so on this random interval n Xj = n
Xj = nf n
sin
Xj n
.
By Itˆo’s formula the right-hand side is always a semimartingale. Therefore by the next lemma Xj is a semimartingale. Lemma 7.78 Let (τ n ) be a localizing sequence and let (Yn ) be a sequence of semimartingales. If X = Yn on [0, τ n ) for every n then X is a semimartingale. Proof. To make the notation simple let τ 0 = 0. As τ n ∞ for every t X (t) = lim Yn (t) . n→∞
Hence X is adapted and it is obviously right-regular. If Zn Ynτ n + (X (τ n ) − Yn (τ n )) χ ([τ n , ∞)) , then X and Zn are equal on [0, τ n ]. As X is adapted and right-regular the second component is adapted hence it is in V. The first expression is a stopped semimartingale, so the sum, Zn , is a semimartingale. Let Zn = X (0) + Ln + Vn be a decomposition of Zn . Then X = X (0) + L + V, where L
n
Ln χ ((τ n−1 , τ n ]) and V X − X (0) − L.
Lτ n =
n
Lp χ ((τ p−1 , τ p ]) =
p=1
n
Lτp p − Lτp p−1 ∈ M,
p=1
so L ∈ L. The proof of V ∈ V is similar. Proposition 7.79 (Characterization of predictable cumulants) Let X be an n-dimensional right-regular process. The next statements are equivalent: 1. X is a semimartingale and Ψ is the predictable cumulant of X. 2. exp (iuX) − exp (iuX− ) • Ψ (u) is a complex valued local martingale for every u. Proof. The main part of the proof is an application of Itˆ o’s formula.
520
PROCESSES WITH INDEPENDENT INCREMENTS
1. Assume that the first statement holds. Using the definition of the characteristics X has a decomposition X =B+L+
(∆X − h (∆X)) ,
where L is the local martingale part of the special semimartingale X (h). Let o’s formula f ∈ C 2 (Rn ). By Itˆ f (X) − f (X (0)) =
n ∂f (X− ) • Bj + ∂x j j=1
+
n ∂f (X− ) • Lj + ∂xj j=1
+
n ∂f (X− ) • (∆Xj − h (∆X)) + ∂xj j=1
+
' & 1 ∂2f (X− ) • Xjc , Xkc + 2 j=1 ∂xj ∂xk n
n
k=1
+
n ∂f f (X) − f (X− ) − (X− ) ∆Xj . ∂x j j=1
One can write the third line as n ∂f (X− ) (∆Xj − h (∆X)) ∂x j j=1
Let us introduce the predictable process H (t, ω, e)
n ∂f (X− ) (ej − h (e)) + ∂xj j=1
+f (X (t−) + e) − f (X (t−)) −
n ∂f (X (t−)) ej = ∂x j j=1
= f (X (t−) + e) − f (X (t−)) −
n ∂f (X− ) h (e) . ∂x j j=1
´ LEVY–KHINTCHINE FORMULA
521
With this notation f (X) − f (X (0)) =
n ∂f (X− ) • Bj + ∂x j j=1
' & 1 ∂2f + (X− ) • Xjc , Xkc + 2 j=1 ∂xj ∂xk n
n
k=1
+H •µ + X
+
n ∂f (X− ) • Lj . ∂x j j=1
Let us assume that f is bounded. In this case the left-hand side is a bounded semimartingale, hence it is a special semimartingale. The first and the second expressions on the right-hand side are obviously predictable and have finite variation. The fourth expression is a local martingale. This implies that the third expression on the right-hand side is also a special semimartingale. Hence75 P, µX ). Therefore H • µX ∈ Aloc . H is predictable so by definition H ∈ L1loc (Ω, 76 by the elementary properties of the predictable compensator of µX H • µX − H • ν ∈ L. 2. Let f (x) exp (iux). ∂f = iuj f, ∂xj
∂2f = −uj uk f. ∂xj ∂xk
In this case H (t, ω, e) exp (iu (X (t−))) exp (iue) − − exp (iuX (t−)) −
n
iuj exp (iuX (t−)) hj (e) ,
j=1
that is H (t, ω, e) = exp (iu (X (t−))) · (exp (iue) − 1 − iuh (e)) . Hence f (X) − f (X (0)) −
n j=1
75 See: 76 See:
1 uj uk f (X− ) • Cjk − H • ν 2 j=1 n
iuj f (X− ) • Bj +
Theorem 4.44, page 257. line (7.30), page 501.
n
k=1
522
PROCESSES WITH INDEPENDENT INCREMENTS
is a local martingale. One can write the last three expression as
1 f (X− ) • iuB − uCu + (exp (iux) − 1 − iuh (x)) • ν (x) 2 f (X− ) • Ψ (u) hence exp (iuX) − exp (iuX− ) • Ψ (u)
is a local martingale. 3. Assume that the second statement holds. First we prove that X is a semimartingale. exp (iuX− ) • Ψ (u) has finite variation for every u as the integrand is bounded and Ψ (u, t) has finite variation in t. By the assumption exp (iuX) − exp (iuX− ) • Ψ (u) is a local martingale. Therefore exp (iuX) is a semimartingale for every u. Hence by the lemma above X is a semimartingale. 4. Finally we prove that the predictable cumulant of X is Ψ. By the already denotes the predictable cumulant of X then proved part of the proposition if Ψ exp (iuX) − exp (iuX− ) • Ψ (u) is a local martingale. Hence
(u) = Y (u) exp (iuX− ) • Ψ (u) − Ψ (u) = exp (iuX− ) • Ψ (u) − exp (iuX− ) • Ψ is also a local martingale. Y has finite variation on any finite interval and as (u) is predictable it is also predictable77 . Therefore by Fisk’s theorem Ψ (u) − Ψ Y (u) with probability one78 is zero for every u. Therefore 0 = exp (−iuX− ) • Y (u) =
(u) = = exp (−iuX− ) • exp (iuX− ) • Ψ (u) − Ψ
(u) = Ψ (u) − Ψ (u) . = 1 • Ψ (u) − Ψ (u, t, ω) = Ψ (u, t, ω) in t for every u. The expressions So with probability one Ψ (u) = Ψ (u) with are continuous in u, hence one can unify the zero sets. So Ψ probability one for every u. 77 See: 78 See:
Example 7.56, page 500. Corollary 3.40, page 205.
´ LEVY–KHINTCHINE FORMULA
523
From the last part of the proof of the proposition the next statement is trivial: Corollary 7.80 Let Φ (u, t, ω) be predictable, continuous in u and rightcontinuous with finite variation in t. If for every u exp (iuX) − exp (iuX− ) • Φ (u) is a local martingale then X is a semimartingale and Φ − Φ (0) is the predictable cumulant of X. Corollary 7.81 If Ψ is the predictable cumulant of a semimartingale X and τ is a stopping time then the predictable cumulant of X τ is Ψτ . Proof. For every u
τ τ τ • Ψ (u) (exp (iuX) − exp (iuX− ) • Ψ (u)) = exp (iuX τ ) − exp iuX− is a local martingale. 7.4.3
Semimartingales with independent increments
Every L´evy process is a semimartingale. This is not true for processes with independent increments. Proposition 7.82 A deterministic right-regular process S is a semimartingale if and only if it has finite variation on any finite interval [0, t]. Proof. If S has the stated properties then S is obviously a semimartingale. Now let S be a deterministic semimartingale. As S is a semimartingale one can define the continuous linear functional79 (f • S) (t) on C ([0, t]). By the Riesz representation theorem80 there is a function V with finite variation that (f • S) (t) =
t
f dV,
f ∈ C ([0, t]) .
0
From the Dominated Convergence Theorem it is clear that f − f (0) = V − V (0) on [0, t] so f has finite variation. As every right-regular, deterministic function starting from the origin is a process with independent increments there are processes with independent increments which are not semimartingales. When is a process with independent increments a semimartingale? 79 As
f • S is also an Itˆ o–Stieltjes integral, it is deterministic. [80].
80 See:
524
PROCESSES WITH INDEPENDENT INCREMENTS
Theorem 7.83 (Characterization of semimartingales with independent increments) An n-dimensional process X with independent increments is a semimartingale if and only if the Fourier transform of X ϕ (u, t) E (exp (iuX (t))) has finite variation on every finite interval in variable t for every u. Proof. Observe that by definition X is right-regular. Therefore by the Dominated Convergence Theorem ϕ is also right-regular in t. 1. Let us fix parameter u. As the increments are not stationary81 it can happen that ϕ (u, t) = 0 for some t. Let t0 (u) inf {t: ϕ (u, t) = 0} . ϕ (u, 0) = 1 and as ϕ (u, t) is right-regular in t obviously t0 (u) > 0. Obviously |ϕ (u, t)| is positive on [0, t0 (u)). We show that t → |ϕ (u, t)| is decreasing on R+ and it is zero on [t0 (u) , ∞). Let h (u, s, t) E (exp (iu (X (t) − X (s)))) . X has independent increments, so if s < t then ϕ (u, t) = ϕ (u, s) h (u, s, t) .
(7.41)
|h (u, t, s)| ≤ 1, therefore as we said |ϕ| is decreasing. By the right-regularity ϕ (u, t0 (u)) = 0. So as |ϕ| ≥ 0 and as it is decreasing ϕ is zero on the interval [t0 (u) , ∞). As ϕ (u, t) is right-regular in t if t0 (u) < ∞ then ϕ (u, t0 (u) −) is well-defined. We show that it is not zero. By (7.41) if s < t0 (u) then ϕ (u, t0 (u) −) = ϕ (u, s) h (u, s, t0 (u) −) . ϕ (u, s) = 0 by the definition of t0 (u). So if ϕ (u, t0 (u) −) = 0 then h (u, s, t0 (u) −) = 0 81 See:
Proposition 1.99, page 63.
´ LEVY–KHINTCHINE FORMULA
525
for every s < t0 (u). 0=
lim h (u, s, t0 (u) −) =
st0 (u)
=
lim E (exp (iuX (t0 (u) −) − iuX (s))) =
st0 (u)
= E (exp (0)) = 1, which is impossible. Therefore ϕ (u, t0 (u) −) = 0. 2. Let exp (iuX (t)) /ϕ (u, t) if t < t0 (u) Z (u, t) . exp (iuX (t0 (u) −)) /ϕ (u, t0 (u) −) if t ≥ t0 (u) X has independent increments, so Z (u) is a martingale on t < t0 (u). As |ϕ (u, t0 (u) −)| > 0 in the next calculation one can use the Dominated Convergence Theorem lim Z (u, t) | Fs = E (Z (u, t0 (u)) | Fs ) = E tt0 (u)
=
lim E (Z (u, t) | Fs ) = Z (u, s)
tt0 (u)
for every s < t0 (u). So Z is a martingale on R+ . 3. By Itˆ o’s formula ϕ (u, t) =
exp (iuX) χ (t < t0 (u)) Z (u, t)
is also a semimartingale. Hence ϕ (u, t) is a deterministic semimartingale, so by the just proved proposition it has finite variation, 5. Now we prove the other implication: Assume that ϕ (u, s) has finite variation on every finite interval [0, t]. One should show that X is a semimartingale. Let us fix a t. If u → 0 then ϕ (u, t) → 1, so there is a b > 0 such that if |u| ≤ b then |ϕ (u, t)| > 0. By the first part of the proof |ϕ| is decreasing, so if s ≤ t and |u| ≤ b then exp (iuX (s)) = Z (u, s) · ϕ (u, s) . Z (u, s) is a martingale, ϕ (u, s) has finite variation, so by Itˆ o’s formula the stopped process exp (iuX t ) is a semimartingale. If |u| > b then for some m large enough |u/m| ≤ b. u
m exp (iuX) = exp i X . m
526
PROCESSES WITH INDEPENDENT INCREMENTS
Therefore by Itˆ o’s formula the stopped process exp (iuX t ) is again a semimartingale for every u. Hence X t is a semimartingale82 for every t. Using the trivial localization property of semimartingales83 it is easy to show that X is a semimartingale. Theorem 7.84 (Predictable cumulants and independent increments) Let X be an n-dimensional semimartingale and let Ψ (u, t, ω) be the predictable cumulant of X. If X has independent increments then Ψ is deterministic and ϕ = E (Ψ) .
(7.42)
Proof. Let us fix an u. Let again t0 (u) inf {t: ϕ (u, t) = 0} . ϕ (u, t) is right-continuous and ϕ (0, u) = 1 therefore t0 (u) > 0. 1. Let U
X (t) if t < t0 (u) X (t0 (u) −) if t ≥ t0 (u)
Let γ ϕU (u, t) E (exp (iuU (t))) . Recall that ϕ (u, t0 (u) −) = 0 therefore γ = 0. Let A
1 • γ. γ−
Observe that as ϕ has finite variations the integral is well-defined84 . γ− •γ = γ− 1 = 1 + γ− • • γ 1 + γ − • A. γ−
γ =1+1•γ =1+
As the Dol´eans equations have unique solution85 γ = E (A). That is ϕ (u, t) = γ (t) = E (A) , 82 See:
Lemma 7.77, page 518. Lemma 7.78, page 519. 84 See: Proposition 1.151, page 106. 85 See: Theorem 6.56, page 412. 83 See:
t ∈ [0, t0 (u)) .
´ LEVY–KHINTCHINE FORMULA
527
We prove that A = Ψ on [0, t0 (u)). Let Y (t, u) exp (iuX (t)) ,
Z (u, t)
exp (iuX (t)) , ϕ (u, t)
t < t0 (u) .
X has independent increments, so Z (u) is a martingale. Integrating by parts on the interval [0, t0 (u)) and using that ϕ has finite variation86 Y − Y− • A Y − Y− • = Y − Y− • =Y −
1 •ϕ γ−
1 •ϕ ϕ−
= =
Y− •ϕ ϕ−
Zϕ − Z− • ϕ = = Z (0) ϕ (0) + ϕ− • Z + [Z, ϕ] = = Z (0) ϕ (0) + ϕ− • Z + ∆Z∆ϕ = = Z (0) ϕ (0) + ϕ− • Z + ∆ϕ • Z = = Z (0) ϕ (0) + ϕ • Z. ϕ is locally bounded, hence ϕ • Z is a local martingale on [0, t0 (u)). As Y exp (iuX) exp (iuX) − exp (iuX− ) • A is a local martingale on [0, t0 (u)). So87 A (u) = Ψ (u) and Ψ (u) is deterministic on [0, t0 (u)). 2. Let 0 if t < t0 (u) V (t) . ∆X (t0 (u)) if t ≥ t0 (u) Obviously processes V has independent increments. The spectral measure of V is88 a.s
ν ({t0 (u)} × Λ) = P (∆X (t0 (u)) ∈ Λ) F (Λ) . 86 See: Example 4.39, page 249. Let us recall that ϕ ∈ V is deterministic therefore it is predictable. 87 See: Corollary 7.80, page 523. 88 See: Example 7.70, page 514.
528
PROCESSES WITH INDEPENDENT INCREMENTS
If t < t0 (u) then obviously ϕV (u, t) = 1. With simple calculation89 if t ≥ t0 (u) ϕV (u, t) E (exp (iu∆X (t0 ))) = exp (iux) dF (x) = = Rn
=1+ Rn \{0}
exp (iux) − 1dF (x) =
=1+
Rn \{0}
exp (iux) − 1ν ({t0 (u)} × dx) =
=1+
Rn \{0}
iuh (x) ν ({t0 (u)} × dx) +
L (u, x) ν ({t0 (u)} × dx) .
+ Rn \{0}
where ν ({t0 (u)} × Λ) is deterministic. It is easy to see that in the decomposition of V (h) the local martingale part is
0 if t < t0 (u) h (∆X (t0 (u))) − E (h (∆X (t0 (u)))) if t ≥ t0 (u)
so
0 if t < t0 (u) = E (h (∆X (t0 ))) if t ≥ t0 (u) 0 if t < t0 (u) = . h (x) ν ({t0 (u)} × dx) if t ≥ t0 (u) Rn \{0}
B=
Hence ΨV (u, t) =
0 if t < t0 (u) ϕV (u, t) − 1 if t ≥ t0 (u)
that is ϕV (u, t) = 1 + ΨV (u, t) where ΨV (u, t) is deterministic. 89 See:
Example 7.72, page 514.
´ LEVY–KHINTCHINE FORMULA
529
3. Obviously U and V are independent and X = U + V on [0, t0 (u)]. From the definition of the predictable cumulant one can easily prove that on [0, t0 (u)] ΨX (u, t) = ΨU (u, t) + ΨV (u, t) . Therefore if t ∈ [0, t0 (u)] then ΨX (u, t) is deterministic. 4. Let W (u, t) =
0 if t < t0 (u) . X (t) − X (t0 (u)) if t ≥ t0 (u)
W is a semimartingale with independent increments and with same argument as above one can show that there is a t0 (u) < t1 (u) that the predictable cumulant of W is deterministic on [0, t1 (u)]. If t ∈ [0, t1 (u)) then ΨX (u, t) = ΨV (u, t) + ΨU (u, t) + ΨW (u, t) , therefore ΨX (u, t) is almost surely deterministic if t ∈ [0, t1 (u)). 5. Let t∞ (u) be the supremum of the time-parameters t (u) for which ΨX (u, t) is almost surely deterministic on t ∈ [0, t (u)]. Let tn (u) t∞ (u). If t∞ (u) = ∞ then unifying the zero sets one can easily show that almost surely ΨX (u, t) has a deterministic version. If t∞ (u) < ∞ then as above one can prove that ΨX (u, t) is deterministic [0, t∞ (u)] and one can find a δ > 0 such that ΨX (u, t) is deterministic on [0, t∞ (u) + δ] which is impossible. Now for every u with rational coordinates let us construct a deterministic version of ΨX (u, t). Unifying the measure-zero sets and using that ΨX (u, t) is continuous in u one can construct an almost surely deterministic version of ΨX (u, t). 6. As Ψ is deterministic the local martingale exp (iuX) − exp (iuX− ) • ΨX (u, t) is bounded on any finite interval. So it is a martingale. Using Fubini’s theorem one can easily show that E (exp (iuX (t))) − E (exp (iuX (t) −)) • Ψ (u) = 1. That is for every u ϕ (u) − ϕ− (u) • Ψ (u) = 1. Hence ϕ (u) = E (Ψ (u)).
PROCESSES WITH INDEPENDENT INCREMENTS
530 7.4.4
Characteristics of semimartingales with independent increments
Now we show that the characteristics of semimartingales with independent increments are deterministic. The main step is the next famous classical observation: Theorem 7.85 Let b be an n-dimensional vector, C a positive semidefinite matrix and let ν be a measure on Rn with
2 ν ({0}) = 0 and x ∧ 1 dν (x) < ∞. Rn \{0}
Let h be an arbitrary truncating function. The function 1 φ (u) iub − uCu+ L (x, u) dν (x) 2 Rn \{0} determines the triplet (b, C, ν). Proof : Let v ∈ Rn \ {0} ψ v (u) φ (u) −
1 2
1
φ (u + tv) dt. −1
With simple calculation 1 1 ψ v (u) = (vCv) t2 dt+ 4 −1 1 1 exp (iux) (1 − exp (itvx) + itvh (x)) dν (x) dt. + 2 −1 Rn \{0} By the integrability assumption one can use Fubini’s theorem to change the order of the integration. Hence 1 (vCv) + 6 1 1 exp (iux) (1 − exp (itvx)) dtdν (x) + 2 Rn \{0} −1 1 sin vx = (vCv) + dν (x) . exp (iux) 1 − 6 vx Rn \{0}
ψ v (u) =
By the integrability condition σ v (Λ)
1 (vCv) δ 0 (Λ) + 6
1− Λ
sin vx vx
dν (x) ,
Λ ∈ B (Rn )
´ LEVY–KHINTCHINE FORMULA
531
is a finite measure and ψ v (u) is the Fourier transform of σ v . If we know σ v then vCv = 6σ v ({0}) . If vx = 0 then 1−
sin vx = 0 vx
and ν and σ ν are equivalent on the set {vx = 0}. Hence the set of measures {σ v } determines90 ν. If ⇒ denotes the relation that the left-hand side uniquely determines the right-hand side then obviously φ ⇒ ψ v ⇒ σ v ⇒ (C, ν) . Obviously b is determined by (C, ν) and φ. Therefore φ determines (b, C, ν).
Corollary 7.86 If (B1 , C1 , ν 1 ) and (B2 , C2 , ν 2 ) are different characteristics, Ψ1 and Ψ2 are the corresponding predictable cumulants then Ψ1 = Ψ2 . Proof. Assume, that Ψ1 = Ψ2 . Let us fix an ω and a t and let u = 0 ∈ Rn . L depends only on x so if νi (Λ) ν (ω, (0, t] × Λ) ,
i = 1, 2
then one can write the integral in the definition of the predictable characteristics as L (u, x) d ν i (x) , i = 1, 2. Rn \{0}
From the previous theorem for every t and ω uB1 (t, ω) = uB2 (t, ω) ,
uC1 (t, ω) u = uC2 (t, ω) u
and ν 1 (ω, (0, t] × Λ) = ν 2 (ω, (0, t] × Λ) . Hence (B1 , C1 , ν 1 ) = (B2 , C2 , ν 2 ). 90 ν ({0}) = 0, and by the theorem on separating hyperplanes one can calculate the measure of every closed convex set in Rn \ {0}.
532
PROCESSES WITH INDEPENDENT INCREMENTS
Corollary 7.87 If X is an n-dimensional semimartingale with independent increments then the characteristics of X are deterministic. Corollary 7.88 If X is an n-dimensional process with independent increments then its spectral measure ν, that is the predictable compensator of µX , is deterministic. In this case ν ((0, t] × Λ) is the expected value of the jumps belonging to Λ during the time period (0, t]. Proof. Let ν be the spectral measure of X. We show that ν is deterministic. Let g χΛ , where Λ ∈ B (Rn \ {0}) with 0 ∈ / cl (Λ). As 0 ∈ / cl (Λ) the jumps in Λ cannot accumulate so g • µX is a finite valued process. For an arbitrary s if t > s then
g (∆Xr ) , g • µX (t) − g • µX (s) = s
is independent of the σ-algebra Fs . g ≥ 0 therefore g • µX is increasing so it is a semimartingale. Hence the spectral measure of g • µX is deterministic. As g is bounded g • µX is increasing with bounded jumps. Hence g • µX is locally bounded. If H ≥ 0 is a predictable process then
E H • g • µX = E Hg • µX = E (Hg • ν) = E (H • (g • ν)) . g • ν is predictable, hence the compensator of g • µX ∈ A+ loc is g • ν. This implies that (g • ν) (t) = ν ((0, t] × Λ) is deterministic. ν is defined on B (R+ ) × B (Rn \ {0}). Hence ν is deterministic. Theorem 7.89 (Characteristics and independent increments) Let X be an n-dimensional semimartingale with X (0) = 0. The characteristics of X have a deterministic version if and only if X has independent increments. In this case if s < t then h (u, s, t) E (exp (iu (X (t) − X (s)))) = E (Ψ (u, t) − Ψ (u, s)) , where Ψ denotes the predictable cumulant of X. Proof. One should only prove that if the characteristic are deterministic then X has independent increments. Let X be a semimartingale with X (0) = 0 and let Ψ be the predictable cumulant of X. If the characteristics are deterministic then Ψ is deterministic. As we have proved91 U (u) exp (iuX) − exp (iuX− ) • Ψ 91 See:
Proposition 7.79, page 519.
(7.43)
´ LEVY–KHINTCHINE FORMULA
533
is a local martingale. But |U (u)| ≤ 1 + Var (Ψ (u)) < ∞. As the characteristics are deterministic Var (Ψ) is deterministic, so it is obviously integrable by P. Hence on any finite interval U (u) ∈ D. Therefore U (u) is a martingale. If 0 ≤ s < t then using (7.43) exp (iuX (t)) exp (iuX (s))
U (u, t) (exp iu (X− ) • Ψ) (t) + = exp (iuX (s)) exp (iuX (s))
=
U (u, t) − U (u, s) (exp iu (X− ) • Ψ) (t) + U (u, s) + = exp (iuX (s)) exp (iuX (s))
U (u, t) − U (u, s) + = exp (iuX (s))
t s
exp (iu (X (r−))) dΨ (r) + exp (iuX (s)) . exp (iuX (s))
Multiplying by χF where F ∈ Fs χF exp (iu (X (t) − X (s))) = = χF exp (−iuX (s)) (U (u, t) − U (u, s)) + t χF exp (iu (X (r−) − X (s))) dΨ (r) + χF . + s
U is a martingale, hence by the elementary properties of the conditional expectation the expected value of the first term on the right-hand side is zero. Let f (r) E (χF exp (iu (X (r) − X (s)))) . Taking expected value and by Fubini’s theorem changing the integrals92
t
f (t) = P (F ) +
f (r−) dΨ (r) = s
t
f (r−) d (Ψ (r) − Ψ (r ∧ s)) .
= P (F ) + 0 92 Observe
that we have used that Ψ is determinstic.
534
PROCESSES WITH INDEPENDENT INCREMENTS
As every Dol´eans equation has just one solution93 f (t) = P (F ) · E (Ψ (t) − Ψ (s ∧ t)) = P (F ) · E (Ψ (t) − Ψ (s)) . Therefore for every u E (χF exp (iu (X (t) − X (s)))) = P (F ) · E (Ψ (u, t) − Ψ (u, t)) . This means that X (t) − X (s) is independent of Fs and E (exp (iu (X (t) − X (s)))) = E (Ψ (u, t) − Ψ (u, s)) .
7.4.5
The proof of the formula
Theorem 7.90 (L´ evy–Khintchine formula) If X is an n-dimensional semimartingale with independent increments then E (exp (iu (X (t) − X (s)))) = exp (U ) · V,
(7.44)
where 1 U iu [B (t) − B (s)] − u [C (t) − C (s)] u+ 2 (exp (iux) − 1 − iuh (x)) χJ c (r) dν (r, x) + (s,t]×(Rn \{0})
and V =
;
exp (−iu∆B (r)) 1 +
Rn \{0}
s
(exp (iux) − 1) ν ({r} × dx)
,
where J {r: ν ({r} × (Rn \ {0})) > 0} . Proof. The formula in (7.44) is a direct consequence of the formula of the solution of the Dol´eans equation 94 . Z (t) Ψ (t) − Ψ (s) has finite variation so E (Z) = exp (Z) 93 See: 94 See:
line (6.44), page 411. Theorem 6.56, page 412.
!
(1 + ∆Z) exp (−∆Z) .
´ LEVY–KHINTCHINE FORMULA
535
The jumps of ∆Z are the sum of the jumps of B and the jumps of the integral in Ψ. The integral in Ψ has a jump at r if and only if r ∈ J. Hence ∆Z (r) = iu∆B (r) + + (exp (iux) − 1 − iuh (x)) ν ({r} × dx) . Rn \{0}
As in the integral in U one has χJ c we have cancelled the jumps Rn \{0}
(exp (iux) − 1 − iuh (x)) ν ({r} × dx)
from (7.44), which proves the formula for U . ν and B are deterministic so Rn \{0}
h (x) ν ({r} × dx) = E (h (∆X (r))) =
= E (∆B (r) + ∆L (r)) = ∆B (r) + E (∆L (r)) , where L is the local martingale in the decomposition of X (h). If (τ n ) is a localizing sequence of L then E (∆Lτ n (r)) = 0. As |∆L (r)| ≤ 1 + |B (r)| by the Dominated Convergence Theorem E (∆L (r)) = 0. Therefore ∆Z (r) = Rn \{0}
(exp (iux) − 1) ν ({r} × dx) ,
which proves the formula for V . Corollary 7.91 If X is an n-dimensional semimartingale with independent increment then the Fourier transform of ∆X (t) is E (exp (iu∆X (t))) = 1 + Rn \{0}
(exp (iux) − 1) ν ({t} × dx) ,
so ∆X (t) is not zero with positive probability if and only if ν ({t} × (Rn \ {0})) >0, that is if t ∈ J. Proof. If s t in (7.44) then all the other terms disappear. Corollary 7.92 If X is an n-dimensional semimartingale with independent increment then X is continuous in probability if and only if the probability of a jump at time t is zero for every t. In this case the Fourier transform of X (t) is ϕ (u, t) = exp (Ψ (u, t)) .
PROCESSES WITH INDEPENDENT INCREMENTS
536
Proof. As X is right-regular X is continuous in probability if and only a.s. if ∆X (t) = 0. This means that X is stochastically continuous if and only if E (exp (iu∆X (t))) ≡ 1 for all t. By the previous corollary in this case J = ∅ and in this case V = 1 in (7.44). Example 7.93 If X is an n-dimensional semimartingale with independent increments and X is continuous in probability and 0 ∈ / cl (Λ) then for every t the number of jumps in Λ during the time interval (0, t] has a Poisson distribution with parameter ν ((0, t] × Λ) .
Let N Λ be the process counting the jumps in Λ. As 0 ∈ / cl (Λ) obviously N Λ has right-regular trajectories and N Λ (t) is finite for every t and it is a process with independent increments. As X is continuous in probability N Λ does not have fixed time of discontinuities so it is also continuous in probability. By the L´evy–Khintchine formula the Fourier transform of N Λ (t) has the representation
ϕ (u, t) = exp ΨΛ (u, t) where 1 ΨΛ (u, t) iuB Λ (t) − uC Λ (t) u+ 2 + (exp (iux) − 1 − iuh (x)) dν Λ (r, x) . (0,t]×(Rn \{0})
N Λ is continuous in probability an it has bounded jumps, hence all the moments of N Λ (t) are finite95 . Therefore the expected value of N Λ (t) is finite so
ν Λ ((0, t] × (Rn \ {0})) = E N Λ (t) < ∞. Therefore one can write the integral as
(exp (iux) − 1) dν (r, x) − iu Λ
(0,t]×(Rn \{0}) 95 See:
Proposition 1.114, page 78.
h (x) dν Λ (r, x) (0,t]×(Rn \{0})
´ LEVY–KHINTCHINE FORMULA
537
and the predictable cumulant has the representation 1 ΨΛ (u, t) iuDΛ (t) − uC Λ (t) u+ 2 + (exp (iux) − 1) dν Λ (r, x) . (0,t]×(Rn \{0})
The derivative of the Fourier transform at u = 0 is the expected value of the distribution multiplied by i. So as ν is deterministic
iE N Λ (t) = ϕu (0, t) = iDΛ (t) + i
x exp (i0x) dν Λ (r, x) = (0,t]×(Rn \{0})
= iDΛ (t) + E
xdN Λ (r, x)
=
(0,t]×(Rn \{0})
= iDΛ (t) + iE N Λ (t) .
Hence DΛ (t) = 0. Differentiating ϕ twice E
2
N Λ (t)
= C Λ (t) + E
N Λ (t)
2
.
Hence C Λ (t) = 0. So as ΨΛ is deterministic
Λ
(exp (iux) − 1) dν (r, x) Λ
Ψ (u, t) = E (0,t]×(Rn \{0})
(exp (iux) − 1) dN (r, x) Λ
=E = E
(0,t]×(Rn \{0})
s≤t
= E
=
=
exp iu∆N Λ (s) − 1 =
(exp (iu) − 1) =
s≤t
= (exp (iu) − 1) · E N Λ (t) = (exp (iu) − 1) · ν Λ ((0, t] × Λ) . Therefore N Λ (t) has a compound Poisson distribution with parameter λ ν Λ ((0, t] × Λ) .
538
PROCESSES WITH INDEPENDENT INCREMENTS
As ν ((0, t] × Λ) is the expected value of the number of jumps of X in Λ during the time interval (0, t] obviously λ ν Λ ((0, t] × Λ) = ν ((0, t] × Λ) .
7.5
Decomposition of Processes with Independent Increments
As we have remarked, not every process with independent increments is a semimartingale. On the other hand we have the next nice observation: Theorem 7.94 Every n-dimensional process X with independent increments has a decomposition X = F + S where F is a right-regular deterministic process and S is a semimartingale with independent increments. Proof. The main idea of the proof is the following: we shall decompose X into several parts. During the decomposition we successively remove the different types of jumps of X. The decomposition procedure is nearly the same as the decomposition of L´evy processes. The only difference is that now we can have jumps which occur with positive probability. When the increments are not stationary one should classify the jumps of X by two different criteria: 1. one can take the jumps which occur with positive or with zero probability at a fixed moment of time t, 2. one can take the large and the small jumps. Let W be the process which is left after we removed all the jumps of X. We shall prove that all the removed jump-components are semimartingales. As X is not necessarily a semimartingale W is also not necessarily a semimartingale. Process X can have jumps occurring with positive probability, therefore as we shall see, W is not necessarily continuous: when we remove the jumps of X occurring with positive probability we can introduce some new jumps. But very importantly the new jumps have deterministic size and they can occur only at fixed moments of time. Let W be independent of W with the same distribution as W . As the jumps of W , and of course the jumps of W , are deterministic and they occur at 1 W − W is continuous as the jumps of W and the same moments of time W 1 has independent W cancel each other out. As W and W are independent W increments. If the Fourier transform of W is ϕW (u, t) then the Fourier transform 1 is |ϕW (u, t)|2 . As we have already observed96 in this case |ϕW (u, t)|2 is of W 1 is a semimartingale97 . decreasing. So it has finite variation. This implies that W 96 See 97 See:
line (7.41), page 524. Theorem 7.83, page 524.
DECOMPOSITION OF PROCESSES WITH INDEPENDENT INCREMENTS
539
1 is continuous, hence its spectral measure is zero. From the L´evy–Khintchine W 1 is a Gaussian process. By Cram´er’s theorem99 W formula98 it is clear that W has a Gaussian distribution. Therefore W (t) has an expected value for every t. If F (t) E (W (t)) then W (t) − F (t) has independent increment and it has zero expected value so it is a martingale and of course F is just the right-regular deterministic process in the theorem. 1. Obviously 0 (h) (t) X 0 (t) X
(∆X (s) − h (∆X (s)))
s≤t
is the process of large jumps, where h (x) xχ (x < 1). X is right-regular, so 0 has finite variation the large jumps do not have an accumulation point. Hence X 0 0 on finite intervals. If s ≤ t then X (t) − X (s) is the large jumps of X during 0 has the time period (s, t] so it is independent of the σ-algebra Fs . Hence X independent increments. As a first step let us separate from X the process of 0 To make the notation simple let us denote by X the process large jumps X. 0 0 0 is a semimartingale. X − X. As X has finite variation X 2. As a second step we separate the small jumps of X which are not in J {t: ν ({t} × Rn \ {0}) > 0} . The construction is basically the same as the construction we have seen in the L´evy process case: Let Ym xχJ c χ
1 1 ≥ x > m m+1
• µX g • µX .
Ym is a process of some jumps of X, so it has independent increments. As the jumps are larger than 1/ (m + 1) they cannot have an accumulation point. Hence Ym has finite variation on finite intervals. As g is bounded Ym ∈ Aloc . Let Ymp 98 See: 99 See:
= g • ν xχJ c χ
1 1 ≥ x > m m+1
Corollary 7.90, page 534, Theorem 6.12, page 367. Theorem A.14, page 551.
•ν
540
PROCESSES WITH INDEPENDENT INCREMENTS
be the compensator of Ym . If t ∈ J c then100 ν ({t} × Rn \ {0}) = P (∆X (t) = 0) = 0. This implies that Ymp is continuous101 in t. Let Lm Ym − Ymp be the local martingale of the compensated jumps of Ym . As Ymp is continuous ∆Lm ∆Ym = χJ c χ
1 1 ≥ ∆X > m m+1
.
Lm has finite variation, so it is a pure quadratic jump process102 . Obviously for every coordinate i (
(i) = 0, L(i) p , Lq
p = q.
(7.45)
It is also obvious that ∞
2 ( 2 (i) Lm (t) = ∆X (i) (s) χJ c (s) = |xi | χJ c • µX . m=1
s≤t
We want to prove that * + ∞ ( + (i) ) 2 , Lm = |xi | χJ c • µX ∈ A+ loc .
(7.46)
m=1
By Jensen’s inequality * * + ∞ ( + ∞ ( (i) + + (i) Lm ≤ ,E Lm E , m=1
m=1
so it is sufficient to show that ∞ (
+ L(i) m ∈ Aloc .
m=1
∞ (i) Observe that the jumps of m=1 [Lm ] are smaller than one, so if it is finite then it is a right-regular increasing process with bounded jumps, so it is locally ∞ (i) bounded103 . Therefore the main point is that m=1 [Lm ] is almost surely finite 100 See:
Example 7.70, page 514. χJ c is in the integrand of the integral describing it. 102 See: Example 4.12, page 229. 103 See: Proposition 1.152, page 107. 101 As
DECOMPOSITION OF PROCESSES WITH INDEPENDENT INCREMENTS
541
(i)
for every moment of time t. As Lm is a pure quadratic jump process ∞ ( n
2 X L(i) m = x χJ c • µ .
i=1 m=1
So it is almost surely finite for every t if104
2 2 E x χJ c • µX (t) = E x χJ c • ν (t) =
2 = x χJ c • ν (t) < ∞.
(7.47)
105 We shall show this in the next, third ∞point of the proof. From (7.45) and (7.46) there is a local martingale L = m=1 Lm . The convergence holds uniformly in probability on compact intervals, so for some subsequence outside a measurezero set the trajectories converge uniformly on every compact interval. So almost surely
∆L = ∆Xχ (∆X ≤ 1) χJ c . Processes X − Ym have independent increments. As X has independent increments ν is deterministic. Hence Ymp is deterministic. Therefore X−
m k=1
Lk X −
m
(Yk − Ykp )
k=1
has independent increments for every m. So the limit X − L has independent increments. 2 3. Now we prove (7.47) that is we show that x χJ c is ν-integrable. Of course we already deleted the large jumps of X! So we want to prove that for any process with independent increments
2 x ∧ 1 χJ c • ν < ∞.
Let now X be a process with independent increments and let ν be the spectral measure of ν. Let X be a process which has the same distribution106 as X but 104 ν
is deterministic as X has independent increments. Theorem 4.26, page 236. In fact the convergence holds in H2 , so one can also use Doob’s inequality and the completeness of H2 . 106 Of course infinite dimensional distribution. 105 See:
542
PROCESSES WITH INDEPENDENT INCREMENTS
independent of X. To construct X let us consider the product
F (Ω, A, P, F) × (Ω, A, P, F) . A, P, Ω, Let X (ω, ω ) X (ω) and let X (ω, ω ) X (ω ). It is easy to see that X and X have independent increments with respect to the filtration F F × F F × F . For any s the σ-algebra Gs generated by X (u) − X (v) ,
u, v ≥ s
X − X is independent of Fs and the same is true for X . The increments of X are measurable with respect to Gs × Gs and this σ-algebra is independent of has independent increments on the extended space. If ϕX denotes the Fs . So X is |ϕX |2 . Function |ϕX |2 Fourier transform of X then the Fourier transform of X 107 is a semimartingale. As X is decreasing , so it has finite variation. Hence X has independent increments its spectral measure ν is deterministic108 . By the 2 semimartingale property109 (x ∧ 1) • ν < ∞. Unfortunately as the jumps of X 2 and X can interfere, this does not imply110 that (x ∧1)•ν < ∞. If t ∈ J then X has a jump in t with positive probability and if t ∈ J c then the probability of a jump of X in t is zero111 . Let (τ k ) be the sequence of stopping times covering the jumps of X. Let Gk be the distribution of τ k . By the definition of the conditional expectation P (∆X (τ k ) = 0, τ k ∈ / J) = / J)) = E (χ (∆X (τ k ) = 0) χ (τ k ∈ E (χ (∆X (τ k ) = 0) χ (τ k ∈ / J) | τ k = s) dGk (s) . R+
∆X is independent of τ k , as τ k is measurable with respect to the σ-algebra generated by X and X is independent of X. The distribution of X and X are the same so the moments of time where they jump with positive probability are equal. So if s ∈ J c then E (χ (∆X (s) = 0)) = P (∆X (s) = 0) = 0. 107 See:
line (7.41), page 524. Corollary 7.88, page 532. 109 See: Proposition 7.66, page 509. 110 Consider the case when X is a deterministic process with independent increments. Then
X − X = 0! X 111 See: Example 7.70, page 514. 108 See:
DECOMPOSITION OF PROCESSES WITH INDEPENDENT INCREMENTS
543
Hence by the independence E (χ (∆X (τ k ) = 0) χ (τ k ∈ / J) | τ k = s) = E (χ (∆X (s) = 0) χ (s ∈ / J)) = 0. This implies that outside a set with zero probability X and X do not have common jumps in J c . Hence
a.s 2 2 x ∧ 1 χJ c • µX ≤ x ∧ 1 • µX .
Using again that ν and ν are deterministic
2 2 (7.48) x ∧ 1 χJ c • ν = E x ∧ 1 χJ c • µX ≤
2 ≤ E x ∧ 1 χJ c • µX ≤
2 2 ≤ E x ∧ 1 • µX = x ∧ 1 • ν < ∞.
And that is what we wanted to prove. 4. Now take process Z X − L. ∆Z = ∆Xχ (∆X < 1) χJ . As ν is σ-finite there are maximum countable number of points (tm ) such that ∆Z (tm ) = 0. Define the martingales Um (t)
0 if t < tm . ∆Z (tm ) − E (∆Z (tm )) if t ≥ tm
E (∆Z (tm )) is meaningful as ∆Z (tm ) ≤ 1. Obviously for any i = 1, 2, . . . , n (
Up(i) , Uq(i) = 0,
p = q.
We should show again that * + ∞ ( + (i) , Um ∈ A+ loc .
(7.49)
m=1
As above let Z be independent of Z and let the distribution of Z be the same Z − Z is again a semimartingale with independent as the distribution of Z. Z then as Z is a semimartingale increments. If again ν is the spectral measure of Z
544
PROCESSES WITH INDEPENDENT INCREMENTS
2 z ∧ 1 • ν < ∞. Hence 2 2 (s) = E ≤ E ∆Z ∆Z (s)
s≤t,s∈J
s≤t,s∈J
2 ≤E z ∧ 1 • µZ (t) =
2 = z ∧ 1 • ν (t) < ∞.
By the definition of Um n
2 E U (tm ) = D2 ∆Z (i) (tm ) = 2
D
i=1
(tm ) + D2 −∆Z (i) (tm ) = = 2 i=1
n 2 D2 ∆Z (i) (tm ) − ∆Z (i) (tm ) 1 = E ∆Z (tm ) . = 2 2 i=1 n
∆Z
(i)
Hence E
tm ≤t
2 U (tm ) =
2 E U (tm ) < ∞
(7.50)
tm ≤t
which as above implies (7.49). Let U be the limit of (Um ). If we subtract U from X − L then W X − L − U has independent increments112 and ∆W = χJ · E (∆Z) .
(7.51)
5. By (7.51) the jumps of W are fixed and they are deterministic. As we remarked in the introductory part of the proof this implies that the expected value of W (t) is finite. If F (t) E (W (t)) then as W has independent increments W − F satisfies the martingale condition. As the filtration satisfies the usual conditions W −F has a right-regular version. As W is already right-regular F is also right-regular. 6. Observe that X = S + F , where S is a semimartingale. Let us explicitly state some important observations proved above. 112 The
jumps of ∆Z disappear but we bring in the expected values of ∆Z.
DECOMPOSITION OF PROCESSES WITH INDEPENDENT INCREMENTS
545
Corollary 7.95 If X is a continuous process with independent increments then X is a Gaussian process113 . Corollary 7.96 If X is a process with independent increments then X has a decomposition 0 +H +G+F X=X where the processes in the decomposition are independent and: 0 is the large jumps of X, 1. X 2. H is a martingale with the small jumps of X and H ∈ H02 on any finite interval, 3. G is a Gaussian martingale and, 4. F is a deterministic process. Proof. From the proof of the previous theorem it is clear that for every t
E
n (
L(i) (t)
i=1
= E
∆X χJ c = E 2
2 x χJ c • µX (t) =
s≤t
2 = x χJ c • ν (t) < ∞,
and E
n (
U
(i)
(t)
2 = x χJ • ν (t) < ∞.
i=1
Hence114 L and U are in H02 on any finite interval. Therefore H L + U is a martingale. Theorem 7.97 (Characterization of local martingales with independent increments) If X is a local martingale with independent increments then X is a martingale. Proof. By the previous corollary 0 + H + G + F. X=X
113 See:
Theorem 6.12, page 367. Proposition 2.84, page 170. As in the L´evy process case we could use Doob’s inequality to construct L and U and directly prove that L and U are in H2 on any finite interval. 114 See:
546
PROCESSES WITH INDEPENDENT INCREMENTS
1. X is a local martingale so115
2 x ∧ x • ν ∈ A+ loc . Therefore for every t (x χ (x ≥ 1) • ν) (t) < ∞. X has independent increments so ν is deterministic116 and the expression above is deterministic. By the definition of the spectral measures117
0 E X (t) E ∆X (s) χ (∆X (s) ≥ 1) ≤ 0<s≤t ≤ E ∆X (s) χ (∆X (s) ≥ 1) = 0<s≤t
=E
x χ (x ≥ 1) • µX (t) =
= E ((x χ (x ≥ 1) • ν) (t)) = = (x χ (x ≥ 1) • ν) (t) < ∞. 0 (t) is integrable for every t. Hence X
0 (t) . X 0 is the process of the large jumps of X so X 0 2. Let m 0 (t) E X 0 −m has independent increments. Hence X 0 has independent increments and the 0 −m expected value of the increments is zero. So X 0 is a martingale. 3. As X is a local martingale
0 −m 0 +H +G F +m 0 =X− X is a deterministic local martingale. This implies that F + m 0 is a constant. Hence X is a martingale.
115 See:
Corollary 7.68, page 512. Corollary 7.88, page 532. 117 See: (7.31), page 501. 116 See:
Appendix A RESULTS FROM MEASURE THEORY A.1
The Monotone Class Theorem
In describing the structure of measurable sets and functions the main tool is the so-called Monotone Class Theorem. The theorem has many forms. In this section we present a simple proof of this important statement. First we give the necessary definitions: Definition A.1 The set L of bounded real-valued functions defined on a set X is said to be a λ-system if: 1. 1 ∈ L, that is the constant function 1 is in L, 2. L is a vector space, 3. if 0 ≤ fn f , fn ∈ L and f is bounded then f ∈ L. Definition A.2 The set V of real-valued functions on a set X is said to be a vector lattice if: 1. V is a vector space, and 2. whenever f ∈ V then |f | ∈ V.1 Definition A.3 The lattice V in the previous definition is called a Stone-lattice if for every f ∈ V one has f ∧ 1 ∈ V. Definition A.4 The set P of real-valued functions on a set X is said to be a π-system if whenever f, g ∈ P then f g ∈ P, where f g denotes the product of f and g. Lemma A.5 If H is an arbitrary set of bounded functions on a set X then there is a set of bounded functions on X, denoted by L (H) , which contains H and which is the smallest λ-system on X containing H. 1 If
V is a vector lattice, then for any f, g ∈ V f ∨ g
f +g−|f −g| 2
f+
f−
f +g+|f −g| 2
∈ V and f ∧ g
f ∨ 0, −f ∨ 0. It is evident from the definition, ∈ V. For any function f that f = f + − f − . If V is a vector lattice and f ∈ V, then f ± ∈ V.
547
548
APPENDIX A
Proof. The set of all bounded functions on X is a λ-system, the intersection of λ-systems is again a λ-system, hence the intersection of all λ-systems containing H is well defined and it is trivially the smallest λ-system containing H. Lemma A.6 If P is a set of bounded functions which is a π-system, then L (P) is also a π-system. Proof. For any f let us introduce Lf {g ∈ L (P) : gf ∈ L (P)} .
1. It is easy to see that Lf is a λ-system for any f . Only the third condition in the definition is not trivial. Let 0 ≤ gn g where g is bounded and gn f ∈ L (P). As f is bounded there is an α such that f + 1α ≥ 0, so 0 ≤ gn (f + α1) g (f + α1). gn (f + α1) = gn f + αgn ∈ L (P) , and L (P) is a λ-system, so g, g (f + α1) ∈ L (P) , hence gf = g (f + α1) − αg ∈ L (P) , that is g ∈ Lf . 2. Fix an f ∈ P and let g ∈ P. As P is product closed f g ∈ P ⊆ L (P) , that is g ∈ Lf , hence P ⊆ Lf . Lf is a λ-system, so L (P) ⊆ Lf ,
if
f ∈ P.
(A.1)
3. Fix a g ∈ L (P) and let again f ∈ P. In this case from (A.1) f g ∈ L (P), which means, that for any f ∈ P and g ∈ L (P) one has f ∈ Lg , that is P ⊆ Lg . Lg is a λ-system, hence L (P) ⊆ Lg ,
if f ∈ L (P) .
4. By definition this means that for any f, g ∈ L (P) one has f g ∈ L (P), that is L (P) is a π-system. Lemma A.7 For any N > 0 there is a sequence of polynomials (pn ) such that 0 ≤ pn (x) |x| for any |x| ≤ N. Proof. We shall approximate N − |x| by a decreasing sequence. Let y0 ≡ N and by induction define the iteration yn+1 (x)
1 2 yn (x) + N 2 − x2 . 2N
THE MONOTONE CLASS THEOREM
549
It is clear that yn (x) is a polynomial in x for all n and yn ≥ 0 on [−N, N ]. We prove by induction that (yn ) is a decreasing sequence. By definition y1 (x) = N −
x2 , 2N
and trivially y0 ≥ y1 ≥ 0. If yn−1 ≥ yn ≥ 0 for some n, then 1 2 y (x) − yn2 (x) = 2N n−1 1 = (yn−1 (x) − yn (x)) (yn−1 (x) + yn (x)) ≥ 0. 2N
yn (x) − yn+1 (x) =
Every sequence of real numbers bounded from below has a limit. N ≥ y ∗ (x) ≥ 0 is the limit of (yn (x)) , then
1 ∗ 2 (y (x)) + N 2 − x2 . 2N
y ∗ (x) = Hence x2 = N − y ∗ (x).
(y ∗ (x) − N ) , 2
If
which means that |x|
=
|y ∗ (x) − N |
=
Proposition A.8 If a set of bounded functions P is a π-system then L (P) is a Stone-lattice. Proof. L (P) is a π- and a λ-system. If f ∈ L (P), then f is bounded so there is an N such that |f | ≤ N. If (pn ) is a sequence of polynomials on [−N, N ] for which pn (x) |x|, then 0 ≤ pn (f (x)) |f (x)| . L (P) is a λ-system, (pn (f )) ⊆ L (P) , so |f | ∈ L (P) , that is L (P) is a lattice. As 1 ∈ L (P) , L (P) is a Stone-lattice. Proposition A.9 If L is a λ-system and a Stone-lattice, then L is exactly the set of σ (L)-measurable functions2 , where σ (L) is the σ-algebra generated by L. Proof. Every function in L is σ (L)-measurable, so it is sufficient to prove that every σ (L)-measurable function is in L. Let G {A : χA ∈ L} . L is a lattice, hence if A, B ∈ G, then A ∪ B, A ∩ B ∈ G. 1 ∈ L, and as L is a linear space χAc = 1 − χA is in L so G is an algebra. L is a λ system, so if An A and An ∈ G, then A ∈ G, hence G is a σ-algebra. We shall prove that σ (L) ⊆ G. Let f ∈ L. L is a Stone–lattice, so if fn 1 ∧ (n (f − 1 ∧ f )) , 2 In the definition of the λ-systems we have assumed that the elements of the λ-system are bounded. In this proposition the boundedness of the functions is not used.
550
APPENDIX A
then fn ∈ L. 0 ≤ fn χ (f > 1) . L is a λ-system, hence χ (f > 1) ∈ L, that is {f > 1} ∈ G. L is a linear space so for any α > 0, {f > α} ∈ G. This means that f + is G-measurable. The same argument implies that f − is also G-measurable, that is f is G-measurable. Hence σ (L) ⊆ G. G ⊆ σ (L) is trivial so G = σL. Therefore L contains the characteristic functions of the elements of σ (L). L is a linear space so it contains the σ (L)-measurable step functions. As L is closed for the monotone limit it contains all the measurable functions. Theorem A.10 (Monotone Class Theorem) If P is a π-system and L is a λ-system and P ⊆ L, then L contains all the σ (P)-measurable bounded real valued functions. Proof. Trivially L (P) ⊆ L, L (P) is a λ-system and Stone–lattice, so L contains the σ (L (P)) , hence the σ (P)-measurable bounded functions. Example A.11 One cannot drop the assumption that the functions in the theorem have to be bounded.
Assume, that one could prove the theorem for unbounded functions. Let F and G be two probability distributions on R and assume, that the of F moments and G are equal. Let L be the set of functions f for which R f dG = R f dF, where the integrals on both sides can be infinite or undefined at the same time. The set of possible functions f is a λ-system. The set P of polynomials forms a π-system and as the moments of F and G are equal P ⊆ L. By the assumption B (R) = σ (P) ⊆ L, which is impossible since it can happen, that F = G, but all the moments of the two distributions are the same.
A.2
Projection and the Measurable Selection Theorems
During the discussion of stochastic analysis we should assume the completeness of the space (Ω, A, P) as we use several times the next two theorems3 . Theorem A.12 (Projection Theorem) If the space (Ω, A, P) is complete and U ∈ B (Rn ) × A, then projΩ U {x : ∃t such that (t, x) ∈ U } ∈ A. 3 See:
[11], [42].
´ CRAMER’S THEOREM
551
Theorem A.13 (Measurable Selection Theorem) If the space (Ω, A, P) is complete and U ∈ B (R+ ) × A then there is an A-measurable function f : Ω → [0, ∞] for which Graph (f ) {(t, ω) : t = f (ω) < ∞} ⊆ U and {f < ∞} = projΩ U .
A.3
Cram´ er’s Theorem
Theorem A.14 (Cram´ er) If ξ and η are independent random variables and ξ + η has Gaussian distribution then ξ and η also have Gaussian distribution. Without loss of generality one can assume that the distribution of ξ+η is N (0, 1). Let exp (zx) dFξ (x) , Mη (z) exp (zx) dFη (x) Mξ (z) R
R
2 be the complex moment-generating functions of ξ and η. M (z) = exp z /2 . ξ+η
2 As ξ and η are independent Mξ (z) Mη (z) = exp z /2 , whenever Mξ (z) and Mη (z) are defined. One should prove that Mξ (z) and Mη (z) has the form exp σ 2 z 2 /2 . Lemma A.15 Mξ and Mη defined on the whole complex plane. Proof. From the definition of the complex moment-generating functions it is clear that Mξ and Mη are defined on the strips parallel with the imaginary axis based on the domain of finiteness of exp (sx) dFξ (x) , s ∈ R, Mξ (s) R
Mη (s)
R
exp (sx) dFη (x) ,
s ∈ R.
As ξ and η are independent and as for non-negative independent variables the product rule for expected values holds ∞ > exp
s2 2
= Mξ+η (s) E (exp (s (ξ + η))) =
= E (exp (sξ)) E (exp (sη)) Mξ (s) Mη (s) .
APPENDIX A
552
As every real moment-generating function is positive4 Mξ (s) and Mη (s) are finite for every s ∈ R. As a consequence all the moments of ξ and η are finite. Hence as the expected value of the sum is zero one can assume that E (ξ) = E (η) = 0. Using this and the convexity of exp (|x|) by Jensen’s inequality |Mξ (z)| |E (exp (zξ))| ≤ E (|exp (zξ)|) ≤ E (exp (|zξ|)) = = E (exp (|z| |ξ + E (η)|)) = exp |z| x + ydFη (y) dFξ (x) ≤ = R
R
≤
R
R
exp (|z| |x + y|) dFη (y) dFξ (x) = E (exp (|z| |ξ + η|)) =
u2 exp (|z| |u|) exp − du = 2 R 2 ∞ 2 u du ≤ =√ exp (|z| u) exp − 2 2π 0 2 ∞ 2 u exp (|z| u) exp − du = 2MN (0,1) (|z|) = ≤√ 2 2π −∞ 2 |z| . = 2 exp 2
1 =√ 2π
2 In a similar way |Mη (z)| ≤ 2 exp |z| /2 . As Mξ (z) Mη (z) = Mξ+η (z) = exp
z2 2
= 0,
Mξ (0) = Mη (0) = 1,
one can define the complex logarithms5 gξ (z) log Mξ (z) and gη (z) log Mη (z). |Mξ (z)| = |exp (gξ (z))| = exp (Re (gξ (z))) ≤ 2 exp |Mη (z)| = |exp (gη (z))| = exp (Re (gη (z))) ≤ 2 exp 4 Possibly
2
|z| 2
,
2
|z| 2
.
+∞. f (z) = 0 and if f is continuously differentiable then g (x) 0z f (z) /f (z) dz is welldefined and in the whole complex plane exp (g (x)) ≡ f (x) . 5 If
´ CRAMER’S THEOREM
553
Taking the real logarithm of both sides 2
Re (gξ (z)) ≤ ln 2 +
2
|z| |z| ≤1+ . 2 2
(A.2)
and of course 2
Re (gη (z)) ≤ 1 +
|z| . 2
Lemma A.16 If in the circle |z| < r0 f (z)
∞
an z n ,
n=0
and A (r) max Re (f (z)) , |z|=r
then for all n > 0 and 0 < r < r0 |an | rn ≤ 4A+ (r) − 2 Re (f (0)) . Proof. Let z r exp (iθ). f (z) =
∞
an rn exp (inθ) ,
r < r0 .
n=0
Hence if r < r0 then ∞
r |an exp (inθ)| = n
n=0
∞
rn |an | < ∞.
n=0
By the Weierstrass criteria for any r < r0 the next convergence is uniform in θ Re f (z) =
∞
rn Re (an exp (inθ)) =
(A.3)
n=0
=
∞
rn [Re (an ) cos nθ − Im (an ) sin nθ] .
n=0
Multiplying (A.3) by some cos nθ and by sin nθ and integrating by θ over [0, 2π], by the uniform convergence and by the orthogonality of the trigonometric
554
APPENDIX A
functions if n > 0 1 r Re an = π
n
2π
1 r Im an = − π
2π
n
Re f (z) cos nθdθ, 0
Re f (z) sin nθdθ, 0
that is 2π 1 1 2π |an r | = Re f (z) exp (−inθ) dθ ≤ |Re (f (z))| dθ. π 0 π 0 n
Integrating (A.3) Re (f (0)) = Re a0 =
1 2π
2π
Re (f (z)) dθ. 0
Hence 1 |an r | + 2 Re (f (0)) ≤ π
2π
|Re (f (z))| + Re (f (z)) dθ =
n
=
1 π
1 ≤ π
0
2π
+
2 (Re (f (z))) dθ ≤ 0
2π
2A+ (r) dθ = 4A+ (r) . 0
gξ is analytic in the whole complex plane that is gξ (z) = and by the lemma if r > 0, then
∞ k=0
an z n . By (A.2)
r2 |an | rn ≤ 4 1 + − 2 · 0. 2 Hence if n > 2, then an = 0, that is gξ (z) = a0 + a1 z + a2 z 2 . But 1 = Mξ (0) = exp (a0 ) , so a0 = 0 and as a1 = Mξ (0) = E (ξ) = 0,
a2 = Mξ (0) = D2 (ξ) > 0,
Mξ (z) = exp σ 2ξ z 2 . In a similar way Mη (z) = exp σ 2η z 2 .
INTERPRETATION OF STOPPED σ-ALGEBRAS
A.4
555
Interpretation of Stopped σ-algebras
If X is an arbitrary stochastic process, then the interpretations of the stopped variables and stopped processes Xτ and X τ are quite obvious and appealing. On the other hand the definition of the stopped σ-algebras Fτ are a bit formal. The usual interpretation of Fτ is that it contains the events which happened before τ . But in the abstract model of stochastic analysis, it is not clear from the definition how subsets of (Ω, A, P) are related to time, and what does it mean that an abstract event happened before τ ? In the canonical model the outcomes in Ω explicitly depend on the time parameter, hence the idea that for some function ω(t) something happened before time t = τ (ω) is perhaps more plausible. To make the next discussion as simple as possible let us assume that Ω is a subset of the right-continuous functions. The restriction that the functions be right-continuous is a bit too restrictive as the topological or measure theoretic properties of the functions in Ω will play practically6 no role below, so with this assumption we just fix the space of possible trajectories. Perhaps the most specific operation of stochastic analysis is truncation. We shall assume that if X is a stochastic process then the truncated process X τ is also a stochastic process7 . In the canonical setup this means that the trajectories of X τ are in Ω. This happens if g(t) f (t ∧ γ) ∈ Ω for arbitrary number γ and for arbitrary f ∈ Ω. Of course this is a very mild but slightly unusual assumption. If Ω is the set of all right-regular or continuous or increasing functions or Ω is the set of functions which has fixed size of jumps etc. then the condition is satisfied. Let X be the coordinate process X(t, ω) ω(t), and let assume that the filtration F is generated by Ω that is F = F Ω = F X . Let τ be a stopping time of F. Beside Fτ let us define two other σ-algebras. One of them is Gτ σ (X τ ) = σ (X (τ ∧ t) : t ∈ Θ) .
(A.4)
To define the other, let us introduce on the space Ω an equivalence relation ∼τ : The outcomes ω and ω are equivalent with respect to τ , if and only if X τ (ω ) = X τ (ω ). That is the outcomes ω and ω are equivalent if the trajectories of X for ω and for ω are the same up to the random time τ . ∼τ is trivially an equivalence relation on Ω. For every B let [B]τ be the set of outcomes which are equivalent to some outcome from B that is [B]τ {ω : ∃ω ∈ B such that ω ∼τ ω } . The obvious interpretation of the elements of the partition generated by ∼τ is that they are the outcomes of the experience of the observation of X up to 6 For instance one can also assume that the trajectories are left-continuous. We need some restriction on the trajectories as we should guarantee that the truncated processes remain adapted. 7 See: Definition 1.128, page 93.
556
APPENDIX A
time τ . As the trajectories of X are right-continuous ω ∼τ ω if and only if X (τ (ω ) ∧ r, ω ) = X (τ (ω ) ∧ r, ω ) for every rational number r ≥ 0. If ω ∈ Ω, then as X is progressively measurable8 [ω]τ = {ω : X (τ (ω ) ∧ r, ω ) = X (τ (ω) ∧ r, ω) , r ∈ Q+ } = = ∩r∈Q+ {ω : X (τ ∧ r) (ω ) = X (τ ∧ r) (ω)} ∈ Fτ ⊆ F∞ . Hence all the equivalence classes of ∼τ are F∞ -measurable subsets of Ω. Let us denote by Hτ the set of F∞ -measurable subsets of Ω which are the union of some collection of subsets from the partition generated by ∼τ . Hτ is obviously a σ-algebra and one can naturally interpret the sets in Hτ as the events from F∞ which happened before τ . Obviously H ∈ Hτ if and only if H ∈ F∞ and [H]τ = H. Proposition A.17 In the just specified model Fτ = Hτ = Gτ . Proof. We shall prove that Fτ ⊆ Hτ ⊆ Gτ ⊆ Fτ . 1. The ‘hard’ part of the proof is the relation Fτ ⊆ Hτ . Assume that τ ≡ s. Let L {B ∈ Fτ : [B]τ = B} = {B ∈ Fs : [B]s = B} . L is trivially a λ-system. Let us consider the sets B ∩k {X(sk ) ≤ γ k } ,
sk ≤ s,
which obviously form a π-system. By the definition of ∼τ trivially B = [B]τ . By the Monotone Class Theorem Fs FsX σ (X(sk ) ≤ γ k , sk ≤ s) ⊆ L, which is exactly what we wanted to proof. Now let τ be an arbitrary stopping time of F and let A ∈ Fτ . We prove that A ∈ Hτ . As A ∈ Fτ ⊆ F∞ one should only prove that A = [A]τ . Let ω ∈ A and ω ∼τ ω . One should prove that ω ∈ A. If s τ (ω) and t ≤ s, then X(t, ω) = X τ (t, ω) = X τ (t, ω ) = X(t, ω ), so ω ∼s ω , where obviously ∼s denotes the equivalence relation defined by the stopping time s. ω ∈ A ∩ {τ ≤ s} B ∈ Fs . By the case τ ≡ s, just proved Fs = Hs . Therefore ω ∈ [B]s = B ⊆ A. 8 See:
Example 1.18, page 11.
INTERPRETATION OF STOPPED σ-ALGEBRAS
557
2. By the structure of Ω every stopped trajectory X τ (ω) is in Ω. As Ω is just the set of all trajectories of X, for every ω there is an α(ω) such that X τ (ω) = X(α(ω)). Let us denote this mapping by α. We shall prove that the mapping α : (Ω, Gτ ) → (Ω, F∞ )
(A.5)
is measurable. If B {ω : X(t, ω) ≤ γ} is one of the sets generating the σ-algebra F∞ , then α−1 (B) {ω : α(ω) ∈ B} = {ω : X(t, α(ω)) ≤ γ} = = {ω : X τ (t, ω) ≤ γ} = {X(τ ∧ t) ≤ γ} ∈ Gτ , from which the (A.5) measurability of α is evident. Assume that A ∈ Hτ , that is A ∈ F∞ and A = [A]τ . We prove that α−1 (A) = A. Hence by the just proved measurability of α obviously A ∈ Gτ . Which implies that Hτ ⊆ Gτ . If ω ∈ A, then by definition α(ω) ∼τ ω. As A = [A]τ , one has α(ω) ∈ A and hence ω ∈ α−1 (A), so A ⊆ α−1 (A). On the other hand if ω ∈ α−1 (A), then α(ω) ∈ A. But as ω ∼τ α(ω), one has ω ∈ [A]τ = A. Therefore α−1 (A) ⊆ A. 3. X is right-continuous so X is progressively measurable. Hence the variables X(τ ∧ t) are Fτ ∧t ⊆ Fτ -measurable and so Gτ ⊆ Fτ . Obviously one can use this proposition only when space Ω is big enough. Let us assume that the trajectories of X are just right-continuous. Let Ω be the set of Ω X all possible trajectories of X and let
F Ω F F , that is let us represent X is called the minimal representation by its canonical model. This space Ω, F of X. Let denote by Φ the set of all right-continuous functions. On the set Φ let define the filtration F Φ . Obviously if f ∈ Φ then f γ f ∧ γ ∈ Φ for all γ. Of course Ω ⊆ Φ and obviously FtΩ ⊆ FtΦ for every t.
Lemma A.18 Let τ be a stopping time of the minimal representation Ω, F Ω . If we extend τ to the space Φ with τ (φ)
τ (ω) if ω∈Ω , +∞ if φ ∈ Φ \ Ω
then the extended function τ is a stopping time of Φ, F Φ . Proof. {τ ≤ t} ∈ FtΩ ⊆ FtΦ .for every t. By the proposition just proved HτΦ = FτΦ . This means that A ∈ FτΦ if and only Φ if A = [A]τ and A ∈ F∞ . If f ∈ Ω and g ∈ Φ \ Ω, then f and g cannot be ∼τ equivalent by the definition of τ . From this it is clear that there are two types of
measurable sets in Φ, FτΦ . One type of set is formed by the functions from Ω
558
APPENDIX A
and the other type of set contains functions only from Φ \ Ω. In the second case the equivalence classes generated by τ are singletons. Ω 1. If A ∈ FτΩ ⊆ F∞ , then by definition A∩{τ ≤ t} ∈ FtΩ ⊆ FtΦ , hence A ∈ FτΦ . ˙ Hence in Ω. Hence FτΩ ⊆ HτΩ . By the proposition above A = [A]τ in Φ. 2. On the other hand, let assume that A ∈ HτΩ . That is let us assume that Ω Φ and A = [A]τ in Ω. In this case A ∈ F∞ and as the outcomes from Ω A ∈ F∞ and from Φ \ Ω are never equivalent A = [A]τ in Φ as well. By the proposition A ∈ FτΦ , that is A ∩ {τ ≤ t} ∈ FtΦ . But A ∩ {τ ≤ t} ⊆ Ω for every finite t, so A ∩ {τ ≤ t} ∈ FtΦ ∩ Ω. FtΦ is generated by the set {φ : φ(s) ≤ γ} , and
{φ : φ(s) ≤ γ} ∩ Ω = {φ ∈ Ω : φ(s) ≤ γ} = {ω : X(s, ω) ≤ γ} . From this it is not difficult to prove that FtΦ ∩ Ω = FtΩ . Therefore A ∩ {τ ≤ t} ∈ FtΩ . So A ∈ FτΩ . This means that HτΩ ⊆ FτΩ . Hence the following proposition is true: Proposition A.19 Let assume that the trajectories of a stochastic process X are right-continuous. If τ is a stopping time of the minimal representation of X X and A = [A]τ . That is, in this case the sets then A ∈ FτX if and only if A ∈ F∞ in FτX are the ‘events before’ τ .
Appendix B WIENER PROCESSES Perhaps the most interesting processes are the Wiener processes. The number of theorems about Wiener processes is huge. In this appendix we summarize the simplest properties of this class of processes.
B.1
Basic Properties
It is worth emphasizing that the name Wiener process refers not to a single process but to a class of processes. Definition B.1 Process {w (t, ω)}t≥0 is a Wiener process if it satisfies the next assumptions: 1. w (0) ≡ 0, 2. w has independent increments,
√ 3. if 0 ≤ s < t then the distribution of w (t) − w (s) is N 0, t − s , that is the density function of w (t) − w (s) is gt−s (x) "
1 2π (t − s)
exp
−x2 2 (t − s)
.
4. w is continuous that is for any outcome ω the trajectory w (ω) is continuous. By the formula for moments of the normal distribution the next lemma is obvious. Lemma B.2 For arbitrary 0 ≤ s < t n
E ([w (t) − w (s)] ) =
n/2
1 · 3 · . . . · (n − 1) · (t − s) 0
if n = 2k . if n = 2k + 1
Lemma B.3 If t1 < t2 < . . . < tk then the distribution of (w (t1 ) , w (t2 ) , . . . , w (tk )) 559
(B.1)
560
APPENDIX B
has a density function f and f (x1 , x2 , . . . , xk ) =
k ! i=1
1
" exp 2π (ti − ti−1 )
2
− (xi − xi−1 ) 2 (ti − ti−1 )
where t0 x0 0. Proof. Let t0 0. By definition (∆w (t0 ) , ∆w (t1 ) , . . . , ∆w (tk−1 )) is a vector with independent coordinates. So its density function is g (u1 , u2 , . . . , uk ) =
k ! i=1
"
1
exp
2π (ti − ti−1 )
−u2i 2 (ti − ti−1 )
.
The determinant of the linear mapping A : Rk → Rk u1 = x1 u2 = x2 − x1 ... uk = xk − xk−1 is 1. If f is the density function of (B.1) then P f (x) dx1 . . . dxk = P ((w (t1 ) , w (t2 ) , . . . , w (tk )) ∈ H) . H
By the integral transformation theorem P P (A (w (t1 ) , w (t2 ) , . . . , w (tk )) ∈ AH) = = P ((∆w (t0 ) , ∆w (t1 ) , . . . , ∆w (tk−1 )) ∈ AH) = g (u1 , u2 , . . . , uk ) du1 . . . duk = = AH
g (Ax) |det (A)| dx1 . . . dxk =
= H
g (Ax) dx1 . . . dxk . H
Hence f (x) = g (Ax) =
k ! i=1
where x0 0.
"
1 2π (ti − ti−1 )
exp
2
− (xi − xi−1 ) 2 (ti − ti−1 )
,
BASIC PROPERTIES
561
As we remarked several times, one should assume that the filtration satisfies the usual assumptions. Every Wiener process is a L´evy process, therefore if we augment the filtration generated with the measure-zero sets then the augmented filtration satisfies the usual conditions. If the filtration is already given then one should use the following definition. Definition B.4 We say that a stochastic process w defined on Θ [0, ∞) is a Wiener process on the stochastic base (Ω, A, P, F) if: 1. w (0) = 0, 2. for every t ∈ Θ and h > 0 the increments w (t + h) − w (t) are independent of Ft ,
√ 3. for every 0 ≤ s < t the distribution of w (t) − w (s) is N 0, t − s , 4. the trajectories w (ω) are continuous for every outcome ω, 5. F satisfies the usual conditions. Example B.5 If w is a Wiener process under a filtration F then it is not necessarily a Wiener process under a larger filtration G.
If w is a Wiener process under F and Gt σ (Ft ∪ F1 ) then w is not a Wiener process under G as if s 1 > t then as w (s) is Ft -measurable the martingale property E (w (s) | Gt ) = w (t) does not hold. Perhaps the most well-known property of Wiener processes is the following. Theorem B.6 (Paley–Wiener–Zygmund) For almost all ω the trajectory w (ω) is nowhere differentiable. Proof. It is sufficient to prove that almost surely w (t, ω) does not have a rightderivative for any t. If f is a real function then for any t let
D+ f (t) lim sup h0
f (t + h) − f (t) , h
D+ f (t) lim inf h0
f (t + h) − f (t) . h
Obviously f is differentiable at time t from the right if D+ f (t) = D+ f (t) and the common value is finite. To make the notation simple let [a, b] = [0, 1].
562
APPENDIX B
Let j, k ≥ 1 integers and let
Ajk
t∈[0,1] h∈(0,1/k]
=
w (t + h) − w (t) ≤j = h {|w (t + h) − w (t)| ≤ hj} .
t∈[0,1] h∈[0,1/k] ∞ Obviously B ∪∞ j=1 ∪k=1 Ajk contains the outcomes ω for which there is a time t, that
−∞ < D+ w (t, ω)
and
D+ w (t, ω) < +∞.
To prove the theorem it is sufficient to show that P (B) = 0. To show this it is enough to show that P (Ajk ) = 0 for any k and j. Let us fix a j and a k. Let ω ∈ Ajk and let t be a moment of time belonging to ω. By definition if 0 < h ≤ 1/k then |w (t + h, ω) − w (t, ω)| ≤ j, h which is the same as |w (t + h, ω) − w (t, ω)| ≤ hj whenever 0 < h ≤ 1/k. Let n ≥ 4k and let us partition interval [0, 1] into n equal parts. Let t ∈ [(i − 1) /n, i/n] for some i. Firstly w i + 1 − w i ≤ w i + 1 − w (t) + w i − w (t) ≤ n n n n ≤ as
j 2j + n n
i−1 i , t∈ n n
and
1 4 ≤ n k
therefore
i 1 2 1 −t + ≤ ≤ , n n n k
0<
i+1 −t= n
0<
1 1 i −t≤ ≤ . n n k
(B.2)
BASIC PROPERTIES
563
Secondly w i + 2 − w i + 1 ≤ w i + 2 − w (t) + w i + 1 − w (t) ≤ n n n n ≤
3j 2j + n n
since by (B.2) i+2 0< −t= n 0<
i 2 3 1 −t + ≤ ≤ , n n n k
2 1 i+1 −t≤ ≤ n n k
Thirdly w i + 3 − w i + 2 ≤ w i + 3 − w (t) + w i + 2 − w (t) ≤ n n n n ≤
3j 4j + n n
since again by (B.2) i+3 −t= n i+2 0< −t≤ n
0<
i 3 4 1 −t+ ≤ ≤ , n n n k 3 1 ≤ . n k
Let Cin be the set 2m + 1 i+m i+m−1 ∩3m=1 ω : w ,ω − w , ω ≤ j . n n n If ω ∈ Ajk then t ∈ [(i − 1) /n, i/n] for some t and i. Hence by the just proved three inequalities Ajk ⊆ ∪ni=1 Cin . Hence it is sufficient to show that lim P (∪ni=1 Cin ) = 0.
n→∞
Let us estimate the probability of Cin . By the definition of Wiener processes the distribution of √ i+m−1 i+m −w ξm n w n n
564
APPENDIX B
is N (0, 1). Hence 1 P (|ξ m | ≤ α) = √ 2π
1 1 exp − x2 dx ≤ √ 2α ≤ α. 2 2π −α
α
Using that the Wiener processes have independent increments for every i √ i + m − 1 2m + 1 i+m √ P (Cin ) = P ∩3m=1 nw ≤ j = −w n n n 3 ! √ +1 nw i + m − w i + m − 1 ≤ 2m √ = j ≤ P n n n m=1 ≤
3 · 5 · 7 · j3 . n3/2
Hence lim sup P (∪ni=1 Cin ) ≤ lim sup n→∞
n→∞
n
P (Cin ) ≤ lim n n→∞
i=1
105j 3 = 0. n3/2
Proposition B.7 If w is a Wiener process then almost surely lim sup w (t) = ∞, t→∞
lim inf w (t) = −∞. t→∞
Proof. We prove only the first relation. As t → w (t + s) − w (s) is a Wiener process for every s one should only prove that a.s.
η sup w (t) = ∞
(B.3)
t≥0
for every Wiener process w. Let w be a Wiener process. It is trivial from the definition that if c = 0 then wc cw t/c2 is also a Wiener process. As w is continuous it is sufficient to take the supremum in (B.3) at rational points of time, so η is a random variable. Obviously sup wc (t) sup c · w t
t
t c2
= c · η.
The distribution of the supremum of some process depends only on the infinite dimensional distribution of the process. Hence η and c · η have the same distribution. Therefore η can be almost surely either 0 or ∞. w (t + 1) − w (1)
BASIC PROPERTIES
565
is also a Wiener process, therefore supt≥1 (w (t) − w (1)) is almost surely either zero or +∞. P (η = 0) P sup w (t) = 0 ≤ P sup w (t) ≤ 0 =
t≥0
t≥1
= P w (1) + sup (w (t + 1) − w (1)) ≤ 0
=
t≥0
= P w (1) ≤ 0, sup (w (t + 1) − w (1)) = 0 .
t≥0
The two events in the last probability are independent so 1 p P (η = 0) ≤ P (w (1) ≤ 0) · P sup {w (t + 1) − w (1)} = 0 = p 2 t≥0 so p = 0. Corollary B.8 For every number a the set {t : w (t) = a} is not bounded from above. Particularly the one dimensional Wiener process returns to the origin infinitely many times. Proposition B.9 (Law of large numbers) If w is a Wiener process then lim
t→∞
w (t) = 0. t
Proof. By Doob’s inequality 2 w (t) 1 2 E sup sup w (t) ≤ ≤ 2n E t 2 2n ≤t≤2n+1 2n ≤t≤2n+1 ≤
4 4 8 1 · 4 · E w2 2n+1 = 2n D2 w2 2n+1 = 2n 2n+1 = n . 2n 2 2 2 2
By Markov’s inequality E
sup 2n ≤t≤2n+1
|w (t)| >ε t
≤
8 . 2n ε2
By the Borel–Cantelli lemma almost surely except for some finite number of n sup 2n ≤t≤2n+1
which proves the proposition.
|w (t)| ≤ ε, t
566
APPENDIX B
Corollary B.10 If w is a Wiener process then t · w (1/t) if t > 0 w (t) 0 if t = 0 is indistinguishable from a Wiener process. Corollary B.11 If w is a Wiener process then for every r > 0 P (A) P (w (t) ≤ 0 : ∀t ∈ [0, r]) = 0 and for almost all ω there is an ε (ω, r) > 0 such that (−ε (ω, r) , ε (ω, r)) ⊆ w ([0, r] , ω) . Proof. The second part easily follows from the first part. As w is a Wiener process one can assume that t · w (1/t) if t > 0 w (t) 0 if t = 0 is also a Wiener process. But the trajectories of w are bounded on A which implies1 that P (A) = 0. Corollary B.12 If w is a Wiener process then2 a.s.
inf {t : w (t) ∈ cl (G)} = inf {t : w (t) ∈ G} .
(B.4)
Proof. Recall, that the random variable on the left-hand side is a stopping time3 . One can assume that G = ∅ otherwise the statement is trivial. From Proposition B.7 it is easy to see that τ inf {t : w (t) ∈ cl (G)} = min {t : w (t) ∈ cl (G)} < ∞. By the strong Markov property4 it is clear that w∗ (t) w (τ + t) − w (τ ) is also a Wiener process. As w (τ ) ∈ cl (G) by the previous statement for any rational number r > 0 almost surely there is an t (ω, r) such that w (τ (ω) + t (ω, r) , ω) ∈ G. From this (B.4) is obvious. 1 See:
Proposition B.7, page 564. Example 6.10, page 364. 3 See: Example 1.32, page 17. 4 See: Proposition 1.109, page 70. 2 See:
EXISTENCE OF WIENER PROCESSES
B.2
567
Existence of Wiener Processes
We defined the Wiener processes with their properties. We now show that these properties are consistent. Theorem B.13 One can construct a stochastic process w which is a Wiener process. Proof. First we construct w on the time interval [0, 1], later we shall extend the construction to R+ . 1. Let (Ω, A, P) be a probability space5 where there are countable number of independent random variables (ξ n ) with distribution N (0, 1). Let H ⊆ L2 (Ω) be the closed linear space generated by (ξ n ). As the linear combination of independent Gaussian variables is again Gaussian and as convergence in L2 implies weak convergence, all the vectors in H are Gaussian. H is a Hilbert space with the orthonormal bases (ξ n ). Let (en ) be an orthonormal bases in the Hilbert space L2 ([0, 1] ). Let us define the continuous linear isomorphism T determined by the correspondence en ←→ ξ n : T :
an en −→
k
χ ([0, t]) ∈ L2 ([0, 1]) for every t. If χ ([0, t]) = w (t) T (χ ([0, t])) = T
an ξ n .
k
k
ak (t) ek then let
ak (t) ek
=
k
=
k
ak (t) ξ k =
(χ [0, t] , ek ) · ξ k =
k
ak (t) T (ek ) =
k
k
t
ek dλ · ξ k .
0
For any u and v η u · w (t) + v · (w (t + h) − w (t)) = T (u · χ ([0, t]) + v · χ ((t, t + h])) is Gaussian. Hence the Fourier transform of η at s = 1 is E (exp (iuw (t) + iv (w (t + h) − w (t)))) = 2 u · χ ([0, t]) + v · χ ((t, t + h])2 D2 (η) = exp − = = exp − 2 2 2 2 2 u t + v2 h u t v h = exp − = exp − exp − . 2 2 2 5 For
example (Ω, A, P) = ([0, 1] , B ([0, 1]) , λ).
568
APPENDIX B
Therefore w (t) and w (t + h) − w (t) are independent6 and √
w (t + h) − w (t) ∼ = N 0, h . In a same way one can easily show that w has independent and stationary increments. This means that w is nearly a Wiener process. The only problem is that the trajectories w (ω) are not necessarily continuous7 ! 2. We show that if the orthonormal basis in L2 ([0, 1]) is the set of Haar’s functions then w has a version which is almost surely continuous. Let I (n) be the set of odd numbers between 0 and 2n , that is let I (0) {1} , I (1) {1} , (0) I (2) {1, 3}, etc. Let H1 (t) 1 and if n ≥ 1 then let +2(n−1)/2 if t ∈ [(k − 1) /2n , k/2n ) (n) . Hk (t) −2(n−1)/2 if t ∈ [k/2n , (k + 1) /2n ) 0 if t ∈ / [(k − 1) /2n , (k + 1) /2n ) The Haar’s functions
(n)
form a complete orthonormal system of
Hk
k,n
2
L ([0, 1]): One can show the orthonormality with simple calculation. The proof (n) of the completeness is the following: Let f be orthogonal to every Hk and let x (0) F (x) 0 f (t) dt. F (1) − F (0) = f, H1 = 0, hence F (1) = F (0) = 0. Similarly 0=
(n) f, Hk
1
=
(1) f H1 dλ
1/2
1
f dλ −
=
0
0
f dλ = 1/2
= F (1/2) − F (0) − F (1) + F (1/2) = 2F (1/2) . Hence F (1/2) = 0. In a similar way one can prove that F (k/2n) = 0, that is F ≡ 0. With Monotone Class Theorem one can easily prove that A f dλ = 0 for a.s. every A ∈ B ([0, 1]), which implies that f = 0. 3. For every k ∈ I (n) let (n) ak
t
(t) 0
Let wn (t)
(n) (n) Hk dλ = χ ([0, t]) , ek . n
(m)
ak
(m)
(t) ξ k .
m=0 k∈I(m) (n)
As functions ak are obviously continuous, the trajectories of wn (ω) are continuous for any ω. We show that for almost all ω the series (wn (t, ω)) is uniformly 6 See: 7 And
Lemma 1.96, page 60. w (t) is defined just as a vector from H, that is w (t) defined up to measure-zero sets.
EXISTENCE OF WIENER PROCESSES
convergent in t. Let
569
(n) bn max ξ k . k∈I(n)
∼ = N (0, 1), so if x > 0 then
(n)
ξk
2 ∞
1 u (n) P ξ k > x = 2 √ exp − du ≤ 2 2π x % % ∞
2 2 u 2 exp −x2 /2 u exp − , du = ≤ π x x 2 π x hence %
(n) 2 exp −n2 /2 n ≤ 2 . > n P (bn > n) = P ξ k π n
k∈I(n)
∞
2n exp −n2 /2 /n < ∞, so by the Borel–Cantelli lemma
n=1
P lim sup {bn > n} = 0, n→∞
that is for almost all ω there is an n0 (ω) such that if n ≥ n0 (ω) then bn (ω) ≤ n. (n) Observe that the supports of the non-negative functions ak are disjoint hence for any k (n) (n) ak (t) ≤ max ak = 2(n−1)/2 · 2−n = 2−(n+1)/2 . k∈I(n) From this for almost all ω from an index n large enough (n) (n) ξ k (ω) ak (t) ≤ |wn (t, ω) − wn−1 (t, ω)| = k∈I(n) (n) ≤ n ak (t) ≤ n2−(n+1)/2 . k∈I(n) ∞
n2−(n+1)/2 < ∞, therefore the series (wn (t, ω)) is uniformly convergent for almost all ω. Hence its limit w (t, ω) is almost surely continuous in t. By the n=1
570
APPENDIX B
construction w is defined and continuous up to a measure-zero set. So one can set to zero the trajectories where w is not continuous or where it is not defined. 4. Finally one should extend w from [0, 1] to [0, ∞). Let w(n) , (n = 1, 2, . . .) be countable number of independent Wiener processes on [0, 1]. One can construct such processes as we assumed that there are countable number of independent N (0, 1) variables on Ω and one can form an infinite two-dimensional matrix from these variables. Let w (0) 0 and let w (t) w (n) + w(n+1) (t − n)
if t ∈ [n, n + 1) .
w(n) (0) = 0, for every n so w is continuous on [0, ∞). With direct calculation it is easy to check that w is a Wiener process on R+ . On the space C (R+ ) let us define the topology of uniform convergence on compacts. Using the Stone–Weierstrass theorem it is easy to see that C (R+ ) is a complete separable metric space. Let B B (C (R+ )) be the Borel σ-algebra of C (R+ ). It is easy to see that the σ-algebra generated by the process X (ω, t) = ω (t) ,
ω ∈ C (R+ )
is equal to B. Definition B.14 A measure W on the measurable space (C (R+ ) , B) is a Wiener measure if process w (t, ω) ω (t) ,
ω ∈ C (R+ )
satisfies the following conditions: a.s.
1. w (0) = 0, 2. w has independent increments,
√ 3. if t > s the distribution of w (t) − w (s) is N 0, t − s . Proposition B.15 There is a Wiener measure. Proof. Let w be a Wiener process on some stochastic base (Ω, A, P, F) and let F (ω) = w (ω). Obviously F : Ω → C (R+ ). As for every Borel measurable set B⊆R {w (t) ∈ B} = F −1 ({w (t) ∈ B}) ∈ Ft F is (Ω, F∞ ) → (C (R+ ) , B) measurable. The distribution of F
W (A) P F −1 (A) ,
A∈B
defines a Wiener measure on (C (R+ ) , B). As the sets {ω (t) ∈ B} generate B one can easily prove the next observation.
QUADRATIC VARIATION OF WIENER PROCESSES
571
Proposition B.16 The Wiener measure is unique.
B.3
Quadratic Variation of Wiener Processes
It is a natural question to ask, when does the quadratic variation of a Wiener process converge almost surely? Theorem B.17 (Quadratic variation of Wiener processes) Let w be a
(n) Wiener process and let Pn tk be an infinitesimal sequence of partitions of an interval [a, b]. In the topology of convergence in L2 (Ω) lim
n→∞
2 (n)
(n) = b − a. w tk − w tk−1
(B.5)
k
In a similar way if w1 and w2 are independent Wiener processes then lim
n→∞
(n) (n) (n) (n) w1 tk − w1 tk−1 w2 t k − w2 tk−1 = 0.
(B.6)
k
If Pn+1 is a refinement of Pn for all n then the convergence holds almost surely. Proof. Let ∆w (tk ) w (tk ) − w (tk−1 ). 1. By the definition of Wiener processes E
∆w
(n) tk
2
=
k
(n) (n) tk − tk−1 = b − a.
k
Recall that if the distribution of ξ is N (0, 1) then
D2 ξ 2 = D2 χ21 = 2. w has independent increments and the expected value of the increments is zero, hence as the sequence of partitions is infinitesimal 2
2
2 (n) (n) 2 ∆w tk ∆w tk − (b − a) = D = k k 2
2
2 (n) (n) (n) ∆w tk = D2 2 · tk − tk−1 ≤ = k
k
(n)
(n)
≤ 2 · (b − a) · max tk − tk−1 → 0. k
572
APPENDIX B
√ If w1 and w2 are independent Wiener processes then (w1 ± w2 ) / 2 are also Wiener processes. (B.6) follows from the identity ∆w1 (tk ) ∆w2 (tk ) 4
k
=
k
−
2
[∆ (w1 + w2 ) (tk )] −
2
[∆ (w1 − w2 ) (tk )] .
k
2. The proof of the almost sure convergence is a bit more complicated. The quadratic variation depends only on the trajectories so one can assume that (Ω, A, P) = (C (R+ ) , B, W ) , where B is the σ-algebra of the Borel measurable sets of the function space C (R+ ), W is the Wiener measure, that is the common distribution of the Wiener processes. Fix an interval [0, u] and a partition Pm with n points. Let us consider n the 2n signs ±1 corresponding to the n points. To every sequence of signs (sk )k=1 and to every f ∈ C [0, ∞) let us map the function
(n) (n) f(t) f tk−1 + sk f (t) − f tk−1 ,
(
(n) (n) t ∈ tk−1 , tk .
We shall call the correspondence f → f as alternation. The Gaussian distributions are symmetric, so if w is a Wiener process then the alternated process ω → w (ω) is also a Wiener process. The Wiener measure is the common, therefore unique distribution of every Wiener process, so W is invariant under all the alternations f → f. Let Bn ⊆ B be the set of events which are invariant under the whole 2n alternations. It is easy to see that Bn is a σ-algebra. As the (n + 1)-th partition is refining the n-th one every alternation corresponding to the partition Pn+1 is an alternation corresponding to Pn . Hence Bn+1 ⊆ Bn . If i = j then
(n) (n) E ∆w ti ∆w tj | Bn = 0, as if B is invariant under an alternation of the i-th interval then by the integral transformation theorem, using that W is invariant under the alternations
∆w
B
(n) ti
∆w
(n) tj
dW = − B
(n) (n) ∆w tj dW. ∆w ti
QUADRATIC VARIATION OF WIENER PROCESSES
573
2 (n) On the other hand ∆w ti is invariant under any possible alternations so it is Bn -measurable. By the energy equality
E w2 (T ) | Bn
2 (n) = E ∆w tk | Bn = k
=E
∆w
(n) tk
2
| Bn
=
k
2 (n) ∆w tk .
k
By L´evy’s theorem about reversed martingales8 the expression on the left-hand side is almost surely convergent. Hence the sum on the right-hand side is also convergent for almost all ω. If the partition is infinitesimal, then by the just proved convergence in L2 (Ω) one can easily prove that the almost sure limit is the quadratic variation. It is very natural to ask what does happen if partition Pn+1 is not refining partition Pn ? Theorem B.18 (Almost sure convergence of the quadratic variation)
(n) If for a sequence of partitions Pn tk
1 (n) (n) ln max tk+1 − tk =o k log n then for every Wiener process w the sequence in (B.5) is almost surely convergent. Proof. To make the notation simply let [a, b] = [0, 1]. Let (n)
(n)
N (n)
tk
be a k=0
(n)
partition of [0, 1] and let ck tk − tk−1 . 1. Recall that the moment-generating function of distribution χ21 is
M1 (s) = E exp sN (0, 1) 1 =√ 2π
2
1 =√ 2π
∞
2
exp sx −∞
x2 exp − 2
dx =
1 x2 exp − (1 − 2s) dx = √ . 2 1 − 2s −∞
∞
2. Let ε > 0 and let 0 < a 1 − ε and let s < 0. By Markov’s inequality, using 2 the formula for the moment-generating function of χ21 N (0, 1) and using that 8 See:
Theorem 1.75, page 46
574
APPENDIX B
if x ≥ 0 then ln (1 + x) ≥ x − x2 /2
p(1) n
(a) P
∆w
(n) tk
2
− 1 ≤ −ε
k
=P
=P s
∆w
(n) tk
2
≤a
=
k
∆w
(n) tk
=
2
≥ sa
≤
k
2 E exp s k ck N (0, 1)
!
√ 1/ 1 − 2sck ≤ = = exp (sa) exp (sa k ck ) ! 1 exp −sack − ln (1 − 2sck ) ≤ = 2 k !
≤ exp −s (ack − ck ) + s2 c2k = k
= exp −s (a − 1) + s
2
k
c2k
≤
k
≤ exp −s (a − 1) + s
2
max ck k
= exp −s (a − 1) + s2 ln .
ck
=
k
The minimum over s < 0 is obtained at s=
a−1 . 2ln
Substituting it back 2 2 (a − 1) (a − 1) (a) ≤ exp − + = 2ln 4ln 2 (a − 1) K1 (a) exp − , = exp − 4ln ln
p(1) n
where K1 (a) > 0.
(B.7)
QUADRATIC VARIATION OF WIENER PROCESSES
575
3. Now let a 1 + ε, where s > 0. p(2) n (a) P
∆w
(n) tk
2
−1≥ε
k
=P
∆w
(n) tk
2
≥a
k
=P s
∆w
(n) tk
=
2
=
≥ sa
≤
k
≤
2 (n) E exp s k ∆w tk exp (sa)
2 ! (n) E exp s ∆w tk k
=
! =
k
=
=
exp (sa) √ 1/ 1 − 2sck
= exp (sa) 1 ln (1 − 2sck ) − sa exp (f (s)) . = exp − 2 k
Obviously f (0) = 0 and as a > 1 f (0) = 1 − a < 0. If s → 1/ (2 max ck ) then f (s) → ∞. Therefore it has a minimum at point s∗ > 0 where ck f (s∗ ) = − a = 0. (B.8) 1 − 2s∗ ck k
Hence if xk 2s∗ ck ≤ 1
p(2) n
1 xk (a) ≤ exp − ln (1 − xk ) + . 2 1 − xk k
Now we want to estimate ln (1 − x) +
x 1−x
576
APPENDIX B
over x ∈ (0, 1). ln (1 − x) +
x 1 = ln (1 − x) − ln 1 + −1= 1−x 1−x x −1 1 = + 2 du = 1 − u (1 − u) 0 x x u u − u2 /2 = 2 du ≥ 2 du = 0 (1 − u) 0 (1 − u) 1 x 2u − u2 − 1 + 1 = du = 2 2 0 (1 − u) x 1 1 1 x − x . − 1du = = 2 0 (1 − u)2 2 1−x
Hence using (B.8) 1 2s∗ ck ∗ = (a) ≤ exp − − 2s ck 4 1 − 2s∗ ck k k ∗ s 1 ∗ ∗ 2s a − 2s = exp − (a − 1) = exp − 4 2
p(2) n
k
But again by (B.8) 1 ck k ck = ≥ =a ∗ ∗ 1 − 2ln s 1 − 2ln s 1 − 2s∗ ck k
That is s∗ ≥ (a − 1) / (2aln ). Using this
2
(a − 1) p(2) n (a) ≤ exp − 4aln
K2 (a) exp − ln
.
(B.9)
4. By the assumption of the theorem ln = o (1/ ln n). So for some εn → 0 K/εn 1 K K ln n . = exp − = bn exp − ln εn n So n bn < ∞. Using this and the just proved estimations (B.7) and (B.9) for all m ∞
2 1 (n) < ∞, ∆w tk P − 1 ≥ m n=1 k
QUADRATIC VARIATION OF WIENER PROCESSES
577
hence by the Borel–Cantelli lemma
2 a.s. (n) ∆w tk → 1.
k
One can ask whether we can improve the estimation of the order of ln . The answer is no. Example B.19 There is a sequence of partitions with ln = O (1/ log n) for which (B.5) is not almost surely convergent.
For every integer p ≥ 1 let Πp be the set of partitions of [0, 1] formed from the intervals k k+1 2k 2k + 2 , Jpk p , = , 2 2p 2p−1 2p−1 and
Ip2k
2k 2k + 1 , p, 2 2p
Ip2k+1
2k + 1 2k + 2 , 2p 2p
where in both cases k = 1, . . . , 2p−1 . During the construction of a partition 2p−1
times one should choose between one Jpk and a pair of Ip2k , Ip2k+1 so for any p p−1 the number of partitions in Πp is 22 . For a p for one partition, when we are p−1 using just Ipk type intervals, the maximal length is 2−p and for the other 22 −1 partitions the length of the maximal interval is 2−(p−1) . If we take any sequence of partitions from Πp then the index of a partition from Πp is maximum
22
q−1
< 21+2
p−1
.
1≤q≤p
Observe that if ln is the size of the largest interval in the n-th partition then
ln ln n ≤ 2−(p−1) 1 + 2p−1 ≤ 3, that is ln = O (1/ log n). Let Q (π) be the approximating sum of the quadratic variation formed with partition π. Let Mp max {Q (π) : π ∈ Πp } . The lim sup of sequence (Q (π)) is the same as the lim sup of sequence (Mp ). Let Mp(k) max
2
2
2
. ∆w Ip2k + ∆w Ip2k+1 , ∆w Jpk
578
APPENDIX B
Obviously
Mp = The variables distribution is
∆w Ip2k
Mp(k) .
0≤k≤2p−1 −1
and
∆w Ip2k+1
are
independent
N 0, σ 2 = N 0, 2−p .
and
their
∆w Jpk is the sum of two independent variables ∆w Ip2k and ∆w Ip2k+1 . If
ξ and η are independent variables with distribution N 0, σ 2 and
2 ζ max ξ 2 + η 2 , (ξ + η) then one can find constants a, b > 0 such that E (ζ) = (1 + a) σ 2 for all σ. With these constants
E Mp(k) = (1 + a) 2−(p−1)
and D2 (ζ) = bσ 4 .
and D2 Mp(k) = b2−2p .
The number of variables in Mp is 2p−1 , so for their sum Mp E (Mp ) = 1 + a and D2 (Mp ) = b2−p−1 . b2−(p−1) → 0, hence by Chebyshev’s inequality lim Mp = lim sup Q (π n ) = 1 + a > 1,
p→∞
n→∞
so (Q (π n )) cannot converge almost surely to 1.
Appendix C POISSON PROCESSES Let us first define the point processes Definition C.1 Let F be a filtration and let (τ n ) be a sequence of stopping times. (τ n ) generates a point process if it satisfies the next assumptions: 1. τ 0 = 0, 2. τ n ≤ τ n+1 , 3. if τ n (ω) < ∞ then τ n (ω) < τ n+1 (ω). The investigation of a point process (τ n ) is equivalent to the investigation of the counting process N (t)
∞
χ (τ n ≤ t) .
n=1
N is finite on the interval [0, τ ∞ ), where of course τ ∞ limn τ n . As we defined the stochastic processes only on deterministic intervals we assume that τ ∞ = ∞. That is we assume that N (t) is finite for every t. Otherwise we should restrict our counting processes to some intervals [0, u] where u < τ ∞ . The trajectories of N are increasing, so N is regular. By the second and the third assumptions N is right-continuous. As the functions τ n are F-stopping times, N is F-adapted as whenever a ≥ 0 . / {N (t) ≤ a} = τ [a]+1 > t ∈ Ft . Very often the filtration F is not given explicitly and the point process is defined just by the random variables (τ n ). With (τ n ) one can define the counting process N and the filtration F is defined by the filtration generated by N .
Definition C.2 The pair N, F N is called the minimal representation of the point process (τ n ). Proposition C.3 If the trajectories of a process X are right-regular and for every t and ω the trajectory X (ω) is constant on an interval [t, t + δ] , where δ > 0 can depend on ω and t then the filtration F X generated by X is rightcontinuous. 579
APPENDIX C
580
Proof. is well-known that for an arbitrary collection of random variables
It X ξ γ γ∈Γ any set C from the σ-algebra generated by X has a representation
C = Ψ−1 (B) , where B ∈ B (R∞ ) and Ψ (ω) ξ γ k (ω) and the number of k
indexes (γ k ) ⊆ Γ is maximum countable. Let us fix a moment t. The trajectories X (ω) are right-regular and they are constant on an interval after t, hence for every n there is a set An ⊆ Ω, such that if ω ∈ An then the trajectory X (ω) is constant on the closed interval [t, t + 1/n]. If C ∈ Ft+ then C ∈ Ft+1/n for every n. By the just mentioned property of the generated σ-algebras C = Ψ−1 (Bn ) where Ψ (ω) (X (t1 , ω) , . . . , X (tk , ω) , . . .) , (n)
Let tk
Bn ∈ B (R∞ ) ,
tk ≤ t +
1 . n
(n) tk ∧ t and let Ψn be the analogous correspondence defined by tk . k
X As tk ≤ t obviously Cn Ψ−1 n (Bn ) ∈ Ft . If ω ∈ An then by the structure of the trajectories Ψn (ω) = Ψ(ω) and therefore (n)
Cn ∆C ⊆ Acn .
(C.1)
An Ω, hence Acn ∅. If X C∞ lim sup Cn ∩n ∪∞ m=n Cm ∈ Ft , n→∞
then as Acn ∅ by (C.1) C = C∞ ∈ FtX . This means that the filtration F X is right-continuous. Corollary C.4 If N is a counting process of a point process then the filtration F N is right-continuous and the jump times (τ n ) are stopping times with respect to F N . Proof. To prove the last statement it is sufficient to remark that τ n is the hitting time1 of the open set (n − 1/2, ∞) . Filtration F N is right-continuous, but the usual assumptions do not hold: one
N with respect to P and add the measure-zero should complete the space Ω, F∞ sets to the σ-algebras FtN . Lemma C.5 If τ is a stopping time of the augmented filtration then τ = τ + τ , where τ is a stopping time of the filtration F N and τ is almost surely zero. 1 See:
Example 1.32, page 17.
APPENDIX C
581
Proof. Let τ be a stopping time of the augmented filtration. For every t ≥ 0 a.s. there is an At ∈ FtN such that {τ < t} = At . If τ (ω) inf {r ∈ Q+ : ω ∈ Ar } , then {τ < t} = ∪s∈Q+ ,s
a.s.
so τ = τ . Proposition C.6 (Representation of stopping times) If τ is a stopping time of the minimal representation of a point process (τ n ) then there is a sequence of real-valued Borel measurable functions (ϕn ) , where ϕn : Rn → R and a constant ϕ0 such that on the set {τ < ∞} τ = ϕ0 χ (τ 0 ≤ τ < τ 1 ) +
∞
χ (τ n ≤ τ < τ n+1 ) ϕn (τ 0 , . . . , τ n )
n=1
∞
χ (τ n ≤ τ < τ n+1 ) ϕn (τ 0 , . . . , τ n ) .
n=0
Proof. First assume that Ω is the canonical space of the point processes that is Ω is the space of right-continuous functions which have jumps of size one and which are constant between the jumps. Ω is closed under truncation and therefore Gτ = Fτ , where2 Gτ = σ (N (τ ∧ t) : t ≥ 0)
(C.2)
is the σ-algebra defined in line (A.4). τ is Fτ -measurable, hence τ is Gτ measurable. By (C.2) for some Borel measurable function3 ϕ : R∞ → R τ = ϕ (N (τ ∧ t1 ) , N (τ ∧ t2 ) , . . . , N (τ ∧ tk ) , . . .) = ∞ χ (τ n−1 ≤ τ < τ n ) + χ (τ ∞ ≤ τ ) ϕ (N (τ ∧ tk ) , k ∈ N) . = n=1 2 See: 3 It
Proposition A.17, page 556. is a simple consequence of the Monotone Class Theorem.
582
APPENDIX C
On the set {τ n ≤ τ < τ n+1 } N (τ ∧ tk ) = N (τ n ∧ tk ) =
n
χ (τ l ≤ tk ) ,
l=1
on the set the set {τ ∞ ≤ τ } N (τ ∧ tk ) = N (tk ) =
∞
χ (τ l ≤ tk )
l=1
from which the representation is evident. Now let us assume that τ is a stopping time of the minimal representation. One can obviously embed the minimal representation to the canonical representation. Let us extend τ with the definition τ (ω) ∞ for every new outcome ω. As we already discussed4 τ remains a stopping time, therefore one can construct the needed representation. Example C.7 Point process with a single jump.
Let 0 ≤ σ be a random time. One can define a point process with τ 1 σ and τ k ∞, for k ≥ 2. In this case N (t, ω) χ (σ (ω) ≤ t). By the just proved proposition if τ ≥ 0 is a stopping time of the filtration F F N then τ = ϕ0 χ (τ < σ) + χ (σ ≤ τ < ∞) ϕ1 (σ) + χ (τ = ∞) · ∞. If τ < ∞ then
τ = ϕ0 χ (τ < σ) + χ (σ ≤ τ ) ϕ1 (σ) .
(C.3)
Therefore τ is constant on the set {τ < σ}. N τ n is bounded for every n, so trivially N ∈ A+ loc . This means that N has a compensator N p . Example C.8 Predictable compensator of point processes.
Let N be a counting process and let assume that we study N in its minimal representation. Assume that we added the measure-zero sets to the filtration and let assume that F∞ is complete. In this case the usual conditions hold and N has a compensator N p . Let F1 (t) P (τ 1 ≤ t) , 4 See:
Lemma A.18, page 557.
Fk (t) P (τ k ≤ t | τ 1 , . . . ., τ k−1 ) ,
(C.4)
APPENDIX C
583
be the conditional distributions of the jumps. Assume that the conditional distributions are regular. Define the so called integrated conditional hazard rates Ak (t)
t∧τ k
0
dFk (u) . 1 − Fk (u−)
(C.5)
Ai B.
(C.6)
We show that Np =
i
To prove this it is sufficient to show that B is predictable and N τ k − B τ k is a uniformly integrable martingale. Before the proof, let us discuss the interpretation of the formula. τ k is larger than τ k−1 , and the conditional probability of the random segment [0, τ k−1 ] with respect to Fk is zero. Hence the measures related to Fk are concentrated on the set (τ k−1 , ∞]. One should pay the compensation fee Ak for the k-th jump only on the interval (τ k−1 , τ k ]. After every jump the amount of the compensation can change. The expression in the integral, the hazard rate, the amount of compensation one should pay at moment u is dFk (u) 1 − Fk (u−) which is the probability that the k-th jump of the process will happen at time u under the condition that there was no jump before u. If the distribution Fk is continuous then5 t∧τ k dFk (u) = − ln (1 − Fk (t ∧ τ k )) . 1 − Fk (u−) 0 1. Let us first prove that N τ k − B τ k is a uniformly integrable martingale. To prove this we show first that E (N (θ ∧ τ k )) = E (B (θ ∧ τ k ))
(C.7)
for every stopping time θ. Using the representation of the stopping times of the minimal representation a.s.
θ =
∞
χ (τ k−1 ≤ θ < τ k ) ϕk−1 (τ 0 , . . . , τ k−1 ) ,
θ < ∞.
k=1
By this one can define a Borel measurable functions with s variables θs θs (τ 1 , . . . , τ s ) 5 See:
(6.32), page 398.
APPENDIX C
584 that
θ ∧ τ k = θk−1 ∧ τ k . The measure generated by Fk is concentrated on the set (τ k−1 , ∞]. Hence E (B (θ ∧ τ n ))
∞
E (Ak (θ ∧ τ n )) =
k=1
=
n
n
E (Ak (θ ∧ τ k )) =
k=1
E (Ak (θk−1 ∧ τ k )) =
k=1
=
n
E (E (Ak (θk−1 ∧ τ k ) | τ 1 , τ 2 , . . . , τ k−1 )) .
k=1
Let us calculate the condition expectation. Let I E (Ak (θk−1 ∧ τ k ) | τ 1 , τ 2 , . . . , τ k−1 ) = θ k−1 ∧τ k dFk (u) =E | τ 1 , . . . , τ k−1 . 1 − Fk (u−) 0 Using the definition of Ak and the regularity of the conditional expectations
∞
θ k−1 ∧s
I= 0
0 θ k−1
= 0
0
s
dFk (u) dFk (s) = 1 − Fk (u−) ∞ θk−1 ∧s dFk (u) dFk (u) dFk (s) + dFk (s) . 1 − Fk (u−) 1 − Fk (u−) θ k−1 0
If s > θk−1 then in the second term the inner integral is not changing, that is the expression is (1 − Fk (θk−1 ))
θ k−1
0
dFk (u) . 1 − Fk (u−)
Integrating by parts in the first expression and using that Fk (0) = 0, if k ≥ 1 the first integral is Fk (θk−1 ) 0
θ k−1
dFk (u) − 1 − Fk (u−)
0
θ k−1
Fk (u−) dFk (u) . 1 − Fk (u−)
Reordering (Fk (θk−1 ) − 1) 0
θ k−1
dFk (u) + Fk (θk−1 ) . 1 − Fk (u−)
APPENDIX C
585
Adding up the two integrals
θ k−1 ∧τ k
E 0
dFk (u) | τ 1 , . . . , τ k−1 1 − Fk (u−)
= Fk (θk−1 ) ,
hence E (B (θ ∧ τ n )) =
n
Fk (θk−1 ) .
k=1
On the other hand using that τ 0 = 0 and N (0) = 0 E (N (θ ∧ τ n )) =
n
(E (N (θ ∧ τ i ) − N (θ ∧ τ i−1 ))) =
i=1
=
n
(E (N (θi−1 ∧ τ i ) − N (θi−1 ∧ τ i−1 ))) =
i=1
= E (χ (τ 1 ≤ θ0 )) +
n
E (χ (τ i ≤ θi−1 )) =
i=2
n
Fi (θi−1 ) ,
i=1
therefore (C.7) is valid. 2. Observe that this does not imply that the truncated process is a uniformly integrable martingale, as both sides can be infinite. (N τ n (t) − B τ n (t)) ≤ N τ n (t) + B τ n (t) ≤ n + B τ n (t) . By the just proved statements E (B τ n (t)) = E (B (τ n ∧ t)) ≤ E (B (τ n )) = E (N (τ n )) = n,
(C.8)
therefore the truncated process is really a uniformly integrable martingale. 3. One should prove that B is predictable. Every natural process is predictable hence one should show that B is natural6 . By definition this means that for any non-negative, bounded martingale7 M 0 6 See: 7 See:
Theorem 5.10, page 302. Definition 5.7, page 299.
t
M dB
E
=E 0
t
M− dB .
APPENDIX C
586
Let M be a non-negative bounded martingale. F1 is a distribution function8 so by the non-negativity of M one can apply Fubini’s theorem E
t
χ (u ≤ τ 1 ) M (u) dF1 (u) = 0 1 − F1 (u−) t χ (u ≤ τ 1 ) M (u) dF1 (u) = E = 1 − F1 (u−) 0 t 1 − χ (τ 1 < u) E E = dF1 (u) = M (u) | Fu− 1 − F1 (u−) 0 t 1 − χ (τ 1 < u) E = E (M (u) | Fu− ) dF1 (u) = 1 − F1 (u−) 0 t t χ (u ≤ τ 1 ) E M− dA1 . M− (u) dF1 (u) E = 1 − F1 (u−) 0 0 t
E
M dA1 0
The case i > 1 is a bit more complicated as in this case for the conditional distribution functions Fi (u) P (τ i ≤ u | τ 1 , . . . , τ i−1 ) one cannot apply Fubini’s theorem. 4. Let G be a σ-algebra. Assume that V is right-regular with finite variation n and V (t) is G-measurable for every t. If X k=1 ξ k χIk is a step-function, where Ik are intervals in [0, t] then E
t
XdV 0
t =E E X (u) dV (u) | G = 0
=E E ξ k (V (tk ) − V (tk−1 )) | G =
=E
i
=E
t
k
E (ξ k | G) (V (tk ) − V (tk−1 ))
=
E (X (u) | G) dV (u) .
0
The set of processes X for which the above identity holds form a λ-system. So by the Monotone Class Theorem the identity holds for any product measurable, bounded process X. With the Monotone Convergence Theorem one can extend the identity to any non-negative product measurable process X. 8 Not
a conditional distribution function.
APPENDIX C
587
5. Using that Fi is Fτ i−1 -measurable E
t
M dAi 0
χ (τ i−1 < u ≤ τ i ) M (u) dFi (u) = (C.9) E 1 − Fi (u−) 0 t χ (τ i−1 < u ≤ τ i ) M (u) | Fτ i−1 dFi (u) = E =E 1 − Fi (u−) 0
t E χ (τ i−1 < u ≤ τ i ) M (u) | Fτ i−1 dFi (u) . =E 1 − Fi (u−) 0
t
Let us calculate the conditional expectation under the integral. χ (τ i−1 < u ≤ τ i ) = χ (τ i−1 < u) (1 − χ (τ i < u)) = 1 = lim χ τ i−1 < u − 1 − χ τi < u − n∞ n 1 = lim χ τ i−1 < u − 1 − χ τi < u − n∞ n
1 n 1 n
=
Let F ∈ Fτ i−1 and let c 1 1 Fn F ∩ τ i−1 < u − ∈ Fu− . ∩ τi < u − n n M is bounded so by the Dominated Convergence Theorem using that M (u−) = E (M (u) | Fu− )
χ (τ i−1 < u ≤ τ i ) M (u) dP =
F
M (u) dP = F ∩{τ i−1
= lim
n∞
M (u) dP = lim Fn
n∞
M (u−) dP = Fn
χ (τ i−1 < u ≤ τ i ) M (u−) dP.
= F
Hence E
t
M dAi
t
=E
0
0
=E 0
t
E χ (τ i−1 < u ≤ τ i ) M (u−) | Fτ i−1 dFi (u) = 1 − Fi (u−) t χ (τ i−1 < u ≤ τ i ) M (u−) M− dAi . dFi (u) E 1 − Fi (u−) 0
588
APPENDIX C
Example C.9 Processes with a single jump.
Let N (t) χ (τ ≤ t) and let F be the distribution of τ . If F is continuous then
t∧τ
N (t) p
0
dF (u) = − ln (1 − F (t ∧ τ )) . 1 − F (u)
N p (∞) = N p (τ ) = − log (1 − F (τ )) . If τ has an exponential distribution with parameter λ = 1, then N p (t) = − ln (exp (−t ∧ τ )) = t ∧ τ . If F is strictly increasing and continuous, then
P (F (τ ) < x) = P τ < F −1 (x) = F F −1 (x) = x, so F (τ ) is uniformly distributed. Hence P (N p (∞) < x) = P (− log (1 − U ) < x) = = P (U < 1 − exp (−x)) = 1 − exp (−x) . Hence in this case N p (∞) is exponentially distributed with parameter λ = 1.
Example C.10 Predictable compensator of Poisson processes.
Let π (t) be a Poisson process with parameter λ. As π is a L´evy process and E (π (t)) = λt π (t) − λt is a martingale. So by the definition of the predictable compensator π p (t) = λt. On the other hand the distribution of the time between the jumps is exponential with parameter λ so Fk (x | τ 1 , τ 2 , . . . , τ k−1 ) =
0 if x ≤ τ k−1 = 1 − exp (−λ (x − τ k−1 )) if x > τ k−1
= 1 − exp (−λ (max (0, x − τ k−1 ))) .
APPENDIX C
589
From this if t > 0 Ai (t) = − ln (1 − Fi (t ∧ τ i )) = = − ln (exp (−λ max (0, t ∧ τ k − τ k−1 ))) = = λ max (0, (t ∧ τ i − τ i−1 )) , From this the predictable compensator is π p (t) = λ (t ∧ τ 1 + max (0, t ∧ τ 2 − τ 1 ) + . . .) = λt.
Example C.11 Counting process with Weibull distribution.
Let the length of the jump times be independent with Weibull distribution. In this case α
Fk (x) = 1 − exp (−λ (max (0, x − τ k−1 )) ) . The integrated hazard rate is α
Λk (t) = − ln (exp (−λ (max (0, t ∧ τ k − τ k−1 )) )) = α
= λ (max (0, t ∧ τ k − τ k−1 )) . The compensator is α
α
λ ((t ∧ τ 1 ) + max (0, t ∧ τ 2 − τ 1 ) + . . .) If α = 1 then the compensator is λt, which is the compensator of the Poisson process otherwise the compensator is not deterministic. Definition C.12 The counting process N is an extended Poisson process if the increments N (t) − N (s) are independent of the σ-algebra Fs for all s < t. If N is a counting process then it is a semimartingale. As all the jumps have the same size ν ((0, t] × Λ) = ν ((0, t] × {1}) = N p (t) . If N p denotes the measure generated by N p then it is easy to see that the characteristics of semimartingale N are (0, 0, ν) = (0, 0, N p ). Proposition C.13 The compensator N p of a counting process N is deterministic if and only if N is an extended Poisson process. In this case N p (t) = E (N (t)) < ∞.
(C.10)
590
APPENDIX C
Proof. Let N be an extended Poisson process. As N has independent increments the spectral measure ν of N is deterministic9 . If Λ {1} then
E (N (t)) = E ∆N (s) χ (∆N (s) ∈ Λ) = E χΛ x • µN (t) = s≤t
= (xχΛ • ν (x)) (t) ≤
x2 ∧ 1 • ν (x) (t) < ∞.
Therefore, using the independence of the increments E (N (t) − N (s) | Fs ) = E (N (t) − N (s)) = E (N (t)) − E (N (s)) , and so N (t) − E (N (t)) is trivially a martingale and (C.10) holds. On the other hand from the general theory of processes with independent increments we know10 that if the spectral measure ν, that is the measure generated by N p , is deterministic then N has independent increments. So the proposition is true. Proposition C.14 The Fourier transform of the increment of an extended Poisson process N is E (exp (iu (N (t) − N (s)))) =
(C.11) c
c
= exp ((exp (iu) − 1) ((N (t)) − (N (s)) )) × ! (1 + (exp (iu) − 1) ∆N p (r)) , × p
p
s
where (N p ) is the continuous part11 of the compensator N p . Proof. It is a special case of the L´evy–Khintchine formula12 . Recall that B = 0 and C = 0 and ν is the measure generated by N p . So E (exp (iu (N (t) − N (s)))) = exp (U ) V, where U
(exp (iux) − 1 − iuh (x)) χJ c (r) dν (r, x) (s,t]×(R\{0})
V =
; s
9 See:
1+
R\{0}
(exp (iux) − 1) ν ({r} × dx)
Corollary 7.88, page 532. Theorem 7.89, page 532. 11 N p is an increasing process and (N p )c its continuous part. 12 See: Theorem 7.90, page 534. 10 See:
APPENDIX C
591
If δ denotes the Dirac delta for x = 1 then ν ({r} × Λ) = ∆N p (r) δ (Λ) so the integrals in the formula are13 c
c
U (exp (iu) − 1) ((N p (t)) − (N p (s)) ) ; (1 + (exp (iu) − 1) ∆N p (r)) . V = s
Proposition C.15 An extended Poisson process N has a jump with positive probability at time t if and only if14 its compensator N p is discontinuous at time t. Proof. If s t in the Fourier transform (C.11) then E (exp (iu∆N (t))) = 1 + (exp (iu) − 1) ∆N p (t) . The left-hand side is one if and only if ∆N p (t) = 0. Definition C.16 We say that the counting process N is a generalized Poisson process if it has independent increments and N (t) has Poisson distribution for every t. Proposition C.17 If the predictable compensator N p of a counting process N is deterministic and continuous then N (t) has a Poisson distribution with parameter λ (t) N p (t). Hence under these conditions N is a generalized Poisson process15 . Proof. Let us recall that the Fourier transform of the Poisson distribution with parameter λ is exp (λ (exp (iu) − 1)) . N p is continuous so the proposition follows from line (C.11). The jump times of Poisson processes are totally inaccessible16 . Our goal is to prove the same result for generalized Poisson processes. First we prove a simple, but interesting general result: Lemma C.18 If A ∈ A+ then Ap is almost surely continuous if and only if A is regular in the following sense: If for some sequence of stopping times σ n σ then E (A (σ n )) → E (A (σ)) . 13 Observe
that h (x) xχ ( x < 1) so h (1) = 0. Corollary 7.91, page 535. 15 See: Example 7.93, page 536. 16 See: Example 3.7, page 183. 14 See:
(C.12)
592
APPENDIX C
Proof. As A ∈ A+ by the elementary properties17 of the predictable compensator A − Ap is a uniformly integrable martingale. Hence by the Optional Sampling Theorem E (A (σ n )) = E (Ap (σ n )). If Ap is continuous then by the Monotone Convergence Theorem
lim E (A (σ n )) = lim E (Ap (σ n )) = E lim Ap (σ n ) = n→∞ n→∞ n→∞
= E Ap lim σ n = E (Ap (σ)) = E (A (σ)) . n→∞
On the other hand let us assume that (C.12) holds. If tn t then E (Ap (tn )) = E (A (tn )) E (A (t)) = E (Ap (t)) . So E (|Ap (t) − Ap (tn )|) = E (Ap (t) − Ap (tn )) = E (Ap (t)) − E (Ap (tn )) → 0. Hence Ap (tn ) Ap (t) almost surely. Let P ⊆ R+ × Ω be the set of discontinuities of Ap . If P (projΩ P ) = 0 then Ap is almost surely continuous. P is a predictable set. So if P (projΩ P ) > 0 then there is18 a predictable stopping time σ such that Graph(σ) ⊆ P and 0 < P (σ < ∞). If (σ n ) is announcing σ then lim E (A (σ n )) = lim E (Ap (σ n )) = E (Ap (σ−)) < E (Ap (σ)) = E (A (σ)) .
n→∞
n→∞
Hence as A is regular Ap is almost surely continuous. Lemma C.19 If A ∈ Aloc then Ap is continuous if and only if for every sequence of stopping times σ n σ a.s.
lim A (σ n ) = A
n→∞
lim σ n = A (σ) .
(C.13)
n→∞
Proof. Let (τ k ) be the localizing sequence of A. Aτ k ∈ A+ , so τk
E (Aτ k (σ n ) − Aτ k (σ)) = E ((Ap )
τk
(σ n ) − (Ap )
(σ)) . a.s.
If Ap is continuous then one can prove again that Aτ k (σ n ) → Aτ k (σ). This a.s. obviously implies that A (σ n ) → A (σ n ). On the other hand if (C.13) holds then by the Monotone Convergence Theorem E (Aτ k (σ n )) E (Aτ k (σ)). Hence by p τ the previous lemma19 (Aτ k ) = (Ap ) k is almost surely continuous. So Ap is almost surely continuous. 17 See:
Property 4, page 217. Proposition 3.32, page 195. 19 See: Property 5, page 217. 18 See:
APPENDIX C
593
Proposition C.20 The compensator N p of a point process N is almost surely continuous if and only if the jump times of N are totally inaccessible20 . Proof. If N p is continuous and τ is a jump time and with positive probability ρn τ then by the previous lemma
N (τ −) = N lim ρn = N (τ ) . n→∞
That is on a set with positive probability N (τ −) = N (τ ) which is impossible as τ is a jump time of N . On the other hand if ρn ρ and with positive probability ρn < ρ then ρ cannot be a jump time of N as all the jump times are totally inaccessible. Therefore N (ρn ) → N (ρ−) = N (ρ). So by the previous lemma N p is almost surely continuous.
20 See:
Example 7.74, page 517.
Notes and Comments There are many good books on stochastic analysis: [4], [6], [21], [22], [48], [45], [53], [57], [59], [58], [63], [73], [74], [77], [78], [79]. A small part of the literature deals with the general theory where the integrators are general semimartingales, some of the books describe the theory when the integrators are continuous semimartingales. There are many books, and one can find a lot of lecture notes on the internet introducing the theory when the integrator is a Wiener process. Perhaps from pedagogical point of view, in an introductory course the simplest and most resolute approach is when the integrators are continuous semimartingales. This approach is sufficiently abstract to cover the most important results and from this perspective one can easily see the most elegant aspects of the stochastic analysis. The main advantage of this approach is that one can avoid the concept of predictability and every continuous local martingale is locally square-integrable. It is also very important that in this case the predictable quadratic variation and the quadratic variation are equal. On the other hand, the Wiener process case is a bit too elementary. It hides some very important aspects of the stochastic integration: mainly the role of the quadratic variation and especially if we introduce the stochastic integrals only for Wiener processes we cannot integrate when the integrator is already an integral process with respect to some Wiener process. Of course the popularity of the stochastic integration theory comes from its application in mathematical finance, and very often the stochastic integration is part of some courses on derivative prices, and already this very simple approach is a bit too demanding for an audience interested mainly in elementary financial applications. Perhaps the main disadvantage of the approach based on continuous semimartingales, is that in some sense it hides the most important aspects of stochastic analysis: its relation to other parts of probability theory. The canonical examples for semimartingales are the L´evy processes and the L´evy processes are mainly discontinuous. One can find a good account of the history of stochastic integration in [49]. Chapter 1 Filtration and stopping times were first investigated systematically in [19]. A good source about discrete-time martingales is [72]. The concept of predictability was introduced in [66]. As a general introduction one can also use [29] and [30] or [82]. Theorem 1.28 comes from [11]. Proposition 1.109 was borrowed from [74], while Proposition 1.112 is from [28]. The results about the first passage time 594
NOTES AND COMMENTS
595
of the Wiener process were taken from [78], [53] and [27]. The treatment of localization was taken from [45]. The definition of local martingales was introduced in [37]. The theorem on the quadratic variation of discrete time martingales was taken from [5] and [61]. Chapter 2 There are several approaches to stochastic integration. The theory started by Itˆ o [38], [39], [40]. Our introduction is mainly based on [78], which is based on [56]. See also [7] and [8]. The main problem with this approach is that one can first construct the quadratic variation or the predictable quadratic variation of the process and one can apply this construction only for locally square-integrable martingales. As every continuous local martingale is locally bounded, perhaps this approach is the most economical one in the continuous case. One can construct the predictable quadratic variation with the Doob–Meyer decomposition then construct the stochastic integral with the Hilbert space method of this chapter, and with the stochastic integral construct the quadratic variation as the correction term in the integration by parts formula. [9], [10], [45], [69]. The concept of semimartingales appears in [15]. Fisk’s theorem comes from [25]. Chapter 3 In this chapter I follow [45] and [74]. The Fundamental Theorem of Local Martingales is due to J. A. Yan. See: [43], [70], [17]. Chapter 4 In this chapter I also followed [45] and [74]. The proof of the discrete-time Davis’s inequality comes from [5]. The proof of Burkholder’s inequality was borrowed from [51]. The invariance of semimartingales under change of measure was studied in [87] and [47]. The discussion of the properties of the stochastic integration was based on [81]. Theorem 4.26 was taken from [57]. Chapter 5 The Doob–Meyer decomposition appears in [64], [65]. The proof was simplified by [75]. See also [36], [74] and [53]. The theory of quasimartingales was developed by [25], [76], [86] and [62]. The Bichteler–Dellacherie theorem was proved in [2], [3] and [14]. It appeared for the first time in [48]. [74] builds the theory of stochastic integration on this theorem. The theory of parametric stochastic integrals comes from [85], [18]. See also [19], [48], [52] and [50]. The present discussion of the integral representation builds on [74]. The Jacod–Yor theorem appears in [46]. See also: [12] and [13]. [44]. One can find the theorem in [45] and [57]. The final version of H1 BMO duality appears in [67] and [68]. One can find a simple proof for the Brownian case is in [73]. About BMO spaces see also: [54].
596
NOTES AND COMMENTS
Chapter 6 Most of the material is taken from [78], [53] and [74]. See also: [69] and [56]. The proof of L´evy’s theorem for the continuous case was borrowed from [56], see: [9]. F¨ ollmer’s theorem comes from [26]. The discussion of Dol´eans’ equation comes from [45], see: [16] and [17]. The discussion of local times is mainly based on [74], see [69]. I borrowed the proof of theorem of Dvoretzky–Erd˝ os–Kakutany from [53] which follows [55]. Chapter 7 In this chapter I followed mainly [45] and [74]. One can also consult [28], [83], [84] or [51]. The theory of L´evy processes and the L´evy–Khintchine formula has a long history. [31], [60]. The idea of characteristics goes back to [41]. Later it was studied in [88], [1], [33], [47], [32]. The characterization of processes with independent increments comes from [48], [34]. Appendix There is a vast literature on the Brownian motion: [78], [53]. The condition for almost sure convergence of the quadratic variation comes from [20]. Example B.19 comes from [24].
References [1] Benveniste, A., and Jacod, J. Syst`emes de L´evy des processus de Markov. Invent. Math. 21 (1973), 183–198. [2] Bichteler, K. Stochastic integrators. Bull. Amer. Math. Soc. (N.S.) 1, 5 (1979), 761–765. [3] Bichteler, K. Stochastic integration and Lp -theory of semimartingales. Ann. Probab. 9, 1 (1981), 49–89. [4] Bichteler, K. Stochastic Integration with Jumps, vol. 89 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 2002. [5] Chow, Y. S., and Teicher, H. Probability Theory: Independence, interchangeability, martingales, third ed. Springer Texts in Statistics. SpringerVerlag, New York, 1997. [6] Chung, K. L., and Williams, R. J. Introduction to Stochastic Integration, second ed. Probability and its Applications. Birkh¨ auser Boston Inc., Boston, MA, 1990. `ge, P. Int´egrales stochastiques associ´ees `a une martingale de carr´e [7] Courre int´egrable. C. R. Acad. Sci. Paris 256 (1963), 867–870. `ge, P. Int´egrale stochastiques et martingales de carr´e int´egrable. [8] Courre In S´eminaire de Th´eorie du Potentiel, dirig´e par M. Brelot, G. Choquet et J. Deny, 1962/63, No. 7. Secr´etariat math´ematique, Paris, 1964, p. 20. [9] Dellacherie, C., and Meyer, P.-A. Probabilit´es et Potentiel. Hermann, ´ Paris, 1975. Chapitres I a` IV, Edition enti`erement refondue, Publications de l’Institut de Math´ematique de l’Universit´e de Strasbourg, No. XV, Actualit´es Scientifiques et Industrielles, No. 1372. [10] Dellacherie, C., and Meyer, P.-A. Probabilit´es et Potentiel. Chapitres V ` a VIII, revised ed., vol. 1385 of Actualit´es Scientifiques et Industrielles [Current Scientific and Industrial Topics]. Hermann, Paris, 1980. Th´eorie des martingales. [Martingale theory]. [11] Dellacherie, C. Capacit´es et processus stochastiques. Springer-Verlag, Berlin, 1972. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 67. [12] Dellacherie, C. Int´egrales stochastiques par rapport aux processus de Wiener ou de Poisson. In S´eminaire de Probabilit´es, VIII (Univ. Strasbourg, ann´ee universitaire 1972-1973). Springer, Berlin, 1974, pp. 25–26. Lecture Notes in Math., Vol. 381. [13] Dellacherie, C. Correction a`: “Int´egrales stochastiques par rapport aux processus de Wiener ou de Poisson” (s´eminaire de probabilit´es, viii (Univ. Strasbourg, ann´ee universitaire 1972-1973), pp. 25–26, Lecture Notes in 597
598
[14] [15]
[16]
[17]
[18] [19]
[20] [21] [22] [23] [24] [25] [26]
[27] [28]
[29]
REFERENCES
Math., Vol. 381, Springer, Berlin, 1974). In S´eminaire de Probabilit´es, IX (Seconde Partie, Univ. Strasbourg, Strasbourg, Ann´ees Universitaires 1973/1974 et 1974/1975). Springer, Berlin, 1975, pp. p. 494. Lecture Notes in Math., Vol. 465. Dellacherie, C. Un survol de la th´eorie de l’int´egrale stochastique. Stochastic Process. Appl. 10, 2 (1980), 115–144. ´ans-Dade, C., and Meyer, P.-A. Int´egrales stochastiques par Dole rapport aux martingales locales. In S´eminaire de Probabilit´es, IV (Univ. Strasbourg, 1968/69). Lecture Notes in Mathematics, Vol. 124. Springer, Berlin, 1970, pp. 77–107. ´ans-Dade, C. Quelques applications de la formule de changement de Dole variables pour les semimartingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 16 (1970), 181–194. ´ans-Dade, C. On the existence and unicity of solutions of stochastic Dole integral equations. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 36, 2 (1976), 93–101. ´ans, C. Int´egrales stochastiques d´ependant d’un param`etre. Publ. Inst. Dole Statist. Univ. Paris 16 (1967), 23–33. Doob, J. L. Stochastic Processes. Wiley Classics Library. John Wiley & Sons Inc., New York, 1990. Reprint of the 1953 original, A Wiley-Interscience Publication. Dudley, R. M. Sample functions of the Gaussian process. Ann. Probability 1, 1 (1973), 66–103. Durrett, R. Stochastic Calculus. Probability and Stochastics Series. CRC Press, Boca Raton, FL, 1996. A practical introduction. Elliott, R. J. Stochastic Calculus and Applications, vol. 18 of Applications of Mathematics (New York). Springer-Verlag, New York, 1982. Feller, W. An Introduction to Probability Theory and its Applications. Vol. II. Second edition. John Wiley & Sons Inc., New York, 1971. Fernandez de la Vega, W. On almost sure convergence of quadratic Brownian variation. Ann. Probability 2 (1974), 551–552. Fisk, D. L. Quasi-martingales. Trans. Amer. Math. Soc. 120 (1965), 369–389. ¨ llmer, H. Calcul d’Itˆ Fo o sans probabilit´es. In Seminar on Probability, XV (Univ. Strasbourg, Strasbourg, 1979/1980) (French), vol. 850 of Lecture Notes in Math. Springer, Berlin, 1981, pp. 143–150. Freedman, D. Brownian Motion and Diffusion. Springer-Verlang, New York, 1983. Gihman, I. I., and Skorohod, A. V. The Theory of Stochastic Processes. II. Springer-Verlag, New York, 1975. Translated from the Russian by Samuel Kotz, Die Grundlehren der Mathematischen Wissenschaften, Band 218. Gihman, I. I., and Skorohod, A. V. The Theory of Stochastic Processes. III. Springer-Verlag, Berlin, 1979. Translated from the Russian by Samuel Kotz, With an appendix containing corrections to Volumes I and II, Grundlehren der Mathematischen Wissenschaften, 232.
REFERENCES
599
[30] Gihman, I. I., and Skorohod, A. V. The Theory of Stochastic Processes. I, English ed., vol. 210 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1980. Translated from the Russian by Samuel Kotz. [31] Gnedenko, B. V., and Kolmogorov, A. N. Limit Distributions for Sums of Independent Random Variables. Translated from the Russian, annotated, and revised by K. L. Chung. With appendices by J. L. Doob and P. L. Hsu. Revised edition. Addison-Wesley Publishing Co., Reading, Mass.-LondonDon Mills., Ont., 1968. [32] Grigelionis, B. The Markov property of random processes. Litovsk. Mat. Sb. 8 (1968), 489–502. [33] Grigelionis, B. The representation of integer-valued random measures as stochastic integrals over the Poisson measure. Litovsk. Mat. Sb. 11 (1971), 93–108. [34] Grigelionis, B. Martingale characterization of random processes with independent increments. Litovsk. Mat. Sb. 17, 1 (1977), 75–86, 212. ´chal, C. Fundamentals of Convex [35] Hiriart-Urruty, J.-B., and Lemare Analysis. Springer-Verlag, Berlin, 2001. [36] Ikeda, N., and Watanabe, S. Stochastic Differential Equations and Diffusion Processes, vol. 24 of North-Holland Mathematical Library. NorthHolland Publishing Co., Amsterdam, 1981. ˆ , K., and Watanabe, S. Transformation of Markov processes by mul[37] Ito tiplicative functionals. Ann. Inst. Fourier (Grenoble) 15, fasc. 1 (1965), 13–30. ˆ , K. Stochastic integral. Proc. Imp. Acad. Tokyo 20 (1944), 519–524. [38] Ito ˆ , K. On a stochastic integral equation. Proc. Japan Acad. 22, nos. 1-4 [39] Ito (1946), 32–35. ˆ , K. On the stochastic integral. Sˆ [40] Ito ugaku 1 (1948), 172–177. ˆ , K. On stochastic differential equations. Mem. Amer. Math. Soc. 1951, [41] Ito 4 (1951), 51. [42] Jacobs, K. Measure and Integral. Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1978. Probability and Mathematical Statistics, With an appendix by Jaroslav Kurzweil. ´min, J. Caract´eristiques locales et conditions de conti[43] Jacod, J., and Me nuit´e absolue pour les semi-martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 35, 1 (1976), 1–37. ´min, J. Un th´eor`eme de repr´esentation des martingales [44] Jacod, J., and Me pour les ensembles r´eg´en´eratifs. In S´eminaire de Probabilit´es, X (Premi`ere partie, Univ. Strasbourg, Strasbourg, Ann´ee Universitaire 1974/1975). Springer, Berlin, 1976, pp. 24–39. Lecture Notes in Math., Vol. 511. [45] Jacod, J., and Shiryaev, A. N. Limit Theorems for Stochastic Processes, second ed., vol. 288 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 2003.
600
REFERENCES
´ [46] Jacod, J., and Yor, M. Etude des solutions extr´emales et repr´esentation int´egrale des solutions pour certains probl`emes de martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 38, 2 (1977), 83–125. [47] Jacod, J. Multivariate point processes: predictable projection, RadonNikod´ ym derivatives, representation of martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 31 (1974/75), 235–253. [48] Jacod, J. Calcul Stochastique et Probl`emes de Martingales, vol. 714 of Lecture Notes in Mathematics. Springer, Berlin, 1979. [49] Jarrow, R., and Protter, P. A short history of stochastic integration and mathematical finance: the early years, 1880–1970. In A festschrift for Herman Rubin, vol. 45 of IMS Lecture Notes Monogr. Ser. Inst. Math. Statist., Beachwood, OH, 2004, pp. 75–91. [50] Kailath, T., Segall, A., and Zakai, M. Fubini-type theorems for stochastic integrals. Sankhy¯ a Ser. A 40, 2 (1978), 138–143. [51] Kallenberg, O. Foundations of Modern Probability. Probability and its Applications (New York). Springer-Verlag, New York, 1997. [52] Kallianpur, G., and Striebel, C. Stochastic differential equations occurring in the estimation of continuous parameter stochastic processes. Teor. Verojatnost. i Primenen 14 (1969), 597–622. [53] Karatzas, I., and Shreve, S. E. Brownian Motion and Stochastic Calculus, 2 ed. Graduate Texts in Mathematics 113. Springer-Verlang, New York, 1991. [54] Kazamaki, N. Continuous Exponential Martingales and BMO, vol. 1579 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1994. [55] Knight, F. B. Essentials of Brownian Motion and Diffusion, vol. 18 of Mathematical Surveys. American Mathematical Society, Providence, R.I., 1981. [56] Kunita, H., and Watanabe, S. On square integrable martingales. Nagoya Math. J. 30 (1967), 209–245. [57] Liptser, R. S., and Shiryaev, A. N. Theory of Martingales. Mathematics and Its Applications. Kluwer Academic Publishers, Dordrecht, 1989. [58] Liptser, R. S., and Shiryaev, A. N. Statistics of Random Processes II: Applications, 2 ed. Applications of Mathematics 6. Springer-Verlang, Berlin, 2001. [59] Liptser, R. S., and Shiryaev, A. N. Statistics of Random Processes I: General Theory, 2 ed. Applications of Mathematics 5. Springer-Verlang, Berlin, 2001. `ve, M. Probability Theory. I-II, fourth ed. Springer-Verlag, New York, [60] Loe 1978. Graduate Texts in Mathematics, Vol. 46. [61] Malliavin, P. Integration and Probability. Graduate Text in Mathematics 157. Springer-Verlang, New York, 1995. ´tivier, M., and Pellaumail, J. On Doleans-F¨ [62] Me ollmer’s measure for quasi-martingales. Illinois J. Math. 19, 4 (1975), 491–504. ´tivier, M. Semimartingales, vol. 2 of de Gruyter Studies in Mathematics. [63] Me Walter de Gruyter & Co., Berlin, 1982. A course on stochastic processes.
REFERENCES
601
[64] Meyer, P.-A. A decomposition theorem for supermartingales. Illinois J. Math. 6 (1962), 193–205. [65] Meyer, P.-A. Decomposition of supermartingales: the uniqueness theorem. Illinois J. Math. 7 (1963), 1–17. [66] Meyer, P.-A. Int´egrales stochastiques. I, II, III, IV. In S´eminaire de Probabilit´es (Univ. Strasbourg, Strasbourg, 1966/67), Vol. I. Springer, Berlin, 1967, pp. 72–94, 95–117, 118–141, 142–162. [67] Meyer, P.-A. Le dual de ‘H 1 ’ est ‘BMO’ (cas continu). In S´eminaire de Probabilit´es, VII (Univ. Strasbourg, Ann´ee Universitaire 1971–1972). Springer, Berlin, 1973, pp. 136–145. Lecture Notes in Math., Vol. 321. [68] Meyer, P.-A. Compl´ement sur la dualit´e entre H 1 et BMO: ‘Le dual de “H 1 ” est “BMO” (cas continu)’ (S´eminaire de Probabilit´es, VII (Univ. Strasbourg, ann´ee universitaire 1971–1972), pp. 136–145, Lecture Notes in Math., Vol. 321, Springer, Berlin, 1973). In S´eminaire de Probabilit´es, IX (Seconde Partie, Univ. Strasbourg, Strasbourg, ann´ees universitaires 1973/1974 et 1974/1975). Springer, Berlin, 1975, pp. 237–238. Lecture Notes in Math., Vol. 465. [69] Meyer, P.-A. Un cours sur les int´egrales stochastiques. In S´eminaire de Probabilit´es, X (Seconde Partie: Th´eorie des Int´egrales Stochastiques, Univ. Strasbourg, Strasbourg, Ann´ee Universitaire 1974/1975). Springer, Berlin, 1976, pp. 245–400. Lecture Notes in Math., Vol. 511. [70] Meyer, P.-A. Notes sur les int´egrales stochastiques. II. le th´eor`eme fondamental sur les martingales locales. In S´eminaire de Probabilit´es, XI (Univ. Strasbourg, Strasbourg, 1975/1976). Springer, Berlin, 1977, pp. 463–464. Lecture Notes in Math., Vol. 581. [71] Neveu, J. Mathematical Foundations of the Calculus of Probability. Translated by Amiel Feinstein. Holden-Day Inc., San Francisco, Calif., 1965. [72] Neveu, J. Discrete-Parameter Martingales, revised ed. North-Holland Publishing Co., Amsterdam, 1975. Translated from the French by T. P. Speed, North-Holland Mathematical Library, Vol. 10. [73] Øksendal, B. Stochastic Differential Equations, sixth ed. Universitext. Springer-Verlag, Berlin, 2003. An introduction with applications. [74] Protter, P. E. Stochastic Integration and Differential Equations, second ed., vol. 21 of Applications of Mathematics (New York). Springer-Verlag, Berlin, 2004. Stochastic Modelling and Applied Probability. [75] Rao, K. M. On decomposition theorems of Meyer. Math. Scand. 24 (1969), 66–78. [76] Rao, K. M. Quasi-martingales. Math. Scand. 24 (1969), 79–92. [77] Rao, M. M. Stochastic Processes and Integration. Martinus Nijhoff Publishers, The Hague, 1979. [78] Revuz, D., and Yor, M. Continuous Martingales and Brownian Motion. Grundlehren der mathematischen Wissenschaften 293. Springer-Verlang, Berlin, 1999. [79] Rogers, L. C. G., and Williams, D. Diffusions, Markov Processes, and Martingales. Vol. 1. Cambridge Mathematical Library. Cambridge University
602
[80] [81]
[82]
[83] [84]
[85] [86]
[87] [88]
REFERENCES
Press, Cambridge, 2000. Foundations, Reprint of the second (1994) edition. Rudin, W. Real and Complex Analysis, third ed. McGraw-Hill Book Co., New York, 1987. Shiryaev, A. N., and S., C. A. A vector stochastic integral and the fundamental theorem of asset pricing. Tr. Mat. Inst. Steklova 237, Stokhast. Finans. Mat. (2002), 12–56. Shiryaev, A. N. Probability, second ed., vol. 95 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1996. Translated from the first (1980) Russian edition by R. P. Boas. Skorohod, A. V. Processes with independent increments (in Russian). Izdat. “Nauka”, Moscow, 1964. Skorokhod, A. V. Studies in the Theory of Random Processes. Translated from the Russian by Scripta Technica, Inc. Addison-Wesley Publishing Co., Inc., Reading, Mass., 1965. Stricker, C., and Yor, M. Calcul stochastique d´ependant d’un param`etre. Z. Wahrsch. Verw. Gebiete 45, 2 (1978), 109–133. Stricker, C. Quasimartingales, martingales locales, semimartingales et filtration naturelle. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 39, 1 (1977), 55–63. Van Schuppen, J. H., and Wong, E. Transformation of local martingales under a change of law. Ann. Probability 2 (1974), 879–888. Watanabe, S. On discontinuous additive functionals and L´evy measures of a Markov process. Japan. J. Math. 34 (1964), 53–70.
Index Absolutely continuous measure 208 Adapted process 10 Announcing stopping time 182 Associativity 160 Augmented filtration 10, 67, 561
measure 151, 208, 503 process 295 semimartingale 151 Doob’s inequality continuous-time 33 discrete-time 30 Doob–Meyer decomposition 207, 292, 303 random measures 501 Downcrossing 43 Dynkin’s formula 360
Bessel process 359 BMO 341 Burkholder’s inequality 287 Canonical decomposition of special semimartingales 205, 510 Canonical model 3, 555 Carath´ eodory 295 Characteristics L´ evy process 488, 509 local martingales 511 processes with independent increments 532 semimartingales 509 special semimartingale 510 Class D , 102, 292 Closed under truncation 93, 341, 555, 581 Compensated single jump 243 Compensator 141 continuous 141 predictable 141 Complete space 1 Conditional expectation 190 extended 192 generalized 191 Consistent measure spaces 385 Continuous local martingales with finite variation 117 Control measure 478 Counting process 461
Energy equality 35, 117, 119 Evanescent set 208 Excursion intervals 448 Exhausting sequence 188 Exponential martingale L´ evy process 66 Poisson process 131 Wiener process 57 Exponential semimartingale 412 Extended conditional expectation 192 Extremal point 330 Fefferman’s inequality 338 Filtration 8 augmented 10, 67, 561 generated by a process 9 left-continuous 8 right-continuous 8 Finite variation 2, 179 continuous local martingales 117 F∞ , 8 First passage time 15, 18, 57, 80 Laplace transform 82, 494 process 90, 494 Fτ , 20 Fτ + , 20 Ft− , 8 Ft+ , 8 Fundamental Theorem of Local Martingales 220
Davis’ inequality 103, 235, 253, 277, 279, 327 D´ ebut 15 Dellacherie’s formula 299, 301 ∆X, 4 Deterministic process 23 Dirichlet–Doob class 102, 292 Dol´ eans equation 411, 526 solution 412
Generalized conditional expectation 191 Girsanov transformation 379 formula 381
603
604
INDEX
Girsanov (Cont.) continuous local martingale 381 Wiener process 383 Good integrator 308 Green function 428 H02 -measure
329 Haar’s functions 568 Hitting time 15 Borel sets 16 closed sets 17 open sets 17 Homogeneous increments 59 Hp -martingales 34, 328 Independent increments 58 Indistinguishable processes 6 predictable 213 Inequality Burkholder 287 Davis 103, 235, 277, 279, 327 Doob 30, 33, 49, 127, 147, 164, 173 downcrossing 44 Fefferman 338 Jensen 30 Kunita–Watanabe 134, 152 Infinitesimal sequence of partitions 109 Integrable variation 179 Integral Itˆ o–Stieltjes 113 random measure 497 Riemann–Stieltjes 112 stochastic 171 continuous local martingales 152 local martingales 248 purely discontinuous local martingales 248 Integral process 126 Integral Representation Property 346, 373 Integration by parts 129, 177, 222, 353 Itˆ o’s isometry 156, 164 Itˆ o–Stieltjes integral 113, 115, 120, 122, 123, 125, 352 Itˆ o’s formula for continuous semimartingales 353 for non continuous semimartingales 394 time dependent 356 Jensen’s inequality 30 Jump 4 Jump time 465 L´ evy processes 465 processes with independent increments 517 right-regular processes 517 point process 593
Kazamaki’s criteria 387, 391 Kolmogorov type filtered space 385 Kunita–Watanabe inequality 134, 152 L1 -identity 475 general 477, 486 λ-system 547 Laplace equation 360, 364 Laplace operator 100 Laplace transform first passage time 494 Poisson process 131 stopping time 82 L´ evy kernel 492, 518 L´ evy–Khintchine formula 488, 534, 539 formula for extended Poisson processes 590 kernel 492 Local martingale 94 constant 145 continuous part 233 continuous, square-integrable 96 exponential transformation 381 H2 martingales 148, 170 independent increments 104, 545 integration with respect to continuous local martingales, 155 L2 bounded, not martingale 100 logarithm 380 martingale 102, 104, 545 non negative are supermartingale 101 orthogonal 227 strongly 227, 229, 343, 346 predictable 205, 229 purely discontinuous 228 part 233 quadratic pure jump 244 quadratic co-variation 141 quadratic variation 139 zero 144 square-integrable 96 Local martingales independent 144, 375 Local time 425 Localization 92 Localizing sequence 92 Locally absolutely continuous measures 264, 378 Locally bounded process 106 left-regular 106, 169 right-regular and predictable 200 right-regular with bounded jumps 107, 510, 540 Locally equivalent measures 378 Locally finite measure 404 Locally integrable variation 180
INDEX Locally square-integrable martingales 96 Lp (M ), 151, 343 Martingale 30 BMO 341 difference 279 equivalent 34 exponential 57, 66 L´ evy process 66 Hp , 34, 328 2 , 96 not in Hloc reversed 45 square-integrable 34 structure 46 trajectory 49 uniformly integrable 48, 94, 102 Martingale Representation Property 329 Measurable Selection Theorem 196, 550 Measure generated by a process 208 Measure space Kolmogorov 98 Meyer–Tanaka formula 436 Meyer–Itˆ o formula 432 Minimal representation 557, 579 Modification 6 Monotone Class Theorem 550 Natural process 299, 585 Novikov’s criteria 391 Optional set not predictable 27, 200 σ-algebra 27 Optional Sampling Theorem 53, 153 Orthogonal H2 -martingales 230, 470 local martingales 227 strongly 227, 229, 343, 346 π-system 547 Point of increase 457 strict 457 Point process 579 filtration 580 minimal representation 579 predictable compensator 582 single jump 582 stopping time 581 Poisson measure 478, 536 Poisson process 461 characterization extended 589 compound 180, 258, 477, 511 compensated 56, 125 compound Fourier transform 467, 477, 486 extended 589, 590 generalized , 591
605
independent 469, 485 common jumps 471 Integral Representation Property 347 predictable compensator 588 purely discontinuous 226 totally inaccessible stopping time 183, 591 with respect to a filtration 461 Potential 293 Predictable compensator 141 compound Poisson process 216 continuous 591, 592 existence 213 point process 582 Poisson process 588 properties 217 random measure 497, 501 cumulant 518 kernel 500 not predictable right-regular process 27, 200, 219 Optional Sampling Theorem 204 process 23 indistinguishable 213 projection 194, 206 existence 206 local martingales 205 properties 201 rectangle 27 sequence 280 Process adapted 10 deterministic 10, 23 L´ evy 58 optional 27 optional not predictable 27, 200, 219 Poisson 461, 590 extended 589 predictable 23 step 25, 173, 308 progressively measurable 10 pure quadratic jump 244 Radon–Nikodym 265, 378 simple 252 predictable 24, 165, 175 bounded 176 stopped, truncated 19 symmetric stable 491, 512 Wiener 559 with independent increments 58 Progressive measurability 10 intuitive meaning 12 Projection Theorem 196, 550 Pure quadratic jump process 244
606
INDEX
Purely discontinuous local martingales 228, 229, 232, 233 not in V, 233 quadratic pure jump process 244 Quadratic co-variation 129 continuous local martingales 141 local martingales 222 predictable 224, 229 Quadratic variation 37, 38, 128, 129, 404 continuous local martingales 139, 169 fundamental properties 222 local martingales 222 purely discontinuous , 244 Poisson process 129 compensated 142 predictable 224 semimartingale 245 stopping rule 143 Wiener process 110, 129 Quasi-martingale 305, 311 Radon–Nikodym process 265, 378 Random interval 24 Random measure 473, 474, 497 finite 498 integral 208, 497 locally finite 498 measurable 498 Poisson 478 predictable compensator 496, 501 σ-finite 498 Random variable 1 Realization 2 Regular 3 left 4 right 4 Reversed martingales 45 Riemann–Stieltjes integral 111, 112 Semimartingale 124, 180 complex 356 continuous 124 decomposition 259 unique decomposition 124 continuous part 235 deterministic 523 good integrator 308 special 180 with independent increments 524 sign function 419 Single jump process 243 continuously compensated 243 Skorohod lemma 445, 447, 448, 450 Space Lp (M ), 151, 343 Special semimartingale 180 canonical decomposition 205, 508, 510 characterisation 257
stable process 512 Spectral measure L´ evy process 475 right-regular process 497 Stable subspace 329 Stationary increments 59 Stochastic basis 8 extension 374 Stochastic integration change of measure 271 continuous local martingales 155 continuous H2 martingales 152 local martingales 248, 255 additivity 251 associativity 252 purely discontinuous 248 unambiguous 254 locally square-integrable martingales 174 semimartingales 254 additivity 261 associativity 262 locally bounded processes 176 unambiguous 254 special semimartingales 259 Wiener process 318 Stochastic process 2 Stone lattice 547 Stopped process 19 predictable 23 progressivly measurable 22 Stopped σ-algebra 20 convergence 52 Stopped variable 20 Fτ -measurable 22 Fτ − -measurable 186 Stopping rule for quadratic variation 139, 143, 169, 170 for stochastic integrals 154, 159 Stopping time 13 accessible 189, 274 announcing 182 construction 15 first passage time 80 Fτ -measurable 20 Fτ − -measurable 186 predictable 182, 190 Strong Markov property L´ evy process 70, 183 processes with independent increments 76 totally inaccessible 182, 190, 274 independent increments 517 L´ evy process 465 point process 593 Poisson process 183, 591 right-regular processes 517 weak 13
INDEX Wiener process 81 Laplace transform 82 Stricker 313 Strong Markov property L´ evy process 70, 75, 183, 363, 462, 466 processes with independent increments 76 Submartingale 29 integrable 29 Supermartingale 30 integrable 29 Tanaka’s formula 426, 445, 450 Theorem Austin 37 Bichteler–Dellacherie 309 characterization of continuous L´evy processes 60 of continuous H2 martingales 148 of continuous L´evy processes 367, 487 of continuous processes with independent increments, 367, 539, 545 of H2 martingales 170 2 martingales 223 of Hloc of special semimartingales 257 of Wiener process 368, 445, 448 continuous part of local martingales 232, 481 of semimartingales 234 Cram´ er 539 Dol´ eans 302 Dominated Convergence Theorem 162, 175, 253 Doob–Meyer 292 Dvoretzky–Erd˝ os–Kakutani 457 Fisk 117, 139, 205, 215, 293 generalized 229 F¨ ollmer 405 Fubini bounded integrand 322, 434 unbounded integrand 324 Fundamental Theorem of Local Martingales 219 generalized Radon–Nikodym 208, 215, 299 Hahn–Banach 311, 331 invariance of semimartingales 378 Itˆ o’s isometry 156 Jacod–Yor 341 L´ evy characterization of Wiener processes 60, 368, 445, 448 decreasing σ-algebras 46, 573 increasing σ-algebras 41 L´ evy–Khintchine formula 488, 532, 534, 539
607
local martingales with independent increments 545 Martingale Convergence 39 p H spaces 40 non-negative martingales 40 uniformly integrable martingales 40 Measurable Selection 196, 550 Monotone Class 550 Optional Sampling continuous-time 53, 153 discrete-time 49 predictable 204 submartingale 54 Paley–Wiener–Zygmund 561 Predictable Section 195, 592 Projection 16, 17, 196, 550 Ray–Knight 455 regularization of martingales 48 Riesz representation 523 semimartingales and change of measure 309 Stricker 313 structure of processes with independent increments 538 submartingale convergence 44, 293 unique decomposition of special semimartingales 205 Thin set 188 Trajectory 2 Truncated process 19 Truncating function 508, 530 ucp convergence 105 Uniform convergence in probability on every compact interval 105, 163, 175 Uniformly integrable 41, 53, 55 Usual conditions augmented filtration 67, 69 filtration 8, 46 L´ evy process 464 martingale regularization 48 stochastic basis 8 Vague topology 404 Vector lattice 547 Vector measure 479 Wiener integrals 125 Wiener measure 570, 572 Wiener process 487, 489, 559, 561 augmented filtration 70 canonical 9 construction 567 exponential martingale 57 Fubini’s theorem 328 hits an open set one-dimensional 566 two-dimensional 364
608
INDEX
Wiener process (Cont.) Integral Representation Property 347, 373 intersecting a line 88 Law of large numbers 565 limits in infinity 564 local maximum is unique 450 local time 445 L´ evy’s representation 494 maximum 86, 447, 450
maximum is unique 449 non-differentiable 561 reflected 82, 370 set of zeros 448 stochastic integration 318 stopped 80 Tanaka’s formula 445, 450 truncated Yor’s formula 412