Springer Finance
Editorial Board
M. Avellaneda
G. Barone-Adesi
M. Broadie
M.H.A. Davis E. Dennan
C. Kliippelberg
E. Kopp
w. Schachermayer
Springer
London Berlin Heidelberg New York Hong Kong Milan Paris Tokyo
Springer Finance Springer Finance is a prograrmne of books aimed at students, academics, and
practitioners working on increasingly technical approaches to the analysis of financial markets. It aims to cover a variety of topics, not only mathematical
finance but foreign exchanges, tenn structure, risk management, portfolio theory, equity derivatives, and financial economics.
M Ammann, Credit Risk Valuation: Methods, Models, and Applications (2001)
E. Barucci, Financial Markets Theory: Equilibrium, Efficiency and Information (2003)
N.H. Bingham and R. Kiesel, Risk-Neutral Valuation: Pricing and Hedging of Financial Derivatives, 2nd Edition (2004)
T.R. Bielecki and M Rutkowski, Credit Risk: Modeling, Valuation and Hedging (2001)
D. Brigo amd F. Mercurio, Interest Rate Models: Theory and Practice (2001)
R. Buff, Uncertain Volatility Models - Theory and Application (2002)
R.-A. Dana and M Jeonblonc, Financial Markets in Continuous Time (2003)
G. Deboeck and T. Kohonen (Editors), Visual ExplDnltions in Finance with Self
Organizing Maps (1998)
R.J. Elliott and P.E. Kopp, Mathematics of Financial Markets
(1999)
H. Geman, D. Madan, S.R. Pliska and T. Vorst (Editors), Mathematical Finance Bachelier Congress 2000 (2001)
M Gundlach and F. Lehrbass (Editors), CreditRisk+ in the Banking Industry (2004) Y.-K. Kwok. Mathematical Models of Financial Derivatives (1998)
M Ku/pmann. Irrational Exuberance Reconsidered: The Cross Section of Stock Returns, 2"" Edition (2004)
A. Pelsser, Efficient Methods for Valuing Interest Rate Derivatives (2000)
J.-L Prigent, Weak Convergence of Financial Markets (2003)
B. Schmid, Credit Risk Pricing Models: Theory and Practice, 2 "" Edition (2004)
S.E. Shreve. Stochastic Calculus for Finance I: The Binomial Asset Pricing Model (2004) S.E. Shreve, Stochastic Calculus for Finance II: Continuous-Time Models (2004)
M Yor, Exponential Functionals of Brownian Motion and Related Processes (2001) R. Zagst, Interest-Rate Management (2002)
y'-L Zhu and l-L Chern, Derivative Securities and Difference Methods (2004)
A. Ziegler, Incomplete Information and Heterogeneous Beliefs in Continuous-Time Finance (2003) A. Ziegler, A Game Theory Analysis of Options: CorpDnlte Finance and Financial Intermediation in Continuous Time, 2"" Edition (2004)
N
.
R
.
Bingham and
R.
Kiesel
Risk-Neutral Valuation Pricing and Hedging of Financial Derivatives Second Edition
,
Springer
Nicholas H. Bingham, ScD
RUdiger Kiesel, PhD
Department of Probability and Statistics
Department of Financial
University of Sheffield
Mathematics
Sheffield S3 7RH, UK
University of Ulm
Department of Mathematical Sciences
89069 Ulm, Germany
BruneI University
Department of Statistics
Uxbridge
London School of Economics
Middlesex UB8 3PH, UK
London WC2A 2AE, UK
British Library Cataloguing in Publication Data Bingham, N. H. Risk-neutral valuation: pricing and hedging of financial derivatives. - 2nd ed. 1. Investments - Mathematical models
2. Finance -
Mathematical models I. Title
n. Kiesel, RUdiger, 1962-
332'.015118 ISBN 1852334584 Library of Congress Cataloging-in-Publication Data Bingham, N. H. Risk-neutral valuation: pricing and hedging of financial derivatives / N.H. Bingham and R. Kiesel.- 2nd ed. p. cm. - (Springer finance) Includes bibliographical references and index. ISBN 1-85233-458-4 Calk. paper) 1. Investments-Mathematical models. RUdiger. 1962HG4515.2.B56
II. Title.
2. Finance-Mathematical models.
I. Kiesel,
III. Series.
2004
332.64'57-dc22
2003067310
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be repro duced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the tenTIS of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those tenTIS should be sent to the publishers. ISBN 1-85233-458-4 Springer-Verlag London Berlin Heidelberg ISSN 1616-0533 Springer-Verlag is part of Springer Science+Business Media springeronline.com © Springer-Verlag London Limited 2004 Printed in the United States of America The use of registered names, trademarks etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Typesetting: Camera-ready by authors 12/3830-543210 Printed on acid-free paper SPIN 10798207
To my mother, Blanche Louise Bingham (nee Corbitt,
1912-)
and to the memory of my father, Robert Llewelyn Bingham
(1904-1972).
Nick
Filr Corina, Una und Roman.
Rudiger
Preface to the Second Edition
Books are written for use, and the best compliment that the community in the field could have paid to the first edition of 1998 was to buy out the print run, and that of the corrected printing, as happened. Meanwhile, the fast developing field of mathematical finance had moved on, as had our thinking, and it seemed better to recognize this and undertake a thorough-going re write for the second edition than to tinker with the existing text. The second edition is substantially longer than the first; the principal changes are as follows. There is a new chapter ( the last, Chapter 9) on credit risk - a field that seemed too important to exclude. We have included continuous-time processes more general than the Gaussian processes of the Black-Scholes theory, in order in particular to model driving noise with jumps. Thus we include material on infinite divisibility and Levy processes, with hy perbolic models as a principal special case, in recognition of the growing importance of 'Levy finance'. Chapter 5 is accordingly extended, and new material on Levy-based models is included in Chapter 7 on incomplete mar kets. Also on incomplete markets, we include more material on criteria for selecting one equivalent martingale measure from many, and on utility-based approaches. However, arbitrage-based arguments and risk-neutral valuation remain the basic theme. It is a pleasure to record our gratitude to the many people who have had a hand in this enterprise. We thank our students and colleagues over the years since the first edition was written at Birkbeck - both Nick ( BruneI and Sheffield ) and Rudiger ( London School of Economics and Ulm) . Special thanks go to Holger Hafting, Torsten Kleinow and Matthias Scherer. In partic ular, we thank Stefan Kassberger for help with Sections 8.5 and 8.6, and with parts of Chapter 9. We are grateful to a number of sharp-eyed colleagues in the field who have pointed out errors ( of commission or omission ) in the first edition, and made suggestions. And we thank our editors at Springer-Verlag for bearing with us gracefully while the various changes ( in the subject, our jobs and lives etc. ) resulted in the second edition being repeatedly delayed. Again: last, and most, we thank our families for their love, support and forbearance throughout. August 2003 Nick, Brunel and Sheffield Riidiger, Ulm and LSE
Preface to the First Edition
The prehistory of both the theory and the practicalities of mathematical fi nance can be traced back quite some time. However, the history proper of mathematical finance - at least, the core of it, the subject-matter of this book - dates essentially from 1973. This year is noted for two developments, one practical, one theoretical. On the practical side, the world's first options exchange opened in Chicago. On the theoretical side, Black and Scholes pub lished their famous paper (Black and Scholes 1973) on option pricing, giving in particular explicit formulae, hedging strategies for replicating contingent claims and the Black-Scholes partial differential equation. Both Black's article (Black 1989) and the recent obituary of Fischer Black (1938-1995) (Chichilnisky 1996) contain accounts of the difficulties Black and Scholes encountered in trying to get their work published. After several rejec tions by leading journals, the paper finally appeared in 1973 in the Journal of Political Economy. It was alternatively derived and extended later that year by Merton. Thus, like so many classics, the Black-Scholes and Merton papers were ahead of their time in the economics and financial communities. Their ideas became better assimilated with time, and the Arbitrage Pricing Technique of S.A. Ross was developed by 1976-1978; see (Ross 1976) , (Ross 1978) . In 1979, the Cox-Ross-Rubinstein treatment by binomial trees (Cox, Ross, and Rubinstein 1979) appeared, allowing an elementary approach showing clearly the basic no-arbitrage argument, which is the basis of the majority of con tingent claim pricing models in use. The papers (Harrison and Kreps 1979), (Harrison and Pliska 1981) made the link with the relevant mathematics martingale theory - explicit. Since then, mathematical finance has devel oped rapidly - in parallel with the explosive growth in volumes of derivatives traded. Today, the theory is mature, is unchallengeably important, and has been simplified to the extent that, far from being controversial or arcane as in 1973, it is easy enough to be taught to students - of economics and finance, financial engineering, mathematics and statistics - as part of the canon of modern applied mathematics. Its importance was recognized by the award of the Nobel Prize for Economics in 1997 to the two survivors among the three founding pioneers, Myron Scholes and Robert Merton (Nobel prize laudatio 1997) .
x
Preface to the First Edition
The core of the subject-matter of mathematical finance concerns ques tions of pricing - of financial derivatives such as options - and hedging covering oneself against all eventualities. Pervading all questions of pricing is the concept of arbitrage. Mispricing will be spotted by arbitrageurs, and exploited to extract riskless profit from your mistake, in potentially unlimited quantities. Thus to misprice is to expose oneself to being used as a money pump by the market. The Black-Scholes theory is the main theoretical tool for pricing of options, and for associated questions of trading strategies for hedging. Now that the theory is well-established, the profit margins on the standard - 'vanilla' - options are so slender that practitioners constantly seek to develop new - nonstandard or 'exotic' - options which might be traded more profitably. And of course, these have to be priced - or one will be used as a money-pump by arbitrageurs ... The upshot of all this is that, although standard options are well established nowadays, and are accessible and well understood, practitioners constantly seek new financial products, of ever greater complexity. Faced with this open-ended escalation of the theoretical problems of mathematical finance, there is no substitute for understanding what is going on. The gist of this can be put into one sentence: one should discount everything, and take expected values under an equivalent martingale measure. Now discounting has been with us for a long time - as long as inflation and other concomitants of capitalism - and makes few mathematical demands beyond compound inter est and exponential growth. By contrast, equivalent martingale measures the terminology is from ( Harrison and Pliska 1981), where the concept was first made explicit - make highly non-trivial mathematical demands on the reader, and in consequence present the expositor with a quandary. One can presuppose a mathematical background advanced enough to include measure theory and enough measure-theoretic probability to include martingales say, to the level covered by the excellent text ( Williams 1991). But this is to restrict the subject to a comparative elite, and so fails to address the needs of most practitioners, let alone intending ones. At the other extreme, one can eschew the language of mathematics for that of economics and finance, and hope that by dint of repetition the recipe that eventually emerges will appear natural and well-motivated. Granted a leisurely enough approach, such a strategy is quite viable. However, we prefer to bring the key concepts out into the light of day rather than leave them implicit or unstated. Con sequently, we find ourselves committed to using the relevant mathematical language - of measure theory and martingales - explicitly. Now what makes measure theory hard ( final year material for good mathematics undergradu ates, or postgraduates ) is its proofs and its constructions. As these are only a secondary concern here - our primary concern being the relevant concepts, language and viewpoint - we simply take what we need for granted, giving chapter and verse to standard texts, and use it. Always take a pragmatic view in applied mathematics: the proof of the pudding is in the eating.
Preface to the First Edition
xi
The phrase 'equivalent martingale measure' is hardly the language of choice for practitioners, who think in terms of the risk-adjusted or - as we shall call it - risk-neutral measure: the key concept of the subject is risk neutrality. Since this concept runs through the book like a golden thread (roter Faden, to use the German) , we emphasize it by using it in our title. One of the distinctive features of mathematical finance is that it is, by its very nature, interdisciplinary. At least at this comparatively early stage of the subject's development, everyone involved in it - practitioners, students, teachers, researchers - comes to it with his/her own individual profile of expe rience, knowledge and motivation. For ourselves, we both have a mathematics and statistics background (though the second author is an ex-practitioner) , and teach the subject to a mixed audience with a high proportion of prac titioners. It is our hope that the balance we strike here between the math ematical and economic/financial sides of the subject will make the book a useful addition to the burgeoning literature in the field. Broadly speaking, most books are principally aimed at those with a background on one or the other side. Those aiming at a more mathematically advanced audience, such as the excellent recent texts (Lamberton and Lapeyre 1996) and (Musiela and Rutkowski 1997) , typically assume more mathematics than we do specifically, a prior knowledge of measure theory. Those aiming at a more economic/financial audience, such as the equally excellent books (Cox and Rubinstein 1 985) and of (Hull 1 999) , typically prefer to 'teach by doing' and leave the mathematical nub latent rather than explicit. We have aimed for a middle way between these two. We begin with the background on financial derivatives or contingent claims in Chapter 1 , and with the mathematical background in Chapter 2, leading into Chapter 3 on stochastic processes in discrete time. We apply the theory developed here to mathematical finance in discrete time in Chapter 4. The corresponding treatment in continuous time follows in Chapters 5 and 6. The remaining chapters treat incomplete markets and interest rate models. We are grateful to many people for advice and comments. We thank first our students at Birkbeck College for their patience and interest in the courses from which this book developed, especially Jim Aspinwall and Mark Deacon for many helpful conversations. We are grateful to Tomas Bjork and Martin Schweizer for their careful and scholarly suggestions. We thank Jon McLoone from Wolfram UK for the possibility of using Mathematica for numerical experiments. Thanks to Alex Schone who always patiently and helpfully explained the mysteries of LaTeX to the second author over the years ohne Dich, Alex, wurde ich immer noch im Handbuch nachschlagen! It is a pleasure to thank Dr Susan Hezlet and the staff of Springer-Verlag UK for their support and help throughout this project. And, last and most, we thank our families for their love, support and forbearance while this book was being written. N.H. Bingham Rudiger Kiesel London, March 1998 -
Contents
Preface to the Second Edition Preface to the First Edition 1.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Derivative Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 . 1 Financial Markets and Instruments . . . . . . . . . . . . . . . . . . . . . . . 2 1 . 1 . 1 Derivative Instruments . . . . . . . . . . . . . . . . . . . .. . . . . . . . 2 1 . 1.2 Underlying Securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1 . 3 Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1 . 1 .4 Types of Traders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1 . 5 Modeling Assumptions . . . . . . . . . . . .. . . . . . . . . . . . . . . . 6 1 . 2 Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1 . 3 Arbitrage Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 . 3 . 1 Fundamental Determinants of Option Values . . . . . . . . . 1 1 1 . 3 . 2 Arbitrage Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 1 . 4 Single-period Market Models . . . . . .. . . . .. . . . . . . . . . . . . . . . . . 1 5 1 .4. 1 A Fundamental Example . . . . . . . . . . . . . . . . . . . . . . . . . . 1 5 1 .4 . 2 A Single-period Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1 .4 . 3 A Few Financial-economic Considerations . . . . . . . . . . . 2 5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 .
2.
Probability Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . 1 Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . 3 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Equivalent Measures and Radon-Nikodym Derivatives . . . . . . . 2 . 5 Conditional Expectation . . . . . . . . .. . . . ... . . . . . . . . . . . . . . .. . 2 . 6 Modes of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . 7 Convolution and Characteristic Functions . . . . . . . . . . . . . . . . . 2 .8 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Asset Return Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . 1 0 Infinite Divisibility and the Levy-Khintchine Formula . . . . . .. 2 . 1 1 Elliptically Contoured Distributions . . . . . . . . . . . . . . . . . . . . . . . 2 . 12 Hyberbolic Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
29 30 34 37 42 44 51 53 57 61 63 65 67
xiv
Contents
Exercises 3.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Stochastic Processes in Discrete Time . ..... .. . . 3.1 Information and Filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Discrete-parameter Stochastic Processes . . . . . . . . . . . . . . . . . . 3.3 Definition and Basic Properties of Martingales . . . . . 3.4 Martingale Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Stopping Times and Optional Stopping . . . . . . . . . . . . . . . . . . . 3.6 The Snell Envelope and Optimal Stopping . . . . . . . . . . . . . . . . 3.7 Spaces of Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . ... . . . . .
75 75 77 78 80 82 88 94 96 98
.... . . .. . 4 . 1 The Model . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Existence of Equivalent Martingale Measures . . . . . . . . . . . . . . 4.2.1 The No-arbitrage Condition . . . . 4.2.2 Risk-Neutral Pricing . . .. .. . . . ..... .. 4.3 Complete Markets: Uniqueness of EMMs . . .... ..... 4.4 The Fundamental Theorem of Asset Pricing: Risk-Neutral Valuation . . . . . . . .. 4.5 The Cox-Ross-Rubinstein Model . . . . 4 .5. 1 Model Structure . . . . . . . . . . . . . .. . 4.5.2 Risk-neutral Pricing . . . . . . . . . . . 4 .5.3 Hedging . . . . . . . ... . . . . 4.6 Binomial Approximations . . . .. . . . 4.6.1 Model Structure . . .. . . ... . ... 4.6.2 The Black-Scholes Option Pricing Formula . . . . . . 4 .6.3 Further Limiting Models . . . . . . . . . 4.7 American Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 Theory . . . . .. . . . 4 .7.2 American Options in the CRR Model . . . . . . . . . 4.8 Further Contingent Claim Valuation in Discrete Time . . 4.8.1 Barrier Options . . . . . . . . . 4.8.2 Lookback Options . . .. 4 .8.3 A Three-period Example . .. . . . .. .... 4 .9 Multifactor Models . . . . . . . . . . . . . . 4.9.1 Extended Binomial Model . . . . . ..... . 4 .9.2 Multinomial Models . . . . . Exercises . . . ... . .. .. ..
101 101 105 105 1 12 116
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4.
71
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Mathematical Finance in Discrete Time .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 18 121 122 124 126 130 130 131 136 138 138 141 14 3 14 3 144 145 14 7 147 14 8 150
Contents
5.
xv
Stochastic Processes in Continuous Time . . . . . . . . . . 153 5.1 Filtrations; Finite-dimensional Distributions . . . . . . . . . . 153 5.2 Classes of Processes . . . . . . . . . . . . . . . . . . . . 155 5.2. 1 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 5.2.2 Gaussian Processes . . . . . . . . . . . . . . . . . . . 158 5.2.3 Markov Processes . . . . . . . . . . . . . . . . . . . 158 5.2.4 Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 5.3 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . 160 5.3.1 Definition and Existence . . . . . . . . . . . . . . 160 5.3.2 Quadratic Variation of Brownian Motion . . . . . . . . . 167 5 .3.3 Properties of Brownian Motion . . . . . . . . . . . . . 171 5.3.4 Brownian Motion in Stochastic Modeling . . . . . . . . 173 5.4 Point Processes . . . . . . . . . . . . . . . . . . . . . . . 175 5.4. 1 Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 175 5.4.2 The Poisson Process . . . . . . . . . . . . . . . . . . 176 5.4.3 Compound Poisson Processes . . . . . . . . . . . . . . . . . . . . . . 176 5.4.4 Renewal Processes . . . . . . . . . . . . . . . . . . 177 5.5 Levy Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 5.5.1 Distributions . . . . . . . . . . . . . . . . . . . . . . . 179 5.5.2 Levy Processes . . . . . . . . . . . . . . . . . . . . . 181 5.5.3 Levy Processes and the Levy-Khintchine Formula . . . 183 5.6 Stochastic Integrals; Ito Calculus . . . . . . . . . . . . 187 5.6.1 Stochastic Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 5.6.2 Ito's Lemma . . . . . . . . . . . . . . . . . . . . . . . . . 193 5.6.3 Geometric Brownian Motion . . . . . . . . . . . 196 5.7 Stochastic Calculus for Black-Scholes Models . . . . . . . . . 198 5.8 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 202 5.9 Likelihood Estimation for Diffusions . . . . . . . . . . . . . . . 206 5.10 Martingales, Local Martingales and Semi-martingales . . . . 209 5.10. 1 Definitions . . . . . . . . . . . . . . . . . . . . . . . 209 5. 10.2 Semi-martingale Calculus . . . . . . . . . . . . . 211 5.10.3 Stochastic Exponentials . . . . . . . . . . . . . . . . 215 5.10 .4 Semi-martingale Characteristics . . . . . . . . . . . 217 5. 1 1 Weak Convergence of Stochastic Processes . . . . . . . . 219 5.1 1 . 1 The Spaces Cd and Dd . . . ... .. . .. .. 219 5.11.2 Definition and Motivation . . . . . . . . . . . . . . 220 5.11.3 Basic Theorems of Weak Convergence . . . . . . . . . . . . . . . 222 5 . 1 1 .4 Weak Convergence Results for Stochastic Integrals . 223 Exercises 225 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
6.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Mathematical Finance in Continuous Time . . . . . . . . . . . ...... .. 6. 1 Continuous-time Financial Market Models 6.1.1 The Financial Market Model . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 Equivalent Martingale Measures . . . . . . . . . . . . . . 6.1.3 Risk-neutral Pricing . . . . . .. . . ...... . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
229 229 229 232 235
xvi
Contents
Changes of Numeraire . . . . . . . . . . . . . . . . . . . . . The Generalized Black-Scholes Model . . . . . . ... 6 . 2 . 1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Pricing and Hedging Contingent Claims . . . . . . . . . 6.2.3 The Greeks . . . . . . . . . . . . . . . .. .. ... ... 6.2.4 Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Further Contingent Claim Valuation . . . . . . . . . . . . . . . . 6.3. 1 American Options . . . . . . . . . . . . . . . . . . . . . . . . . 6. 3 . 2 Asian Options . ... . .. . .. . .. . . 6.3.3 Barrier Options . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . .... 6. 3 . 4 Lookback Options . . . . 6.3. 5 Binary Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Discrete- versus Continuous-time Market Models . . . . . . . . 6.4. 1 Discrete- to Continuous-time Convergence Reconsidered . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Finite Market Approximations . . . .. . 6.4.3 Examples of Finite Market Approximations . . . . . . 6.4.4 Contiguity .. .. ... . . . . . . . 6. 5 Further Applications of the Risk-neutral Valuation Principle . . . . . . . . . . . . . . . . . . . . . . . . . 6. 5 . 1 Futures Markets . . . . . . . . . . . . . . . . . . . . . . . 6. 5 . 2 Currency Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises 6.2
6. 1 .4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Incomplete Markets . . . . . . . . . . .... ...... . 7 . 1 Pricing in Incomplete Markets . . . . . . . . . . . . . . . . . . 7. 1 . 1 A General Option-Pricing Formula . . . . . . . 7 . 1 . 2 The Esscher Measure . . . . . . . . . . . . . . . . . . . . . . 7.2 Hedging in Incomplete Markets . . . . . . . . . . . 7. 2 . 1 Quadratic Principles . . . . . . . . . . . . . . . . . . . . . . 7.2.2 The Financial Market Model . . . . . . . . . . . . . . . . 7.2.3 Equivalent Martingale Measures . . . . . . . . . . . 7.2.4 Hedging Contingent Claims . . . . . . . . . . . . 7.2. 5 Mean-variance Hedging and the Minimal ELMM . 7.2.6 Explicit Example . . . . . . . . . . . .... ... .. 7.2.7 Quadratic Principles in Insurance . . . . . . . . . . . . . . 7.3 Stochastic Volatility Models . . . . . . . . . . . . . . . . . . . 7.4 Models Driven by Levy Processes . . . . . . . . . . . . 7.4 . 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 General Levy-process Based Financial Market Model . . . . . . . . . ... .. . 7.4.3 Existence of Equivalent Martingale Measures . . . . 7.4.4 Hyperbolic Models: The Hyperbolic Levy Process .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . .. . ... . .
.
.
.
.
.
.
.
.
. . . .. . . ... ... . . .. .. .
.
.
.
.
.
.
.
.
239 242 242 2 50 254 255 2 58 2 58 260 263 266 269 270 270 271 274 280 281 281 28 5 287
. 289 . 289 289 292 . 29 5 . 296 . 297 299 . 300 30 5 307 . 312 . 314 . 3 18 . 3 18 . .
.
.
.
. . 3 19 . . . 321 .. 32 3 .
.
. .
.
xvii
Contents
8.
Interest Rate Theory . . . . . . . . . . . . ... . 8 . 1 The Bond Market . . . . . . . . ... .. . .. .. .. 8. 1 . 1 The Term Structure of Interest Rates .. .. . . 8. 1 . 2 Mathematical Modelling . . . . . . . .. .. ... 8. 1 . 3 Bond Pricing, . . . . . .... .. .. ... ... .. 8.2 Short-rate Models . . . .. . . . . . . . . . . . . . . . . . . .. 8.2. 1 The Term-structure Equation . . . . . . . . . . . . . . . 8.2 . 2 Martingale Modelling . . . . . .. . . . . . . . . . . . . . . . 8.2. 3 Extensions: Multi-Factor Models . . . . . . .. . . . . . 8. 3 Heath-Jarrow-Morton Methodology . . . ... 8. 3 . 1 The Heath-Jarrow-Morton Model Class . .. . . .. 8. 3 . 2 Forward Risk-neutral Martingale Measures . 8. 3 . 3 Completeness . . . . . . . . .. .. . .. . .. . . .. 8. 4 Pricing and Hedging Contingent Claims . 8. 4 . 1 Short-rate Models . . . . . . .. . . .. .. .. 8. 4 . 2 Gaussian HJM Framework . . . . . . . . . . .. ... . 8. 4 . 3 Swaps . . . . . . . . . . . . . . . . . . . . . .. .. . 8. 4 . 4 Caps . . . . .. . . . . . . . . .. . . . 8. 5 Market Models of LIBOR- and Swap-rates . . . . . . .. 8. 5 . 1 Description of the Economy ... . . .. . . .. . . .. . 8. 5 . 2 LIBOR Dynamics Under the Forward LIB OR Measure . . . . . . 8.5 . 3 The Spot LIBOR Measure . . . . . .. . . 8. 5 . 4 Valuation of Caplets and Floorlets in the LMM 8. 5 . 5 The Swap Market Model . . .. . ... . . . 8. 5.6 The Relation Between LIBOR- and Swap-market Models . . . . . . . . . . . . . . 8.6 Potential Models and the Flesaker-Hughston Framework . . . . 8.6. 1 Pricing Kernels and Potentials . . . . . . . . . .. . 8.6.2 The Flesaker-Hughston Framework . . . . ..... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Exercises
9.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Credit Risk . . ...... .. . . . .. . . . . . . . . . . 9 . 1 Aspects of Credit Risk . . . . . . . . . . . . ... . . . . . . . . 9. 1 . 1 The Market . .. . .. . . . . . . . . .. 9 . 1 . 2 What Is Credit Risk? . . . . . . . . . . . . . . . . . .. . 9. 1 . 3 Portfolio Risk Models . ...... ... 9.2 Basic Credit Risk Modeling . . . . . .. ... 9.3 Structural Models . . . . . . . .. 9. 3 . 1 Merton's Model . . . .. . . . . 9. 3 . 2 A Jump-diffusion Model . . . . . 9. 3 . 3 Structural Model with Premature Default 9. 3.4 Structural Model with Stochastic Interest Rates . 9. 3.5 Optimal Capital Structure - Leland's Approach . . 9. 4 Reduced Form Models . . . . . . . . .. . . . . . . . .. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .. .. .. . .. .. . .
.
.
.
.
.
.
.
. .
3 57 361 362 363 367 368 368 3 70 372
37 5 3 76 376 376 377 378 379 379 382 38 4 . 388 . 389 390 .
.
327 328 328 3 30 334 3 36 3 37 3 38 3 42 343 343 3 46 3 48 3 50 3 50 351 353 354 3 56 3 56
.
.
.
xviii
Contents
9.5 Credit Derivatives . . . 9.6 Portfolio Credit Risk Models . . .. 9.7 Collateralized Debt Obligations ( CDOs ) 9.7.1 Introduction . . . ........ 9.7.2 Review of Modelling Methods . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A. Hilbert Space . . . . .. .. .. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Bibliography
.
.
.
.
.
.
.
.
.
.
.
Index
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
. . .
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
B. Projections and Conditional Expectations C . The Separating Hyperplane Theorem
. . . 399 .. .. 400 ... . ... .. . 404 .. .... . 404 . . . .. . . 405 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . .. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
409
.
.
411
. . . 415 .
.. .
.
.
.
.
.
417
.
.
.
433
1. Derivative Background
The main focus of this book is the pricing of financial assets. Price formation in financial markets may be explained in an absolute manner in terms of fundamentals, as, e.g. in the so-called rational expectation model, or, more modestly, in a relative manner explaining the prices of some assets in terms of other given and observable asset prices. The second approach, which we adopt, is based on the concept of arbitrage. This remarkably simple concept is independent of beliefs and tastes ( preferences ) of the actors in the financial market. The basic assumption simply states that all participants in the mar ket prefer more to less, and that any increase in consumption opportunities must somehow be paid for. Underlying all arguments is the question: Is it possible for an investor to restructure his current portfolio ( the assets cur rently owned ) in such a way that he has to pay less today for his restructured portfolio and still has the same ( or a higher ) return at a future date? If such an opportunity exists, the arbitrageur can consume the difference today and has gained a free lunch. Following our relative pricing approach, we think of financial assets as specific mixtures of some fundamental building blocks. A key observation will be that the economics involved in the relative pricing lead to linearity of the price formation. Consequently, if we are able to extract the prices of these fundamental building blocks from the prices of the financial assets traded in the market, we can create and price new assets simply by choosing new mixtures of the building blocks. It is this special feature of financial asset pricing that allows the use of modern martingale-based probability theory ( and made the subject so special to us ) . We will review in this chapter the relevant background for the pricing theory in financial markets. We start by describing financial markets, the actors in them and the financial assets traded there. After clarifying the fun damental economic building blocks we come to the key concept of arbitrage. We introduce the general technique of arbitrage pricing and finally specify our first model of a financial market.
2
1 . Derivative Background
1. 1 Financial Markets and Instruments
This book is on the risk-neutral ( probabilistic ) pricing of derivative securities. In practitioner's terms a 'derivative security' is a security whose value depends on the values of other more basic underlying securities; cf. Hull ( 1 999), p. l . We adopt a more precise academic definition, i n the spirit of Ingersoll ( 1986) : 1.1.1. A derivative security, or contingent claim, is a financial contract whose value at expiration date T {more briefly, expiry} is determined exactly by the price {or prices within a prespecified time-interval} of the un derlying financial assets {or instruments} at time T {within the time interval [O, T] }.
Definition
We refer to the underlying assets below simply as 'the underlying'. This section provides the institutional background on derivative securities, the main groups of underlying assets, the markets where derivative securities are traded and the financial agents involved in these activities. As our focus is on ( probabilistic ) models and not institutional considerations we refer the reader to the references for excellent sources describing institutions such as Davis ( 1994), Edwards and Ma ( 1 992) and Kolb ( 1 991 ) . 1. 1. 1
Derivative Instruments
Derivative securities can be grouped under three general headings: Options, Forwards and Futures and Swaps. During this text we will mainly deal with options although our pricing techniques may be readily applied to forwards, futures and swaps as well. Options. An option is a financial instrument giving one the right but not the obligation to make a specified transaction at ( or by ) a specified date at a specified price. Call options give one the right to buy. Put options give one the right to sell. European options give one the right to buy/sell on the specified date, the expiry date, on which the option expires or matures. American options give one the right to buy/sell at any time prior to or at expiry. Over-the-counter ( OTC ) options were long ago negotiated by a broker between a buyer and a seller. In 197 3 ( the year of the Black-Scholes formula, perhaps the central result of the subject ) , the Chicago Board Options Ex change ( CBOE ) began trading in options on some stocks. Since then, the growth of options has been explosive. Options are now traded on all the ma jor world exchanges, in enormous volumes. Risk magazine ( 1 2/97) estimated $3 5 trillion as the gross figure for worldwide derivatives markets in 1996 . By contrast the Financial Times of 7 October 2002 ( Special Report on Deriva tives ) gives the interest rate and currency derivatives volume as $ 8 3 trillion an indication of the rate of growth in recent years! The simplest call and put options are now so standard they are called vanilla options. Many kinds of options now exist, including so-called exotic options. Types include Asian -
1 . 1 Financial Markets and Instruments
3
options, which depend on the average price over a period; lookback options, which depend on the maximum or minimum price over a period; and barrier options, which depend on some price level being attained or not. The asset to which the option refers is called the underlying the underlying. The price at which the parties agree to buy/sell the underlying, on/by the expiry date ( if exercised ) , is called the exercise price strike price. We shall usually use K for the strike price, time t 0 for the initial time ( when the contract between the buyer and the seller of the option is struck ) and time t T for the expiry or final time. Consider, say, a European call option, with strike price K; write S (t) for the value ( or price ) of the underlying at time t . If S (t) > K, the option is in the money; if S (t) K, the option is said to be at the money; and if S (t) < K, the option is out of the money. The payoff from the option is Terminology.
asset or
or
=
=
=
S ( T)
-
K if S (T)
>
K and
0
otherwise,
more briefly written as [S ( T ) Kj+. Taking into account the initial payment of an investor one obtains the profit diagram below. -
profit
Fig. 1 . 1 .
Profit diagram for a European call
Forwards. A forward contract is an agreement to buy or sell an asset S at a certain future date T for a certain price K. The agent who agrees to buy the underlying asset is said to have a long position, the other agent assumes a short position. The settlement date is called delivery date; and the specified price is referred to as delivery price. The forward price f(t, T) is the delivery
4
1 . Derivative Background
price that would make the contract have zero value at time t. At the time the contract is set up, t 0, the forward price therefore equals the delivery price, hence 1(0, T) K. The forward prices I(t, T) need not (and will not) necessarily be equal to the delivery price K during the lifetime of the contract. The payoff from a long position in a forward contract on one unit of an asset with price S(T) at the maturity of the contract is =
=
S(T) - K.
Compared with a call option with the same maturity and strike price K, we see that the investor now faces a downside risk, too. He has the obligation to buy the asset for price K. Swaps. A swap is an agreement whereby two parties undertake to exchange, at known dates in the future, various financial assets (or cash flows) according to a prearranged formula that depends on the value of one or more underlying assets. Examples are currency swaps (exchange currencies) and interest-rate swaps (exchange of fixed for floating set of interest payments) . 1.1.2
Underlying Securities
Stocks. The basis of modern economic life - or of the capitalist system - is the limited liability company ( UK: & Co. Ltd, now pIc - public limited com pany) , the corporation (US: Inc.), 'die Aktiengesellschaft' (Germany: AG) . Such companies are owned by their shareholders; the shares provide partial ownership of the company, pro rata with investment, have value, reflecting both the value of the company's (real) assets and the earning power of the company's dividends. With publicly quoted companies, shares are quoted and traded on the Stock Exchange. Stock is the generic term for assets held in the form of shares. Interest Rates. The value of some financial assets depends solely on the level of interest rates (or yields) , e.g. Treasury (T-) notes, T-bills, T-bonds, municipal and corporate bonds. These are fixed-income securities by which national, state and local governments and large companies partially finance their economic activity. Fixed-income securities require the payment of in terest in the form of a fixed amount of money at predetermined points in time, as well as repayment of the principal at maturity of the security. Inter est rates themselves are notional assets, which cannot be delivered. Hedging exposure to interest rates is more complicated than hedging exposure to the price movements of a certain stock. A whole term structure is necessary for a full description of the level of interest rates, and for hedging purposes one must clarify the nature of the exposure carefully. We will discuss the subject of modeling the term structure of interest rates in Chapter 8. •
•
1 . 1 Financial Markets and Instruments
5
Currencies. A currency is the denomination of the national units of pay ment (money) and as such is a financial asset. The end of fixed exchange rates and the adoption of floating exchange rates resulted in a sharp increase in exchange rate volatility. International trade, and economic activity involv ing it, such as most manufacturing industry, involves dealing with more than one currency. A company may wish to hedge adverse movements of foreign currencies and in doing so use derivative instruments. See for example the exposure of the hedging problems British Steel faced as a result of the sharp increase in the pound sterling in 96 /97, Rennocks (1997) . Indexes. An index tracks the value of a (hypothetical) basket of stocks (FT SElDO, S&P-500, DAX) , bonds (REX), and so on. Again, these are not assets themselves. Derivative instruments on indexes may be used for hedging if no derivative instruments on a particular asset (a stock, a bond, a commodity) in question are available and if the correlation in movement between the index and the asset is significant. Furthermore, institutional funds (such as pension funds, mutual funds etc.) , which manage large diversified stock portfolios, try to mimic particular stock indexes and use derivatives on stock indexes as a portfolio management tool. On the other hand, a speculator may wish to bet on a certain overall development in a market without exposing him / herself to a particular asset. A new kind of index was generated with the Index of Catastrophe Losses (CAT-Index) by the Chicago Board of Trade (CBOT) lately. The growing number of huge natural disasters (such as hurricane Andrew 1992, the Kobe earthquake 1995 etc) has led the insurance industry to try to find new ways of increasing its capacity to carry risks. The CBOT tried to capitalize on this problem by launching a market in insurance derivatives. Investors have been offered options on the CAT-Index, thereby taking in effect the position of traditional reinsurance. Derivatives are themselves assets - they are traded, have value etc. - and so can be used as underlying assets for new contingent claims: options on futures, options on baskets of options etc. These developments give rise to so-called exotic options, demanding a sophisticated mathematical machinery to handle them. 1 . 1 . 3 Markets
Financial derivatives are basically traded in two ways: on organized exchanges and over-the-counter (OTC). Organized exchanges are subject to regulatory rules, require a certain degree of standardization of the traded instruments (strike price, maturity dates, size of contract etc.) and have a physical loca tion at which trade takes place. Examples are the Chicago Board Options Exchange (CBOE) , which coincidentally opened in April 1973, the same year as the seminal contributions on option prices Black and Scholes (1973) and Merton (1973) were published, the London International Financial Futures Exchange (LIFFE) and the Deutsche Terminb6rse (DTB).
6
1 . Derivative Background
OTC trading takes place via computers and phones between various com mercial and investment banks (leading players include institutions such as Bankers Trust, Goldman Sachs - where Fischer Black worked -, Citibank, Chase Manhattan and Deutsche Bank). Due to the growing sophistication of investors boosting demand for in creasingly complicated, made-to-measure products, the OTC market volume is currently growing at a much faster pace than trade on most exchanges. 1 . 1 .4 Types of Traders
We can classify the traders of derivative securities in three different classes. Hedgers. Successful companies concentrate on economic activities in which they do best. They use the market to insure themselves against adverse move ments of prices, currencies, interest rates etc. Hedging is an attempt to reduce exposure to risk a company already faces. Shorter Oxford English Dictionary (OED) : Hedge: 'trans. To cover oneself against loss on (a bet etc.) by betting, etc., on the other side. Also fig. 1672.' Speculators. Speculators want to take a position in the market - they take the opposite position to hedgers. Indeed, speculation is needed to make hedg ing possible, in that a hedger, wishing to lay off risk, cannot do so unless someone is willing to take it on. In speculation, available funds are invested opportunistically in the hope of making a profit: the underlying itself is irrelevant to the investor (specu lator) , who is only interested in the potential for possible profit that trade involving it may present. Hedging, by contrast, is typically engaged in by companies who have to deal habitually in intrinsically risky assets such as foreign exchange next year, commodities next year, etc. They may prefer to forgo the chance to make exceptional windfall profits when future uncertainty works to their advantage by protecting themselves against exceptional loss. This would serve to protect their economic base (trade in commodities, or manufacture of products using these as raw materials) , and also enable them to focus their effort in their chosen area of trade or manufacture. For specu lators, on the other hand, it is the market (forex, commodities or whatever) itself that is their main forum of economic activity. Arbitrageurs. Arbitrageurs try to lock in riskless profit by simultaneously entering into transactions in two or more markets. The very existence of arbitrageurs means that there can only be very small arbitrage opportunities in the prices quoted in most financial markets. The underlying concept of this book is the absence of arbitrage opportunities (cf. § 1 .2 ) . 1 . 1 . 5 Modeling Assumptions Contingent Claim Pricing. The fundamental problem in the mathematics of financial derivatives is that of pricing. The modern theory began in 1973
1 . 1 Financial Markets and Instruments
7
with the seminal Black-Scholes theory of option pricing, Black and Scholes ( 1973) , and Merton's extensions of this theory, Merton (1973) . To expose the relevant features, we start by discussing contingent claim pricing in the simplest ( idealized ) case and impose the following set of as sumptions on the financial markets ( we will relax these assumptions subse quently) . No market frictions No default risk Competitive markets Rational agents No arbitrage
No transaction costs, no bid / ask spread, no taxes, no margin requirements, no restrictions on short sales Implying same interest for borrowing and lending Market participants act as price takers Market participants prefer more to less Table 1.1.
General assumptions
All real markets involve frictions; this assumption is made purely for sim plicity. We develop the theory of an ideal - frictionless - market so as to focus on the irreducible essentials of the theory and as a first-order approxi mation to reality. Understanding frictionless markets is also a necessary step to understand markets with frictions. The risk of failure of a company - bankruptcy - is inescapably present in its economic activity: death is part of life, for companies as for individu als. Those risks also appear at the national level: quite apart from war, or economic collapse resulting from war, recent decades have seen default of in terest payments of international debt, or the threat of it. We ignore default risk for simplicity while developing understanding of the principal aspects ( for recent overviews on the subject we refer the reader to Jameson ( 1995) , Madan (1998) ). We assume financial agents to be price takers, not price makers. This implies that even large amounts of trading in a security by one agent does not influence the security's price. Hence agents can buy or sell as much of any security as they wish without changing the security's price. To assume that market participants prefer more to less is a very weak assumption on the preferences of market participants. Apart from this we will develop a preference-free theory. The relaxation of these assumptions is subject to ongoing research and we will include comments on this in the text. We want to mention the special character of the no-arbitrage assumption. If we developed a theoretical price of a financial derivative under our assump tions and this price did not coincide with the price observed, we would take this as an arbitrage opportunity in our model and go on to explore the conse quences. This might lead to a relaxation of one of the other assumptions and
8
1 . Derivative Background
a restart of the procedure again with no-arbitrage assumed. The no-arbitrage assumption thus has a special status that the others do not. It is the basis for the arbitrage pricing technique that we shall develop, and we discuss it in more detail below. 1.2 Arbitrage
We now turn in detail to the concept of arbitrage, which lies at the centre of the relative pricing theory. This approach works under very weak assump tions. We do not have to impose any assumptions on the tastes (preferences) and beliefs of market participants. The economic agents may be heteroge neous with respect to their preferences for consumption over time and with respect to their expectations about future states of the world. All we assume is that they prefer more to less, or more precisely, an increase in consumption without any costs will always be accepted. The principle of arbitrage in its broadest sense is given by the following quotation from OED: '3 [Comm. ) . The traffic in Bills of Exchange drawn on sundry places, and bought or sold in sight of the daily quotations of rates in the several markets. Also, the similar traffic in Stocks. 1881 . ' Used i n this broad sense, the term covers financial activity of many kinds, including trade in options, futures and foreign exchange. However, the term arbitrage is nowadays also used in a narrower and more technical sense. Fi nancial markets involve both riskless (bank account) and risky (stocks etc.) assets. To the investor, the only point of exposing oneself to risk is the op portunity, or possibility, of realizing a greater profit than the riskless pro cedure of putting all one's money in the bank (the mathematics of which compound interest - does not require a textbook treatment at this level) . Generally speaking, the greater the risk, the greater the return required to make investment an attractive enough prospect to attract funds. Thus, for instance, a clearing bank lends to companies at higher rates than it pays to its account holders. The companies' trading activities involve risk; the bank tries to spread the risk over a range of different loans, and makes its money on the difference between high/ risky and low/riskless interest rates. The essence of the technical sense of arbitrage is that it should not be possible to guarantee a profit without exposure to risk. Were it possible to do so, arbitrageurs (we use the French spelling, as is customary) would do so, in unlimited quantity, using the market as a 'money-pump' to extract arbitrarily large quantities of riskless profit. This would, for instance, make it impossible for the market to be in equilibrium. We shall restrict ourselves to markets in equilibrium for simplicity - so we must restrict ourselves to markets without arbitrage opportunities. The above makes it clear that a market with arbitrage opportunities would be a disorderly market - too disorderly to model. The remarkable thing is the converse. It turns out that the minimal requirement of absence of arbitrage
1 . 2 Arbitrage
9
opportunities is enough to allow one to build a model of a financial market that - while admittedly idealized - is realistic enough both to provide real insight and to handle the mathematics necessary to price standard contingent claims. We shall see that arbitrage arguments suffice to determine prices the arbitrage pricing technique. For an accessible treatment rather different to ours, see e.g. Allingham (1991) . To explain the fundamental arguments of the arbitrage pricing technique we use the following: Example. Consider an investor who acts in a market in which only three financial assets are traded: ( riskless ) bonds B ( bank account ) , stocks 8 and European Call options C with strike K 1 on the stock. The investor may invest today, time 0, in all three assets, leave his investment until time T, t T and get his returns back then ( we assume the option expires at also ) . We assume the current £ prices of the financial assets are given by =
t
=
=
t
B (O)
=
1,
8(0)
1 , C( O )
=
=
=
0 .2
and that at t = T there can be only two states of the world: an up-state with £ prices B (T, u )
=
1 . 25, 8(T, u )
and a down-state with B (T, d)
=
£
=
1 .75,
and therefore C(T
, u
)
=
0.75,
prices
1 .25, 8(T, d)
=
0.75,
and therefore C(T, d)
=
o.
Now our investor has a starting capital of £25 , and divides it as in Table 1 . 2 below ( we call such a division a portfolio ) . Depending on the state of Financial asset Bond Stock Call
Number of 10 10 25
Table 1 . 2 .
Total amount in 10 10 5
£
Original portfolio
the world at time t T; this portfolio will give the £ return shown in Table 1.3. Can the investor do better? Let us consider the restructured portfolio of Table 1 .4. This portfolio requires only an investment of £24.6. We compute its return in the different possible future states ( Table 1 .5) . We see that this portfolio generates the same time T return while costing only £24.6 now, a saving of £0.4 against the first portfolio. So the investor should use the second portfolio and have a free lunch today! =
t
=
10
1. Derivative Background
Bond 12.5 12.5
State of the world Up Down Table 1 . 3 .
Call 1 S . 75 0
Total 4S . 75 20.
Return of original portfolio
Financial asset Bond Stock Call
Number of 1 l .S 7 29
Table 1 . 4 .
State of the world Up Down Table 1 . 5 .
Stock 17.5 7.5
Total amount in 1 l .S 7 5.8
£
Restructured portfolio
Bond 14.75 14.75
Stock 1 2 . 25 5 . 25
Call 2 1 . 75 0
Total 48.75 20.
Return of the restructured portfolio
In the above example the investor was able to restructure his portfolio, reducing the current (time 0 ) expenses without changing the return at the future date t T in both possible states of the world. So there is an arbitrage possibility in the above market situation, and the prices quoted are not arbitrage (or market) prices. If we regard (as we shall do) the prices of the bond and the stock (our underlying) as given, the option must be mispriced. In this book we will develop models of financial market (with different degrees of sophistication) which will allow us to find methods to avoid (or to spot) such pricing errors. For the time being, let us have a closer look at the differences between portfolio 1 , consisting of 10 bonds, 10 stocks and 2 5 call options, in short ( 10, 10 , 2 5 ) , and portfolio 2 , of the form ( 1 1 .8, 7, 29 ) . The difference (from the point of view of portfolio 1 , say) is the following portfolio, D: ( - 1 .8, 3, -4 ) . Sell short three stocks (see below) , buy four options and put £ 1 .8 in your bank account. The left-over is exactly the £ 0.4 of the example. But what is the effect of doing that? Let us consider the consequences in the possible states of the world. From Table 1 .6 below, we sec in both cases that the effects of the different positions of the portfolio offset themselves. But clearly the portfolio generates an income at 0 and is therefore itself an arbitrage opportunity. If we only look at the position in bonds and stocks, we can say that this position covers us against possible price movements of the option, i.e. having £ 1 .8 in your bank account and being three stocks short has the same time T effects of having four call options outstanding against us. We say that the bond/stock position is a hedge against the position in options. =
t
=
t
t
=
=
1 . 3 Arbitrage Relationships
World is i n state Exercise option Buy 3 stocks at 1 . 75 Sell bond
up
World is in state 3
Balance
o Table 1 . 6 .
down
Option is worthless Buy 3 stocks at 0.75 Sell bond
-5.25 2 .25
I
11
Balance
0 -2.25 2.25 o
Difference portfolio
Let us emphasize that the above arguments were independent of the pref erences and plans of the investor. They were also independent of the inter pretation of = T: it could be a fixed time, maybe a year from now, but it could refer to the happening of a certain event, e.g. a stock hitting a certain level, exchange rates at a certain level etc.
t
1 . 3 Arbitrage Relat ionships
We will in this section use arbitrage-based arguments ( arbitrage pricing tech nique) to develop general bounds on the value of options. Such bounds, de duced from the underlying assumption that no arbitrage should be possible, allow one to test the plausibility of sophisticated financial market models. In our analysis here we use stocks as the underlying. 1 . 3 . 1 Fundamental Determinants of Option Values
We consider the determinants of the option value in Table 1 . 7 below. Since we restrict ourselves to stocks not paying dividend we don't have to consider cash dividends as another natural determinant. Current stock price Strike price Stock volatility Time to expiry Interest rates Table 1 . 7.
S(t) K a
T-t r
Determinants affecting option value
We now examine the effects of the single determinants on the option prices other factors remaining unchanged ) . We saw that at expiry the only variables that mattered were the stock price S ( T ) and strike price K: remember the payoffs C ( S ( T) - K) + , P ( all
=
=
12
1 . Derivative Background
(S(T) K) - (:= max{K S(T) , O}). Looking at the payoffs, we see that an increase in the stock price will increase (decrease) the value of a call (put) option (recall all other factors remain unchanged) . The opposite happens if the strike price is increased: the price of a call (put) option will go down (up). When we buy an option, we bet on a favourable outcome. The actual outcome is uncertain; its uncertainty is represented by a probability density; favourable outcomes are governed by the tails of the density (right or left tail for a call or a put option) . An increase in volatility flattens out the density and thickens the tails, so increases the value of both call and put options. Of course, this argument again relies on the fact that we don't suffer from (with the increase of volatility more likely) more severe unfavourable outcomes we have the right, but not the obligation, to exercise the option. A heuristic statement of the effects of time to expiry or interest rates is not so easy to make. In the simplest of models (no dividends, interest rates remain fixed during the period under consideration) , one might argue that the longer the time to expiry the more can happen to the price of a stock. So a longer period increases the possibility of movements of the stock price and hence the value of a call (put) should be higher the more time remains before expiry. But only the owner of an American-type option can react immediately to favourable price movements, whereas the owner of a European option has to wait until expiry, and only the stock price then is relevant. Observe the contrast with volatility: an increase in volatility increases the likelihood of favourable outcomes at expiry, whereas the stock price movements before expiry may cancel themselves out. A longer time until expiry might also increase the possibility of adverse effects from which the stock price has to recover before expiry. We see that by using purely heuristic arguments we are not able to make precise statements. One can, however, show by explicit arbitrage arguments that an increase in time to expiry leads to an increase in the value of call options as well as put options. (We should point out that in case of a dividend-paying stock the statement is not true in general for European-type options.) To qualify the effects of the interest rate we have to consider two aspects. An increase in the interest rate tends to increase the expected growth rate in an economy; and hence the stock price tends to increase. On the other hand, the present value of any future cash flows decreases. These two effects both decrease the value of a put option, while the first effect increases the value of a call option. However, it can be shown that the first effect always dominates the second effect, so the value of a call option will increase with increasing interest rates. The above heuristic statements, in particular the last, will be verified again in appropriate models of financial markets, see §4.5 and §6.2. We summarize in table 1 .8 the effect of an increase of one of the param eters on the value of options on stocks no paying dividends while keeping all others fixed: -
-
1 . 3 Arbitrage Relationships Parameter (increase )
Call
Put
Stock price Strike price Volatility Interest rates Time to expiry
Positive Negative Positive Positive Positive
Negative Positive Positive Negative Positive
Table 1 . 8 .
13
Effects of parameters
We would like to emphasize again that these results all assume that all other variables remain fixed, which of course is not true in practice. For example stock prices tend to fall ( rise) , when interest rates rise fall ) , and the observable effect on option prices may well be different from the effects deduced under our assumptions. Cox and Rubinstein ( 1985) , p. 37-39, discuss other possible determining factors of option value, such as expected rate of growth of the stock price, additional properties of stock price movements, investors' attitudes toward risk, characteristics of other assets and institutional environment ( tax rules, margin requirements, transaction costs, market structure ) . They show that in many important circumstances the influence of these variables is marginal or even vanishing.
(
1 .3.2 Arbitrage Bounds
We now use the principle of no-arbitrage to obtain bounds for option prices. Such bounds, deduced from the underlying assumption that no arbitrage should be possible, allow one to test the plausibility of sophisticated financial market models. We focus on European options ( puts and calls ) with identical underlying ( say a stock S) , strike K and expiry date T. Furthermore we assume the existence of a risk-free bank account ( bond ) with constant interest rate r ( continuously compounded ) during the time interval [0, T] . We start with a fundamental relationship: Proposition 1 .3 . 1 . We have the following put-call parity between the prices
of the underlying asset S and European call and put options on stocks that pay no dividends: (1.1) S + P - C = Ke- r ( T - t ) . Proof. Consider a portfolio consisting of one stock, one put and a short position in one call ( the holder of the portfolio has written the call ) ; write for the value of this portfolio. Then
V(t)
V(t) = S(t) + P(t) - C(t)
for all t E
[0, T] .
At expiry we have
14
1. Derivative Background
V(T)
=
S(T) + (S(T) - K) - - (S(T) - K ) +
=
S( T) + K - S(T)
=
K.
This portfolio thus guarantees a payoff K at time T. Using the principle of no-arbitrage, the value of the portfolio must at any time t correspond to the D value of a sure payoff K at T, that is V (t) = K e- r (T- t ) . Having established ing.
(1 .1) ,
we concentrate on European calls in the follow
Proposition 1 . 3 . 2 . The following bounds hold for European call options:
{
max S(t) - e - r (T-t) K, o }
=
( S(t) - e- r (T-t) K) + ::; C(t) ::; S(t) .
Proof. That C � 0 is obvious, otherwise 'buying' the call would give a riskless profit now and no obligation later. Similarly the upper bound C ::; S must hold, since violation would mean that the right to buy the stock has a higher value than owning the stock. This must be false, since a stock offers additional benefits. Now from put-call parity ( 1 . 1 ) and the fact that P � 0 (use the same argument as above) , we have
S(t) - K e - r( T- t )
=
C(t) - P(t) ::; C(t) ,
which proves the last assertion.
D
It is immediately clear that an American call option can never be worth less than the corresponding European call option, for the American option has the added feature of being able to be exercised at any time until the maturity date. Hence (with the obvious notation) : CA (t) � CE (t) . The striking result we are going to show, is Theorem 8.2 in Merton ( 1990 ) : Proposition 1 .3 . 3 . For a stock not paying dividends we have
( 1 .2 )
Exercising the American call at time t < T generates the cash-flow From Proposition 1 .3.2 we know that the value of the call must be greater or equal to S(t) K e- r( T-t) , which is greater than S(t) - K. Hence selling the call would have realized a higher cash-flow and the early exercise D of the call was suboptimal. Remark 1 . 3. 1 . Qualitatively, there are two reasons why an American call should not be exercised early: (i) Insurance. An investor who holds a call option instead of the underlying stock is 'insured' against a fall in stock price below K , and if he exercises early, he loses this insurance. Proof.
S(t) - K .
-
1 . 4 Single-period Market Models
15
( ii )
Interest on the strike price. When the holder exercises the option, he buys the stock and pays the strike price, K. Early exercise at t < T deprives the holder of the interest on K between times t and T: the later he pays out K, the better. We remark that an American put offers additional value compared to a European put. 1 . 4 Single- p eriod Market Models
1 .4 . 1 A Fundamental Example
0
0
We consider a one-period model, i.e. we allow trading only at t = and t = T = 1 ( say ) . Our aim is to value at t = a European derivative on a stock S with maturity T. First Idea. Model ST as a random variable on a probability space (n, F, IP) . The derivative is given by H = f ( ST ) , i.e. it is a random variable ( for a suitable function f ( . » . We could then price the derivative using some discount factor (3 by using the expected value of the discounted future payoff: Ho
=
lE«(3H) .
( 1 .3)
Problem. How should we pick the probability measure IP? According to their preferences, investors will have different opinions about the distribution of the price ST . Black-Scholes-Merton ( Ross ) Approach. Use the no-arbitrage principle and construct a hedging portfolio using only known ( and already priced ) securities to duplicate the payoff H. We assume 1. Investors are nonsatiable, i.e. they always prefer more to less. 2. Markets do not allow arbitrage, i.e. the possibility of risk-free profits. From the no-arbitrage principle we see: If it is possible to duplicate the payoff H of a derivative using a portfolio V of underlying (basic) securities, i. e. H(w) = V (w) , 't/w, the price of the portfolio at t = must equal the price of the derivative at t = O.
0
•
Let us assume there are two tradeable assets a risk-free bond ( bank account ) with B (O) = 1 and B(T) = 1 , that is the interest rate r and the discount factor (3(t) = 1. ( In this context we use (3(t) = 1/ B (t) as the discount factor. ) a risky stock S with S ( O ) 1 0 and two possible values at t = T with probability p e ) ST = 7. 5 with probability 1 p. =
•
0
{20 =
-
16
1 . Derivative Background
We call this setting a (B, S)-market. The problem is to price a European call at = 0 with strike K 15 and maturity T, i.e. the random payoff H (S(T) - K) + . We can evaluate the call in every possible state at t T and see H = 5 (if S( T ) 20) with probability p and H 0 (if S( T ) = 7.5) with probability 1 - p. This is illustrated in Figure (1.2)
t
=
=
=
=
=
today
80
Bo
= =
{
one period 81
Bl Hi
= =
::
20 1 ax{ 20 - 15, O}
�
10 1
{
Ho = ?
Fig. 1.2.
81 7. 5 Bl 1 HI :: max{7.5 - 1 5 , 0} - O. =
=
One-period example
} }
up-state
down-state
The key idea now is to try to find a portfolio combining bond and stock, which synthesizes the cash flow of the option. If such a portfolio exists, holding this portfolio today would be equivalent to holding the option - they would produce the same cash flow in the future. So the price of the option should be the same as the price of constructing the portfolio, otherwise investors could just restructure their holdings in the assets and obtain a risk-free profit today. We briefly present the constructing of the portfolio (J = (00 , ( 1 ) , which in the current setting is just a simple exercise in linear algebra. If we buy 01 stocks and invest 00 £ in the bank account, then today's value of the portfolio is V(O) = 00 + 01 . S(O) . In state 1 the stock price is 20 £ and the value of the option 5 £ , so 00 + 01 . 20 In state 2 the stock price is 7.5
£
=
5.
and the value of the option 0 £ , so
00 + 0 1 . 7.5 = O. We solve this and get 00 - 3 and 0 1 = 0.4. So the value of our portfolio at time 0 in £ is =
V(O) = - 3B ( 0 ) + O.4S(O) = - 3 + 0.4 10 = 1 x
1 .4 Single-period Market Models
17
V(O) is called the no-arbitrage price. Every other price allows a riskless profit, since if the option is too cheap, buy it and finance yourself by selling short the above portfolio (i.e. sell the portfolio without possessing it and promise to deliver it at time T 1 - this is risk-free because you own the option). If on the other hand the option is too dear, write it (i.e. sell it in the market) and cover yourself by setting up the above portfolio. We see that the no-arbitrage price is independent of the individual pref erences of the investor (given by certain probability assumptions about the future, i.e. a probability measure JP) . But one can identify a special, so-called risk-neutral, probability measure JP* , such that Ho IE* ({3 H ) = (p* . {3(81 - K) + ( 1 - p *) · 0) 1 . =
=
=
In the above example, we get from 1 p*5 + ( 1 - p*)O that p * 0.2. This probability measure JP* is equivalent to JP, and the discounted stock price process, i.e. {3t8t , t = 0, 1 , follows a JP*-martingale. In the above example, this corresponds to 8(0) p* 8(T)UP + (1 - p * )8(T)dow n , that is 8(0) IE* ({38(T)). We will show that the above generalizes. Indeed, we will find that the no =
=
=
=
arbitrage condition is equivalent to the existence of an equivalent martingale measure (first fundamental theorem of asset pricing) and that the property that we can price assets using the expectation operator is equivalent to the uniqueness of the equivalent martingale measure.
Let us consider the construction of hedging strategies from a different perspective. Consider a one-period (B, 8)-market setting with discount factor {3 = 1. Assume we want to replicate a derivative H (that is a random variable on some probability space (il, F, JP)) . For each hedging strategy (J = (00 , Or ) we have an initial value of the portfolio V(O) 00 + 0 1 8(0) and a time t T value of V(T) Oo + 01 8(T) . We can write V(T) V(O) + (V(T) - V(O)) with G(T) = V(T) - V(O) 0 1 (8(T) - 8(0)) the gains from trading. So the costs C(O) of setting up this portfolio at time t = 0 are given by C(O) V(O) , while maintaining (or achieving) a perfect hedge at t T requires an additional capital of C(T) H - V(T) . Thus we have two possibilities for finding 'optimal' hedging strategies: Mean-variance hedging. Find 00 (or alternatively V(O)) and 0 1 such that IE ( (H - V (T)) 2 ) IE ( ( H - (V(O) + 01 (8(T) - 8(0)))) 2 ) -+ min =
=
=
=
=
=
=
=
•
=
Risk-minimal hedging. Minimize the cost from trading, i.e. an appropriate functional involving the costs C(t) , t = 0, T. In our example, mean-variance hedging corresponds to the standard linear regression problem, and so 8(0)) 0 1 Cov(H, (8(T) -8(0)) ) and Vo IE(H) - 01IE(8(T) - 8(0)) . Var (8(T) •
=
=
18
1 . Derivative Background
We can also calculate the optimal value of the risk functional Rmin =
Var ( H ) - (}�Var(8(T) - 8(0) ) = Var ( H ) ( I - p2 ) ,
where p is the correlation coefficient of H and 8(T) . Therefore we can't expect a perfect hedge in general. If however Ip l = 1, i.e. H is a linear function of 8(T) , a perfect hedge is possible. We call a market complete if a perfect hedge is possible for all contingent claims. 1 .4.2 A Single-period Model
We proceed to formalize and extend the above example and present in detail a simple model of a financial market. Despite its simplicity, it already has all the key features needed in the sequel (and the reader should not hesitate to come back here from more advanced chapters to see the bare concepts again) . We introduce i n passing a little of the terminology and notation of Chapter 4; see also Harrison and Kreps ( 1979) . We use some elementary vocabulary from probability theory, which is explained in detail in Chapter 2. We consider a single period model, i.e. we have two time-indexes, say t 0, which is the current time (date) , and t = T, which is the terminal date for all economic activities considered. The financial market contains d + 1 traded financial assets, whose prices at time t = ° are denoted by the vector 8(0) E IRd + I , =
(where ' denotes the transpose of a vector or matrix) . At time T, the owner of financial asset number i receives a random payment depending on the state of the world. We model this randomness by introducing a finite probability space ( D, F, lP ) , with a finite number I D I N of points (each corresponding to a certain state of the world) WI , , Wj , . . . , W each with positive probability: JP( {w} ) > 0, which means that every state of the world is possible. F is the set of subsets of D (events that can happen in the world) on which JP(.) is defined (we can quantify how probable these events are) , here F P ( D ) the set of all subsets of D. (In more complicated models it is not possible to define a probability measure on all subsets of the state space D, see §2. 1 . ) We can now write the random payment arising from financial asset i as =
.
•
.
N,
=
At time t 0, the agents can buy and sell financial assets. The portfolio position of an individual agent is given by a tmding stmtegy
1 .4
Single-period Market Models
19
The dynamics of our model using the trading strategy c.p are as follows: at time t = 0 we invest the amount 8(0)'c.p = E := o c.pi 8i (0) and at time t = T we receive the random payment 8(T, w)'c.p E := o c.pi8i (T, w) depending on the realized state w of the world. Using the (d + 1) N-matrix S, whose columns are the vectors 8(T, w) , we can write the possible payments more compactly as S' c.p. What does an arbitrage opportunity mean in our model? As arbitrage is making something out of nothing'; an arbitrage strategy is a vector c.p E JRd+ 1 ' such that 8(0)'c.p 0, our net investment at time = 0 is zero, and =
x
t
=
8(T, w)'c.p 2:: 0, 'Vw E il and there exists a w E il such that 8(T, w)'c.p > O. <
We can equivalently formulate this as: 8(0)'c.p consumption at time = 0, and
t
8(T, w)'c.p 2:: 0, 'Vw
E
0, we borrow money for
il,
i.e we don't have to repay anything at t = T. Now this means we had a 'free lunch ' at t = 0 at the market's expense. We agreed that we should not have arbitrage opportunities in our model. The consequences of this assumption are surprisingly far-reaching. So assume that there are no arbitrage opportunities. If we analyze the structure of our model above, we see that every statement can be formulated in terms of Euclidean geometry or linear algebra. For instance, absence of arbitrage means that the space r
=
{ (� ) ,
x E
JR, y
E JR N : x
=
-8(0)' c.p, y = S' c.p, c.p
E
JR d+ l
}
and the space JR� + 1 = { z E JRN + 1
:
Zi
2::
0 'V 0 :::; i
:::;
N
3
i
such that
Zi
>
o}
have no common points. A statement like that naturally points to the use of a separation theorem for convex subsets, the separating hyperplane theo rem (see e.g. Rockafellar (1970) for an account of such results, or Appendix A ) . Using such a theorem, we come to the following characterization of no arbitrage. Theorem 1 .4 . 1 . There is no arbitrage if and only if there exists a vector
such that
(1.4) S'If; = 8(0) . Proof. The implication ' {= ' follows straightforwardly: assume that 8(T, w),c.p 2:: 0, w E il for a vector c.p E JRd+1 . Then
20
1 . Derivative Background
1/Ji
S(O)'cp = (S1/J)'cp = 1/J' S'cp
�
0,
since > 0, \11 ::; i ::; N. So no arbitrage opportunities exist. To show the implication ' =} ' we use a variant of the separating hyperplane theorem. Absence of arbitrage means the r and IR�+ 1 have no common points. This means that K c IR�+ l defined by K
=
{Z
E IR�+ l
:
t,Zi I} =
t=O
and r do not meet. But K is a compact and convex set, and Nby the separating hyperplane theorem ( Appendix C), there is a vector A E IR + 1 such that for all E K
Z
A'
but for all
(x, y )'
Zi
Er
Z
>
0
Now choosing 1 successively we see that A i > 0, i = 0, . . . N, and hence by normalizing we get 'lj; = AjAo with 1/Jo = 1 . Now set x -S(O)'cp and 0 y = S' cp and the claim follows. =
=
The vector 1/J is called a state-price vector. We can think of 1/Jj as the marginal cost of obtaining an additional unit of account in state Wj . We can now reformulate the above statement to: There is no arbitmge if and only if there exists a state-price vector.
Using a further normalization, we can clarify the link to our probabilistic setting. Given a state-price vector 1/J = (1/Jb . . . , 1/JN ) , we set 1/Jo = 1/Jl + . + 1/J N and for any state Wj write qj = 1/Jj /1/Jo . We can now view (ql , " " qN ) as probabilities and define a new probability measure on il by Q( {Wj } ) qj , j 1 , . , N. Using this probability measure, we see that for each asset i we have the relation .
.
=
=
.
.
Hence the normalized price of the financial security i is just its expected payoff under some specially chosen 'risk-neutral' probabilities. Observe that we didn't make any use of the specific probability measure 1P in our given probability space. So far we have not specified anything about the denomination of prices. From a technical point of view, we could choose any asset i as long as its price vector (Si (O) , Si (T, wd , . . . , Si (T, WN ) )' only contains positive entries,
1 .4 Single-period Market Models
21
and express all other prices in units of this asset. We say that we use this asset as numeraire. Let us emphasize again that arbitrage opportunities do not depend on the chosen numeraire. It turns out that appropriate choice of the numeraire facilitates the probability-theoretic analysis in complex settings, and we will discuss the choice of the numeraire in detail later on. For simplicity, let us assume that asset 0 is a riskless bond paying one unit in all states w E n at time T. This means that 80 (T, w) = 1 in all states of the world w E n. By the above analysis we must have
and 'ljJ0 is the discount on riskless borrowing. Introducing an interest rate r, we must have 80 (0) = 'ljJo = ( 1 + r) - T . We can now express the price of asset i at time t 0 as =
We rewrite this as
8i (T) ( l + r)O
- (
_
IE
Q
8i (T) ( l + r)T
)
.
In the language of probability theory, we just have shown that the processes 8i (t)/( 1 + r) t , t = 0, T are Q-martingales. (Martingales are the probabilist's way of describing fair games: see Chapter 3.) It is important to notice that under the given probability measure IP (which reflects an individual agent's belief or the market's belief) the processes 8i (t)/( 1 + r) t , t = 0, T generally do not form IP-martingales. We use this to shed light on the relationship of the probability measures IP and Q. Since Q( {w }) > 0 for all w E n, the probability measures IP and Q are equivalent, and (see Chapters 2 and 3) because of the argument above we call Q an equivalent martingale measure. So we arrived at yet another characterization of arbitrage: There is no arbitrage if and only if there exists an equivalent martingale measure.
We also see that risk-neutral pricing corresponds to using the expectation operator with respect to an equivalent martingale measure. This concept lies at the heart of stochastic (mathematical) finance and will be the golden thread (or roter Faden) throughout this book. We now know how the given prices of our (d + 1) financial assets should be related in order to exclude arbitrage opportunities, but how should we price a newly introduced financial instrument? We can represent this financial instru ment by its random payments 8(T) = (8(T, w d , . . . , 8(T, wj ) , . . . , 8(T, W N ) )'
22
1 . Derivative Background
(observe that 8(T) is a vector in JRN ) at time t = T and ask for its price 8(0) at time t O. The natural idea is to use an equivalent probability measure Q and set 8(0) IEQ (8(T)/(1 + r f ) (recall that all time t 0 and time t T prices are related in this way) . Unfortunately, as we don't have a unique martingale measure in general, we cannot guarantee the uniqueness of the t 0 price. Put another way, we know every equivalent martingale measure leads to a reasonable relative price for our newly created financial instrument, but which measure should one choose? The easiest way out would be if there were only one equivalent martingale measure at our disposal - and surprisingly enough, the classical economic pricing theory puts us exactly in this situation! Given a set of financial assets on a market, the underlying question is whether we are able to price any new financial asset which might be introduced in the market, or equivalently whether we can replicate the cash-flow of the new asset by means of a portfolio of our original assets. If this is the case and we can replicate every new asset, the market is called complete. In our financial market situation the question can be restated mathemat ically in terms of Euclidean geometry: do the vectors Si (T) span the whole JRN ? This leads to: =
=
=
=
=
Theorem 1 .4.2. Suppose there are no arbitrage opportunities. Then the
model is complete if and only if the matrix equation S ' cp
=
8
E JRN. Linear algebra immediately tells us that the above theorem means that the number of independent vectors in S' must equal the number of states in n. In an informal way, we can say that if the financial market model con tains 2 (N) states of the world at time T, it allows for 1 (N 1) sources of randomness (if there is only one state we know the outcome) . Likewise we can view the numeraire asset as risk-free and all other assets as risky. We can now restate the above characterization of completeness in an informal (but intuitive) way as: has a solution cp
E
JRd+ l for any vector 8
-
A financial market model is complete if it contains at least as many in dependent risky assets as sources of randomness.
The question of completeness can be expressed equivalently in probabilis tic language (to be introduced in Chapter 3) as a question of represent ability of the relevant random variables or whether the a-algebra they generate is the full a-algebra.
1 . 4 Single-period Market Models
23
If a financial market model is complete, traditional economic theory shows that there exists a unique system of prices. If there exists only one system of prices, and every equivalent martingale measure gives rise to a price system, we can only have a unique equivalent martingale measure. ( We will come back to this important question in Chapters 4 and 6.) The (arbitrage-free) market is complete if and only if there exists a unique equivalent martingale measure. Example. We give a more formal example of a binary single-period model. We have d + l 2 assets and I nl = 2 states of the world n {Wl , W2 } . Keep ing the interest rate r = we obtain the following vectors ( and matrices ) :
0, 0 (0) ] = [ 1501 ] 80 (T) = [ 11 ] 81 (T) = [ 1 8900 ] ' = [ 1 801 901 ] . 8(0) = [ 881(0) We try to solve ( 1.4) for state prices, i.e. we try to find a vector 'I/J ( 'l/J 1 , 'l/J2 ) ' , 'l/Ji > 0, i 1, 2 such that =
=
S
'
'
=
=
Now this has a solution
[ ij�l [��] hence showing that there are no arbitrage opportunities in our market model. =
1
Furthermore, since 'l/J l + 'l/J 2 we see that we already have computed risk neutral probabilities, and so we have found an equivalent martingale measure Q with 2 (W2 ) 1 (W ) =
Q
l = 3' Q
=
3·
We now want to find out if the market is complete ( or equivalently if there is a unique equivalent martingale measure ) . For that we introduce a new financial asset a with random payments o(T) (o(T, wd , o (T, W2 ) )' . For the market to be complete, each such o(T) must be in the linear span of and 1 (T) . Now since 80 (T) and 81 (T) are linearly independent, their linear span is the whole JR2 ( = JR l n l ) and o(T) is indeed in the linear span. Hence we can find a replicating portfolio by solving
8
[
30,
80 (T)
=
0
O (T, wd O(T, W2 )
] [ 1 19080 ] [ ] 1 =
CPO . l cp
Let us consider the example of a European option above. There, o(T, WI ) o(T, W2 ) = and the above becomes
=
1. Derivative Background
24
with solution CPo = and CP = � , telling us to borrow units and buy � stocks, which is exactly what we did to set up our portfolio above. Of course, an alternative way of showing market completeness is to recognize that above admits only one solution for risk-neutral probabilities, showing the uniqueness of the martingale measure. Example. Change of Numeraire. We choose a situation similar to the above example, i.e. we have d + = assets and I n l = states of the world n = {W 1 , W2 } . But now we assume two risky assets (and no bond) as the financial instruments at our disposal. The price vectors are given by (O) S( O ) = So = Sl ( ) = = Sl ( O ) = S We solve and get state prices
-30
1
30
(1. 4)
2
1 2
] [ 11 ] ' So(T) [ 35/4/4 ] ' T [ 1 /22 ] '
[
[ 31 //42 5/42 ] '
(1 .4 )
[ 'l/J'l/J21 ] [ 6/7 2/7 ] ' =
showing that there are no arbitrage opportunities in our market model. So we find an equivalent martingale measure Q with Since we don't have a risk-free asset in our model, this normalization (this numeraire) is very artificial, and we shall use one of the modelled assets as a numeraire, say So. Under this normalization, we have the following asset prices (in terms of So (t, w )!):
Since the model is arbitrage-free (recall that a change of numeraire doesn't affect the no-arbitrage property) , we are able to find risk-neutral probabilities - = 154 ' q-1 194 and q2 We now compute the prices for a call option to exchange So for Sl . Define = m {S O } and the cash flow is given by =
Z(T)
1 (T) - So(T), ] [ 3/04 ] . [ Z(T,wd Z (T, W2 ) There are no difficulties in finding the hedge portfolio CPo and pricing the option as Zo 134 ' ax
-
=
=
-
¥ and CP
1
=
194
1 . 4 Single-period Market Models
We want to point out the following observation. Using we naturally write
Z(T,w) 05 (T, w ) - max { 51(50 (TT,,w)w) 1 , o }
25
50 as numeraire,
_
_
and see that this seemingly complicated option is (by use of the appropriate numeraire) equivalent to
Z(T, w) = max {S't (T,w)
-
}
1, 0 ,
a European call on the asset !h .
1 .4.3 A Few Financial-economic Considerations
The underlying principle for modelling economic behaviour of investors (or economic agents in general) is the maximization of expected utility, that is one assumes that agents have a utility function U ) and base economic deci sions on expected utility considerations. For instance, assuming a one-period model, an economic agent might have a utility function over current (t 0) and future (t = values of consumption
(
T)
.
=
(1.5) U( Co, CT ) = u(co ) + lE ((3U ( CT )), where Ct is consumption at time t and u(.) is a standard utility function expressing nonsatiation - investors prefer more to less; u is increasing; risk aversion - investors reject an actuarially fair gamble; u is concave; and (maybe) decreasing absolute risk aversion and constant relative risk aversion. Typical examples are power utility u(x) = (x')' - l)h, log utility u(x) log(x) or quadratic utility u x) x2 + dx (for which only the first two properties are true for certain arguments) . Assume such an investor is offered at t 0 at a price p a random payoff X at t How much will he buy? Denote by � the amount of the asset he chooses to buy and by e T , T = 0 , his original consumption. Thus, his problem is max [u( co) + lE[(3U(CT ) ]] •
•
•
=
such that
T.
(
=
=
=
T
�
Co = e o - p� and CT eT + X � . Substituting the constraints into the objective function and differentiating with respect to �, we get the first-order condition =
26
1. Derivative Background
[
x] .
' p = .IE ,B U '((CT)) U CO The investor buys or sells more of the asset until this first-order condition is satisfied. If we use the stochastic discount factor
pu' (co ) = .IE [,BU' ( CT )X]
m
we obtain the central equation
=
,B U' ( CT )
u' (co ) ,
(1 .6) p = .IE ( mX ) We can use (under regularity conditions) the random variable m to perform a change of measure, Le. define a probability measure JP* using JP* (A) = .IE * (IA ) = .IE(mlA ) . We write (1 .6) under measure JP* and get p = .IE * ( X ) . Returning to the initial pricing equation, we see that under JP* the investor has the utility function u ( x ) = x. Such an investor is called risk-neutral, and consequently one often calls the corresponding measure a risk-neutral measure. An excellent discussion of these issues (and further much deeper results) are given in Cochrane (2001). .
Exercises
Draw payoff diagrams of the following portfolios: 1 . A vertical spread: one option is bought and another sold, both on the same underlying stock and with same expiration date, but with different strikes. 2. A horizontal spread: one option is bought and another sold, both on the same underlying stock and with same strike, but with different expiration dates. 3. A straddle: a put and a call on the same underlying stock, with same strike and same expiration date. 1 . 2 A bear spread is created by buying a call with strike price Kl and selling a call with strike price K2 . Both calls are on the same underlying stock and have the same expiry date T, but K2 is greater than Kl (Le. K2 > Kt) . 1 . Draw the payoff diagram of a bear spread. 2. What are the market expectations of a (rational) investor setting up a bear spread? 3. How does a change in the stock's volatility affect the value of the position? 4. Construct a bear spread using put options. Compare the initial costs with the initial costs of a bear spread using calls. 1.1
27
Exercises
Show the following arbitrage bounds for call options. 1 . Consider call options, which are identical (same underlying, same expiry date) except for the strike price. The following relations hold: (a) C(KI ) ;:::: C(K2), if K2 ;:::: KI ;
1.3
(b) e- rT (K2 - Kd ;:::: C(Kd - C(K2), if K2 ;:::: K1 ; (c) AC(Kd + ( 1 - A)C(K2) ;:::: C(AKI + (1-A)K2), if K2 ;:::: K1 and 0 ::; >. ::; 1 . 2 . Consider call options, which are identical (same underlying, same strike price) except for the date of expiry, then Consider the example at the end of §1.4: 1. Solve (1.4) for state prices. 2. Compute risk-neutral probabilities with numeraire 80 . 3. Show that the model leads to a complete market. 4. Price a call on 80 and a call on 81 by computing the appropriate expectations and by constructing a hedge portfolio. 1 . 5 Consider a one-period financial market model consisting of a bank ac count B and a stock 8 modelled on a probability space (n, F, JP) with n {wI , w2}, F pen) and JP a probability measure on (n, F) such that JP({wI } ) > 0, JP({W2}) > O . Suppose that the current asset prices (time t 0) are B ( O ) 1 and 8(0) 5 and that the terminal prices (time t 1) are B (l , WI ) B ( l , W2) 1 + r with r 1/9, and 8 ( 1 , wd = 20/3 and 8 ( 1 , W2) 40/9. 1. Show that the model is free of arbitrage by computing - a state-price vector; - an equivalent martingale measure (using B as numeraire) . 2. Is the financial market model complete? 3. Consider a contingent claim X with X(WI ) 7 and X(W2) = 2. Find the time t 0 value of this claim by - using the risk-neutral valuation formula; - constructing a replicating portfolio. 1.4
=
=
=
=
=
=
=
=
=
=
=
=
2 . Probability Background
No one can predict the future! All that can be done by way of prediction is to use what information is available as well as possible. Our task is to make the best quantitative statements we can about uncertainty - which in the financial context is usually uncertainty about the future. The basic tool to quantify uncertainty is a probability density or distribution. We will assume that most readers will be familiar with such things from an elementary course in probability and statistics; for a clear introduction see, e.g. Grimmett and Welsh (1986), or the first few chapters of Grimmett and Stirzaker (2001); Ross (1997) , Resnick (2001), Durrett (1999) , Ross (1997) , Rosenthal (2000) are also useful. We shall use the language of probability, or randomness, freely to describe situations involving uncertainty. Even in the simplest situations this requires comment: the outcome of a coin toss, for instance, is deterministic given full information about the initial conditions. It is our inability in practice to specify these accurately enough to use Newtonian dynamics to predict the outcome that legitimizes thinking of the outcome as random - and makes coin-tossing available as a useful symmetry-breaking mechanism, e.g. to start a football match. We note also the important area of Bayesian statistics, in which the em phasis is not on randomness as such, but on uncertainty, and how to quantify it by using probability densities or distributions. This viewpoint has much to recommend it; for an excellent recent textbook treatment, see Robert (1997) . With this by way of preamble, or apology, we now feel free to make explicit use of the language of randomness and probability and the results, viewpoints and insights of probability theory. The mathematical treatment of probability developed through the study of gambling games (and financial speculation is basically just a sophisticated form of gambling!). Gambling games go back to antiquity, but the first well documented study of the mathematics of gambling dates from the correspon dence of 1654 between Pascal and Fermat. Probability and statistics grew together during the next two and a half centuries; by 1900, a great deal of value was known about both, but neither had achieved a rigorous modern form, or even formulation. Indeed, the famous list of Hilbert problems - posed to the International Congress of Mathematicians in 1900 by the great Ger-
30
2. Probability Background
man mathematician David Hilbert (1862-1943; see Appendix A ) , contains ( as part of Problem 6) putting probability theory onto a rigorous mathemat ical footing. The machinery needed to do this is measure theory, originated by Lebesgue ( see §2.1, below ) ; the successful harnessing of measure theory to provide a rigorous treatment of probability theory is due to the great Russian mathematician and probabilist Andrei Nikolaevich Kolmogorov (1903-1987) , in his classic book Kolmogorov (1933) . We begin with a brief summary in Chapter 2 of what we shall need of probability theory in a static setting. For financial purposes, we need to go further and handle the dynamic setting of information unfolding with time. The framework needed to describe this is that of stochastic processes ( or random processes ) ; we turn to these in discrete time in Chapter 3, and in continuous time in Chapter 5. Unless the reader is already familiar with measure theory, we recommend that he read Chapter 2 taking omitted proofs for granted: our strategy is to summarize what we need, and then use it. For the reader wishing to fill in the background here, or revise it, we recommend particularly Rudin (1976) ; many other good analysis texts are available. An excellent introductory measure theoretic text is Williams (1991). 2 . 1 Measure
The language of modeling financial markets involves that of probability, which in turn involves that of measure theory. This originated with Henri Lebesgue (1875- 1941), in his thesis, 'Integrale, longueur, aire', Lebesgue (1902) . We begin with defining a measure on 1R generalizing the intuitive notion of length. The length J.L(I) of an interval I (a, b) , [a, b] , [a, b) or (a, b] should be b - a: J.L(I) b - a. The length of the disjoint union I U;= l lr of intervals lr should be the sum of their lengths: =
=
=
( finite
additivity ) .
Consider nOw an infinite sequence h , 12 , ( ad infinitum) of disjoint inter vals. Letting n tend to 00 suggests that length should again be additive over disjoint intervals: .
.
.
( countable
additivity) .
For I an interval, A a subset of length J.L(A) , the length of the complement I \ A : I n A C of A in I should be =
J.L(I \ A)
=
J.L(I) - J.L(A)
( complementation ) .
2 . 1 Measure
31
If A � B and B has length J-L ( B ) = 0, then A should have length 0 also: A � B and J-L ( B ) = 0 :::} J-L ( A ) = 0 ( completeness ) . The term 'countable' here requires comment. We must distinguish first be tween finite and infinite sets; then countable sets ( like IN = { I , 2, 3, . . } ) are the 'smallest', or 'simplest', infinite sets, as distinct from uncountable sets such as JR = ( - 00 , ) Let F be the smallest class of sets A c JR containing the intervals, closed under countable disjoint unions and complements, and complete ( containing all subsets of sets of length 0 as sets of length 0) . The above suggests - what Lebesgue showed - that length can be sensibly defined on the sets F on the line, but on no others. There are others - but they are hard to construct ( in technical language: the axiom of choice, or some variant of it such as Zorn's lemma, is needed to demonstrate the existence of non-measurable sets - but all such proofs are highly non-constructive ) . So: some but not all subsets of the line have a length. These are called the Lebesgue-measurable sets, and form the class F described above; length, defined on F, is called Lebesgue measure J-L ( on the real line, JR) . Turning now to the general case, we make the above rigorous. Let n be a set. Definition 2 . 1 . 1 . A collection AD of subsets of n is called an algebra on n .
(0
.
if:
(i) n E AD, (ii) A E AD :::} AC = n \ A E AD, (iii) A, B E AD :::} A u B E AD .
Using this definition and induction, we can show that an algebra on n is a family of subsets of n closed under finitely many set operations. Definition 2 . 1 . 2 . An algebra A of subsets of n is called a a-algebra on n if for any sequence An E A, ( n E IN ) , we have 00
U A n E A.
n= l Such a pair (n, A) is called a measurable space.
Thus a a-algebra on n is a family of subsets of n closed under any countable collection of set operations. The main examples of a-algebras are a-algebras generated by a class C of subsets of n, i.e. a ( C ) is the smallest a-algebra on n containing C. The Borel a-algebra B = B(JR) is the a-algebra of subsets of JR generated by the open intervals ( equivalently, by half-lines such as ( - 00 , xl ) as x varies in JR. As our aim is to define measures on collection of sets, we now turn to set functions.
32
2. Probability Background
Ao an algebra on Q and Ilo a non-negative set function Il o : Ao --+ [0, 00] such that llo (0) = O . Ilo is called: (i) additive, if A, B E Ao , A n B = 0 =? llo (A U B) = llo ( A ) + llo (B) , (ii) countably additive, if whenever (An)nElN is a sequence of disjoint sets in Ao with U An E Ao then Definition 2 . 1 .3. Let Q be a set,
Definition 2 . 1 .4. Let
map
(Q, A) be a measurable space.
A countably additive
Il : A --+ [0, 00] is called a measure on (Q, A) . The triple (Q, A , Il) is called a measure space. Recall that our motivating example was to define a measure on JR consis tent with our geometrical knowledge of length of an interval. That means we have a suitable definition of measure on a family of subsets of JR and want to extend it to the generated a-algebra. The measure-theoretic tool to do so is the CaratModory extension theorem, for which the following lemma is an inevitable prerequisite. Lemma 2 . 1 . 1 . Let Q be a set. Let I be a 7r-system on Q, that is, a family of subsets of Q closed under finite intersections: h , 12 E I =? h n 12 E I. Let A a (I) and suppose that III and 11 2 are finite measures on (Q, A) (i. e. Il I (Q) 1l 2 ( Q ) < 00 ) and III = 112 on I. Then III 11 2 on A . Theorem 2 . 1 . 1 (Caratheodory Extension Theorem) . Let Q be a set, Ao an algebra on Q and A = a (Ao) . If Ilo is a countably additive set function on Ao , then there exists a measure Il on (Q, A) such that Il Ilo on Ao · If Ilo is finite, then the extension is unique. For proofs of the above and further discussion, we refer the reader to Chapter 1 and Appendix 1 of Williams ( 1991 ) and the appendix in Durrett ( 1996a ) . Returning to the motivating example Q JR, we say that A c JR belongs to the collection of sets Ao if A can be written =
=
=
=
=
as
where r E lN, -00 � a l is an algebra and a(Ao)
<
=
< br � 00. It can be shown that Ao For A as above define
bl � . . . � ar B.
2 . 1 Measure
33
T
lko(A) = � ) bk - a k ) . k= l
Iko is well-defined and countably additive on Ao. As intervals belong to Ao our geometric intuition of length is preserved. Now by Caratheodory's extension theorem there exists a measure Ik on (r2, B) extending Iko on (r2, Ao) . This Ik is called Lebesgue measure. With the same approach we can generalize: (i) the area of rectangles R (a I , bl ) (a 2 , b2 ) - with or without any of its perimeter included - given by Ik(R) (b l - ad (b2 - a 2 ) to Lebesgue measure on Borel sets in IR2 j (ii) the volume of cuboids C (a I , bd (a 2 , b2 ) (a3 , b3 ) given by =
x
=
x
=
x
x
to Lebesgue measure on Borel sets in IR3 j (iii) and similarly in k-dimensional Euclidean space IRk . We start with the formula for a k-dimensional box,
and obtain Lebesgue measure Ik, defined on B, in IRk . We are mostly concerned with a special class of measures: Definition 2 . 1 . 5 . A measure JP on a measurable space (r2, A) is called a
probability measure if
JP(r2)
=
l.
The triple (r2, A, JP) is called a probability space.
Observe that the above lemma and Caratheodory's extension theorem guar antee uniqueness if we construct a probability measure using the above pro cedure. For example, the unit cube [0, Il k in IRk has (Lebesgue) measure l . Using r2 [0, Il k as the underlying set in the above construction, we find a unique probability (which equals length / area/volume if k 1 / 2/3). If a property holds everywhere except on a set of measure zero, we say it holds almost everywhere (a.e.) . If it holds everywhere except on a set of probability zero, we say it holds almost surely (a.s.) (or, with probability one) . Roughly speaking, one uses addition in countable (or finite) situations, integration in uncountable ones. As the key measure-theoretic axiom of count able additivity above concerns addition, countably infinite situations (such as we meet in discrete time) fit well with measure theory. By contrast, uncount able situations (such as we meet in continuous time) do not - or at least, =
=
34
2. Probability Background
are considerably harder to handle. This is why the discrete-time setting of Chapters 3 and 4 is easier than, and precedes, the continuous-time setting of Chapters 5 and 6. Our strategy is to do as much as possible to introduce the key ideas - economic, financial and mathematical - in discrete time ( which, because we work with a finite time-horizon, the expiry time T, is actually a finite situation) , before treating the harder case of continuous time. 2 . 2 Integral
Let (0, A) be a measurable space. We want to define integration for a suitable class of real-valued functions. Definition 2 . 2. 1 . Let f : 0 --+ JR. For A c JR define f - l (A) = {w E O : f(w) E A} . f is called (A-) measurable if f - l (B) E A for all B E B.
Let f-L be a measure on (0, A) . Our aim now is to define, for suitable measurable functions, the ( Lebesgue ) integral with respect to f-L . We will denote this integral by f-L (f)
=
J
fdf-L =
[J
J
f (w) f-L (dw ) .
[J
We start with the simplest functions. If A E A the indicator function lA (W) is defined by . lA (W) = l, �f W E A 0, If W f¢ A.
{
Then define f-L(lA) f-L(A). The next step extends the definition to simple functions. A function f is called simple if it is a finite linear combination of indicators: f = E�= 1 Ci lA; for constants Ci and indicator functions lAi of measurable sets Ai' One then extends the definition of the integral from indicator functions to simple func tions by linearity: =
for constants Ci and indicators of measurable sets Ai. If f is a non-negative measurable function, we define simple, fa :::; f } . The key result in integration theory, which we must use here to guarantee that the integral for non-negative measurable functions is well-defined, is: f-L ( f) : = sup { f-L ( fo ) : fa
2 . 2 Integral
35
(In) is a se quence of non-negative measurable functions such that fn is strictly mono tonic increasing to a function f (which is then also measurable), then
Theorem 2 . 2 . 1 (Monotone Convergence Theorem) . If
J.L(Jn) -+ J.L(J) :::; 00 . We quote that we can construct each non-negative measurable f as the increasing limit of a sequence of simple functions fn : for all w E fl
fn ( w ) t f(w )
( n -+
00) , fn simple.
Using the monotone convergence theorem we can thus obtain the integral of f as J.L(J) := lim J.L(Jn ) . n -t oo
Since fn increases in n, so does J.L(Jn) (the integral is order-preserving) , so J.L(Jn ) either increases to a finite limit, or diverges to 00. In the first case, we say f is (Lebesgue-) integrable with (Lebesgue-) integral J.L(J) = lim J.L(Jn) . Finally i f f is a measurable function that may change sign, we split it into its positive and negative parts, f± : 1+ ( w ) := max(J(w) , 0) , f- (w) := - min(J(w) , 0) , f(w) = 1+ (w) - f- (w) , If( w ) 1 = 1+ (w) + f- ( w ) . If both 1+ and f- are integrable, we say that f is too, and define Thus, in particular, I f I is also integrable, and The Lebesgue integral thus defined is, by construction, an absolute integral: f is integrable iff I f I is integrable. Thus, for instance, the well-known formula
J sinx x dx = 2 00
�
o
has no meaning for Lebesgue integrals, since It ISi� x l dx diverges to +00 like It � dx. It has to be replaced by the limit relation x
sin x dx Jx o
-+
7r
-2 (X -+ 00).
The class of (Lebesgue-) integrable functions f on fl is written £'(fl) or (for reasons explained below) £. 1 (fl) - abbreviated to £. 1 or £..
36
2. Probability Background
For p 2: 1 , the £,P space £,p(n) on n is the space of measurable functions f with £'P-norm
The case p = 2 gives £, 2 , which is particularly important as it is a Hilbert space (Appendix A) . Thrning now to the special case n = IRk we recall the well-known Rie mann integral. Mathematics undergraduates are taught the Riemann integral (G.B. Riemann (1826-1866)) as their first rigorous treatment of integration theory - essentially this is just a rigorization of the school integral. It is much easier to set up than the Lebesgue integral, but much harder to manipulate. For finite intervals [a, bj ,we quote: (i) for any function f Riemann-integrable on [a, bj , it is Lebesgue-integrable to the same value (but many more functions are Lebesgue integrable) ; (ii) f is Riemann-integrable on [a, bj iff it is continuous a.e. on [a, bj . Thus the question, 'Which functions are Riemann-integrable?' cannot be answered without the language of measure theory - which gives one the technically superior Lebesgue integral anyway. Suppose that F(x) is a non-decreasing function on IR:
F(x) :::; F(y)
if x :::; y.
Such functions can have at most countably many discontinuities, which are at worst jumps. We may without loss redefine F at jumps so as to be right continuous. We now generalize the starting points above: Measure. We take JL((a, b] ) := F(b) - F(a) . Integral. We have JL( l(a,bj ) = JL((a, b] ) = F(b) - F(a) . We may now follow through the successive extension procedures used above. We obtain: Lebesgue-Stieltjes measure JLF , Lebesgue-Stieltjes integral JLF(f) = J fdJLF, or even J fdF . The approach generalizes to higher dimensions; we omit further details. If instead of being monotone non-decreasing, F is the difference of two such functions, F = FI - F2 , we can define the integrals J fdFI , J fdF2 as above, and then define •
•
•
•
If [a, bj is a finite interval and F is defined on [a, bj , a finite collection of points, x o , X l , . . . , Xn with a = X o < Xl < . . . < Xn = b, is called a partition
2 . 3 Probability
37
of over [a, b] , P say. The sum E �= l ! F ( Xi ) - F ( Xi- d l is called the variation of partition.ofTheoverleasttheupper F called thethevariation intervalbound[a, b]of, Vthis; (F)over: all partitions P is F This , may be +00;If Fbutis ifofV;(F) F is said to be of finite variation on finite variation on all finite intervals, F is said to [a, b] F FV; . belinelocally of finite variation, F E FV;oc ; if F is of finite variation on the real F is of finite variation, F E FV . WeJR, quote that the following two properties are equivalent: F is locally of finite variation, F can be written the difference F Fl - F2 of two monotone functions. So the above procedure defines the integral J fdF when the integrator F is of finite variation. Remark 2. 2. 1. (i) When we pass from discrete to continuous time, we will need to handle both 'smooth' paths and pathsvariation that varybutbybounded jumps -quadratic of finite variation and ' r ough' ones of unbounded variation; (ii) The Lebesgue-Stieltjes integral J g(x)dF(x) is needed to express the ex pectation JE g ( X ) , where X is random variable with distribution function F and a suitable function. <
E
00 ,
•
•
=
as
9
2 . 3 Probability
we remarkedcaninbethetraced introduction of correspondence this chapter, thebetween mathematical theory ofAs662) probability to 654, to Pascal ( 6231 1 60 and Fermat ( -1665). However, the theory remained both incom 1 and non-rigorous1 until 1 the 20th century. It turns out that the Lebesgue plete theory ofto construct measure and integraltheory sketched above is exactly thefor machinery needed a rigorous of probability adequate modelling reality (option pricing etc.) for us. This was realized by Kolmogorov ( 1 9039nung 933, Grundbegriffe der Wahrscheinlichkeitsrech 7), whose classic book of 1 8 (Foundations of Probability 1 Theory) inaugurated the modern era in probability, Kolmogorov (1933). Recall from your first course on probability that, to describe a random experiment mathematically, we begin with the sample space n, the set of all outcomes. Each point of then, orrandom sample point, represents a possible -possible random outcome of performing experiment. For apr(A)). set A �Wen ofclearly pointswantw we want to know the probability JP(A) (or Pr(A), w
38
2. Probability Background
(1) JP(0) = 0, JP(Q) = 1 , (2) JP(A) ;::: 0 for all A, (3) If AI . A2 , , An are disjoint, JP(U�=l Ai ) = l:�= l JP(Ai ) ( finite additiv ity), which, as above we will strengthen to (3') If Al , A2 . . . ( ad inf.) are disjoint, •
.
•
( countable
(4) If B � A and JP(A) = 0, then JP(B)
=
additivity ) .
0 ( completeness ) .
Then by (1) and (3) ( with A = AI . Q \ A = A2 ) , JP(A C )
= JP(Q \ A) = 1 JP(A) . -
So the class F of subsets of Q whose probabilities JP(A) are defined ( call such A events) should be closed under countable, disjoint unions and com plements, and contain the empty set 0 and the whole space Q. So, F should be a O"-algebra and JP should be defined on F according to Definition 2. 1.5. Repeating this: Definition 2.3. 1 . A probability space, or Kolmogorov triple, is a triple
(Q, F, JP) satisfying Kolmogorov axioms (1), (2), (3 ') , (4) above.
A probability space is a mathematical model of a random experiment. Often we quantify outcomes w of random experiments by defining a real valued function X on Q, i.e. X : Q -t JR. If such a function is measurable it is called a random variable. Definition 2.3.2. Let (Q, F, JP) be a probability space. A random variable
(vector) X is a function X : Q -t JR (X : Q -t JRk ) such that X- l (B) {w E Q : X (w) E B } E F for all Borel sets B E B(JR) (B E B(JR k )).
=
In particular we have for a random variable X that {w E Q : X (w) :$ x } E F for all x E JR. Hence we can define the distribution function Fx of X by Fx ( x ) := JP ({w : X (w) :$ x } ) . The smallest -algebra containing all the sets {w : X (w) :$ x } for all real x ( equivalently, {X < x } , {X ;::: x } , {X > x } ) is called the -algebra generated by X, written O" (X) . Thus, 0"
0"
X
is F measurable ( is a random variable) iff O"(X) � F. -
The events in the O"-algebra generated by X are the events {w : X (w) E B } , where B runs through the Borel O"-algebra on the line. When the ( random) value X (w ) is known, we know which of these events have happened.
2.3 Probability
39
Think of a (X) as representing what we know when we know or in other words the information contained in X (or in knowledge of X) . This is reflected in the following result, due to J.L. Doob, which we quote:
Interpretation.
X,
a(X)
�
a(Y)
if and only if
X = g(Y)
for some measurable function g. For, knowing Y means we know X : = g(Y) - but not vice versa, unless the function 9 is one-to-one (injective) , when the inverse function g - l exists, and we can go back via Y = g - l (X) . Note. An extended discussion of generated a-algebras in the finite case is given in Dothan (1990) , Chapter 3. Although technically avoidable, this is useful preparation for the general case, needed for continuous time. A measure (§2. 1) determines an integral (§2.2) . A probability measure IP, being a special kind of measure (a measure of total mass one) determines a special kind of integral, called an expectation. Definition 2.3.3. The expectation lE of a random variable X on (D, :F, IP)
is defined by
lEX : =
J XdIP,
or
J X (w)dIP(w) .
n
n
The expectation - also called the mean - describes the location of a distribu tion (and so is called a location parameter) . Information about the scale of a distribution (the corresponding scale parameter) is obtained by considering the variance Var(X) : = lE [ (X _ lE(X) ) 2 ]
=
lE ( X2 ) - (lEX) �
If X is real-valued, say, with distribution function F, recall that lEX is defined in your first course on probability by lEX :=
J xf (x) dx
if X has a density f
or if X is discrete, taking values xn ( n = 1, 2, . . . ) with probability function f (xn ) ( � 0) (E x n f(xn) 1), =
lEX : =
L Xnf(xn) .
These two formulae are the special cases (for the density and discrete cases) of the general formula 00
lEX : =
J xdF(x)
-
00
where the integral on the right is a Lebesgue-Stieltjes integral. This in turn agrees with the definition above, since if F is the distribution function of X ,
40
2. Probability Background 00
J XdlP J xdF(x) =
n
-
00
follows by the change of variable formula for the measure-theoretic integral, on applying the map X : n � JR (we quote this: see any book on measure theory, e.g. Dudley ( 1 989)) . Clearly the expectation operator IE is linear. It even becomes multiplica tive if we consider independent random variables. Definition 2.3.4. Random variables Xl ' . . . ' Xn are independent if when
ever Ai
E
B for i
=
1 , . . . n we have
Using Lemma 2.1.1 we can give a more tractable condition for independence: Lemma 2.3. 1 . In order for Xl ' . . " Xn to be independent, it is necessary and sufficient that for all X l > ' . . X n E ( - 00 00 ] ,
,
Now using the usual measure-theoretic steps (going from simple to inte grable functions) it is easy to show: Theorem 2 . 3 . 1 (Multiplication Theorem) . If X l , . . . , Xn are indepen
dent and IE IXi l
<
00,
i
=
1 , . . . , n, then
We now review the distributions we will mainly use in our models of financial markets. Examples. (i) Bernoulli distribution. Recall our arbitrage-pricing example from § 1 .4. There we were given a stock with price S(O) at time t O. We assumed that after a period of time Llt the stock price could have only one of two values, either S(Llt) eU S(O) with probability p or S(Llt) = e d S(O) with probability 1 - p ( d E JR) . Let R(Llt) r ( l ) be a random variable modeling the logarithm of the stock return over the period [0, Lltj ; then =
=
u,
=
)
p and lP(r(l ) = d) 1 - p. We say that r ( l ) is distributed according to a Bernoulli distribution. Clearly IE(r( I ) ) = p + d( l - p) and Wa r(r( I ) ) 2 p + d 2 ( 1 - p) - ( IEX) � lP(r( l )
u
= u
=
=
=
u
2.3 Probability
41
The0 standard caseaofverya Bernoulli distribution by choosing (which is not useful choice in financialis given modeling). If wesayconsider of the stockintoreturn over(ii)n Binomial periods (ofdistribution. equal length), over [the0 , T],logarithm then subdividing the periods 1, . . . , n we have 8(T) ] = log [ 8(T) . . . 8(LH) ] R(T) = log [ 8(0) 8(T L1t) 8(0) log [ 8(T8(T)L1t) ] + . . . + log [ 8(L1t) 8(0) ] = r(n) + . . . + r(1) . Assuming that r(i) , ias above 1 , . . . , n are independent and that each r(i) is Bernoulli distributed we have thatoperator R(T) L: �=l r( i) is binomially distributed. Linearity of the expectation and independence yield (R(T) ) (r(i) ) and ar (R(T) ) = L: JE �=l JE V L:�=l V ar(r(i) ) . Again for the standard case one would use shorthand notation for a binomial random variable X is then X1 , d BO.(n,The p) and we can compute JP(X k) ( � ) pk ( 1 - p) ( n - k ) , JE(X) np, War (X) np( 1 - p) . (iii) Normal distribution.normalized As we willbinomial show indistributions the sequel the limit(standard) of a se quence of appropriate is the normalparameters distribution. 2We, in say X is normally distributed with shorta XrandomN(/-L,variable ( 2 ) , if X has density function a 1 X - /-L 2 1 f 2 (X) -- exp -- ( -- ) . 2 a V27ra anddescribed War (X) = a 2 , and thus a normally One can showrandom that variable JE(X) distributed is fully by knowledge of its mean and variance. Returning to themodel aboveofexample, oneofoffinancial the keymarkets results ofwiththisone-period text will beassetthat the limiting a sequence returns modeled byofa instantaneous Bernoulli distribution is a model whereThat the means distri bution of the logarithms asset returns is normal. 8(t+L1t)j8(t) is lognormally distributed (Le. log(8(t+L1t)j8(t)) is normally distributed). Although rejected by many empirical studies (see Eberlein and Keller ( 1995) for a recent overview), such a model seems to be the standard in use financial practitioners (andagainst we willusing call itnormally the standard model inrandom the among following). The main arguments distributed variables for modeling log-returns (Le. log-normal distributions for returns) asymmetry (semi-) heavy know symmetric that distributions of financialareasset returns and are generally rathertails. closeWeto being around u =
1, d =
_
=
_
=
=
=
u =
=
rv
=
=
=
=
/-L ,
rv
/J- , O"
=
=
/-L
{
}
42
2. Probability Background
zero, but there is a definite tendency towards asymmetry. This may be ex plained by the fact that the markets react differently to positive as opposed to negative information (see Shephard ( 1996) §1 .3.4) . Since the normal distri bution is symmetric it is not possible to incorporate this empirical fact in the standard model. Financial time series also suggest modeling by probability distributions whose densities behave for x -+ ±oo as Ixl P exp{ -a Ix/ } with p E IR, a > O. This means that we should replace the normal distri bution with a distribution with heavier tails. Such a model like this would exhibit higher probabilities of extreme events, and the passage from ordinary observations (around the mean) to extreme observations would be more sud den. Among suggested (classes of) distributions to be used to address these facts is the class of hyperbolic distributions (see Eberlein and Keller (1995) and §2. 12 below) , and more general distributions of normal inverse Gaussian type (see Barndorff-Nielsen ( 1998) , Rydberg ( 1999) , Rydberg ( 1997)) appear to be very promising. (iv) Poisson distribution. Sometimes we want to incorporate in our model of financial markets the possibility of sudden jumps. Using the standard model, we model the asset price process by a continuous stochastic process, so we need an additional process generating the jumps. To do this we use point processes in general and the Poisson process in particular. For a Pois son process, the probability of a jump (and no jump respectively) during a small interval Llt are approximately .JP(1I( 1 )
=
1)
�
ALlt and .JP(1I(1 )
=
0)
�
1 - ALlt,
where A is a positive constant called the rate or intensity. Modelling small intervals in such a way we get for the number of jumps N(T) 1I(1) + . . . + 1I( n ) in the interval [0, T] the probability function =
.JP(N(T)
=
k)
=
e - AT (AT) k k!
,
k
=
0, 1 , . . .
and we say the process N(T) has a Poisson distribution with parameter AT. We can show JE(N(T) ) AT and War(N(T)) AT. Glossary. Table 2 . 1 summarizes the two parallel languages, measure-theoretic and probabilistic, which we have established. =
=
2 . 4 Equivalent Measures and Radon-Nikodym D erivatives
Given two measures .JP and Q defined on the same a-algebra F, we say that .JP is absolutely continuous with respect to Q, written
2.4 Equivalent Measures and Radon-Nikodym Derivatives
Measure
Probability
Integral Measurable set Measurable function Almost-everywhere (a.e.)
Expectation Event Random variable Almost-surely (a.s.)
Table 2 . 1 .
43
Measure-theoretic and probabilistic languages
IP « Q
if IP(A) 0, whenever Q(A) 0, A E F. We quote from measure theory the vitally important Radon-Nikodym theorem: =
=
Theorem 2.4. 1 (Radon-NikodYm) . IP «
surable function f such that
IP(A)
=
J fdQ
Q iff there exists a (F-) mea
VA E F.
A (Note that since the integral of anything over a null set is zero, any IP so representable is certainly absolutely continuous with respect to Q the point is that the converse holds.) Since IP(A) fA dIP, this says that fA dIP fA fdQ for all A E F. By analogy with the chain rule of ordinary calculus, we write dIP/dQ for f; then -
=
=
J dIP J �� dQ
VA E F.
=
A
A
Symbolically, if IP « Q, dIP
=
�� dQ.
The measurable function (random variable) dIP/ dQ is called the Radon Nikodym derivative (RN-derivative) of IP with respect to Q . If IP < < Q and also Q < < IP, we call IP and Q equivalent measures, written IP '" Q. Then dIP/ dQ and dQ / dIP both exist, and dIP dQ 1/ = dQ dIP ·
For IP '" Q, IP(A) 0 iff Q(A) 0: IP and Q have the same null sets. Taking negations: IP '" Q iff IP, Q have the same sets of positive measure. Taking complements: IP '" Q iff IP, Q have the same sets of probability one (the same a.s. sets) . Thus the following are equivalent: =
=
44
2. Probability Background
IP Q iff IP, Q have the same null sets, iff IP, Q have the same a.s. sets, iff IP, Q have the same sets of positive measure. Far from being an abstract theoretical result, the Radon-Nikodym theo rem is of key practical importance, in two ways: It is the key to the concept of conditioning (§2.5, §2.6 below ) , which is of central importance throughout, The concept of equivalent measures is central to the key idea of mathemati cal finance, risk-neutrality, and hence to its main results, the Black-Scholes formula, fundamental theorem of asset pricing, etc. The key to all this is that prices should be the discounted expected values under an equiva lent martingale measure. Thus equivalent measures, and the operation of change of measure, are of central economic and financial importance. We shall return to this later in connection with the main mathematical result on change of measure, Girsanov's theorem ( see §5.7) . ""'
•
•
2 . 5 Conditional Expectation
For basic events define (2. 1) IP(A I B) IP(A n B)/ IP(B) if IP(B) > O. From this definition, we get the multiplication rule IP(A n B) IP(A I B)IP(B) . Using the partition equation IP(B) L n IP(B I An)IP(An) with (An) a finite or countable partition of il, we get the Bayes rule IP(Ai)IP(BIAi) IP( A I B) Lj IP(Aj )IP(B I Aj ) ' We can always write IP(A) .JE(lA) with lA (W) 1 if w E A and lA (W) = 0 otherwise. Then the above can be written :=
=
=
=
=
=
(2.2)
This suggests defining, for suitable random variables X, the IP-average of X over B as (2.3) .JE(X IB) .JE(XIB) IP(B) . =
Consider now discrete random variables X and Y. Assume X takes values Xl , . . . , Xm with probabilities It (Xi) > 0, Y takes values Y l , . . . , Yn with prob abilities !2 (Yj ) > 0, while the vector (X, Y) takes values (Xi , Yj ) with proba bilities ! (Xi, Yj ) > O. Then the marginal distributions are
45
2.5 Conditional Expectation n
m
j=l
i= l
We can use the standard definition above for the events {Y xd to get
=
Yj }
and {X =
Thus conditional On X Xi (given the information X = Xi ) , Y takes On the values Y1 , . . . , Yn with (conditional) probabilities =
So we can compute its expectation as usual:
Now define the random variable Z lE(YIX) , the conditional expectation of Y given X, as follows: if X(w) Xi , then Z(w) lE(YIX = Xi ) = Zi (say) . Observe that in this case Z is given by a 'nice' function of X. However, a more abstract property also holds true. Since Z is constant On the sets {X xd it is O'(X)-measurable (these sets generate the O'-algebra) . Furthermore =
=
=
=
J
j
{ X =x; }
J
j
YdIP.
{ X =Xi }
Since the {X = xd generate O'(X), this implies
J ZdIP = J YdIP
G
G
V G E O'(X) .
Density Case. If the random vector (X, Y) has density f(x, Y ) , then X has (marginal) density h (x) J�oo f(x , y)dy, Y has (marginal) density h ( Y ) := J�oo f(x, y)dx. The conditional density of Y given X X is: :=
=
fYl x (y l x) : = f(x,(x)y) . h
46
2. Probability Background
Its expectation is 00
lE(YIX
=
x)
1
=
y fY l x ( Y l x)dy
=
- 00
So we define
{
J�oo Yf (x , y )dy . JI (x)
lE(YIX x) if JI (x) > 0 if JI (x) 0, o and call c(X) the conditional expectation of Y given X, denoted by lE(YIX) . Observe that on sets with probability zero (i.e { X(w ) Xj JI (x) O}) the choice of c(x) is arbitrary, hence lE(YIX ) is only defined up to a set of probability zerOj we speak of different versions in such cases. With this definition we again find c(x)
=
=
=
=
w :
1 c(X) dJP 1 YdJP =
G
VG
G
Indeed, for sets G with G = {w : X(w ) Fubini's theorem
E B}
E
=
a(X) .
with B a Borel set, we find by
00
1 c(X)dJP 1 I B (x)c(x)JI (x)dx =
G
- 00 00
=
1 I B (x)JI (x) 1 yfyl x (y l x)dydx
- 00 00
=
00
00
- 00
1 1 I B (x) yf (x , y) dydx 1 YdJP. =
G
- 00 - 00
Now these sets G generate a(X ) and by a standard technique (the 7l"-systems lemma, see Williams (2001) , §2.3) the claim is true for all G E a(X ) . Example. Bivariate Normal Distribution, N ( #-tl , #-t2 , � � p) . O"
, O"
,
the familiar regression line of statistics (linear model) - see Exercise 2.6. General Case. Here, we follow Kolmogorov's construction using the Radon-Nikodym theorem. Suppose that 9 is a sub-a-algebra of F, 9 c F. If Y is a non-negative random variable with lEY < 00, then Q(G) :=
1 YdJP
G
(G E 9)
2.5 Conditional Expectation
47
is non-negative, a-additive - because
if G U n Gn , Gn disjoint - and defined on the a-algebra Q, so it is a measure on Q. If lP(G) = 0, then Q(G) 0 also (the integral of anything over a null set is zero) , so Q « lP. By the Radon-Nikodym theorem, there exists a Radon-Nikodym deriva tive of Q with respect to lP on Q, which is Q-measurable. Following Kol mogorov, we call this Radon-Nikodym derivative the conditional expectation of Y given (or conditional on) Q, lE(YIQ) , whose existence we now have es tablished. For Y that changes sign, split into Y y+ Y- , and define lE(YIQ) := lE(Y + IQ) - lE(Y - IQ) · We summarize: =
=
-
=
lE( I YI) < 00 and Q be a sub- u -algebra of F. We call a random variable Z a version of the conditional expectation lE(Y IQ) of Y given Q, and write Z = lE(YIQ), a. s., if (i) Z is Q-measurable; (ii) lE(IZI) < 00; (iii) for every set G in Q, we have Definition 2. 5 . 1 . Let Y be a random variable with
J YdlP = J ZdlP
VG
G
G
E
Q.
(2.4)
Notation. Suppose Q = u(X1 , . . . , Xn). Then
and one can compare the general case with the motivating examples above. To see the intuition behind conditional expectation, consider the following situation. Assume experiment has been performed, i.e. w E Q has been realized. However, the only information we have is the set of values X (w) for every Q-measurable random variable X. Then Z(w) lE(Y I Q) (w) is the expected value of Y(w) given this information. We used the traditional approach to define conditional expectation via the Radon-Nikodym theorem. Alternatively, one can use Hilbert space projection theory (Neveu (1975) and Jacod and Protter (2000) follow this route) . Indeed, for Y E .c 2 (Q, F, lP) one can show that the conditional expectation Z lE(Y IQ) is the least-squares-best Q-measurable predictor of Y: amongst all Q-measurable random variables it minimizes the quadratic distance, i.e. lE[(Y _ 1E(Y IQ)) 2 ] min{lE[(Y - X) 2 ] : X Q - measurable} . an
=
=
=
48
2. Probability Background
Note. 1 . To check that something is a conditional expectation: we have to check that it integrates the right way over the right sets ( Le., as in (2.4) ). 2. From (2.4): if two things integrate the same way over all sets B E 9, they have the same conditional expectation given 9. 3. For notational convenience, we shall pass between lE(YI9) and lEgY at will. 4. The conditional expectation thus defined coincides with any we may have already encountered - in regression or multivariate analysis, for example. However, this may not be immediately obvious. The conditional expectation defined above - via O'-algebras and the Radon-Nikodym theorem - is rightly called by Williams 'the central definition of modern probability' ( see Williams ( 1991) , p.84 ) . It may take a little getting used to. As with all important but non-obvious definitions, it proves its worth in action: see §2.6 below for properties of conditional expectations, and Chapter 3 for its use in studying stochastic processes, particularly martingales ( which are defined in terms of conditional expectations ) . We now discuss the fundamental properties of conditional expectation. From the definition linearity of conditional expectation follows from the lin earity of the integral. Further properties are given by Proposition 2 . 5 . 1 . 1 . I f9
= {0, a} , lE(Y I {0, a}) = lEY. 2. If 9 = :F, lE(Y I:F) = Y 1P a . s . . 3. If Y is 9-measurable, lE(Y lm = Y 1P a . s . . 4. Positivity. If X 2: 0 , then lE(X lm 2: 0 1P a . s . . 5. Taking out what is known. If Y is 9-measurable and bounded, lE(YZI9) = YlE(Z I 9) 1P a . s . . 6. Tower property. If 90 C 9, lE[lE(Y l m I 90] = lE[Y I90] a . s . . 7. Conditional mean formula. lE[lE(Y l m] = lEY 1P a . s . 8. Role of independence. If Y is independent of 9, lE(Y l m = lEY a . s . 9. Conditional Jensen formula. If c : IR -+ IR is convex, and lElc(X ) 1 < 00 , then lE(c(X) 1 9) 2: c (lE(X I 9 ) ) . -
-
-
-
-
Proof. 1 . Here 9 {0, a} is the smallest possible O'-algebra ( any 0' algebra of subsets of a contains 0 and a) , and represents 'knowing nothing'. We have to check (2.4) for G 0 and G a. For G 0 both sides are zero; for G a both sides are lEY. 2. Here 9 = :F is the largest possible O'-algebra, and represents 'knowing everything'. We have to check ( 2.4 ) for all sets G E :F. The only integrand that integrates like Y over all sets is Y itself, or a function agreeing with Y except on a set of measure zero. Note. When we condition on :F ( 'knowing everything' ) , we know Y ( because we know everything ) . There is thus no uncertainty left in Y to average out, =
=
=
=
=
2.5 Conditional Expectation
49
so taking the conditional expectation (averaging out remaining randomness) has no effect, and leaves Y unaltered. 3. Recall that Y is always F-measurable (this is the definition of Y being a random variable). For 9 c F, Y may not be g-measurable, but if it is, the proof above applies with 9 in place of F. Note. To say that Y is g-measurable is to say that Y is known given 9 that is, when we are conditioning on g. Then Y is no longer random (being known when 9 is given) , and so counts as a constant when the conditioning is performed. 4. Let Z be a version of lE(X I Q ) . If IP(Z < 0) > 0, then for some n, the set G := {Z < _n- 1 } E 9 and IP ( { Z < _ n - 1 }) > O . Thus o � lE ( X I G ) = lE ( Z l G ) < -n- 1 IP(G) < 0, which contradicts the positivity of X . 5. First, consider the case when Y is discrete. Then Y can be written as
-
for constants bn and events Bn E g. Then for any B E g, B n Bn E 9 also (as 9 is a a-algebra) , and using linearity and (2.4) :
[
Y JE ( Z I Q ) dlP
�
[ (t, ) I>" l B.
= I:>n =J N
n =1
J
JE ( Z I Q ) dJP
ZdIP
BnBn
=J
�
t, /. bn
B
N
JE ( Z I Q ) dJP
'
L bn l B n ZdIP n B =1
YZdIP.
B
Since this holds for all B E g, the result holds by (2.4). For the general case, we approximate to a general random variable Y by a sequence of discrete random variables Yn , for each of which the result holds as just proved. We omit details of the proof here, which involves the standard approximation steps based on the monotone convergence theorem from measure theory (see e.g. Williams (1991), p.90, proof of (j)). We are thus left to show the lE ( I ZY i ) < 00, which follows from the assumption that Y is bounded and Z E .c 1 . 6. lEgo lEg Y is go-measurable, and for C E go C g, using the definition of lEgo ' lEg :
J lEgo [lEg Yj dIP = J lEg YdIP = J YdIP.
c
c
c
50
2. Probability Background
So lEgo [lEgY) satisfies the defining relation for lEgo Y. Being also go-measur able, it is lEgo Y (a.s.) . We also have: 6'. If go c g, lE[lE(Y l go) l g) lE[Y l go ) a.s . Proof. lE[Ylgo) is go-measurable, so g-measurable as go C g, so lE[.lg ) has no effect on it, by 3. Note. 6, 6' are the two forms of the iterated conditional expectations prop erty. When conditioning on two a-algebras, one larger (finer) , one smaller (coarser) , the coarser rubs out the effect of the finer, either way round. This may be thought of as the coarse-averaging property: we shall use this term in terchangeably with the iterated conditional expectations property; Williams (1991) uses the term tower property. 7. Take go {0, Q} in 6 and use 1 . 8. If Y is independent o f g, Y is independent o f IB for every B E g . So by (2.4) and linearity, =
.
=
J lE(YI 9 )dlP J YdlP J IBYdlP =
B
B lE(lBY)
=
n
=
lE(lB)lE(Y)
J
lEYdlP, B using the multiplication theorem for independent random variables. Since this holds for all B E g , the result follows by (2.4) . 9. Recall (see e.g. Williams (1991) , §6.6a, §9.7h, §9.8h), that for every convex function there exists a countable sequence ( (an , bn)) of points in JR2 such that + bn) , x E JR. c(X) sup(anx n For each fixed n we use 4 to see from c(X) � anX + bn that =
=
=
So,
lE[c(X) l g)
Remark 2. 5. 1 .
�
sup n (anlE(XIQ) + bn)
=
c (lE(X l g)) . o
If in 6, 6' we take 9 go, we obtain: =
lE[lE(X l g) l 9 l
=
lE(X l g) ·
Thus the map X -+ lE(XIQ) is idempotent: applying it twice is the same as applying it once. Hence we may identify the conditional expectation operator as a projection. This point of view, which is powerful and useful, is developed in Appendix B.
2.6 Modes of Convergence
51
2 . 6 Modes of C onvergence
So far, we have dealt with one probability measure - or its expectation oper ator - at a time. We shall, however, have many occasions to consider whole sequence of them, converging (in a suitable sense) to some limiting proba bility measure. Such situations arise, for example, whenever we approximate a financial model in continuous time (such as the continuous-time Black Scholes model of §6.2 ) by a sequence of models in discrete time (such as the discrete-time Black-Scholes model of §4.6) . In the stochastic-process setting - such as the passage from discrete to continuous Black-Scholes models mentioned above - we need concepts beyond those we have to hand, which we develop later. We confine ourselves here to setting out what we need to discuss convergence of random variables, in the various senses that are useful. The first idea that occurs to one is to use the ordinary convergence concept in this new setting, of random variables: then if Xn, X are random variables, a
Xn -+ X (n -+ oo) would be taken literally - as if the Xn, X were non-random. For instance, if Xn is the observed frequency of heads in a long series of n independent tosses of a fair coin, X = 1/2 the expected frequency, then the above in this case would be the man-in-the-street's idea of the 'law of averages'. It turns out that the above statement is false in this case, taken literally: some qualification is needed. However, the qualification needed is absolutely the minimal one imaginable: one merely needs to exclude a set of probability zero - that is, to assert convergence on a set of probability one ('almost surely') , rather than everywhere. Definition 2.6. 1 . If Xn , X are random variables, we say Xn converges to X almost surely Xn -+ X (n -+ 00 ) - if Xn -+ X with probability one - that is, if JP({w : Xn (w) -+ X(w) as n -+ oo } ) = l . The loose idea of the 'law of averages' has as its precise form a statement on convergence almost surely. This is Kolmogorov's strong law of large numbers, see e.g. Williams (1991) , §12. 1O, which is quite difficult to prove. Weaker convergence concepts are also useful: they may hold under weaker conditions, or they may be easier to prove. Definition 2 . 6 . 2 . If Xn, X are random variables, we say that Xn converges to X in probability -
a. s .
Xn -+ X
(n -+ 00 ) in probability
52
2 . Probability Background
- if, for all E >
0,
JP ( { w : I Xn (w ) - X (w ) 1 > E } ) -+ 0 ( n -+ 00 ) . It turns out that convergence almost surely implies convergence in probabil ity, but not in general conversely. Thus almost-sure convergence is a stronger convergence concept than convergence in probability. This comparison is re flected in the form the 'law of averages' takes for convergence in probability: this is called the weak law of large numbers, which as its name implies is a weaker form of the strong law of large numbers. It is correspondingly much easier to prove: indeed, we shall prove it in §2.8 below. Recall the LP-spaces of pth-power integrable functions (§2.2) . We similarly define the LP-spaces of pth-power integrable random variables: if p � 1 and X is a random variable with we say that X E LP ( or V(n, F, JP ) to be precise) . For Xn , X E V, there is a natural convergence concept: we say that Xn converges to X in LP, or in pt h mean, Xn -+ X in LP, if II Xn - X l lp -+ 0 ( n -+ 00 ) , that is, if lE ( I Xn X I P) -+ 0 ( n -+ 00 ) . The cases p 1 , 2 are particularly important: if Xn -+ X in L 1 , we say that Xn -+ X in mean; if Xn -+ X in L 2 we say that Xn -+ X in mean square. Convergence in pth mean is not directly comparable with convergence almost surely (of course, we have to restrict to random variables in LP for the comparison even to be meaningful): neither implies the other. Both, however, imply convergence in probability. All the modes of convergence discussed so far involve the values of random variables. Often, however, it is only the distributions of random variables that matter. In such cases, the natural mode of convergence is the following: -
=
Xn converge to X in dis tribution if the distribution functions of Xn converge to that of X at all points of continuity of the latter:
Definition 2.6.3. We say that random variables
Xn -+ X for all points
in distribution, if x
JP ( {Xn � x }) -+ JP({X � x }) ( n -+ 00 )
at which the right-hand side is continuous.
The restriction to continuity points x of the limit seems awkward at first, but it is both natural and necessary. It is also quite weak: note that the function
2 . 7 Convolution and Characteristic Functions
53
H P( { X :S x}), being monotone in x, is continuous except for at most countably many jumps. The set of continuity points is thus uncountable: 'most' points are continuity points. Convergence in distribution is (by far) the weakest of the modes of con vergence introduced so far: convergence in probability implies convergence in distribution, but not conversely. There is, however, a partial converse (which we shall need in §2.8) : if the limit X is constant (non-random) , convergence in probability and in distribution are equivalent. Weak Convergence. If Pn , P are probability measures, we say that X
Pn -+ P ( n -+
if
)
weakly if
J jdPn J jdP
(n -+ (0 )
(0
-+
(2.5)
for all bounded continuous functions j. This definition is given a full-length book treatment in Billingsley (1968) , and we refer to this for background and details. For ordinary (real-valued) random variables, weak convergence of their probability measures is the same as convergence in distribution of their distribution functions. However, the weak-convergence definition above ap plies equally, not just to this one-dimensional case, or to the finite-dimensional (vector-valued) setting, but also to infinite-dimensional settings such as arise in convergence of stochastic processes. We shall need such a framework in the passage from discrete- to continuous-time Black-Scholes models. 2 . 7 Convolution and C haracteristic Functions
The most basic operation on numbers is addition; the most basic operation on random variables is addition of independent random variables. If X, Y are independent, with distribution functions F, G, and Z := X + Y,
let Z have distribution function H. Then since X + Y Y + X (addition is commutative) , H depends on F and G symmetrically. We call H the convo lution (German: Faltung) of F and G, written =
H = F * G.
Suppose first that X , Y have densities j, g. Then H (z ) = P ( Z
:S
z)
=
P (X + Y
:S
z) =
J
{ (x , y ) : x + y :S; z }
j(x)g(y)dxdy,
54
2. Probability Background
since by independence of X and Y the joint density of X and Y is the product I(x)g(y) of their separate (marginal) densities, and to find probabilities in the density case we integrate the joint density over the relevant region. Thus
H(z) If
� 1 L{ } � 1 f(x)
9(Y)dY dx
f(x)G(z - x)dx.
<Xl
h( z) :=
J I(x)g(z - x)dx,
-
<Xl
(and of course symmetrically with 1 and 9 interchanged) , then integrating we recover the equation above, after interchanging the order of integration. This is legitimate, as the integrals are non-negative, by Fubini's theorem, which we quote from measure theory, see e.g. Williams (1991) , §8.2. This shows that if X, Y are independent with densities I, g, and Z = X + Y, then Z has density h, where <Xl
h(x) = We write
J I(x - y)g(y)dy.
- <Xl
h = I * g, and call the density h the convolution of the densities 1 and g. If X, Y do not have densities, the argument above may still be taken as far as <Xl
H(z) = lP(Z :::; z) = lP ( X + Y :::; z) =
J F(x - y)dG(y)
-
<Xl
(and, again, symmetrically with F and G interchanged), where the integral on the right is the Lebesgue-Stieltjes integral of §2.2. We again write H = F * G, and call the distribution function H the convolution of the distribution func tions F and G . In sum: addition of independent random variables corresponds to convo lution of distribution functions or densities. Now we frequently need to add (or average) lots of independent random variables: for example, when forming sample means in statistics - when the bigger the sample size is, the better. But convolution involves integration, so adding n independent random variables involves n 1 integrations, and this is awkward to do for large n. One thus seeks a way to transform distributions su as to make the awkward operation of convolution as easy to handle as the operation of addition of independent random variables that gives rise to it. -
2 . 7 Convolution and Characteristic Functions
55
2 . 7. 1 . If X is a random variable with distribution function F , its characteristic function ¢ (or ¢ x if we need to emphasize X) is
Definition
J eitx dF(x) , 00
¢ (t)
:=
JE( e itX )
=
-
(t
E JR) .
00
Note. Here i : = p. All other numbers t , x etc. - are real; all expressions involving i such as e i t x , ¢(t) = JE( e it x ) are complex numbers. The characteristic function takes convolution into multiplication: if X , Y are independent, -
For, as X, Y are independent, so are ei tX and e it Y for any t, so by the multiplication theorem ( Theorem 2.3. 1) ,
JE( e it ( x +Y ) )
=
JE( e itX . e it Y )
=
JE( e itX ) · JE( e it Y ) ,
as required. We list some properties of characteristic functions that we shall need. 1. ¢(O) = 1 . For, ¢(O) JE( e i . O . X ) = JE( e O ) = JE(I) = 1 . 2 . 1 ¢(t) 1 ::; 1 for all t E JR. =
Proof. 1 ¢ ( t ) 1
=
I J�oo eitxdF( x) I J�oo l eit x l dF(x) J�oo IdF(x) ::;
=
=
1.
Thus, in particular the characteristic function always exists ( the integral defining it is always absolutely convergent ) . This is a crucial advantage, far outweighing the disadvantage of having to work with complex rather than real numbers ( the nuisance value of which is in fact slight ) . 3. ¢ is continuous ( indeed, ¢ is uniformly continuous ) . Proof.
I ¢ ( t + u ) - ¢(t) 1
J {ei(t +u) x - eitX } dF(x) J eitx (eiuX - l )dF(x) J l eiux - 1 1 dF(x) , 00
=
-
00
00
-
00
<
-
00
00
for all t. Now u � 0 , l e i ux - 1 1 � 0, and l e i ux - 1 1 ::; 2. The bound on the right tends to zero as u � 0 by Lebesgue's dominated convergence theorem ( which we quote from measure theory: see e.g. Williams (1991) , §5 . 9) , giving continuity; the uniformity follows the bound holds uniformly in t. 4. ( Uniqueness theorem) : ¢ determines the distribution function F uniquely. as
as
56
2. Probability Background
Technically, ¢ is the Fourier-Stieltjes transform of F, and here we are quoting the uniqueness property of this transform. Were uniqueness not to hold, we would lose information on taking characteristic functions, and so ¢ would not be useful. 5. ( Continuity theorem) : If Xn , X are random variables with distribution functions Fn , F and characteristic functions ¢n , ¢, then convergence of ¢n to ¢, ¢n (t) -t ¢(t) ( n -t 00 ) for all t E 1R is equivalent to convergence in distribution of Xn to X. This result is due to Levy; see e.g. Williams (1991) , § 18. 1 . 6 . Moments. Suppose X has kth moment: JE I X l k < 00 . Take the Taylor ( power-series ) expansion of eitx as far the kth power term: as
e itx
=
1 + itx + . . . + (itx) k j k! + (t k ) , 0
where ( t k ) ' denotes an error term of smaller order than t k for small k . Now replace x by X , and take expectations. By linearity, we obtain '0
k k 1 + itJEX + . . . + ( z"t) k! JE (X ) + e{t) , where the error term e{t) is the expectation of the error terms ( now random, X is random ) obtained above ( one for each value X {w) of X). It is not obvious, but it is true, that e (t) is still of smaller order than t k for t -t 0: ¢(t) = JE{e i tX )
=
as
We shall need the case k = 2 in dealing with the central limit theorem below. Examples. 1 . Standard Normal Distribution, N{O, 1 ) . For the standard nor mal density f (x) k exp { - � x 2 } , one has, by the process of 'completing the square' ( familiar from when one first learns to solve quadratic equations! ) , =
J
00
-
00
vk J exp { tx 00
e tx f (x)dx =
JOO exp { 2I
-
1 = -v'21r
-
00
� 2} x
- - (x - t) 2
-
00
dx +
1 2
}
- t 2 dx
The second factor on the right is 1 ( it has the form of a normal integral ) . This gives the integral on the left as exp H t 2 } .
2 . 8 The Central Limit Theorem
57
Now replace t by it ( legitimate by analytic continuation, which we quote from complex analysis, see e.g. Burkill and Burkill (1970) ). The right becomes exp { ! t } . The integral on the left becomes the characteristic function of the standard normal density - which we have thus now identified ( and will need below in §2.8) . 2. General Normal Distribution, N ( p" Consider the transformation x f--t j..t x . Applied to a random variable X, this adds j..t to the mean ( a change of location) , and multiplies the variance by ( a change of scale) . One can check that if X has the standard normal density above, then j..t + has density 1 exp - - j..t ) / , f(x) = - 2 (x
-2
a) .
+a
a2
I { a..,fi(i and characteristic function
2 a2 }
--
lEe it ( JL +aX )
=
exp { i j.t. t } lE (e ( i a t ) X
)
{ - �a2 t2 } .
aX
=
e p { ij..tt } exp x
{-� (at) 2 }
exp ij..tt
=
Thus the general normal density and its characteristic function are
3. Poisson Distribution, P().. ) .
Here, the probability mass function is
(
f k ) := JP(X = k) e - >').. k /k! , (k = 0, 1, 2, . . . ) The characteristic function is thus =
00
=
e - >' L: ( ).. e i t ) k /k ! k=O
=
e->' exp{Aeit } = exp { -).. ( 1
.
- eit ) } .
2 . 8 The Central Limit Theorem
Readers of this book will be well aware that
(l + �r -t ex
(n
-t oo ) 'ix E IR.
This is the formula governing the passage from discrete to continuous com pound interest. Invest one pound ( or dollar ) for one year at 100x% p.a. ; with
58
2. Probability Background
interest compounded n times p.a., our capital after one year is (1 + � ) n . With continuous compounding, our capital after one year is the exponential eX : exponential growth corresponds to continuously compounded interest.
We need two extensions: the formula still holds with x E IR replaced by a complex number Z E C: and if Zn E C,
Zn
-+
Z,
As a first illustration of the power of transform methods, we prove the weak law of large numbers: are in Theorem 2.8. 1 (Weak Law of Large Numbers) . If X1 , X2 , •
•
•
dependent and identically distributed with mean J-l, then
n
-n1 l: Xi -+
J-l
(n -+ (0 ) in probability.
i =l Proof. If the Xi have characteristic function ¢, then by the moment property of §2.8 with k 1 , =
¢(t) = l + iJ-lt + o (t)
(t -+ O) .
Now using the i.i.d. assumption, � I:� 1 Xi has characteristic function
and e i /l t is the characteristic function of the constant J-l (for fixed t, (lin) is an error term of smaller order than 1 In as n -+ ) By the continuity theorem, n -n1 l: Xi -+ J-l in distribution, i= l and as J-l is constant, this says (see §2.6) that 0
(0
n
-n1 l: Xi -+ J-l 1
in probability.
.
2 . 8 The Central Limit Theorem
59 D
The main result of this section is the same argument carried one stage further. Theorem 2 . 8 . 2 ( Central Limit Theorem) . If Xl , X2 , are indepen dent and identically distributed with mean J.L and variance a 2 , then with N(O, 1) the standard normal distribution, •
n
n
y'n -1 �)Xi -J.L) = 1 J.L)/a -+ N(O, 1) ( n -+ 00 ) y'n 2)Xia n i =l i= l -
That is, for all x
E
.
.
in distribution.
JR,
Proof. We first centre at the mean. If Xi has characteristic function
Now y'n( � I:�=l Xi - J.L)/a has characteristic function E
�
=
E
( + v: ( � t, ) }) (J] { �fo } ) J] ( ex
,
X; - �
exp it(
n
�)
( �) ) (
=
1-
�
E
exp
{ u� (X; - } )
�:2�2 + o ( � )) n -+ e - � t 2 2
p)
(n -+ oo) ,
and e- � t 2 is the characteristic function of the standard normal distribution D N(O, 1). The result follows by the continuity theorem. Note. In Theorem 2.8.2, we: centre the Xi by subtracting the mean (to get mean 0); scale the resulting Xi - J.L by dividing by the standard deviation a (to get variance 1 ) . Then if Yi := (Xi - J.L)/a are the resulting standardized variables, In I:�Yi converges in distribution to standard normal. •
•
60
2 . Probability Background
Example. The Binomial Case.
parameter P E (0, 1 ) ,
JP(Xi
=
1)
=
If each Xi is Bernoulli distributed with
JP(Xi = 0) = q
P,
- so Xi has mean p and variance pq - Sn with parameters n and p:
JP
(� k) n
Xi =
=
(n ) p k
:=
:=
1 -p
E�l Xi is binomially distributed
k n-k
q
=
,
n. (n - k)!k ! Pk qn-k .
A direct attack on the distribution of .in E� l (Xi - p)/ .;pq can be made via
JP
(a t b) :::; = i= l
Xi
:::;
k:n
L p+ a ynpq 5, :::;
n k np+ b ynpq (
n! q k ) !k ! P
-
k
n-k
.
Since n, k and n - k will all be large here, one needs an approximation to the factorials. The required result is Stirling 's formula of 1730: n! V21Te -n nn+ 4 ( n -+ 00) rv
(the symbol indicates that the ratio of the two sides tends to 1). The argument can be carried through to obtain the sum on the right as a Rie mann sum ( in the sense of the Riemann integral: §2.2) for J: k e- 4 x 2 dx, whence the result. This, the earliest form of the central limit theorem, is the de Moivre-Laplace limit theorem ( Abraham de Moivre, 1667-1754; P.S. de Laplace, 1749-1827) . The proof of the de-Moivre-Laplace limit theorem sketched above is closely analogous to the passage from the discrete to the continuous Black-Scholes formula: see §4.6 and §6.4. Local Limit Theorems. The central limit theorem as proved above is a global limit theorem: it relates to distributions and convergence thereof. The de Moivre-Laplace limit theorem above, however, deals directly with individual probabilities in the discrete case ( the sum of a large number of which is shown to approximate an integral ) . A limit theorem dealing with densities and convergence thereof in the density case, or with the discrete analogues of densities - such as the individual probabilities JP(Sn = k ) in the binomial case above - is called a local limit theorem. Poisson Limit Theorem. The de Moivre-Laplace limit theorem - conver gence of binomial to normal - is only one possible limiting regime for binomial models. The next most important one has a Poisson limit in place of a normal one. Suppose we have a sequence of binomial models B (n, p) , where the success probability p Pn varies with n , in such a way that rv
=
2.9 Asset Return Distributions
nPn --+ A
>
0, (n --+ 00).
61
(2.6)
Thus Pn --+ 0 - indeed, Pn "" A /n. This models a situation where we have a large number n of Bernoulli trials, each with small probability Pn of success, but such that npn , the expected total number of successes, is 'neither large nor small, but intermediate'. Binomial models satisfying condition (2.6) converge to the Poisson model P( ).. ) with parameter ).. > O. This result is sometimes called the law of small numbers. The Poisson distribution is widely used to model statistics of accidents, insurance claims and the like, where one has a large number n of individuals at risk, each with a small probability Pn of generating an accident, insurance claim etc. ('success probability' seems a strange usage here!) . 2 . 9 Asset Return Distributions
Suppose (to anticipate Chapter 4) that we hold a stock, whose price S is observed at discrete time-points t = n 8 , e.g. (n = 1 , 2, . . . ) , as Set) = S(n8 ) , or Sn say. Of course Sn :2: 0, so its distribution is concentrated on [0, 00 ) . But an investor may be more interested in relative performance than the absolute price, and here a law giving weight to both positive and negative half-lines will be appropriate. There are two ways to achieve this. One is to work with the log-price, log Set) . The other is to focus on the return, the relative gain over a time-period, Sn +l - Sn Sn · For short time-intervals 8 , the change Sn+ 1 - Sn will be small compared to Sn , so Rn will be small, and so, since .
.L "n . D
=
log(1 + x ) "" x ( x --+ 0) , we have approximately Since
log(1 + Rn) = 10g(Sn + dSn) = log Sn+l - log Sn , working with returns is substantially equivalent to working with log-prices. Return Interval. The first question is how the return distribution depends on the return interval 8 . Since the price change over an interval is the sum of the price changes over constituent intervals, the longer 8 is, the more the re turn over it is that of a sum, of numerous summands. If we assume that price changes over disjoint time-intervals are independent, at least approximately (prices respond to the 'driving noise the unpredictable, or random, changes in economic environment), then return distributions are those of sums of many
62
2. Probability Background
individually small and nearly independent terms. This is the classic prescrip tion for obtaining a Gaussian or normal distribution, by the central limit theorem (§2.8) and its extensions. This phenomenon is called aggregational Gaussianity. To summarize: for long enough 8 ( the empirical rule of thumb is sixteen trading days - say, monthly returns ) , return distributions will be ap proximately Gaussian. This is in accordance with the Black-Scholes-Merton model developed below ( Chapters 4 and 6). Since for a Gaussian density f(x) - log f(x) is quadratic, the 'tails' ( very large or small values ) of a Gaussian distribution exhibit very rapid decay log-quadratic decay. At the other end of the scale is small 8, representing high-frequency data ( 'tick data' ) - perhaps on a scale of minutes, or even seconds. Here return distributions behave quite differently. There are good theoretical grounds, backed by good empirical evidence, for modeling return distributions by dis tributions or densities whose 'tails' decrease like a power. That is, the density f ( or distribution function F) satisfies or for exponents b, f3 and constants Ci . Such tails are called Pareto tails, after the Italian economist Vilfredo Pareto ( 1845-1923) . Note that such tail decay is very slow, particularly compared to the enormously fast log-quadratic tail decay in the Gaussian case. For intermediate values of 8 ( say, of the order of a day ) , tail-decay in termediate between the ultra-fast log-quadratic Gaussian and the power-law Pareto cases may be expected. This is indeed found. In §2. 1 2 we present one of the simplest models for this, the hyperbolic distributions with log-linear tail decay ( - log f(x) decays at ±oo like linear functions of x) . We now focus on the Gaussian approximation for asset return distribu tions. By above, we cannot expect this to be even close for 8 too small. Excluding such 8, how well does the approximation perform? It has two prin cipal deficiencies. Skewness. The Gaussian / normal laws are symmetric. However, real return data show asymmetry, or skewness. The positive tail ( large positive values ) reflects upside or profit, the corresponding negative tail reflects downside or loss. One must expect asymmetry, as these two are quite different. This is partly for practical ( or legal ) reasons: extreme losses lead to insolvency, and the exit of the financial agent or 'player' from the market, while windfall profits have less visible results. It is partly also for psychological reasons: a given amount of loss gives more pain than the same amount of profit gives pleasure. Thus to fit reality well we will need to model asymmetric return distributions, and this will lead us beyond the Gaussian case. a,
a,
2 . 10 Infinite Divisibility and the Levy-Khintchine Formula
63
Many real data sets display slower tail decay - 'heavier tails' is consistent with the Gaussian model. an th To summarize: we will be content for most but not all of this book with a Gaussian-based model, as a benchmark or first approximation. In § § 2 . 1O-2 . 1 2 below we lay the foundations to go further. We develop this more fully later in the context of stochastic processes - specifically in § 5 . 5 on Levy processes. Tail decay.
2 . 1 0 Infinite D ivisibility and the Levy-Khintchine Formula
N (J.L , a2 ), the characteristic function is
For the normal family
For each
n
=
1 , 2, .
. , this is the nth power of .
¢n ( t )
exp
=
{ iJ.L� - �a2t2 In } ,
which is a characteristic function ( of Again, for the Poisson family
a2 In)). function is Po()..N) (, J.Ltheln, characteristic ¢(t ) exp { - ).. ( 1 - eit ) } =
which for each is the nth power of
n
- eit ) In } , a characteristic function ( of Po ( ).. l n )). Thus for each n , the normal and Pois son laws are those of the sum of n independent copies of random variables drawn from some common distribution. We regard the normal and Pois as being thus 'divided into n pieces'; as this is possible for each nson laws 1 , 2 , . , we call them infinitely divisible. ¢n ( t )
=
=
exp { - ).. ( 1
. .
One can generalize all this.
random variable X, or its distribution function F, is infinitely divisible i f for each n 1 , 2, . there is a distribution function Fn with F as its n-fold convolution power: F Fn Fn Fn (n factors), that is, X has the same distribution as
Definition 2 . 1 0 . 1 . A
=
=
with Xni (i
=
1,
.
..
*
.
.
* . . . *
,n) independent with common distribution Fn .
2. Probability Background
64
We write I for the class of infinitely divisible distributions. It turns out also that I can be described apparently more generally, as the class of limit laws of row sums of 'triangular arrays' {Xn k : k = 1 , . . . , kn ' n = 1 , 2, . . . } , where the elements in the nth row, are independent, and satisfy a condition of 'asymptotic negligibility: roughly, that for large n , no Xn k makes a significant contribution to the limit distri bution. It turns out that the class I can be characterized exactly. This is the content of the classical (P. Levy (1886-1971) in 1934, A. Ya. Khintchine ( 1894-1956) in 1936) : ¢ E I, that is, ¢(.) is an infinitely-divisible characteristic function, iff ¢ is of the form
Levy-Khintchine formula ¢(u)
with the
Levy exponent 'l/J given by 'IjI (u)
=
c2
"2 u2 - io:u + +
J
{ l x l :::: 1 }
with
0: ,
=
exp { - 'IjI (u) } ,
J
{ lx l < 1}
( 1 - e - i ux - i Ux) J.L(dx)
( 1 - e - i UX) J.L(dx) ,
c E lR and J.L a CT-finite measure on lRj{O} satisfying
(Various other alternative forms for the integrand are used; what matters is the different behaviour near the origin and away from it.) The theory surrounding infinite divisibility and the Levy-Khintchine for mula is one of the crowning glories of probability theory - the solution of the classical 'central-limit problem' ( so-called as the formulation above via tri angular arrays generalizes that of §2.8) . We must refer for proof to any good graduate-level monograph on probability, e.g. Feller ( 1 968) , XVII. The true setting of the theory is actually not the static or distributional one above, but the dynamic setting of stochastic processes. We will return to this in the context of in §5.5. Suffice it to say for now that of the ( c, J.L) above, the represents a deterministic component, or 'drift', the c represents a normally distributed or Gaussian part, and the integral involving J.L represents a 'sum of jumps' term, whose meaning will emerge later.
characteristics Levy processes 0: ,
0:
2 . 1 1 Elliptically Contoured Distributions
65
Self-decomposability. The infinitely-divisible laws are those obtainable as limits of triangular arrays {Xnd with indices. What can be obtained restricting to index, and looking at limits of by
two
-
one
2,
for suitable centering and scaling constants bn, an and XI , X . . . inde pendent (not necessarily identically distributed)? Such laws are called call the class they form S D. As one-suffix arrays above are special cases of two-suffix ones, self-decomposable laws are infinitely divisi ble:
decomposable;
self
SD c I.
It turns out that a law is in SD iff its characteristic function is such that, for each p E (0, 1 ) ,
1>
1> (t) = 1> (pt) 1>p ( t) for some characteristic function 1>p (whence the name 'self-decomposable'). Self-decomposable laws have many nice properties; we quote two, for later x
use. (i) They are absolutely continuous (possess densities) , and are unimodal ('one-peaked'). (ii) In one dimension, they are the laws with Levy measure of the form
/-l
/-l(dx ) = k(xI x l ) dx
with k increasing on (-00, 0) and decreasing on (0, 00) . In particular, they are easy to recognize from the Levy-Khintchine formula, and easy to simulate from. For these properties, and further background, we refer to Sato ( 1 999) , §5.3. 2 . 1 1 Elliptically Cont oured D istributions
Recall (§2.7) that the normal/Gaussian density and its characteristic func tion 1> are given by
f
f (x)
=
1y 27ra exp { 21 (X -a2/-l) 2 } and ¢(t) ICC
- -
=
{ 2I a2 t2 } ,
exp i/-lt
-
_
and (§2.3) that such laws are useful in modeling asset return distributions. We may hold, not just one asset, but a whole of assets, r of them say. To describe the return distribution of such a portfolio, we need to work in r dimensions, with a density ( ) ( = xr)) and characteristic function 1>(t) (t = The univariate normal law above generalizes
portfolio
(tl, . . . , tr )). f
x
x
(Xl'. ' . '
2. Probability Background
66
to the multivariate normal ('multinormal') , the basis of multivariate analysis in statistics:
= (21f) �1r IEI 2 exp { -�(2 x - 1L) t E- 1 (x } = exp { ilL � E } (Edgeworth's Theorem: F. Edgeworth (1845-1926) in 1892) . Here IL ( J.L l , . . , J.Lr ) t is the mean vector, E = (aij ) the covariance matrix r r, positive definite, symmetric) . Thus IL, E completely specify the multinormal F(x)
-
1
¢(t)
tt -
tt
IL )
t
Y.
.
,
(
=
x
in dimensions, Nr (lL , E) say, and are interpretable via the Markowitz mean variance theory. One way to keep most of the desirable features of the multinormal interpretable parameters IL, E, elliptical contours, linear regression etc. - is to replace f by a more general form, i.e. assume that the density f is a function of the quadratic form Q (x - IL) T E- 1 (x - IL):
r
:=
Here f is called elliptically contoured: we write f ECr (lL, E; g ) , and call gparameter, : IR + -+ IR + the density generator of f, or 'shape'. Then 0 : = (IL, E) is the or parametric part, of the model, g the non-parametric part. The characteristic function (CF) 'Ij; of f is of the form '"
(EC' )
for some scalar function ¢ called the of 'Ij;, or f Fang, Kotz, and Ng ( 1990) , Ch. 2, Cambanis, Huang, and Simons ( 1981) . It is convenient to write here f ECr (J.L, E; ¢) also. Examples. 1 . The l/ case. The density generator of the mul tivariate normal distribution is given by
characteristic generator
'"
norma Gaussian
The characteristic function is
{i(}T IL � OT EO } , so {�} The multivariate normal is a member (the s = 1 , t = � case) of the class of symmetric Kotz-type distributions, which are characterized by exponen 'Ij;(O)
=
exp
¢(u) = exp
-
- u .
N
tially decaying density generators of form
=
2 . 1 2 Hyberbolic Distributions
g ( u ) = Gr U N - 1 exp { _tuB } ,
67
t > 0, 2N + r > 2, Gr a constant. For further details see Fang, Kotz, and Ng (1990) , §3.2. 2. The For the multivariate t-distribution with m degrees of freedom the density generator exhibits power decay. It is a member (the N ! ( + m ) , m an integer case) of the class of symmetric multivariate Pearson type VII distributions with density generators r(N) U -N 1 + g ( u) = m - l r ' N > r/2, m > O . ( 7f ) 2 r(N r/2) m Again we refer to Fang, Kotz, and Ng (1990) , §3.3 for further discussion.
multivariate t-distribution. = r
(
_
S,
)
Elliptically contoured distributions are well adapted to modelling any desired rate of tail decay. Indeed, the more slowly g ( u ) decays as u increases, the more slowly I(x) decays as x moves away from J.L. The whole range of rate of tail decay is possible. For example, the r-variate Student t-distribution with m degrees of freedom,
(
)
- l2 ( r +m ) u g ( u) = canst 1 + __ ,
gives Pareto or power-law decay, while
m-2
gives the multinormal case, with log-quadratic decay. So far as modeling skewness or asymmetry is concerned: the function I in (EC) is certainly not symmetrical in its components Xi (unless E is the identity matrix) . Nevertheless, the functional form (EC) , which restricts I to have (paths in x-space of constant I-value) does impose a partial symmetry restriction, and we say that I is Some elliptically contoured distributions are infinitely divisible; in prin ciple, these can be identified via the function 'l/J from (EC) and the Levy Khintchine formula. Restricting further to self-decomposability, we obtain the class
elliptical contours
elliptically symmetric.
SDEG
:=
SD n EG.
Recall that laws in SD are (§2. 10) , which corresponds via (EC) to 9 being We shall meet examples in §2.12 below.
decreasing.
unimodal
2 . 1 2 Hyberbolic Distributions
Our concern here is the hyperbolic family, a four-parameter family with two type and two shape parameters. Recall that, for normal (Gaussian) distribu tions, the log-density is quadratic - that is, parabolic - and the tails are very
68
2. Probability Background
thin. The hyperbolic family is specified by taking the log-density instead to be hyperbolic, and this leads to thicker tails as desired ( but not as thick as for the stable family ) . Before turning to the specifics of notation, parametrization etc., we com ment briefly on the origin and scope of the hyperbolic distributions. Both the definition and the bulk of applications stem from Barndorff-Nielsen and co-workers. Thus Barndorff-Nielsen (1977) contains the definition and an ap plication to the distribution function of particle size in a medium such as sand ( see also Barndorff-Nielsen, Blaesild, Jensen, and SQlrensen (1985)) . Later, in Barndorff-Nielsen, Blaesild, Jensen, and SQlrensen ( 1985) , hyperbolic distri bution functions are used to model turbulence. Now the phenomenon of at mospheric turbulence may be regarded as a mechanism whereby energy, when present in localized excess on one volume scale in air, cascades downwards to smaller and smaller scales ( note the analogy to the decay of larger particles into smaller and smaller ones in the sand studies ) . Barndorff-Nielsen had the acute insight that this 'energy cascade effect' might be paralleled in the 'in formation cascade effect', whereby price-sensitive information originates in, say, a global newsflash, and trickles down through national and local level to smaller and smaller units of the economic and social environment. This in sight is acknowledged by Eberlein and Keller ( 1995) ( see also Eberlein, Keller, and Prause ( 1998) and Eberlein and Raible ( 1998)) , who introduced hyper bolic distribution functions into finance and gave detailed empirical studies of its use to model financial data, particularly daily stock returns. Further and related studies are Bibby and SQlrensen ( 1997) , Chan ( 1999) , Eberlein and Jacod (1997) , Kuchler ( 1999) , Rydberg ( 1999) and Rydberg (1997) . We need some background on Bessel functions, see Watson ( 1944) . Recall the Bessel functions Jv of the first kind, Watson (1944) , §3. 1 1 , Yv of the sec ond kind, Watson (1944) , §3.53, and Kv, Watson ( 1944) , §3.7, there called a Bessel function with imaginary argument or Macdonald function, nowa days usually called a Bessel function of the third kind. From the integral representation 00
Kv{x) � J uv- 1 exp {-�x(u l/U)} du (x > 0) +
=
(2.7)
o
( Watson (1944) ,§6.23)
one sees that
(x > 0)
(2.8)
is a probability density function. The corresponding law is called the gener alized inverse Gaussian GlG>.,1/J,x ; the inverse Gaussian is the case >. 1 : lGx,1/J = GlG 1 ,1/J,x ' These laws were introduced by Good (1953) ; for a mono graph treatment of their statistical properties, see JQlrgensen (1982), and for their role in models of financial markets, Shiryaev (1999) , III, l .d. =
2 . 1 2 Hyberbolic Distributions
69
Now consider a Gaussian (normal) law N(J-t + (3(1 2 , (1 2 ) where the pa rameter (12 is random and is sampled from GIG1 ,1/J,x ' The resulting law is a mean-variance mixture of normal laws, the mixing law being generalized inverse Gaussian. It is written IE,,2 N (J-t + (3(12 , (12 ) ; it has a density of the form Ja 2 - (32 exp -a Jo2 + (x - J-t) 2 + (3 (x - J-t) } (2 . 9 ) 2 (3 2 2aoK1 (o Ja -
{
)
(Barndorff-Nielsen ( 1977)) , where a 2 = 'Ij; + (3 2 and 0 2 = X. Just the Gaussian law has log-density a quadratic - or parabolic - function, so this law has log-density a hyperbolic function. It is accordingly called a hyperbolic distribution. Various parametrizations are possible. Here J-t is a location and o a scale parameter, while a > 0 and (3 (0 ::; 1(31 < a) are shape parameters. One may pass from (a, (3) to (4), 'Y) via a = (4) + 'Y)/2, (3 = (4) - 'Y)/2, so 4>'Y = a2 - (3 2 , and then to (�, X) via �(3 � 4> - 'Y � = ( 1 + 0 �) - � , = . = as
x
a
4> + 'Y
This parameterization (in which � and X correspond to the classical shape parameters of skewness and kurtosis) has the advantage of being affine in variant (invariant under changes of location and scale) . The range of (�, X) is the interior of a triangle v = { (�, X) : 0 ::; I x i < � < I } , called the shape triangle (see Figure 1). It suffices for our purpose to restrict to the centred (J-t 0) symmetric ((3 = 0, or X = 0) case, giving the two parameter family of densities (writing ( � - 2 - 1)
=
hYP('/i (x)
=
= 20;1 ( () exp { V + (Jf } , -
(
I
( ( , 0 > 0) .
(2 . 10)
Infinite Divisibility. Recall ( Feller ( 1971 ) , XIII, 7, Theorem 1) that a func tion is the Laplace transform of an infinitely divisible probability law on IR+ iff = e-1/J , where 'Ij; (0) 0 and 'Ij; has a completely monotone deriva tive (that is, the derivatives of 'Ij;' alternate in sign) . Grosswald (1976) showed that if Qv (x) : = Kv - 1 ( VX) / ( vxKv hIX) ) ( v 2: O , x > 0) , then Qv is completely monotone. Hence Barndorff-Nielsen and Halgreen (1977) showed that the generalized inverse Gaussian laws GIG are infinitely divisible. Now the GIG are the mixing laws giving rise to the hyperbolic laws normal mean-variance mixtures. This transfers infinite divisibility (see e.g. Kelker ( 1971 ) , Keilson and Steutel ( 1974) , §§1,2) , so the hyperbolic laws are infinite divisible. w
w
as
=
2. Probability Background
70
Characteristic Functions. The mixture representation transfers to char acteristic functions on taking the Fourier transform. It gives the characteristic function of hyp( , 0 as
If
where k (.) is the cumulant generating function of the law IG, where Y has law IG"" x (recall X J2 ) , and Grosswald's result above is =
00 / Qv (t) = qv (x)dx/(x + t ) , o
where (thus Qv is a Stieltjes transform, or iterated Laplace transform, Widder (1941) , VIII) . Using this and the Levy-Khintchine formula Eberlein and Keller (1995) obtained the density v(x) of the Levy measure Jl(dx) of Z as _ _1_ v(x ) - 7r2 J x J
/00 exp { - J x J v2Y + ((/ 8) 2 } dY o
y (J'f (J y'2Y) + Y? (J y'2Y) )
and then
+ exp{ -J xJxJ J (/J}
00 iUX / K(u2 /2) = ( e
-
-
Now (Watson (1944), §7.21) rv
Yv (x)
rv
( - �7rV - �7r) , (x V2/7rx sin ( x - �7rV - �7r ) ' V2/7rx cos x
(2. 12)
1 iUx) v(x)dx.
- 00
Jv (x)
'
-+ 00 ) .
Exercises
71
So the denominator in the integral in (2.12) is asymptotic to a multiple of y ! as y --+ 00. The asymptotics of the integral as x .J.. 0 are determined by that of the integral as y --+ 00, and (writing V2y + ((/6)2 as t, say) this can be read off from the Hardy-Littlewood-Karamata theorem for Laplace trans forms (Feller ( 1971), XIII.5, Theorem 2, or Bingham, Goldie, and Teugels (1987) , Theorem 1.7. 1 ). We see that v(x) c/x 2 , (x .J.. 0) for c a constant. In particular the Levy measure is infinite. Tails and Shape. The classic empirical studies of Bagnold (1941) and Bag nold and Barndorff-Nielsen (1979) reveal the characteristic pattern that, when log-density is plotted against log-size of particle, one obtains a unimodal curve approaching linear asymptotics at ±oo. Now the simplest such curve is the hyperbola, which contains four parameters: location of the mode, the slopes of the asymptotics, and curvature near the mode (the modal height is absorbed by the density normalization) . This is the empirical basis for the hy perbolic laws in particle-size studies. Following Barndorff-Nielsen's suggested analogy, a similar pattern was sought, and found, in financial data, with log-density plotted against log-price. Studies by Eberlein and Keller (1995) , Eberlein, Keller, and Prause ( 1998) , Eberlein and Raible (1998) , Bibby and S!2Irensen (1997) , Rydberg (1999) , Rydberg (1997) and other authors show that hyperbolic densities provide a good fit for a range of financial data, not only in the tails but throughout the distribution. The hyperbolic tails are log-linear: much fatter than normal tails but much thinner than stable ones. We will return later to the stochastic process version of this theory in §7.4.4. rv
Exercises
:=
2 . 1 On [0, 1] ' let Q be the set of rationals in [0, 1] , and let f(x) lQ (x) be its indicator function. 1 . f 0 a. e . (almost all numbers are irrational - rationals are flukes!). So the integrals of f and 0 coincide: J01 lQ (x)dx O. 2. f is discontinuous everywhere - there are both rationals and irrationals arbitrarily close to every real number. So the integral J01 1Q (x)dx is not defined (indeed, it is as far from being defined as it possibly could be!). 2 . 2 Carry out the direct proof of the de Moivre-Laplace limit theorem of §2.8 by Stirling's formula (for the fainthearted, the details are in (Feller 1968) , §VII.3) . 2.3 For the Cauchy density f(x) 1/(7f(1 + x2) ) , the characteristic func tion is >(t) e - 1 t l . (The proof is easiest using complex analysis - contour integration and Jordan's lemma.) =
=
Lebesgue
Riemann
=
:=
2. Probability Background
72
For the density I(x) := ! e- 1xl , the characteristic function is ¢J(t) = /Note:( 1 thet2).resemblance (One can integrate the integral defining ¢J(t) twice by parts.) to Exercise 2.3 above is no accident, but an instance 2.4 1
+
of the Fourier integral theorem. 2 . 5 If is normally distributed with mean and variance one would expect any reasonable definition of convergence in distribution to have converging in distribution to the constant (as the variance vanishes in the limit, and we know that a random variance with zero variance is constant, a.s.). 1 . Draw pictures of the densities of concentrating more and more about Draw a picture of the distribution function of the constant and note that it does not have a density; 3. Interpret the restriction to continuity points of the distribution function of the limit in Definition 2.7.3 as designed to handle situations of this type. 2.6 The density is defined, for parameters E < < 1 by JR , > 0,
Xn
/1-
(J'2/n,
/1-
In(x) Xn,
/1-;
2.
Xn
/1-,
bivariate normal I (x, y) /1-1, /1-2 (J'1 , (J'2 - 1 p I(x,y) := c exp { - � Q (x,y) } , 1 c = . 2�(J'1 (J'2 �' Q (x, y) : = ( 1 � p2 ) ( x :1/1-1 r - 2p ( x :1/1-1 ) ( y :;2 ) + ( y :;2 ) 2 . ·
-------:====
]
[
1 . Show that this is indeed a density (integrates to 1 ) . (Complete the square - cf. solving quadratic equations!) Show that if Y) has density is (and so Y is N by symmetry) . 3. Show that the conditional density of Y given is normal, i.e. Y N + 4. Interpret the conditional means in terms of the population regression line of §2.5, and its sample analogue given by the familiar method of least squares. 5. Interpret the conditional variance as decomposing the variance variabil ity) of Y into two parts, the part accounted for by knowledge of X and the remaining part (1 6. Interpret the fraction of the variability o f Y accounted for by knowl edge of (or vice versa, by symmetry). 7. Show that (X, Y) has (joint) characteristic function
2. (/1- , (J'� ) (X, I(x, y), X N(/1-1, (J'D 2 X=x (/1-2 !!£- (x - /1- d , (J'� (1 - p2) ) . '"
(J'�
Xp2
(J'�p2 (J'� p2). _
as
(=
Exercises
73
¢(tb t 2 ) := lE (exp{itIX + it2 Y})
= exp { i (/l l tl + /l2t2 ) � (CT�t� + 2pCTICT2ht2 + CT�t�) } . -
8. Show that the fifth parameter p is the correlation coefficient between X and Y. 9. Deduce that X, Y are independent if and only if they are uncorrelated ( recall that independent random variables are uncorrelated when they are square-integrable - have variances, so a covariance and a correlation - but the converse is false in general ) . Note. The bivariate normal distribution models the common situation of two random variables, each of which is partially but not completely informative about the other. It is basic to statistics - in particular, to regression, and so is its extension to higher dimensions, the multivariate normal distribution. In finance, it is basic to correlation-based options - options such as quantos, involving the currencies of two different but interlinked economies.
3 . Sto chastic Pro cesses in Discrete Time
3 . 1 Informat ion and Filtrat ions
Access to full, accurate, up-to-date information is clearly essential to any one actively engaged in financial activity or trading. Indeed, information is arguably the most important determinant of success in financial life. Partly for simplicity, partly to reflect the legislation and regulations against insider trading, we shall confine ourselves to the situation where agents take deci sions on the basis of information in the public domain, and available to all. We shall further assume that information once known remains known - is not forgotten - and can be accessed in real time. In reality, of course, matters are more complicated. Information overload is as much of a danger as information scarcity. The ability to retain infor mation, organize it, and access it quickly is one of the main factors that will discriminate between the abilities of different economic agents to react to changing market conditions. However, we restrict ourselves here to the sim plest possible situation and do not differentiate between agents on the basis of their information-processing abilities. Thus as time passes, new information becomes available to all agents, who continually update their information. What we need is a mathematical language to model this information flow, unfolding with time. This is provided by the idea of a filtration; we outline below the elements of this theory that we shall need. The Kolmogorov triples (il, F, P) , and the Kolmogorov conditional expec tations .lE(XIB) , give us all the machinery we need to handle static situations involving randomness. To handle dynamic situations, involving randomness which unfolds with time, we need further structure. We may take the initial, or starting, time as t O. Time may evolve discretely, or continuously. We postpone the continuous case to Chapter 5; in the discrete case, we may suppose time evolves in integer steps, t 0, 1 , 2, . . . ( say, stock-market quotations daily, or tick data by the second ) . There may be a final time T, or time horizon, or we may have an infinite time horizon ( in the context of option pricing, the time horizon T is the expiry time ) . We wish to model a situation involving randomness unfolding with time. As above, we suppose, for simplicity, that information is never lost ( or for gotten ) : thus, as time increases we learn more. We recall from Chapter 2 that =
=
76
3. Stochastic Processes in Discrete Time
a-algebras represent information or knowledge. We thus need a sequence of a-algebras IF = {Fn n 0, 1, 2, . . . } , which are increasing: :
=
Fn c Fn + l ( n = 0, 1 , 2, . . . ) , with Fn representing the information, or knowledge, available to us at time n. We shall always suppose alI a-algebras to be (this can be avoided, and is not always appropriate, but it simplifies matters and suffices for our purposes) . Thus, Fo represents the initial information (if there is none, Fo = {0, ill, the trivial a-algebra) . On the other hand,
complete
represents all we ever will know (the 'Doomsday a-algebra'). Often, Foo will be F (the a-algebra from Chapter 2, representing 'knowing everything') . But this will not always be so; see e.g. Williams (1991) , § 15.8 for an interesting example. Such a family IF {Fn n = 0, 1 , 2, . . . } is called a a probability space endowed with such a filtration, {il, IF, F, P} is called a or These definitions are due to P. A. Meyer of Strasbourg; Meyer and the Strasbourg (and more generally, French) school of probabilists have been responsible for the 'general theory of (stochastic) processes', and for much of the progress in stochastic integration since the 1960s; see e.g. Dellacherie and Meyer (1978) , Dellacherie and Meyer (1982) , Meyer (1966), Meyer (1976) . For the special case of a finite state space il {W l , . . . , wn} and a given a-algebra F on il (which in this case is just an algebra), we can always find a unique finite partition P {A I , . . . , Ad of il, i.e. the sets A i are disjoint and U�=l A i = il, corresponding to F. A filtration IF therefore corresponds to a sequence of finer and finer partitions P At time t = 0 the agents only know that some event W E il will happen, at time T < 00 they know which specific event w* has happened. During the flow of time the agents learn the specific structure of the (a-) algebras Fn, which means they learn the corresponding partitions P. Having the information in Fn revealed is equivalent to knowing in which A� n ) E P the event w* is. Since the partitions become finer the information on w* becomes more detailed with each step. Unfortunately this nice interpretation breaks down as soon as il becomes infinite. It turns out that the concept of filtrations rather than that of parti tions is relevant for the more general situations of infinite il, infinite T and continuous-time processes. :=
filtration;
:
stochastic basis filtered probability space.
=
=
n.
n
3.2 Discrete-parameter Stochastic Processes
77
3 . 2 D iscrete-parameter Stochastic Processes
The word 'stochastic' ( derived from the Greek) is roughly synonymous with 'random'. It is perhaps unfortunate that usage favours 'stochastic process' rather than the simpler 'random process', but as it does, we shall follow it. We need a framework which can handle dynamic situations, in which time evolves, and in which new information unfolds with time. In particular, we need to be able to speak in terms of 'the information available at time n ' , or, 'what we know at time n ' . Further, we need to be able to increase n thereby increasing the information available as new information ( typically, new price information ) comes in, and talk about the information flow over time. One has a clear mental picture of what is meant by this - there is no conceptual difficulty. However, what is needed is a precise mathematical construct, which can be conveniently manipulated - perhaps in quite complicated ways - and yet which bears the above heuristic meaning. Now 'information' is not only an ordinary word, but even a technical term in mathematics - many books have been written on the subject of information theory. However, information theory in this sense is not what we need: for us, the emphasis is on the flow of information, and how to model and describe it. With this by way of motivation, we proceed to give some of the necessary definitions. A X {Xn : n E I} is a family of random variables, defined on some common probability space, indexed by an index-set I. Usu ally ( always in this book ) , I represents time ( sometimes I represents space, and one calls X a spatial process ) . Here, 1 = {O, 1, 2, . , T} ( finite horizon ) or 1 = {O, 1, 2, . . . } ( infinite horizon ) . The ( stochastic ) process X = {Xn )�= o is said to be to the filtration IF = {Fn )�=o if -
stochastic process
=
.
.
adapted
Xn is Fn
-
measurable for all
n.
So if X is adapted, we will know the value of Xn at time n. If we call (Fn ) the natural filtration of X . Thus a process is always adapted to its natural filtration. A typical situation is that
=
is the natural filtration of some process W (Wn ) . Then X is adapted to = (Fn ) , i.e. each Xn is Fn - ( or a{Wo , · · · , Wn )-) measurable, iff
IF
for some measurable function fn ( non-random ) of n + 1 variables.
78
3. Stochastic Processes in Discrete Time
Notation. For a random variable X on (il, F , lP) , X(w) is the value X takes on w (w represents the randomness) . For a stochastic process X (Xn) , it is convenient (e.g., if using suffixes, ni say) to use Xn , X(n) interchangeably, and we shall feel free to do this. With w displayed, these become Xn(w) , X(n, w) etc. The concept of a stochastic process is very general - and so very flexible but it is too general for useful progress to be made without specifying further structure or further restrictions. There are two main types of stochastic pro cess which are both general enough to be sufficiently flexible to model many commonly encountered situations, and sufficiently specific and structured to have a rich and powerful theory. These two types are Markov processes and martingales. A models a situation in which where one is, is all one needs to know when wishing to predict the future - how one got there provides no further information. Such a 'lack of memory' property, though an idealization of reality, is very useful for modeling purposes. We shall en counter Markov processes more in continuous time (see Chapter 5) than in discrete time, where usage dictates that they are called Markov chains, see §3.8. For an excellent and accessible recent treatment of Markov chains, see e.g. Norris on the other hand (see §3.3 below) model fair gambling games - situations where there may be lots of randomness (or unpredictability ) , but no tendency to drift one way or another: rather, there is a tendency towards stability, in that the chance influences tend to cancel each other out on average.
=
Markov process
(1 997). Martingales,
3 . 3 D efinition and Basic P ropert ies of Martingales
Excellent accounts of discrete-parameter martingales are Neveu (1975) , Jacod and Protter (2000), Williams ( 1 1 ) and Williams (2001) to which we refer the reader for detailed discussions. We will summarize what we need to use martingales for modeling in finance. Definition 3.3. 1 . A X = (Xn)
99
process is called a martingale relative to if (i) is adaptedfor(to all (ii) (iii) a. s . (n � 1 ) . is a supermartingale if in place of (iii) ({Fn}, lP) X {Fn}}; IE I Xn l < 00 n; IE [ Xn lFn-ll = Xn- 1 lP X X
is a submartingale if in place of (iii) IE[Xn I Fn-d � Xn -1 lP -
a. s .
(n � 1 ) .
3.3 Definition and Basic Properties of Martingales
79
Martingales have a useful interpretation in terms of dynamic games: a martingale is 'constant on average', and models a fair game; a supermartin gale is 'decreasing on average', and models an unfavourable game; a sub martingale is 'increasing on average', and models a favourable game. Note. 1 . Martingales have many connections with harmonic functions in probabilistic potential theory. The terminology in the inequalities above comes from this: supermartingales correspond to superharmonic functions, submartingales to subharmonic functions. 2. X is a submartingale (supermartingale) if and only if -X is a super martingale (submartingale) ; X is a martingale if and only if it is both a submartingale and a supermartingale. 3. (Xn) is a martingale if and only if (Xn Xo) is a martingale. So we may without loss of generality take Xo = 0 when convenient. 4. If X is a martingale, then for < n using the iterated conditional ex pectation and the martingale property repeatedly (all equalities are in the a.s.-sense) -
m
lE [Xn l FmJ
= lE [lE(Xn IFn- d I FmJ = lE[Xn- l IFmJ = . . . = lE[Xm l FmJ = Xm ,
and similarly for sub martingales , supermartingales. From the (etymology unknown) 1. 1589. An article of harness, to control a horse's head. 2. Naut. A rope for guying down the jib-boom to the dolphin-striker. 3. A system of gambling which consists in doubling the stake when losing in order to recoup oneself (1815) . Thackeray: 'You have not played as yet? Do not do so; above all avoid a martingale if you do. ' Gambling games have been studied since time immemorial - indeed, the Pascal-Fermat correspondence of 1654 which started the subject was on a problem (de Mere's problem) related to gambling. The doubling strategy above has been known at least since 1815. The term 'martingale' in our sense is due to J. Ville (1939) . Martingales were studied by Paul Levy (1886-1971) from 1934 on (see obituary Loeve ( 1973)) and by J.L. Doob (1910-) from 1940 on. The first systematic exposi tion was Doob (1953) . This classic book, though hard going, is still a valuable source of information. Examples. 1. Mean zero random walk: Sn = L Xi , with Xi independent with lE(Xi ) 0 is a martingale (submartingales: positive mean; supermartin gale: negative mean) .
Oxford English Dictionary: martingale
=
80
3. Stochastic Processes in Discrete Time
2. Stock prices: Sn = SO(l . . . (n with (i independent positive r.vs with exist ing first moment. 3. Accumulating data about a random variable (Williams (1991), pp. 96, 166-167) . If � E .c 1 (fl, F, IP) , Mn := lE(�IFn) (so Mn represents our best estimate of � based on knowledge at time n ) , then using iterated conditional expectations so (Mn) is a martingale. One has the convergence
3 . 4 Martingale Transforms
Now think of a gambling game, or series of speculative investments, in discrete time. There is no play at time OJ there are plays at times n = 1, 2, . . . , and represents our net winnings per unit stake at play n. Thus if Xn is a martin gale, the game is 'fair on average'. Call a process C = (Cn)�= l if Cn is Fn_1-measurable for all n � 1. Think of Cn as your stake on play n (Co is not defined, as there is no play at time 0). Predictability says that you have to decide how much to stake on play n based on the history time n (Le., up to and including play n - 1). Your winnings on game n are CnLlXn = Cn (Xn - Xn - 1 ) . Your total (net) winnings up to time n are n n Yn = L Ck LlXk = L Ck (Xk - Xk-d · k= l k=l We write Y = C . X, Yn (C . X)n , LlYn = CnLlXn ((C . X)o = 0 as E� =l is empty) , and call C . X the of X by C. Theorem 3.4. 1 . C X C.X C X C.X
predictable before
=
martingale transform (i) If is a bounded non-negativenull predictable process and is a supermartingale, is a supermartingale at zero. (ii) If nullis bounded tingale at zero. and predictable and is a martingale, is a mar
3.4 Martingale Transforms
Proof.
Now
Y
=
81
C X is integrable, since C is bounded and X integrable. •
1E[Yn - Yn - 1 IFn - 1l
=
1E[Cn (Xn - Xn - d I Fn - 1l
Cn1E[(Xn - Xn - 1 ) IFn - d (as Cn is bounded, so integrable, and Fn_1-measurable, so can be taken out) =
::; 0 in case (i) , as C � 0 and X is a supermartingale, =0 in case (ii), as X is a martingale.
o
You can't beat the system! In the martingale case, pre dictability of C means we can't foresee the future (which is realistic and fair) . So we expect to gain nothing - as we should. Note. 1 . Martingale transforms were introduced and studied by Burkholder ( 1966). For a textbook account, see e.g. Neveu ( 1975), VIlI.4. 2. Martingale transforms are the discrete analogues of stochastic integrals. They dominate the mathematical theory of finance in discrete time, just as stochastic integrals dominate the theory in continuous time. 3. We will deal with stochastic integrals in Chapter 5, where we cover Ito calculus, a probabilistic elaboration of ordinary calculus, the old-fashioned term for which is infinitesimal calculus. Martingale transforms belong to the probabilistic elaboration of the discrete analogue of this, the calculus of finite differences. The passage from discrete to continuous is written formally as Interpretation.
where the dY, dX on the right are stochastic differentials. Lemma 3.4.1 ( Martingale Transform Lemma ) .
An adapted sequence of real integrable random variables (Xn) is a martingale iff for any bounded pre dictable sequence (Cn),
(t ) =
..
Ck .:1Xk 0 ( n = 1 , 2, . . ) k=l Proof. If (Xn) is a martingale, Y defined by Yo 0, n Yn = L Ck.:1 Xk ( n � 1 ) k=l 1E
=
3. Stochastic Processes in Discrete Time
82
is the martingale transform C X, so is a martingale. Now lE(Yl ) lE(CllE(Xl - Xo)) = 0 and we see by induction that •
Conversely, if the condition of the proposition holds, choose j, and for any Frmeasurable set A write Cn = 0 for n =f. j + 1 , Cj + l = IA . Then (Cn) is predictable, so the condition of the proposition, lE(E�= l Ck.:1Xk) = 0, becomes
lE[ l A (Xj + l - Xj )] = O. Since this holds for every set A E Fj , the definition of conditional expectation gives lE(Xj + l IFj) = Xj . o Since this holds for every j, (Xn) is a martingale.
The proof above is a good example of the value of Kol mogorov's definition of conditional expectation - which reveals itself, not in immediate transparency, but in its ease of handling in proofs. We shall see in Chapter 4 the financial significance of martingale transforms H M.
Remark 3. 4 .1.
•
3 . 5 Stopping Times and Optional Stopping
A random variable r taking values in {O, 1, 2, . . . ; + oo} is called a (or optional time) if
time
stopping
{r ::; n} = {w : r(w) ::; n} E Fn V n ::; 00 . From {r = n} = {r ::; n} \ {r ::; n - I} and {r ::; n} = Uk -< n {r = k}, we see the equivalent characterization {r = n} E Fn V n ::; oo. Call a stopping time r bounded if there is a constant K such that JP( r ::; K) = 1 . (Since r(w) ::; K for some constant K and all w E n \ N with JP(N) 0 all identities hold true except on a null set, i.e. almost surely.)
=
Suppose (Xn) is an adapted process and we are interested in the time of first entry of X into a Borel set B (typically one might have
Example.
B
= [c, oo)):
=
r inf{n 2 0 Xn E B}. Now {r ::; n} = Uk < n {Xk E B} E Fn and r = 00 if X never enters B. Thus is a stopping time.
r
:
3 . 5 Stopping Times and Optional Stopping
83
Intuitively, think of T as a time at which you decide to quit a gambling game: whether or not you quit at time n depends only on the history up to and including time n - NOT the future. Thus stopping times model gambling and other situations where there is no foreknowledge, or prescience of the future; in particular, in the financial context, where there is no insider trading. Furthermore since a gambler cannot cheat the system the expectation of his hypothetical fortune (playing with unit stake) should equal his initial fortune. Theorem 3.5. 1 (Doob's Stopping-time Principle (STP ) ) . T X (Xn) XT
bounded stopping time and and Proof.
and write
Assume T ( W )
Let be a a martingale. Then is integrable,
=
lE(XT) lE(Xo). � K for all w, where we can take K to be an integer =
00
=
K
L Xk (w) l {T(w)=k} L Xk (w) l {T(w)=k} k=O k=O Thus using successively the linearity of the expectation operator, the martin gale property of X, the Fk -measurability of { T = k } and finally the definition of conditional expectation, we get XT(w) (w)
lE(XT)
=
=
=
=
lE K
[t,
Xk 1{T =k}
] = t,
lE [Xk 1 {T =k } ]
= kK=O
L lE [lE(XK IFk ) l {T =k } ] L lE [XK l{T=k} ]
k=O
[
lE XK
t, l{T=k} ]
=
lE(XK ) = lE(Xo) ·
The stopping time principle holds also true if X gale; then the conclusion is
o
= (Xn) is a supermartin
lEXT � lEXo. Also, alternative conditions such as X (Xn) is bounded ( I Xn( w) 1 � L for some L and all n, w ) ; lET < 00 and (Xn - Xn - 1 ) is bounded; suffice for the proof of the stopping time principle. The stopping time principle is important in many areas, such as sequential analysis in statistics. We turn in the next section to related ideas specific to the gambling/financial context. •
•
=
84
3. Stochastic Processes in Discrete Time
We now wish to create the concept of the a-algebra of events observable up to a stopping time in analogy to the a-algebra Fn which represents the events observable up to time
T, n. FrDefinition is defined3 .to5 . 1be. Let T be a stopping time. The stopping time a-algebra Fr = {A E F : A n {T � n} E Fn, for all n}. Proposition 3.5 . 1 . For T a stopping time, Fr is a a-alg ebra. Proof. We simply have to check the defining properties. Clearly fl, 0 are in Fr . Also for A E Fr we find
thus
AC E Fr. Finally, for a family Ai E Fr , i = 1, 2,
.
. . we have
o
�
Let T be stopping times with � Then Fu Fr. Proof. Since � T we have {T � n} { a � n}. So for A E Fu we get A n {T � n} = (A n {a � n} ) n {T � n} E Fn , since (.) E Fn as A E Fu. So A E Fr. Proposition 3 . 5 . 3 . For any adapted sequence of random variables X (Xn) and a.s. finite stopping time T, define Xr = L Xn l { r = n } ' n =O Then Xr is FT-measurable. Proof. Let B be a Borel set. We need to show { Xr E B} E Fr . Now using the fact that on the set {T = k} we have Xr = Xk , we find n n {XT E B} n {T � n} = U {Xr E B} n {T = k} = U {Xk E B} n {T = k}. k= l k =l Now sets {Xk E B} n {T = k} E Fk Fn, and the result follows. We are now in position to obtain an important extension of the Stopping Proposition 3 . 5 . 2 .
0',
0'
T.
�
(]'
o
00
�
Time Principle, Theorem 3.5.1.
0
85
3 . 5 Stopping Times and Optional Stopping
be a martingale and let be bounded stopping times with Let
Theorem 3 . 5 . 2 (Doob's Optional-Sampling Theorem, OST) . X = (Xn) a, T a s T.
Then and thus IE(Xr ) = IE(Xa ) .
Proof. First observe that X-r and Xa are integrable (use the sum rep resentation and the fact that T is bounded by an integer K) and Xa is Fa-measurable by Proposition 3.5.3. So it only remains to prove that
(3.1) For any such fixed A E Fa , define p by p (W )
Since
=
a(w ) lA ( w ) + T( w ) lAc ( w ) ,
(A n {a S n } ) U (AC n {T S n } ) E Fn p is a stopping time, and from P S T we see that p is bounded. So the STP (Theorem 3.5. 1) implies IE(Xp) = IE(Xo ) = IE(X-r ) . But {p S n}
=
IE(Xp) = IE (Xa 1A + X-r 1Ac ) , IE(X-r ) = IE (X-r lA + X-r 1Ac ) .
So subtracting yields (3. 1).
o
We can establish a further characterization of the martingale property.
Let X = (Xn) be an adapted sequence of random vari ables with IE(IXn l ) 00 for all n and IE(X-r ) = 0 for all bounded stopping times T . Then X is a martingale.
Proposition 3 . 5 .4.
<
Proof. Let 0 S m < n , oo and A T = n lA + m l Ac . Then
0 = IE(X-r )
=
E
Fm . Define a stopping time T by
IE (Xn lA + XmlAc ) ,
0 = IE(Xm ) = IE (Xm lA + XmlAc ) ,
and by subtraction we obtain IE (Xm l A ) = IE (Xn lA ) ' Since this holds for all A E Fm , IE(Xn IFm) Xm by definition of conditional expectation. This 0 says that (Xn) is a martingale, required. =
as
Write X-r = (X� ) for the sequence X we define X� (w ) X-r (w ) l\n (w ) , :=
=
(Xn) stopped at time T, where
86
3. Stochastic Processes in Discrete Time
Proposition 3.5.5. (i) If X is adapted and T is a stopping time, then the stopped sequence XT is adapted. If X is a martingale (super-, submartingale) and T is a stopping time, (ii) x r is a martingale (super-, submartingale). Proof.
Let Cj : = l { j :-:;r} ; then n
XT An = Xo +
L Cj (Xj - Xj - l )
j= 1
(as the right is Xo + E;�; (Xj - Xj - d , which telescopes to Xr An ) . Since {j :::; T} is the complement of {T < j } = {T :::; j - I } E Fj - l , (Cn ) is predictable. So (X;;: ) is adapted. If X is a martingale, so is X T as it is the martingale transform of (Xn ) by (Cn ) (use Theorem 3.4. 1). The right-hand side above is XT A(n- l ) + Cn (Xn Xn- l ) . So taking conditional expectation given Fn_ l and using predictability of (Cn ) , Then Cn 2: 0 shows that if X is a supermartingale (submartingale) , so is the 0 stopped sequence XT . 1 . See e.g. Neveu (1975) , Lemma II-12. 14 for a variant on this proof avoiding the language of martingale transforms. 2. Part (ii) of the proposition says that if a game is fair, it remains fair when stopped (by a stopping time, i.e. without prescience) ; if it is (un-) favourable, it remains (un-) favourable when stopped. Examples and Applications. 1. Recall the simple random walk: Sn := E�= l Xk , where the Xn are independent tosses of a fair coin, taking values ± 1 with equal probability 1/2. Suppose we decide to bet until our net gain is first + 1 , then quit. Let T be the time we quit; T is a stopping time. The stopping time T has been analyzed in detail; see e.g. Grimmett and Stirzaker (2001), §5.3, or Exercise 3.4. From this, note: (i) T < 00 a.s.: the gambler will certainly achieve a net gain of +1 eventually; (ii) JET +00: the mean waiting-time until this happens is infinity. Hence also: (iii) No bound can be imposed on the gambler's maximum net loss before his net gain first becomes + 1 . At first sight, this looks like a foolproof way to make money out of nothing: just bet until you get ahead (which happens eventually, by (i)), then quit. However, a gambling strategy, this is hopelessly impractical: because of Note.
Simple Random Walk.
=
as
87
3 . 5 Stopping Times and Optional Stopping
( ii ) , you need unlimited time, and because of ( iii ) , you need unlimited capital - neither of which is realistic. Notice that the Stopping-time Principle ( Theorem 3.5.1) fails here: we start at zero, so So 0, ESo OJ but ST 1, so EST = 1 . This example shows two things: Conditions are indeed needed here, or the conclusion may fail (none of the conditions in Th. 3.5.1 or the alternatives ( i ) , ( ii ) given are satisfied in this example ) . Any practical gambling ( or trading) strategy needs to have some integra bility or boundedness restrictions to eliminate such theoretically possible but practically ridiculous cases. 2. The strategy of doubling when losing according to the (§3.3) - has similar properties. We play until the time T of our first win. Then T is a stopping time, and is geometrically distributed with parameter p 1/ 2 . If T = n, our winnings on the nth play are 2 n -1 ( our previous stake of 1 doubled on each of the previous n 1 losses ) . Our cumulative losses to date are 1 + 2 + . . . + 2 n - 2 2 n - 1 1 ( summing the geometric series ) , giving us a net gain of 1 . The mean time of play is lEe T ) 2 (so doubling strategies accelerate our eventually certain win to give a finite expected waiting time for it ) . But no bound can be put on the losses one may need to sustain before we win, so again we would need unlimited capital to implement this strategy - which would be suicidal in practice as a result. A single play of the Saint Petersburg 3. game consists of a sequence of coin tosses stopped at the first headj if this is the rth toss, the player receives a prize of $ 2T . ( Thus the expected gain is L�l 2 - T .2T + 00 , so the random variable is not integrable, and martingale theory does not apply. ) Let Sn denote the player's cumulative gain after n plays of the game. The question arises as to what the 'fair price' of a ticket to play the game is. It turns out that fair prices exist ( in a suitable sense ) , but the fair price of the nth play varies with n - surprising, as all the plays are replicas of each other. Other examples may be constructed of games which are 'fair' in some sense, but in which the player sustains a net loss, tending to - 00 , with prob ability one. For a discussion of such examples, see e.g. Feller ( 1968), X.3. Let Sn be the total assets of an insurance company at the end of year n. During year n premiums totaling c pounds are received, while claims totaling (n dollars are paid, so Sn Sn-l + c (n . Let "'n C (n and suppose that "' 1 , "'2 , . . . are independent random variables which are normally distributed with mean J.L > 0 and variance 0" 2 . Let B ( for 'bankrupt' ) be the event that the wealth of the insurance company is negative at some time n. We show
=
=
=
•
•
The Doubling Strategy.Oxford English Dictionary martingale,
-
=
-
=
-
=
The Saint Petersburg Game.
=
4. Ruin Probabilities
=
=
-
-
the
88
3. Stochastic Processes in Discrete Time
(3.2) Thus, p,So/a 2 must be large in order for the insurance company to be suc cessful with a high probability. To show ( 3.2 ) we use the exponential martingale: for a sequence of i.i.d random variables (r/i) with existing moment generating function, i.e. JE( exp { 017 d ) < 00 for some 0 =I- OJ one can check that the sequence _
Mn -
exp { OSn } ¢( o ) n
is a martingale ( Sn = I:�=o 'TJi as usual ) . Above we assumed 'TJi to be normally distributed, so ¢ ( O) = exp { Op, + 0 2 a 2 /2 } . Choose 0 = -2p,/a 2 j then ¢ (O) 1 and Mn = exp { -2p,Sn/a 2 } is a martingale. Now the event of bankruptcy occurs if the wealth of the insurance company is negative, so define the stopping time T = inf { n Sn < OJ . Thus JP(B) = JP(T < ) Since for any n the stopping time T t\ n is bounded and by Proposition 3.5.5 Mrl\ n is a martingale, we can use Doob's stopping principle, Theorem 3.5.1 , to obtain =
:
(0
.
Now
2':
JP(T
:S n ) ,
where we used the positivity of the second expression, and the fact that from Sr < 0 and so -2p,Sr / a 2 2': 0 it follows that exp { . } 2': 1 . Combining the last two expressions, we find Since the inequality holds for all n, ( 3.2 ) follows. 3 . 6 The Snell Envelope and Optimal Stopping
We now discuss the Snell envelope, which will be an important tool for the valuation of American options. The idea is due to Snell (1952)j for a textbook account, see e.g. Neveu (1975) , VI.
3.6 The Snell Envelope and Optimal Stopping
89
Definition 3.6 . 1 . If X = (Xn);{=o is a sequence adapted to a filtration Fn
with lE( I Xn l )
<
00,
the sequence Z = (Zn);{=o defined by
XN , { ZNZn :=:= max { Xn , lE (Zn + 1 IFn n
(n :::;
N - 1)
is called the Snell envelope of X .
We shall see i n Chapter 4 that the Snell envelope is the tool needed in pricing American options. Theorem 3.6. 1 . The Snell envelope Z of X is a supermartingale, and is the
smallest supermartingale dominating X (that is, with Zn ?: Xn for all n) .
Proof. First, Zn ?: lE ( Zn +1 IFn) , so Z is a supermartingale, and Zn ?: Xn , so Z dominates X. Next, let Y = (Yn) be any other supermartingale dominating X; we must show Y dominates Z also. First, since ZN = XN and Y dominates X, we must have YN ?: ZN . Assume inductively that Yn ?: Zn . Then as Y is a
supermartingale,
and as Y dominates X, Combining,
Yn - 1 ?: max {Xn- l , lE(Zn IFn- d } = Zn - l .
By repeating this argument (or more formally, by backward induction) ,
Yn ?: Zn for all n, as required.
D
The definition of the Snell envelope involves a new feature, which calls for emphasis. It is defined by recursion, backwards in time. Thus we begin by defining Z at the terminal time N, and, after repeated recursion (or by induction) end by defining Z at the initial time O. This is no accident: it is essentially the method of dynamic programming (DP), due to Richard Bell man, from operational research (OR) . We shall encounter it again in Chapter 4, where we will price American options by backward recursion through a bi nary tree. Proposition 3.6. 1 . T- := inf{ n ?: 0 : Zn = Xn } is a stopping time, and the Note.
stopped process zr* is a martingale.
Proof. Since ZN = XN , T* E {O, 1, . . . , N } is well-defined and clearly bounded. For k = 0, { T- = O} = {Zo = Xo } E Fo ; for k ?: 1,
{ T* = k} = {Zo > Xo } n · · · n {Zk - l > Xk-d n { Zk = Xd E Fk ·
3. Stochastic Processes in Discrete Time
90
So r* is a stopping time. As in the proof of Proposition 3.5.5,
where Cj = l {j :$ r * } is predictable. For n :-:; N - 1,
Z�:l - Z� * Cn+1 (Zn + 1 - Zn) = l{n + l :$r* } (Zn + l - Zn) . Now Zn := max {Xn , JE(Zn + l ITn) } , and by definition of r* ,
=
Zn > Xn on So from the definition of Zn ,
{n + 1 :-:; r * } .
We next prove For, suppose first that r* 2: n+ 1 . Then the left of (3.3) is Zn+l - Zn , the right is Zn + 1 - JE(Zn + 1 ITn) , and these agree on {n + 1 :-:; r* } by above. The other possibility is that r* < n + 1 , i.e. r* :-:; n. Then the left of (3.3) is Zr * - Zr * = 0, while the right is zero because the indicator is zero, completing the proof of (3.3) . Now apply JE(. I Tn) to (3.3): since {n + 1 :-:; r* } = {r* :-:; n} C E Tn ,
[
JE (Z�:l - Z� * ) ITn
] = l{n+l:$r* } JE ((Zn+l - JE(Zn + 1 I Tn)) ITn ) = l { n + l :$ r * }
(JE(Zn + l I Tn) - JE(Zn +1 ITn) ) = o .
So JE(Z�: l I Tn) = Z� * . This says that zr * is a martingale, as required. (a
if
0
Write Tn , N for the set of stopping times taking values in {n, n + 1 , . . . , N} finite set, as il is finite ) . Call a stopping time a E Tn,N for (Xn)
optimal
We next see that the Snell envelope can be used to solve the optimal stopping problem for (Xn) in Io , N . Recall that To = {0, il} so JE(YITo ) = JE(Y) for any integrable random variable Y. Theorem 3.6.2. r*
solves the optimal stopping problem for X:
=
Zo JE(Xr* ) sup {JE (Xr) =
:
r
E Io, N } .
91
3.6 The Snell Envelope and Optimal Stopping
the first statement we use that (Z;,: ) is a martingale = Xr"To prove then
Proof.
and Zr"
j
(3.4) Now for any stopping time E To, N , since Z is a supermartingale (above), so is the stopped process zr (see Proposition 3.5.5) . Together with the property that Z dominates X this yields
T
(3.5) Combining (3 4) and (3.5) and taking the supremum on T gives the result. .
D
The same argument, starting at time n rather than time 0, gives Theorem 3.6.3. inf{j � n : Zj Xj } ,
IfT� :=
=
=
As we are attempting to maximize our payoff by stopping X (Xn ) at the most advantageous time, the Corollary shows that gives the best stopping time that is realistic: it maximizes our expected payoff given only information currently available. Optimal stopping problems have both an extensive - and quite deep mathematical theory and applications to areas such as gambling, as well as the more speculative areas of mathematical finance. For a textbook treat ment, see e.g. Chow, Robbins, and Siegmund (1991) (the Snell envelope is treated in §4.4) , or Neveu (1975) . The gambling- and game-theoretic side of things is developed in the classic Dubins and Savage (1976) , and its recent sequel Maitra and Sudderth (1996) . There are extensive links between the martingale theory of Chapter 3 and potential theory, classical and probabilistic. The least supermartingale majo rant - Snell envelope - in martingale theory corresponds to the least excessive majorant in potential theory. This is called the (reduced function) in potential theory. It occurs in the fundamental theorem of gambling (Maitra and Sudderth (1996) , §3. 1) j for the setting of probabilistic potential theory, see e.g. Meyer (1966) , IX.2. We proceed by analyzing optimal stopping times. One can characterize optimality by establishing a martingale property: Proposition 3.6.2. u E T (Xt )
T�
n§duite
The stopping if the following two conditions hold.time (ii)(i) zrrZrr =is Xrr; a martingale.
is optimal for
if and only
92
3. Stochastic Processes in Discrete Time
Proof. We start showing that ( i ) and ( ii ) imply optimality. If Z U is a martingale then
Zo = JE( zg )
= JE(Z'N )
= JE( Zu ) = JE( Xu ) ,
where we used ( i ) for the last identity. Since Z is a supermartingale Propo sition 3.5.5 implies that zr is a supermartingale for any E To, N . Now Z dominates X, and so r
Combining, a is optimal. Now assume that a is optimal. Thus since Z dominates X . Since ZU is a supermartingale, we also have Zo JE( Zu) . Combining,
>
But X ::; Z, so Xu ::; Zu , while by the above Xu and Zu have the same expectation. So they must be a.s. equal: Xu Zu a.s., showing ( i ) . To see ( ii ) , observe that for any n ::; N
=
where the second inequality follows from Doob's OST ( Theorem 3.5.2) with the bounded stopping times ( a 1\ n ) ::; a and the supermartingale Z. Using that Z is a supermartingale again, we also find (3.6) As above, this inequality between random variables with equal expectations forces a.s. equality: ZU An JE( Zu jFn) a.s .. Apply JE( . jFn-l ) :
=
by above with n - 1 for n. This says so ZU is martingale.
o
a
From Proposition 3.6.1 and its definition ( first time when Z and X are equal ) it follows that is the smallest optimal stopping time . To find the largest optimal stopping time we try to find the time when Z 'ceases to be a martingale'. In order to do so we need a structural result of genuine interest and importance. r
*
3.6 The Snell Envelope and Optimal Stopping
93
Theorem 3.6.4 (Dooh Decomposition) . Let X = ( Xn ) be an adapted 1 process unique) Doob decom positionwith each Xn [, . Then X has anXo(essentially + Mn + An 'lin (3.7) X = Xo + M + A : Xn with M a martingale null at zero, A a predictable process null at zero. If also X is a submartingale ('i ncreasing on average') , A is increasing: An :S An +l for all n, a.s .. E
=
Proof.
If X has a Doob decomposition (3.7) ,
The first term on the right is zero, as M is a martingale. The second is An - A n - I . since An ( and A n- I ) is Fn_ l -measurable by predictability. So (3.8) and summation gives An
n
=
I: JE [Xk - Xk -1 I Fk - l ] ,
k= 1
a. s .
So set Ao = 0 and use this formula to (An ) , clearly predictable. We then use (3.7) to ( Mn ) , then a martingale, giving the Doob decomposition (3.7) . To see uniqueness, assume two decompositions, i.e. Xn = Xo + Mn + An XO + Mn + A n , then Mn - Mn = An - An . Thus the martingale Mn - Mn is predictable and so must be constant a.s .. If X is a submartingale, the LHS of (3.8) is � 0, so the RHS of (3.8) is D � 0, i.e. (An ) is increasing.
define
define
=
Although the Doob decomposition is a simple result in discrete time, the analogue in continuous time - the Doob-Meyer decomposition - is deep. This illustrates the contrasts that may arise between the theories of stochastic processes in discrete and continuous time. Equipped with the Doob-decomposition we return to the above setting and can write Z = Zo + L + B with L a martingale and B predictable and decreasing. Then M Zo + L is a martingale and A = (-B) is increasing and we have Z = M - A. Definition 3.6.2. fl --+ INo =
Define a random variable
1/ ( w ) =
{Nmin n � 0 An+ ! {
:
>
by setting
1/ :
O}
if AN � ) = O AN ( w ) > O.
if
94
3. Stochastic Processes in Discrete Time
Observe that v (bounded by N) is a stopping time, since {v = n} =
U {A k = O}
k�n
n
{An+ l
>
O} E Fn
as A is predictable. Proposition 3.6.3. v is optimal for (Xt) , and it is the largest optimal stop ping time for ( Xt ) . Proof. We use Proposition 3.6.2. Since for k � v (w) , Zk (W) Mk (W) A k (W) = Mk (w) , z v is a martingale and thus we have (ii) of Proposition 3.6.2. To see (i) we write N l L l { v=k } Zk + l {v=N} ZN Zv k =O N l = L l { v=k } max{Xk , JE (Zk+l I Fk ) } + l { v=N} XN ' k=O Now JE (Zk+l I Fk ) JE (Mk+l - A k +l I Fk ) = Mk - A k+l ' On {v = k} we have A k 0 and A k+l > 0, so JE (Zk+ l I Fk ) < Zk . Hence Zk max{Xk , JE (Zk+ l IFk ) } = Xk on the set {v = k} . So N l Zv = L l {v=k } Xk + l { v=N} XN = Xv , =
-
=
-
=
=
=
-
k= O
which is (i) of Proposition 3.6.2. Now take T E { T} O N with T 2:: v and JP(T > v ) > O. From the definition of v and the fact that A is increasing, Ar > 0 with positive probability. So JE (Ar ) > 0, and ,
JE (Zr ) = JE (Mr ) - JE (Ar) = JE (Zo) - JE (Ar)
<
JE (Zo) .
So T cannot be optimal.
o
3 . 7 Spaces of Mart ingales
We now collect several results on martingales. Definition 3.7. 1 . X
(i) Call {'p-bounded if
P sup n JE IXn l < 00 . £2 X
If a martingale X is bounded in we say is square-integrable
3.7 Spaces of Martingales
95
(ii) We say that if We turn now to the convergence theorems that make martingales so pow erful a tool.
An £ l -bounded supermartingale is a. s . conver gent: there exists finite such that
Theorem 3 . 7 . 1 (Doob) . Xoo
Xn
-+ Xoo ( n -+ 00 )
a.s.
In particular, we have: Corollary 3 . 7 . 1 (Doob's Martingale Convergence Theorem) .
bounded martingale converges a. s..
An £ 1 _
Recall that for � with lE ( I W < 00 the process Mn lE(�IFn ) is a martin gale. Since from Fatou's lemma lE ( I Xoo l ) ::; lim inf lE ( I Xn l ) < 00 in the above setting, one is tempted to use Xoo to generate Xn via Xn lE(Xoo I Fn ) . However, this construction fails without further conditions. We need: =
=
C
A family is uniformly integrable (UJ) if, given any E >of0, real-valued there existsrandom some Kvariables € such that
Definition 3 . 7. 2 .
for all X E
C.
We can use (UI) to get sup lE ( I Xn l ) £ l -boundedness. We also have
<
1 + K 1 i.e. to see that (UI) implies
Then is (UI). Suppose X is a martingale bounded in £P for some
Proposition 3 . 7. 1 . 1. X Proof.
Let A
=
sUPn 2:0 lE ( I Xn I P ) .
Given
E
>
0 define Kf - 1
=
A/E ,
p
>
then
sup lE ( I Xn I 1 { I Xn l > K, } ) ::; sup lE ( I Xn l ( I Xn l / K€ ) P-1 1 { I Xn l >K, } )
o
The central result now is: Theorem 3 . 7 . 2 .
The following are equivalent for martingales X
=
(Xn ) :
96
3. Stochastic Processes in Discrete Time
Xn converges in £1 . (ii)(i) for Thereall nexists an integrable random variable Xoo with Xn = .IE [Xoo IFnJ . (iii) X is (UI). Proof.
(i)==> (ii) . Given n � ° and F E Fn , we have for m > n
.IE ((Xoo - Xn) lF )
=
.IE ((Xm - Xn) lF ) + .IE ((Xoo - Xm ) lF )
= .IE ((Xoo - Xm ) lF ) � .IE ( IXoo - Xm l ) � ° (m � 0 ) .
So .IE (Xoo I F ) = .IE (Xn 1 F ) ' Since Xn is Fn-measurable and the above is true for all F E Fn , we conclude .IE (Xoo I Fn) = Xn . (ii)==>(iii). We use the following: (a) .IE [ IXn l l { IXn l>K} ] � .IE [ I Xoo l l{ IXn l >K} ] , using the conditional Jensen inequality. (b) KJP( IXn l > K) � .IE ( IXn l ) � .IE ( IXoo l ) , by truncation and Jensen's inequality. (c) If X E £ 1 , then for any > 0, there exists some 8 > ° such that €
JP(F)
<
8 ==> .IE ( IX I I F )
< €.
Now given any > 0, choose 8 such that (c) holds for Xoo and choose KE in (b) such that JP (IXn l > K,) < 8. Then using (a) €
.IE [ IXn l l {IX" I> K. } ] � .IE [ IXoo l l {IX" I>K. } ] � € for all n , so (Xn) is (VI) . (iii) ==> (i). From X (VI) we know that X is bounded in £ 1 , hence Xn � Xoo a.s. for some Xoo E £1 . Since almost-sure convergence of a (UI) process 0 implies £ 1 convergence, (i) follows. 3 . 8 Markov C hains
Let (n, F, JP) be a probability space and {Xn ' n = 0, 1, . . . } be a sequence of random variables (rvs) with a discrete state space I. We interpret Xn as the state of some dynamic system at time n. Definition 3.8 . 1 . {Xn , n = 0, 1 , . . I n 0, 1, . . .
The stochastic is called a discrete-time Markov process chain if , for each
for all possible values of (io, . . . , in + 1 )
E
I.
.
=
} with state space
3.8 Markov Chains
97
In the following we consider only Markov chains with time-homogeneous transition probabilities; that is we assume JP( Xn + l
= = i = Pij , i , )
Ef independently of the time parameter n . The probabilities Pij are called and satisfy j l Xn
j
step transition probabilities Pij
;:::
0,
one
= i
E f and L Pij 1 , E f. j E] We call (Pij ) the transition matrix. The n-step transition probabilities are defined by i,j
P=
p�;) = JP ( Xn
for any n
= = j l Xo
i) , i , j
Ef
= 1 , 2, . . (p�J) = 8ij ). .
Consider a gambling game in which on any turn you win 1 pound with probability P 0.4 or lose 1 pound with probabil ity 1 - P 0.6. Suppose further that you adopt the rule that you quit playing if your fortune reaches N pounds. For instance, for N 5 the transition matrix is 1 0 0 0 0 0 0.6 0 0.4 0 0 0 0 0.6 0 0.4 0 0 0 0 0.6 0 0.4 0 0 0 0 0.6 0 0.4 0 0 0 0 0 1 2. Ratings of countries/firms can be thought off as following a Markov chain. Rating agencies typically assign classes like AAA, AA, A, BBB, BB, B, C, D. See Chapter 9 for details. 3. Consider a population in which each individual in the nth generation gives birth, producing k children with probability Pk . The number of individuals in generation n, Xn can be any nonnegative integer. If we let Y1 , Y2 , be independent random variables with Examples. 1 .
Gambler's ruin.
=
=
=
P=
Bond ratings. Branching process. .
•
.
JP(Ym
= k) = Pk ,
then we can write transition probabilities
=
as
=
+ . . . + Yi j ) . One can easily see that the m-step transition probability JP( Xn + m j l Xn ) is the mth power of transition matrix P. We also have the following im portant relation p(i, j )
i
JP(YI
= =
98
3. Stochastic Processes in Discrete Time
Proposition 3 . 8 . 1 (Chapman-Kolmogorov Equations) .
0, 1 , . . . ,
For all n, m =
( n + m ) = � ( n ) ( mj ) , Z. , ]. E I . L...... Pik P k
Pij
kE f
Let Ty = min { n ;::: 1 Xn = y} be the time of first return to y, and let :
gy = lP(Ty < 00 ) be the probability that Xn returns to y when it starts at stopping time. The key property of Markov processes is
y.
Then Ty is a
of the Markov chain {Xn }n>O with transition matrixSuppose P . Thenis a stopping time The process process after after TT isanda Markov the processchainbeforewithTtransition are independent. (ii)(i) The matrix P. Theorem 3.8.1 (Strong Markov Property) .
T
More details and further discussion can be found in e.g. Norris (1997).
Exercises
3.1
Show that in general,
Also, show that if (Xn) is an L2 martingale difference sequence ( that is, Xn = Zn - Zn- l with (Zn) an L2 martingale ) ,
In particular, this holds if the Xn are independent. 3.2 1 . Let X, Y E L2 (D, F, lP). Show that the mean-square error
IE [(Y - (aX + b» 2] is minimized for a* = Y)/Var(X) and b* = IE(Y) - a* IE(X). 2. Now let Y E L2 (D, F, lP) and 9 a a-algebra with 9 � F. Show that min IE [ (Y - (aX + b» 2 ] = IE [ (Y _ IE(Y lg » 2 ] . 2
Cov(X,
a,b;X E L ( !1 , Q , lP )
99
Exercises
A number d of balls are distributed between two urns, I and II. At each time n = 0 , 1 , 2, . . . a ball is chosen - each with equal probability l /d and transferred to the other urn. 1 . Show that the number of balls in urn I forms a Markov chain ( see e.g. Feller (1968) or Norris (1997) for background ) , with transition probabilities Pi , H I = (d - i)/d, Pi , i - I = i/d, Pij = 0 otherwise (i = 0, 1 , . . . , d) . 2. Show that the stationary distribution is ( 7fi ) ' where 3.3
-
7f i
(i)
d = 2 _d
- that is, if the process is started in this distribution, it stays in it. ( This is the Ehrenfest treated in detail in Cox and Miller (1972) , 129-132, Feller ( 1968) , 377-378, etc. It exhibits a strong 'central push' towards the central states, and is a discrete-time analogue of the Ornstein-Uhlenbeck velocity process of §5.7.) 3.4 Consider a gambler who bets a unit stake on a succession of independent plays, each of which he wins with probability P , loses with probability q : = 1 - P , with the strategy of quitting when first ahead. Write Sn for his net gain after n plays, urn,
fn
:=
JP(SI ::; 0, . . . , Sn - I ::; 0, Sn
=
1)
for the probability that he quits at time n, F(s) : =
00
L fnsn
n= 1
for the generating function of the sequence Un ) . Show that: 1 . F(s) = ( 1 - J 1 - 4pq s 2 )/(2q s ) ; ( _ 1 ) n - l 1n 2. h n - I = -2q - ( � ) (4pq) , h n = 0 ; 3. He eventually wins with probability 1 if P ::::: q, p /q if P < q; 4. For P ::::: q ( when he is certain to win eventually ) , the expected duration of play is l/(p - q) if P > q, + 00 if P = q = � . Thus if the game is fair, the expected waiting time to quitting when first ahead is infinite. ( For a detailed account, see e.g. Feller (1968) , XI.3, Grimmett and Stirzaker (2001), §5.3.) 3 . 5 In the fair game case P = q = � of the above: 1 . For each real () , show that Mn : = ( cosh () ) - n eIiSn is a martingale; 2. If T : = inf { n : Sn = I } is the duration of play, JP(T < 00) = 1 ; 3. 1E ( s T ) = L s n JP(T = n ) = (1 - v'f=S2) / s , JP ( T = 2n - 1 ) = ( _ l ) n - 1 (!) . 4. lE(T) = 00. ( Differentiate lE(S T ) in ( iii ) and put s = 1 . ) ( This illustrates the power of martingale methods in such problems; for a detailed treatment, see Williams ( 1991) , § 1O. 12.)
4 . Mathematical Finance in Discrete Time
4 . 1 The Model
We will study so-called finite markets - i.e. discrete-time models of financial markets in which all relevant quantities take a finite number of values. Fol lowing the approach of Harrison and Pliska (1981) and Taqqu and Willinger (1987) , it suffices, to illustrate the ideas, to work with a finite probability space (fl, F, JP) , with a finite number I fl l of points w , each with positive probability: JP( { w }) > O. We specify a time horizon T, which is the terminal date for all economic activities considered. ( For a simple option-pricing model the time horizon typically corresponds to the expiry date of the option. ) As before, we use a filtration IF = {Ft l;=o consisting of u-algebras Fo C Fl C . . . e FT : we take Fo {0 , fl} , the trivial u-field, FT F P(fl) ( here P (fl) is the power-set of fl, the class of all 2 1nl subsets of fl: we need every possible subset, as they all - apart from the empty set - carry positive probability ) . The financial market contains d + 1 financial assets. The usual interpre tation is to assume one risk-free asset ( bond, bank account ) labeled 0, and d risky assets ( stocks, say ) labeled 1 to d. While the reader may keep this interpretation as a mental picture, we prefer not to use it directly. The prices of the assets at time t are random variables, So ( t, W) , Sl ( t, W) , . . . , Sd ( t, W) say, non-negative and Fcmeasurable ( i.e. adapted: at time t, we know the prices Si ( t) ) . We write S(t) = ( So ( t ) , Sl ( t ) , . . . , Sd ( t )) ' for the vector of prices at time t. Hereafter we refer to the probability space (fl, F, JP) , the set of trading dates, the price process S and the information structure IF, which is typically generated by the price process S, together as a securities market model. It will be essential to assume that the price process of at least one asset follows a strictly positive process. =
=
=
numeraire price process of random variables), which is strictlyis apositive for all t( XE ({tO,) ) ;=1 ,o. (a. , T}sequence .
Definition 4. 1 . 1 . A
.
For the standard approach the risk-free bank account process is used as numeraire. In some applications, however, it is more convenient to use a
102
4. Mathematical Finance in Discrete Time
security other than the bank account and we therefore just use 80 without further specification as a numeraire. We furthermore take 80 (0) = 1 ( that is, we reckon in units of the initial value of our numeraire ) , and define (3(t) 1 /80 ( t) as a discount factor. A ( or
trading strategy dynamic portfolio) i before t =
-
-
t;
as
The value of the portfolio at time is the scalar product d Vcp ( t ) =
.
.
=
-
-
Hence
=
t t. The gains process of a trading strategy is given by -
Gcp(t)
:=
=
.
.
7" = 1
Observe the - for now - formal similarity of the gains process Gcp from trading in 8 following a trading strategy
=
discounted value process =
=
4 . 1 The Model
1 03
and the
discounted gains process L cp(T) . ..1S (T), (t Gcp(t) := '7'L= cp(T) . (S(T) - S eT - 1 ) ) = '7'=1 1 t
t
=
1, 2, . . . , T) .
Observe that the discounted gains process reflects the gains from trading with assets 1 to d only, which in case of the standard model (a bank account and d stocks) are the risky assets. We will only consider special classes of trading strategies. P,
The strategy is self-financing, cp E if cp(t) . Set) = cp (t + 1 ) . Set) (t = 1, 2, . . , T - 1 ) . (4 . 1 ) Interpretation. When new prices Set) are quoted at time t , the investor adjusts his portfolio from cp(t ) to cp(t + 1) , without bringing in or consum ing any wealth. The following result (which is trivial in our current setting, cP
Definition 4 . 1 .4.
.
but requires a little argument in continuous time) shows that renormalis ing security prices (Le. changing the numeraire) has essentially no economic effects.
Let Xif(t)andbe only a numeraire. A trading strategy cp is sel f -financing with respect to if cp is sel f Set) financing with respect to X (t) - 1 S(t). Proof. Since X (t) is strictly positive for all t = 0, 1 , . . . , T we have the following equivalence, which implies the claim: cp(t) . Set) = cp(t + 1) . Set) ( t = 1 , 2, . . , T - 1 ) cp(t) . X ( t)- 1 S(t) = cp(t + 1 ) . X ( t) - 1 S(t) (t = 1 , 2, . . . , T - 1 ) . Proposition 4 . 1 . 1 (Numeraire Invariance) .
¢:}
.
o
cp is selfto-financing with respect to Set) ifCorollary and only if4. 1cp. 1is. selA ftrading -financingstrategy with respect et . ) S We now give a characterization of self-financing strategies in terms of the discounted processes. P
Proposition 4 . 1 . 2 . A
trading strategy cp belongs to if and only if Vcp(t ) = Vcp(O) + Gcp( t ), (t = 0, 1, . . . , T) . (4.2) Proof. Assume cp P. Then using the defining relation (4. 1), the numeraire invariance theorem and the fact that So(O) = 1 E
4. Mathematical Finance in Discrete Time
104
V", (O) +
t
cp (l) · 8(0) + L cp (r) . (8(r) - 8 (r - 1 ) ) 7"= 1 cp ( l ) . 8(0) + cp ( t ) . 8( t )
G",(t) =
t-l
L ( cp (r) - cp (r + 1 ) ) . 8 (r) - cp ( l ) . 8 (0)
+
7"=1 cp ( t ) . 8( ) =
t V",(t).
=
Assume now that ( 4 . 2 ) holds true. By the numeraire invariance theorem it is enough to show the discounted version of relation (4. 1 ) . Summing up to = 2 ( 4.2 ) is
t
cp ( 2 ) . 8( 2 ) = cp (l) . 8(0) + cp ( l ) . (8(1) - 8(0)) + cp ( 2 ) . (8( 2 ) - 8(1)) . Subtracting cp ( 2 ) . 8( 2 ) on both sides gives cp ( 2 ) . 8 ( 1 ) = cp (l) . 8(1) , which is (4. 1 ) for = 1 . Proceeding similarly - or by induction - we can show . 8( t ) cp ( t + 1 ) . 8( t ) for = 2, . . , T - 1 required. 0 We are allowed to borrow ( so cpa ( t ) may be negative ) and sell short ( so may be negative for = 1, . . . So it is hardly surprising that if we decide what to do about the risky assets and fix an initial endowment, the numeraire will take care of itself, in the following sense. Proposition 4. 1 .3. ( cpl ( t ) , . . . , CPd ( t )) ' cp = ( cpa ( t )) t=1 ( cpa , CP l , . . . , CP d ) ' V", (O) V . Proof. If cp is self-financing, then by Proposition 4. 1 .2,
cp(t) CPi(t)
=
t
t
. , d).
i
as
If predictable process is predictable and Vo is Fa-mea surable, there isisaselunique such that port f fi nancing with initial value of the corresponding folio a =
V",( t ) = Va +
G", (t )
=
t
Va + L (CP l (r).181 (r) + . . . + CP d (r).18d (r)) .
7"= 1
On the other hand, Equate these: CP a( )
t
which defines
t
=
Va + L (CP l (r).181 (r) + . . . + CP d (r).18d (r)) 7" = 1
cpa(t ) uniquely. The terms in 8i (t ) are
4.2 Existence of Equivalent Martingale Measures
105
which is Ft - 1-measurable. So t-I
'Po (t) = Vo + �) 'P l (r)LtS\ (r) + . . . + 'Pd(r)L1.Sd (r)) T=l
where as 'P I , . . , 'Pd are predictable, all terms on the right-hand side are D Ft_ 1-measurable, so 'Po is predictable. 4 . 1 . 1 . Proposition 4. 1 .3 has a further important consequence: for defining a gains process G
Remark
contingent claim with maturity datefiniteness is anofarbi trary F-measurable random variable (which is by the probability space bounded). We denote the class of all contingent claims theby
-
f
Fr
4 . 2 Existence of Equivalent Martingale Measures
4.2 . 1 The No-arbitrage Condition
The central principle in the single period eXanIple was the absence of arbitrage opportunities, i.e. the absence of investment strategies for making profits without any exposure to risk. As mentioned there this principle is central for any market model, and we now define the mathematical counterpart of this economic principle in our current setting.
106
4. Mathematical Finance in Discrete Time
Let ifJ opportunity be a set oforselfarbitrage -financingstrategy strategieswith. Arespect strategyto is called an arbitrage ifJ if P{V
O} > o. c cP
Definition 4.2 . 1 .
E ifJ
=
So an arbitrage opportunity is a self-financing strategy with zero initial value, which produces a non-negative final value with probability one and has a positive probability of a positive final value. Observe that arbitrage opportunities are always defined with respect to a certain class of trading strategies. Definition 4.2.2. M
We say thatina thesecurity arbitrage-free if there are no arbitrage opportunities class market of tradingis strategies. cP
We will allow ourselves to use 'no-arbitrage' in place of 'arbitrage-free' when convenient. We will use the following mental picture in analyzing the sample paths of the price processes. We observe a realization S , ) of the price process Se ) . We want to know which sample point w E n - or random outcome - we have. Information about is captured in the filtration IF {Ft } . In our current setting we can switch to the unique sequence of partitions {Pd corresponding to the filtration {Ft } . So at time we know the set At E Pt with E At . Now recall the structure of the subsequent partitions. A set A E Pt is the disjoint union of sets AI , . . . , AK E Pt + l . Since S (u) is Fu-measurable Set ) is constant on A and Set + 1) is constant on the Ak , k = 1, . . , K. So we can think of A the time 0 state in a single-period model and each Ak corresponds to a state at time 1 in the single-period model. We can therefore think of a multi-period market model as a collection of consecutive single period markets. What is the effect of a 'global' no-arbitrage condition on the single-period markets?
(t
t
w
=
w
t
w
.
as
Lemma 4.2 . 1 . If the market model contains no arbitrage opportunities, then for for allanyt AE E{O,Pt1 ,, . we. . , Thave- I } , for all self-financing trading strategies E and ( i ) P (V
=
:::;
cP
=
=:}
=
Observe that the conditions i n the lemma are just the defining conditions of an arbitrage opportunity from Definition 4.2. 1 . They are formulated for a single-period model from t to + 1 with respect to the available information w E A. The economic meaning of this result answers the question raised above. No arbitrage 'globally' implies no arbitrage 'locally'. From this the idea of the proof is immediate. Any local trading strategy can be embedded in a global strategy for which we can use the global no-arbitrage condition.
t
4.2 Existence of Equivalent Martingale Measures
107
Proof. We only prove (i) ((ii) is shown in a similar fashion) . Fix t E {O, . . . , T - I } and cP E qJ. Suppose JP(lfcp (t + 1 ) - Vcp (t) 2:: OIA) 1 for some A E Pt and define a new trading strategy 'l/; for all times u 1 , . . . , T as follows: For u :::; t : 'l/;(u) 0 ('do nothing before time t') . For u t + 1 : 'l/; (t + 1 ) 0 if w f/. A, and =
=
=
=
'l/;k (t + 1 , w) =
{
=
CPk (t + 1 , w)
if
w
E A and
k
E { I , . . . , d} ,
CPo (t + 1 , w) - Vcp (t , w)
if
w
E A and
k
=
O.
(If w happens to be in A at time t, follow strategy cP when dealing with the risky assets, but modify the holdings in the numeraire appropriately in order to compensate for doing nothing when w tJ. A.) For u > t + 1 'l/; k (u) 0 for k E {I, . . . , d} and :
=
.,,
'1-' 0
(u, w)
_ -
{0
if if
V", (t + 1 , W )
A, f/. A.
wE w
(Invest the amount V", (t + 1 ) into the numeraire account if w happens to be in A, otherwise do nothing.) The next step now is to show that the strategy 'l/; is a self-financing trading strategy. By construction 'l/; is predictable, hence a trading strategy. For w tJ. A 'l/; 0, so we only have to consider w E A. The relevant point in time is t + 1 . Recall that 'l/; (t) 0, hence 'l/; (t) . S (t) O. Now ==
=
=
'l/;(t +
1 ) · S(t)
=
( cpo (t +
d
1) - V
d
=
=
L CPk (t + l ) Sk (t) - V
k =O cp (t + 1) , S (t) - V
=
cp (t) . S (t) - V
0,
using the fact that cP is self-financing. Since 'l/;(u) , S (u) 0 for u :::; t we have 'l/;( u + 1 ) · S(u) = 'l/;(u) · S (u) for all u :::; t (and for all w E Q) . When u > t + 1 and w E A we only hold the numeraire asset (with constant discounted value equal to 1 ) , so =
'l/; (u + 1 ) ' S ( u )
=
V", (t + 1 )
=
'l/; ( u ) . S ( u ) .
Therefore the strategy 'l/; is self-financing. We now analyze the value process of 'l/;. Using our assumption JP(V
108
4. Mathematical Finance in Discrete Time
V",(u) = 'ljJ (u) · S (u) 'ljJ ( t + 1) S(t + 1 ) d = ( 'PO (t + 1) V
(T) 2: O. The assumption of an arbitrage-free market im plies V",(T) = 0 or 0 = JP(V",(T) > 0 ) = JP ( {V",(T) > O} A ) JP(V
OI A )JP(A). Therefore JP(V
.
-
-
-
n
-
=
-
o
=
=
The fundamental insight in the single-period example was the equivalence of the no-arbitrage condition and the existence of risk-neutral probabilities. For the multi-period case we now use the probabilistic machinery of Chapter 2 to establish the corresponding result. Definition 4.2.3. A (n, FT)
probability measure JP* on equivalent to JP is called a martingale measure for S if the process S follows a JP* -martingale with respectmeasures. to the filtration We denote by P (S) the class of equivalent martingale Proposition 4.2. 1 . Let JP* be an equivalent martingale measure (JP* E Pis (Sa ))JP*and 'P E anywithselrespect f-financing -martingale to thestrategy. filtrationThen the wealth process V
�
IF .
=
So So for E � , is the martingale transform of the martingale by 0 ( see Theorem 3.4. 1 ) and hence a martingale itself. Observe that in our setting all processes are bounded, i.e. the martingale transform theorem is applicable without (urther restrictions. The next result is the key for the further development.
'P
'P
V
JP*
JP*
S
109
4.2 Existence of Equivalent Martingale Measures
If an equivalent martingale measure exists - that is, if then the market M is arbitrage-free.
Proposition 4.2.2.
P (S) =I- 0
-
Proof. Assume such a IP* exists. For any self-financing strategy have as before t V
we
=
r= l By Proposition 4.2 . 1 , S( t) a ( vector ) IP*-martingale implies V
0 ( by assumption, each IP({w}) > 0, so by equivalence each IP* ( {w }) > 0). This and V
If the market is arbitrage-free, then the class of equivalent martingale measures is non-empty.
Because of the fundamental nature of this result we will provide two proofs. The first proof is based on our previous observation that the 'global' no-arbitrage condition implies also no-arbitrage 'locally'. We therefore can combine single-period results to prove the multi-period claim. The second prove uses functional-analytic techniques ( as does the corresponding proof in Chapter 1 ) , i.e. a variant of the First proof. From Lemma 4.2.1 we know that each of the underlying single-period market models is free of arbitrage. By the results in Chapter 1 this implies the existence of risk-neutral probabilities. That is, for each E {O, 1 , . . . , T I } and each A E Pt there exists a probability measure IP(t, A) such that each cell Ai C A, 1, . , KA in the partition Pt+1 has a positive probability mass and
Hahn-Banach theorem.
t
-
i
KA
=
L IP(t, A) (Ai )
.
.
1. i= l Furthermore 1E1P(t,A ) 1)) ( where we restrict ourselves to w E A). We can think of the probability measures IP ( , A) as conditional risk-neutral probability meaSures given the event A occurred at time Now we can define a probability measure IP* on il by defining the probabilities of the simple events {w} ( observe that FT P(il) , hence the final partition consists of all simple events ) . To each such {w} there exists a single path from 0 to T and IP* is set equal to the product of the conditional probabilities along the path. By construction
(S(t+
=
=
S(t)
=
t
t.
4. Mathematical Finance in Discrete Time
1 10
L JP* ( {w } ) = 1 .
wEn
Since the conditional risk-neutral probabilities are greater than 0, JP* ( {w } ) > o for each w E a and JP* is an equivalent measure. The final step is to show that JP* is a martingale measure. We thus have to show lE* ( fh (t + l ) I Ft } Sk (t) for any k 1, . . . , d, = 0, . . . , T - 1 . Now Sk (t) is Frmeasurable, and as any A E Ft can be written as a union of A' E Pt the claim follows from =
t
=
J
A'
Sk (t + l)dJP *
=
J
Sk (t)dJP�
A'
which is true by construction of JP� ( Recall that we have lElP ( A t ) ( Sk (t + 1 ) ) = ,
lElP ( A t ,
0
) ( Sk (t)).)
For the second proof ( for which we follow Schachermayer (2003) ) we need some auxiliary observations. Recall the definition of arbitrage, i.e. Defini.tion 4.2.1, in our finite dimensional setting: a self-financing trading strategy
O. Now call LO = LO (a, F, JP) the set of random variables on (a, F) and L�+ (a, F, JP) := {X E L O : X (w) 2: 0 Vw E a and :3 w E a s. t. X (w) > O J . ( Observe that L�+ is a closed under vector addition and multiplication by positive scalars. ) Using L�+ we can write the arbitrage condition more compactly as
cone
-
for any self-financing strategy
In an arbitrage-free market any predictable vector process satisfies
Lemma 4.2.2. '
( Observe the slight abuse of notation: for the value of the discounted gains process the zeroth component of a trading strategy doesn't matter. Hence we use the operator G for d-dimensional vectors as well. )
4.2 Existence of Equivalent Martingale Measures
Proof.
111
By Proposition 4. 1.3 there exists a unique predictable process
( cpo ( ) ) such that
t = (CPO , CP1 , , CP d ) has zero initial value and is self financing. Assume G , (T) E L� (D, F, JP) . Then using Proposition 4. 1.2, cP
•
.
•
+
which - as G
We call the subspace K of LO (D, F, JP) defined by K = {X E LO (D, F, JP) : X G
Definition 4.2 .4.
=
cP
(4.3) Second proof of Proposition 4.2.3. Since our market model is finite we can use results from Euclidean geometry, in particular we can identify LO with mi n i ) . By assumption we have (4.3), i.e. K and L� + do not intersect. So K does not meet the subset D : = {X E L� + :
L X (w) = I } .
wEn Now D is a compact convex set. By the separating hyperplane theorem, there is a vector A = (A(W) w E D) such that for all X E D :
(4.4)
A · X := L A(W)X(W) > 0,
wEn
but for all G
O. So A · G
=
=
112
4. Mathematical Finance i n Discrete Time
JP * ( { W } ) : =
A ( W) L: w ' E n A ( W')
defines a probability measure equivalent to JP ( no non-empty null sets ) . With JE * as JP* -expectation, (4 . 5) says that JE * i.e.
(G",(T))
=
0,
In particular, choosing for each to hold only stock
i
i,
Since this holds for any predictable r.p ( boundedness holds automatically as Q is finite ) , the martingale transform lemma tells us that the discounted price 0 processes (Bi (t ) ) are JP*-martingales. Note. Our situation is finite-dimensional, so all we have used here is Eu clidean geometry. We have a subspace, and a cone not meeting the subspace except at the origin. Take A orthogonal to the subspace on the same side of the subspace as the cone. The separating hyperplane theorem holds also in infinite-dimensional situations, where it is a form of the Hahn-Banach theorem of functional analysis ( Appendix C ) . For proofs, variants and back ground, see e.g. Bott (1942) and Valentine (1964) . We now combine Propositions 4.2.2 and 4.2.3 as a first central theorem in this chapter. Theorem 4 . 2 . 1 (No-arbitrage Theorem) . M JP* JP JP* B
The market istoarbitrage free if and only i f there exists a probability measure equivalent under which the discounted d -dimensional asset price process is a -martingale. 4.2.2 Risk-Neutral Pricing
We now turn to the main underlying question of this text, namely the pricing of contingent claims ( i.e. financial derivatives ) . As in Chapter 1 the basic idea is to reproduce the cash flow of a contingent claim in terms of a portfolio of the underlying assets. On the other hand, the equivalence of the no-arbitrage condition and the existence of risk-neutral probability measures imply the possibility of using risk-neutral measures for pricing purposes. We will explore the relation of these two approaches in this subsection. We say that a contingent claim is if there exists a r.p E tP such that
strategy
attainable
replicating
4 . 2 Existence of Equivalent Martingale Measures
113
X. So the replicating strategy generates the same time T cash-flow as does X. Working with discounted values ( recall we use (3 as the discount factor ) we find (3 ( T ) X = Vcp (T) V(O) + Gcp (T) . (4. 6) So the discounted value of a contingent claim is given by the initial cost of setting up a replication strategy and the gains from trading. In a highly ef ficient security market we expect that the law of one price holds true, that is for a specified cash-flow there exists only one price at any time instant. Otherwise arbitrageurs would use the opportunity to cash in a riskless profit ( recall that a whole industry of hedge funds rely on such opportunities, also see the case of option mispricing at former NatWest Markets as an excellent example of how arbitrageurs exploit mispricing ) . So the no-arbitrage condi tion implies that for an attainable contingent claim its time t price must be given by the value ( initial cost ) of any replicating strategy ( we say the claim is uniquely replicated in that case ) . This is the basic idea of the Vcp (T)
=
=
arbitrage theory pricing . Let us investigate replicating strategies a bit further. The idea is to repli
cate a given cash-flow at a given point in time. Using a self-financing trading strategy the investor's wealth may go negative at time t < T, but he must be able to cover his debt at the final date. To avoid negative wealth the con cept of admissible strategies is introduced. A self-financing trading strategy r.p E P is called if � 0 for each t 0, 1 , . . , T. We write Pa for the class of admissible trading strategies. The modeling assumption of admissible strategies reflects the economic fact that the broker should be protected from unbounded short sales. In our current setting all processes are bounded anyway, so this distinction is not really needed and we use self financing strategies when addressing the mathematical aspects of the theory. ( In fact one can show that a security market which is arbitrage-free with respect to Pa is also arbitrage-free with respect to P.) We now return to the main question of the section: given a contingent claim X, i.e. a cash-flow at time T, how can we determine its value ( price ) at time t < T ? For an attainable contingent claim this value should be given by the value of any replicating strategy at time t, i.e. there should be a unique value process ( say Vx (t)) representing the time t value of the simple con tingent claim X. The following proposition ensures that the value processes of replicating trading strategies coincide, thus proving the uniqueness of the value process. Proposition 4.2.4. M X M. Proof. Suppose there is an attainable contingent claim X and strategies r.p and 'IjJ such that
admissible Vcp(t)
=
.
Supposeistheuniquely marketreplicated is arbitragetainable contingent claim in free. Then any at
114
4. Mathematical Finance i n Discrete Time
but there exists a r < T such that Vcp ( U )
=
V..p ( u ) for every u
<
r and Vcp (r)
=I- V..p (r) .
Define A : = { w E n : Vcp (r, w ) > V..p (r , w ) } , then A E F-r and 1P ( A ) > ° (otherwise just rename the strategies) . Define the F-r-measurable random variable Y := Vcp (r) - V..p (r) and consider the trading strategy � defined by � ( u)
=
{ 'P( U ) - 'ljJ ( u ) , u -:5. r lA C ('P( u ) - 'ljJ( u ) ) + lA (Y,B(r) , O , . . . , 0) , r < u
-:5.
T.
The idea here is to use 'P and 'ljJ to construct a self-financing strategy with zero initial investment (hence use their difference �) and put any gains at time r in the savings account (i.e. invest them risk-free) up to time T. We need to show formally that � satisfies the conditions of an arbitrage opportunity. By construction � is predictable and the self-financing condition (4. 1) is clearly true for r , and for r we have using that 'P , 'ljJ E iP
t =I-
t
=
�(r) . 8(r) = ('P(r) - 'ljJ(r) ) . 8(r) = Vcp (r) - V..p (r) , � (r + 1 ) . 8(r) = lAc ('P(r + 1) - 'ljJ(r + 1)) . 8(r) + lAY,B(r)8o (r) = lAC ('P(r) - 'ljJ(r)) · 8(r) + l A ( Vcp (r) - V..p (r) ),B(r),B- l (r) = Vcp (r) - V..p (r) .
Comparing these two, � is self-financing, and its initial value is zero. Also Ve ( T ) = lAc ('P( T ) - 'ljJ( T ) ) . 8( T ) + lA (Y,B(r) , 0, . . . , 0) · 8( T ) .
The first term is zero, as Vcp ( T ) = V..p ( T ) . The second term is as Y > ° on A, and indeed 1P { V� ( T ) > O }
=
1P { A } > 0 .
Hence the market contains an arbitrage opportunity with respect to the class of self-financing strategies. But this contradicts the assumption that the 0 market M is arbitrage-free. iP
This uniqueness property allows us now to define the important concept of an arbitrage price process.
Suppose the market is arbitrage-free. Let X be any attain able contingent claim with time T maturity. Then the arbitrage price process 7rxof any t T orstrategy simply arbitrage price of X is given by the value process (t), replicating 'P for X . Definition 4 . 2 . 5 . ° -:5.
-:5.
4.2 Existence of Equivalent Martingale Measures
115
The construction of hedging strategies that replicate the outcome of a contingent claim ( for example a European option ) is an important problem in both practical and theoretical applications. Hedging is central to the the ory of option pricing. The classical arbitrage valuation models, such as the Black-Scholes model Black and Scholes ( 1973 ) , depend on the idea that an option can be perfectly hedged using the underlying asset ( in our case the assets of the market model ) , so making it possible to create a portfolio that replicates the option exactly. Hedging is also widely used to reduce risk, and the kinds of delta-hedging strategies implicit in the Black-Scholes model are used by participants in option markets. We will come back to hedging prob lems subsequently. Analyzing the arbitrage-pricing approach we observe that the derivation of the price of a contingent claim doesn't require any specific preferences of the agents other than nonsatiation, i.e. agents prefer more to less, which rules out arbitrage. So, the pricing formula for any attainable contingent claim must be independent of all preferences that do not admit arbitrage. In particular, an economy of risk-neutral investors must price a contingent claim in the same manner. This fundamental insight, due to Cox and Ross ( 1976 ) in the case of a simple economy - a riskless asset and one risky asset - and in its general form due to Harrison and Kreps ( 1979 ) , simplifies the pricing formula enormously. In its general form the price of an attainable simple contingent claim is just the expected value of the discounted payoff with respect to an equivalent martingale measure.
arbitrage pricevaluation process offormula any attainable contingent claim X is given by theTherisk-neutral (4.7 ) 7rx (t) {3( t ) - l IE* (X{3(T) IFt ) "It 0 , 1 , . . . , T, where IE*IP�is the expectation operator with respect to an equivalent martingale measure
Proposition 4.2.5.
=
=
Proof. Since we assume the the market is arbitrage-free, there exists ( at least ) an equivalent martingale measure IP * . By Proposition 4.2.1 the dis counted value process VIp of any self-financing strategy t.p is a IP * -martingale. So for any contingent claim X with maturity T and any replicating trading strategy t.p E � we have for each t 0, 1, . , T .
=
7r
x( )
t
.
(3(t) - l V
=
V
=
(3(t) - l E* (V
=
(3(t) - l E* ({3(T)V
=
{3(t) - l E* ({3(T)X IFt }
=
( as VIp ( ( as
t.p
t ) is a IP* -martingale)
is a replicating strategy for X) . o
1 16
4. Mathematical Finance in Discrete Time
4 . 3 Complete Markets : Uniqueness of Equivalent Mart ingale Measures
The last section made clear that attainable contingent claims can be priced using an equivalent martingale measure. In this section we will discuss the question of the circumstances under which all contingent claims are attain able. This would be a very desirable property of the market M, because we would then have solved the pricing question ( at least for contingent claims ) completely. Since contingent claims are merely FT-measurable random vari ables in our setting, it should be no surprise that we can give a criterion in terms of probability measures. We start with: Definition 4.3. 1 . M X E LO
A market is complete ifvariable every contingentthere claimexists is ata tainable, i.e. for every F -measurable random replicating self-financingTstrategy
=
In the case of an arbitrage-free market M one can even insist on replicat ing nonnegative contingent claims by an admissible strategy
An arbitragefree marketequiv isalentcomplete if and only if there exists a unique probability measure to under which discounted asset prices are martingales.
(3(T)X
=
.
V
<1 8 (r).
We know by the no-arbitrage theorem ( Theorem 4 . 2 . 1 ) , that an equivalent martingale measure JP* exists; we have to prove uniqueness. So, let JPI , JP2 be two such equivalent martingale measures. For 1 , 2, (v
i
=
117
4.3 Complete Markets: Uniqueness of EMMs
as the value at time zero is non-random (Fo = {0, n}) and ,8(0)
=
1.
So
Since X is arbitrary, lEl ' lE2 have to agree on integrating all integrands. Now lEi is expectation (Le. integration) with respect to the measure JPi , and measures that agree on integrating all integrands must coincide. So JPl = JP2 , giving uniqueness as required. ' � ' : Assume that the arbitrage-free market M is incomplete: then there exists a non-attainable FT -measurable random variable X (a contingent claim) . By Proposition 4. 1 .3, we may confine attention to the risky assets S1 , . , Sd , as these suffice to tell us how to handle the numeraire So . Consider the following set of random variables:
..
t,
{
Y E LO : Y = Yo +
E
,8(T)X does not belong to k, so k is a proper subset of the set LO of all random variables on n (which may be identified with ]R l n l ) . Let JP* be a probability measure equivalent to JP under which discounted prices are martingales (such JP* exist by the no-arbitrage theorem (Theorem 4 . 2 . 1 ) . Define the scalar product (Z, Y) -+ lE * (ZY) on random variables on n. Since k is a proper subset, there exists a non-zero random variable Z orthogonal to k (since n is finite, ]R lnl is Euclidean: this is just Euclidean geometry) . That is, lE * (ZY) = 0, V Y E k. Choosing the special Y = 1 1, . . . , d and Yo = 1 we find Write II X ll oo
:=
E
k
given by
= 0, t
=
1 , 2 , . . . , Tj i =
lE * (Z) = 0. sup{ I X(w) 1 w E n} , and define JP** by :
JP** ({w} )
= ( ��"�) JP* ({w} ) . 1+2
By construction, JP** is equivalent to JP * (same null sets - actually, as JP * JP and JP has no non-empty null sets, neither do JP* , JP** ) . From lE* (Z) = 0,
'"
118
4. Mathematical Finance i n Discrete Time
we see that E JP** ( w ) JP** and JP* are
, i.e. is a probability measure. As Z is non-zero, different. 1Now JE ** (t,CP(t) . LlS(t) ) = ]; JP **(w ) (t,CP(t,W) . LlS(t,W)) ];( 1 + �t�) JP* (w t, cp (t, w ) . LlS(t,W) ). =
{
2
=
The ' 1 ' term on the right gives JE *
(t, cp t . LlS(t)) , ( )
which is zero since this is a martingale transform of the JP*-martingale S(t) ( recall martingale transforms are by definition null at zero ) . The 'Z' term gives a multiple of the inner product T
(Z, L t=l
cp(t) . LlS(t)), -
-
-
E K. By the which is zero as Z is orthogonal to K and E T martingale transform lemma (Lemma is a JP**-martingale since cp is an arbitrary predictable process. Thus JP** is a second equivalent mar tingale measure, different from JP� So incompleteness implies non-uniqueness 0 of equivalent martingale measur�s, as required. Martingale Representation. To say that every contingent claim can be replicated means that every JP*-martingale ( where JP* is the risk-neutral measure, which is unique ) can be written, or as a martingale transform ( of the discounted prices ) by a replicating (perfect-hedge ) trading strategy cpo In stochastic-process language, this says that all JP* -martingales can be as martingale transforms of discounted prices. Such mar tingale representation theorems hold much more generally, and are very im portant. For background, see Revuz and Yor (1991) and Yor (1978) .
. LlS(t) cp(t) 3. 4 .1 ) , S(t) t=l
represented,
represented
4 . 4 The Fundamental Theorem of Asset Pricing: Risk-Neutral Valuation
We summarize what we have achieved so far. We call a measure JP* under which discounted prices are JP* -martingales a Such a JP* equivalent to the actual probability measure P is called an Then:
S(t)
martingale measure.
martingale measure . equivalent
4.4 The Fundamental Theorem of Asset Pricing: Risk-Neutral Valuation •
•
No-arbitrage theorem (Theorem 4 . 2. 1):
equivalent martingale measures JP* exist.
1 19
If the market is arbitrage-free,
Completeness theorem (Theorem 4 . 3. 1): If the market is complete ( all contingent claims can be replicated ) , equivalent martingale measures are unique.
Combining: Theorem 4.4.1 (Fundamental Theorem of Asset Pricing) . In an ar
bitrage-free complete market measure JP* .
M , there exists a unique equivalent martingale
The term fundamental theorem of asset pricing was introduced in Dy bvig and Ross (1987) . It is used for theorems establishing the equivalence of an economic modeling condition such as no-arbitrage to the existence of the mathematical modeling condition existence of equivalent martingale mea sures. Assume now that M is an arbitrage-free complete market and let X be any contingent claim, cp a self-financing strategy replicating it (which exists by completeness ) , then: Vcp (T)
=
X.
As Vcp ( t ) is the martingale transform of the JP*-martingale S(t) ( by cp ( t ) ) , Vcp (t) is a JP*-martingale. So Vcp (O) (= Vcp (O)) = lE* (Vcp (T)) = lE * «(3(T)X ), giving us the risk-neutral pricing formula Vcp (O)
=
lE* «(3(T)X) .
More generally, the same argument gives Vcp (t)
= (3(t)Vcp (t) = lE* «(3(T)X IFt ) : l Vcp (t) = (3(t) - lE* «(3(T)X IFt ) ( t = 0 , 1, . . . , T) . (4.8)
It is natural to call Vcp (O) = 7l"x (O) above the arbitrage price ( or more exactly, arbitrage-free price) of the contingent claim X at time 0, and Vx (t) = 7l"x (t) above the arbitrage price ( or more exactly, arbitrage-free price) of the simple contingent claim X at time t. For, if an investor sells the claim X at time t for Vx (t) , he can follow strategy cp to replicate X at time T and clear the claim; an investor selling for this value is perfectly hedged. To sell the claim for any other amount would provide an arbitrage opportunity ( as with the argument for put-call parity ) . We note that, to calculate prices as above, we need to know only: 1 . n, the set of all possible states, 2 . the a-field F and the filtration ( or information flow ) ( Ft ) , 3. JP� We do not need to know the underlying probability measure JP - only its null sets, to know what 'equivalent to JP' means ( actually, in this finite model, there are no non-empty null-sets, so we do not need to know even this ) .
120
4. Mathematical Finance in Discrete Time
Now pricing of contingent claims is our central task, and for pricing pur poses IP* is vital and IP itself irrelevant. We thus may - and shall - focus attention on IP� which is called the risk-neutral probability measure. Risk neutrality is the central concept of the subject and the underlying theme of this text. The concept of risk-neutrality is due in its modern form to Harrison and Pliska ( 1981) in 1981 though the idea can be traced back to actuarial practice much earlier (see Esscher (1932) and also Gerber and Shiu (1995» . Harrison and Pliska call 1P* the reference measure; Bjork (1999) calls it the risk-adjusted or martingale measure; Dothan ( 1990) uses equilibrium price measure. The term 'risk-neutral' reflects the IP* -martingale property of the risky assets, since martingales model fair games (one can't win systematically by betting on a martingale) . To summarize, we have: -
Theorem 4.4.2 (Risk-neutral Pricing Formula) . In an arbitrage-free
complete market M , arbitrage prices of contingent claims are their discounted expected values under the risk-neutral (equivalent martingale) measure IP�
There exist several variants and ramifications of the results we have pre sented so far. Finite, Discrete Time; Finite Probability Space (our model)
Like Harrison and Pliska ( 1981) in their seminal paper we used several re sults from functional analysis. Taqqu and Willinger (1987) provide an ap proach based on probabilistic methods and allowing a geometric interpreta tion which yields a connection to linear programming. They analyze certain geometric properties of the sample paths of a given vector-valued stochastic process representing the different stock prices through time. They show that under the requirement that no arbitrage opportunities exist, the price incre ments between two periods can be converted to martingale differences (see Chapter 3) through an equivalent martingale measure. From a probabilistic point of view this provides a converse to the classical notion that 'one can not win betting on a martingale' by saying 'if one cannot win betting on a process, then it must be a martingale under an equivalent martingale mea sure'. Furthermore, they give a characterization of complete markets in terms of an extremal property of a probability measure in the convex set p ( S ) of martingale measures for S (not necessarily equivalent to IP): The market model M is complete under a measure Q on (Q, F) if and only if Q is an extreme point of P (S) (i. e. Q cannot be expressed as a strictly convex combination of two distinct probability measures in p ( S » ) .
They also show that the problem of attainability of a simple contingent claim can be viewed and formulated as the 'dual problem' to finding a certain martingale measure for the price process S .
4.5 The Cox-Ross-Rubinstein Model
121
Finite, Discrete Time; General Probability Space
The no-arbitrage condition remains equivalent to the existence of an equiv alent martingale measure. The first proof of this was given by Dalang, Mor ton, and Willinger (1990) using deep functional analytic methods ( such as measurable selection and measure-decomposition theorems ) . There exist now several more accessible proofs, in particular by Schachermayer (1992) , using more elementary results from functional analysis ( orthogonality arguments in properly chosen spaces, see also Kabanov and Kramkov (1995) ) and by Rogers (1994) , using a method which essentially comes down to maximizing expected utility of gains from trade over all possible trading strategies. Discrete Time; Infinite Horizon; General Probability Space
Under this setting the equivalence of no-arbitrage opportunities and existence of an equivalent martingale measure breaks down ( see Back and Pliska (199 1 ) and Dalang, Morton, and Willinger (1990) for counterexamples ) . Introducing a weaker regularity concept than no-arbitrage, namely no free lunch with bounded risk - requiring an absolute bound on the maximal loss occurring in certain basic trading strategies ( see Schachermayer ( 1 994) for an exact mathematical definition, Kreps ( 1981) for related concepts ) - Schachermayer (1994) established the following beautiful result:
The condition no free lunch with bounded risk is equivalent to the existence of an equivalent martingale measure.
For a recent overview of variants of fundamental asset pricing theorems proved by probabilistic techniques, we refer the reader to Jacod and Shiryaev (1998) . We will not pursue these approaches further, but use our finite discrete-time and finite probability space setting to explore several models which are widely used in practice. Note. We return to these matters in the more complicated setting of contin uous time in Chapter 6; see §6. 1 and Theorem 6.1.2. 4 . 5 The C ox-Ross-Rubinstein Model
In this section we consider simple discrete-time financial market models. The development of the risk-neutral pricing formula is particularly clear in this setting since we require only elementary mathematical methods. The link to the fundamental economic principles of the arbitrage pricing method can be obtained equally straightforwardly. Moreover binomial models, by their very construction, give rise to simple and efficient numerical procedures. We start with the paradigm of all binomial models - the celebrated Cox, Ross, and Rubinstein (1979) model ( CRR-model ) .
122
4. Mathematical Finance in Discrete Time
4.5.1 Model Structure
We take d 1 , that is, our model consists of two basic securities. Recall that the essence of the relative pricing theory is to take the price processes of these basic securities as given and price secondary securities in such a way that no arbitrage is possible. Our time horizon is T and the set of dates in our financial market model is t = 0 , 1 , . . . , T. Assume that the first of our given basic securities is a ( riskless ) bond or bank account B, which yields a riskless rate of return r > ° in each time interval [t, t + 1] , i.e. =
B(t + 1 ) = ( 1 + r)B (t) , B (O) = 1. So its price process is B(t) = ( 1 + r) t , t = 0 , 1, . . , T. Furthermore, we have a risky asset ( stock) S with price process .
S(t + 1)
with - 1
=
<
S(t) { 1 + u)d) S(t) (1
( +
with probability p, with probability 1 - p, t
=
0, 1 , . . . , T
-
1
d < u, So E IRci ( see Figure 4.1 below ) . p
8( 1 )
=
(1
+
u)8(O)
8( 1 )
=
(1
+
d) 8(O)
8( 0 )
Fig. 4 . 1 .
One-step tree diagram
Alternatively we write this as Z(t + 1 ) : =
S(t + 1 ) S(t)
-
1,
t
=
0, 1 , . . . , T - 1 .
We set up a probabilistic model by considering the returns process Z(t) , t = 1 , . . . , T as random variables defined on probability spaces (fit , :it , P t ) with fit = fi = {d, u}, :it = :i = P( ii) = {0, {d} , {u} , fi}, Pt = P with P({u}) = p, P({d})
=
1 - p, P E ( 0 , 1 ) .
4.5 The Cox-Ross-Rubinstein Model
123
On these probability spaces we define Z(t, u) = u and Z(t, d) = d, t
=
1 , 2, . . . , T.
Our aim, of course, is to define a probability space on which we can model the basic securities (B, 8) . Since we can write the stock price as t
8(t) = 8(0)
II ( 1 + Z(r) ) ,
T= l
t = 1 , 2, . . , T, .
the above definitions suggest using as the underlying probabilistic model of the financial market the product space (n, F, JP) , see e.g. Williams (1991 ) Chapter 8 , i.e. -
n- T = {d, u} T , with each w E n representing the successive values of Z(t) , t = 1 , 2, . , T. Hence each w E n is a T-tuple w = ( W l , . . . , WT) and Wt E ii = {d, u}. For the a-algebra we use F = p(n) and the probability measure is given by n = nl
x . . . x
nT
=
.
JP ( { w } )
=
.zl\ ( {Wl } )
x . . . x
PT ( {WT } ) = P ( { wd )
x
..
.
x
.
P ( {WT} ) .
The role of a product space is to model independent replication of a ran dom experiment. The Z(t) above are two-valued random variables, so can be thought of as tosses of a biased coin; we need to build a probability space on which we can model a succession of such independent tosses. Now we redefine ( with a slight abuse of notation ) the Z(t) , t = 1, . . . , T random variables on (n, F, JP ) as ( the tth projection ) as
Z(t, W) = Z(t, Wt ) .
Observe that by this definition ( and the above construction ) Z ( l ) , . . . , Z(T) are independent and identically distributed with JP(Z(t) = u) = p = 1
-
JP(Z(t) = d) .
To model the flow of information in the market we use the obvious filtration ( trivial a-field ) Fo = {0, n} ' Ft = a ( Z ( l ) , . . . , Z(t) ) = a (8( 1 ) , . . . , 8(t) ) , ( class of all subsets of n) . FT = F = p ( n) This construction emphasizes again that a multi-period model can be viewed as a sequence of single-period models. Indeed, in the Cox-Ross Rubinstein case we use identical and independent single-period models. As we will see in the sequel this will make the construction of equivalent martingale measures relatively easy. Unfortunately we can hardly defend the assump tion of independent and identically distributed price movements at each time period in practical applications.
124
4 . Mathematical Finance in Discrete Time
Remark 4 . 5. 1 . We used this example to show explicitly how to construct the underlying probability space. Having done this in full once, we will from now on feel free to take for granted the existence of an appropriate probability space on which all relevant random variables can be defined. 4 . 5 . 2 Risk-neutral Pricing
We now turn to the pricing of derivative assets in the Cox-Ross-Rubinstein market model. To do so we first have to discuss whether the Cox-Ross Rubinstein model is arbitrage-free and complete. To answer these questions we have, according to our fundamental theo rems (Theorems 4.2 . 1 and 4.3. 1 ) , to understand the structure of equivalent martingale measures in the Cox-Ross-Rubinstein model. In trying to do this we use (as is quite natural and customary) the bond price process B ( t) as numeraire. Our first task is to find an equivalent martingale measure Q such that the Z( l ) , . . . , Z(T) remain independent and identically distributed, i.e. a proba bility measure Q defined as a product measure via a measure Q on (ii, :F) such that Q( {u}) q and Q( {d}) 1 - q. We have: Proposition 4. 5 . 1 . (i) A martingale measure Q for the discounted stock =
=
price S exists if and only if
(4.9) d < r < u. (ii) If equation (4 . 9) holds true, then there is a unique such measure in P characterized by r-d (4. 10) q = -- ' u-d Proof. Since Set) = S(t)B(t) = S(t) ( l + r) t , we have Z (t + 1) = Set + l ) jS(t) - 1 = (S(t + l ) jS(t) ) ( l + r ) - 1 . So, the discounted price (S(t)) is a Q-martingale if and only if for t = 0, 1 , . . . , T - 1 {::}
JE Q [S(t + l ) ! Ft) = S e t) JEQ [Z(t + l ) ! Ft) = r.
{::}
JE Q [ (S(t + l)jS(t) ) ! Ft ] = 1
But Z( l ) , . . . , Z(T) are mutually independent and hence Z(t + 1 ) is indepen dent of Ft a(Z(l ) , . . . , Z(t) ) . So =
r =
JEQ (Z(t + l) !Ft ) = JE Q (Z(t + 1 ) )
=
uq + d(l - q)
is a weighted average of u and d; this can be r if and only if r E [d, u) . As Q is to be equivalent to 1P and 1P has no non-empty null sets, r = d, u are excluded and (4.9) is proved. To prove uniqueness and to find the value of q we simply observe that under (4.9) u
x
q + d x (1 - q)
=
r
4.5 The Cox-Ross-Rubinstein Model has
a unique solution. Solving it for q leads to the above formula.
125 0
From now on we assume that (4.9) holds true. Using the above Proposition we immediately get: Corollary 4 . 5 . 1 . The Cox-Ross-Rubinstein model is arbitrage-free. Proof. By Proposition 4 . 5 . 1 there exists an equivalent martingale mea sure and this is by the no-arbitrage theorem (Theorem 4.2.1) enough to guar 0 antee that the Cox-Ross-Rubinstein model is free of arbitrage. Uniqueness of the solution of the linear equation (4.7) under (4.9) gives completeness of the model, by the completeness theorem (Theorem 4.3.1): Proposition 4 . 5 . 2 . The Cox-Ross-Rubinstein model is complete.
One can translate this result - on uniqueness of the equivalent martingale measure - into financial language. Completeness means that all contingent claims can be replicated. If we do this in the large, we can do it in the small by restriction, and conversely, we can build up our full model from its constituent components. To summarize: Corollary 4 . 5 . 2 . The multi-period model is complete if and only if every
underlying single-period model is complete.
We can now use the risk-neutral valuation formula to price every contin gent claim in the Cox-Ross-Rubinstein model. Proposition 4 . 5 . 3 . The arbitrage price process of a contingent claim X in
the Cox-Ross-Rubinstein model is given by 7l"x { t)
=
B { t)JE* (XI B (T) IFt) Vt = 0, 1, . . . , T,
where JE* is the expectation operator with respect to the unique equivalent martingale measure JP * characterized by p * = (r - d) I (u - d) .
Proof. This follows directly from Proposition 4.2.5 since the Cox-Ross0 Rubinstein model is arbitrage-free and complete. We now give simple formulas for pricing (and hedging) of European con tingent claims X f{8T ) for suitable functions f (in this simple framework all functions f JR -+ JR) . We use the notation =
:
(4.11) Observe that this is just an evaluation of f(8(j ) ) along the probability weighted paths of the price process. Accordingly, j, T - j are the numbers of times Z(i) takes the two possible values d, u.
126
4. Mathematical Finance in Discrete Time
Corollary 4.5.3. Consider a European contigent claim with expiry T given
by X = f eST ) . The arbitrage price process 7i"x (t) , t contingent claim is given by (set T = T t) -
=
O, l , . . . , T of the (4. 12)
Proof.
Recall that t
Set)
=
S(O) IT ( 1 + z(j) ) , t = 1, 2, . . , T. j=l .
By Proposition 4.5.3 the price IIx (t) of a contingent claim X time t is 7i"x (t)
=
=
feST ) at
t (1 + r) - ( T- ) JE* [J(S(T)) IFt l
[ ( it t [ ( . IT
=
( 1 + r) - ( T - t ) JE* f
=
(1
=
(1 + r) - T FT (S(t) , p* ) .
+
) 1 Ft] ( 1 + Z(i)) ) ] S(t) .= t +l S(t)
r) - ( T - t ) JE* f
( 1 + Z(i))
We used the role o f independence property of conditional expectations from Proposition 2.5.1 in the next-to-Iast equality. It is applicable since Set) is 0 Frmeasurable and Z(t + 1 ) , . . , Z(T) are independent of Ft . An immediate consequence is the pricing formula for the European call option, i.e. X f eST ) with f (x) (x K) + . .
=
=
-
Corollary 4.5.4. Consider a European call option with expiry T and strike price K written on (one share of) the stock S. The arbitrage price process IIc (t) , t = 0, 1 , . . , T of the option is given by (set T = T t) .
-
For a European put option, we can either argue similarly or use put-call parity. 4.5.3 Hedging
Since the Cox-Ross-Rubinstein model is complete we can find unique hedging strategies for replicating contingent claims. Recall that this means we can find a self-financing portfolio ({J (t) ( ({Jo (t) ({Jl (t) ) , ({J predictable, such that the value process Vcp (t) ({Jo (t) B (t) + ({Jl (t) S ( t) satisfies =
=
,
4.5 The Cox-Ross-Rubinstein Model
127
IIx ( t ) = Vr,o (t) , for all t = 0, 1 , . . . , T. Using the bond as numeraire we get the discounted equation
iIx (t)
=
Vr,o (t) = cpo( t ) + '(h ( t ) S( t ) ,
for all t
=
0, 1 , . . . , T.
By the pricing formula, Proposition 4.5.3, we know the arbitrage price process and using the restriction of predictability of cP , this leads to a unique replicat ing portfolio process cp o We can compute this portfolio process at any point in time follows. The equation iIx (t) cpo ( t) + CPl ( t) S( t) has to be true for each w E n and each t 1 , . . . , T. Given such a t we only can use information up to ( and including) time t - 1 to ensure that cP is predictable. Therefore we know S e t - 1 ), but we only know that S e t ) = (1 + Z ( t))S ( t - 1 ). However, the fact that Z (t) E {d, u} leads to the following system of equations, which can be solved for CPo (t) and CPl (t) uniquely. Making the dependence of iIx on S explicit, we have as
=
=
iIx ( t, St-l (1 + u)) cpo( t ) + CP l ( t) St-l(1 u) , iIx ( t , St-l (1 + d)) = cpo( t) + CPl ( t) St-l (1 + d). =
+
This gives two simultaneous linear equations in two unknowns, with solution
:_l(1 + d)) - �t_l(1 + d) iIx ( t, St-l(1 cpo ( t) = St-l (1 + u)iIx ( t, SSt-l (1 + u) - St-l (1 + d) ( 1 + u)iIx ( t, St-l (1 + d)) - (1 + d)iIx (t, St-l (1 + u)) =
CPl (t)
=
+
u))
(u - d)
iIx ( t, S:-l(1 + u ) ) - i!x ( t, St-l (1 + d))
St-l ( 1 + u) - St-l (1 + d) iIx (t, St-l(1 u)) - iIx (t, St-l (1 + d)) St-l ( U - d) +
Observe that we only need to have information up to time t - 1 to compute cp(t) , hence cP is predictable. We make this rather abstract construction more
transparent by constructing the hedge portfolio for the European contingent claims. Proposition 4.5.4. The perfect hedging stmtegy cP
= (CPo , cp d replicating the European contingent claim f e ST ) with time of expiry T is given by (again using T = T - t)
-T CP l ( t ) - (1 + r ) (FT( St-l (1S+t-lu)(,Up* ) d)- FT( St-l(1 + d) , p* )) , CPo ( t ) - (1 u)FT (St-l ( 1 + d) , p* ) - (1 + d)FT (St -l (1 u), p*) . _
_
_
+
(u - d) (I + r) T
+
128
4 . Mathematical Finance in Discrete Time
Proof. (1 + r) - T FT ( St , p*)
the strategy
=
must be the value of the portfolio at time t if
(
Now S(t)
=
S(t - 1 ) ( 1 + Z(t) ) = S(t - 1 ) ( 1 + u ) or S(t - 1 ) ( 1 + d) , so:
=
( 1 + r) - T FT ( St - l (l + d) , p* ) .
Subtract:
=
( 1 + r) - T (FT ( St- l ( l + u) , p*) - FT ( St- l ( l + d) , p* ) ) .
So
S(t - 1 ) (u - d)
.
Using any of the equations in the above system and solving for
:=
�) P * j ( l - p*r-j (x( l + u)j ( l + dr- j - K ) + . t ( j =O J
Then ( 1 + r) - T C ( T, X ) is value of the call at time t ( with time to expiry T) given that S (t) x. =
Corollary 4 . 5 . 5 . The perfect hedging strategy
(
_
_
Notice that the numerator in the equation for
d) . When the payoff function C (T, x ) is an increasing function of x, as for the European call option considered here, this is non-negative. In this case
4.5 The Cox-Ross-Rubinstein Model
129
Corollary 4.5.6. When the payoff function is a non-decreasing function of
the asset price S(t) , the perfect-hedging strategy replicating the claim does not involve short-selling of the risky asset.
If we do not use the pricing formula from Proposition 4.5.3 (Le. the in formation on the price process) , but only the final values of the option (or more generally of a contingent claim) , we are still able to compute the ar bitrage price and to construct the hedging portfolio by backward induction. In essence this is again only applying the one-period calculations for each time interval and each state of the world. We outline this procedure for the European call starting with the last period [T - 1, T] . We have to choose a replicating portfolio cp(T) (CPo (T) , CP 1 (T)) based on the information avail able at time T - 1 (and so FT - 1- measurable) . So for each w E [l the following equation has to hold: =
rrx (T, w )
=
CPo(T, w)B(T, w) + CP1 (T, w) S(T, w).
Given the information FT - 1 we know all but the last coordinate of w , and this gives rise to two equations (with the same notation as above) : rrx (T, ST - 1 (1 + u))
=
cpo (T) (l + r f + CP1 (T)ST- 1 ( 1 + u) ,
rrx (T, ST - 1 (1 + d) )
=
CPo (T) (l + r f + CP1 (T) ST- 1 (1 + d) .
Since we know the payoff structure of the contingent claim at time T, for example in case of a European call rrx (T, ST - 1 ( 1 + u)) (( 1 + U)ST- 1 - K ) + and rrx (T, ST - 1 ( 1 + d) ) = ( ( 1 + d)ST - 1 - K ) + we can solve the above system and obtain (1 CPo (T) ( 1 + u) IIx (T, ST- 1 (1 + d)) - +r) Td) IIx (T, ST - 1 ( 1 + u)) (u - d) ( l =
,
=
+ _ IIx (T, ST - 1 ( 1 + u) ) - IIx (T, ST - 1 ( 1 + d)) . CP1 (T) ST - 1 (U - d)
Using this portfolio one can compute the arbitrage price of the contingent claim at time T - 1 given that the current asset price is ST- 1 as Now the arbitrage prices at time T - 1 are known and one can repeat the procedure to successively compute the prices at T - 2, . . . , 1 , o . The advantage of our risk-neutral pricing procedure over this approach is that we have a single formula for the price of the contingent claim at all times t at once, and don't have to use a backward induction only to compute price at a special time t. a
130
4. Mathematical Finance in Discrete Time
4 . 6 B inomial Approximations
Suppose we observe financial assets during a continuous time period [0, T] . To construct a stochastic model of the price processes of these assets (to, e.g. value contingent claims) one basically has two choices: one could model the processes as continuous-time stochastic processes (for which the theory of stochastic calculus is needed) or one could construct a sequence of discrete time models in which the continuous-time price processes are approximated by discrete-time stochastic processes in a suitable sense. We describe the second approach now by examining the asymptotic properties of a sequence of Cox-Ross-Rubinstein models. 4.6 . 1 Model Structure
We assume that all random variables subsequently introduced are defined on a suitable probability space (Jl, F, JP ) . We want to model two assets, a riskless bond B and a risky stock B, which we now observe in a continuous-time interval [0, T] . To transfer the continuous-time framework into a binomial structure we make the following adjustments. Looking at the nth Cox-Ross Rubinstein model in our sequence, there is a prespecified number kn of trading dates. We set Lln = T/kn and divide [0, T] in kn subintervals of length Lln, namely Ij = [jLln, (j + l)Lln] , j = 0, , kn 1 . We suppose that trading occurs only at the equidistant time points tn,j = jLln ' j 0, . . , kn 1. We fix rn as the riskless interest rate over each interval Ij , and hence the bond process (in the nth model) is given by . . .
-
=
B(tn,j )
=
( 1 + rn) j , j
=
.
-
0, . , kn . . .
In the continuous-time model we compound continuously with spot rate r 2: ° and hence the bond price process B(t) is given by B(t) eTt . In order to approximate this process in the discrete-time framework, we choose rn such that (4.14) j With this choice we have for any j = 0, . . , kn that (1 + rn) exp(rjLln) exp(rtn,j ) ' Thus we have approximated the bond process exactly at the time points of the discrete model. Next we model the one-period returns B(tn,HI)/ B(tn,j ) of the stock by a family of random variables Zn, i ; i = 1 , , kn taking values { dn , un } with =
.
=
=
. . .
JP (Zn, i = un)
=
Pn
=
1 JP ( Zn, i -
=
dn)
for some Pn E (0, 1 ) , which relate to the drift and volatility parameter a > ° of the stock. With these Zn,j we model the stock price process Bn in the nth Cox-Ross-Rubinstein model as
4.6 Binomial Approximations
Sn (tn, j )
=
j Sn (O) IT ( 1 + Zn, i ) , j i=l
=
131
0, 1, . , kn· .
.
With the specification of the one-period returns we get a complete de scription of the discrete dynamics of the stock price process in each Cox Ross-Rubinstein model. We call such a finite sequence Zn = (Zn , i )�� l a lattice or tree. The parameters Un , dn , Pn , kn differ from lattice to lattice, but remain constant throughout a specific lattice. In the triangular array 1, 2, . . . we assume that the random variables are ( Zn, i ) , i 1 , . . . , kn ; row-wise independent ( but we allow dependence between rows ) . The approx imation of a continuous-time setting by a sequence of lattices is called the lattice approach. It is important to stress that for each we get a different discrete stock price process Sn (t) and that in general these processes do not coincide on common time points ( and are also different from the price process S ( t)) . Turning back to a specific Cox-Ross-Rubinstein model, we now have as in §4.5 a discrete-time bond and stock price process. We want arbitrage free financial market models and therefore have to choose the parameters Un, dn , Pn accordingly. An arbitrage-free financial market model is guaranteed by the existence of an equivalent martingale measure, and by Proposition 4.5. 1 ( i ) the ( necessary and ) sufficient condition for that is =
n =
n
The risk-neutrality approach implies that the expected ( under an equivalent martingale measure ) one-period return must equal the one-period return of the riskless bond and hence we get ( see Proposition 4.5. 1 ( ii ) ) *
Pn
=
rn - dn . Un - dn
( 4.15 )
So the only parameters to choose freely in the model are Un and dn . In the next sections we consider some special choices. 4.6.2 The Black-Scholes Option Pricing Formula
We now choose the parameters in the above lattice approach in a special way. Assuming the risk-free rate of interest r as given, we have by ( 4. 14 ) 1 + rn = e r Ll� and the remaining degrees of freedom are resolved by choosing Un and dn . We use the following choice: 1 + Un
=
eU-/Ll,;", and 1 + dn
=
( 1 + Un ) - l
=
e -u-/Ll,;".
By Condition ( 4. 15 ) the risk-neutral probabilities for the corresponding single period models are given by
132
4. Mathematical Finance in Discrete Time
We can now price contingent claims in each Cox-Ross-Rubinstein model using the expectation operator with respect to the (unique) equivalent martingale measure characterized by the probabilities (compare §4.5.2) . In particular we can compute the price JIc (t) at time t of a European call on the stock with strike and expiry T by Formula (4. 13) of Corollary 4.5.4. Let us reformulate this formula slightly. We define (4.16) Then we can rewrite the pricing formula (4. 13) for t 0 in the setting of the nth Cox-Ross-Rubinstein model as JIc (O) = (1
p�
S
K
=
+ rn )- kn (k� ) p�i (l _ p�) kn -i(S(O)(l + un )i(l + dn ) kn-i - K) J=an l + dn ) ) k n - i = S(O) J=a (k� ) ( p�(1 l++rnun ) ) i ( (l -P�)( 1 + rn n -(1 + rn )- k,.K (:}:!(l - p� ) k"-; x
t [t
J
.
J
[;t.
1
l
Denoting the binomial cumulative distribution function with parameters as we see that the second bracketed expression is just =1Also the first bracketed expression is with
(n,p) Bn,p (.)
fin
i3 kn 'P� ( an )
Bkn'P�(an ) . i3kn,Pn( an ) P,n = p�(l1 ++rnun ) .
That is indeed a probability can be shown straightforwardly. Using this notation we have in the nth Cox-Ross-Rubinstein model for the price of a European call at time t = 0 the following formula:
( 4.17) (We stress again that the underlying is S (t), dependent on n, but Sn (O) = S(O) for all n.) We now look at the limit ofn this expression. Proposition 4.6. 1 . We have the following limit relation:
nlim--too JI (n) (0) = JIgs (0) C
4 .6 Binomial Approximations
with IIg s ( 0) given by the Black-Scholes formula (we use the notation)
S = 8( 0)
133
to ease
The functions d1 (s, t) and d2 (s, t) are given by
log(s/ K) + (r + � )t ' ayr..t (r - "; )t d2 (s, t) = d1 (s, t) - ay't = log (s/K) + a y'tt
d1 ( s, t )
_
-
2
and N (.) is the standard normal cumulative distribution function. Proof. Since 8n( 0 ) = 8 ( say ) all we have to do to prove the proposition is to fJk n ;Pn ( an) = N(d1 (8, T)), (i ) nlim ..... oo ( ii ) lim fJ kn ' P� (an) = N(d (8, T) ) .
show
2 n These statements involve the convergence of distribution functions. To show ( i ) we interpret ..... oo
IP (an <_ Yon _< kn ) with (Yn) a sequence of random variables distributed according to the bino mial law with parameters (kn, Pn) . We normalize Yn to fJkn ,Pn ( a n )
=
j=l Yn = Yn - JE(Yn) = Yn - knPn JVar(Yn) JknPn(1 - fin ) where Bj , n, j = 1 , . , kn ; = 1 , 2, . . . are row-wise independent Bernoulli random variables with parameter Pn . Now using the central limit theorem we know that for an -+ a, (3n -+ (3 we have . .
n
nlim IP(an � Yn � (3n) = N( (3) - N(a). By definition we have ..... oo
with
134
4. Mathematical Finance in Discrete Time
Using the following limiting relations: 1 nlim .-.?oo kn (1 - 2Pn h/Li: = -T (�a + 2 .-.?oo Pn -2 , nlim and the defining relation for an, Formula ( 4.16 ) , we get
�),
=
10g(K/S) + knO'�
' - knPn 2O' � -;=j===:==O=: ==::=7"" lim On nlim n.-.?oo .-.? oo y'knPn (l - Pn) 10g(K/S) + O'kn � (l - 2Pn) = nlim .-.?oo 2O' y'kn.dnPn (1 - Pn) _
-
Furthermore we have
--
=
---
=
2 10g( K/S) - ( r + ,,; )T -d l ( S, T ) . T a JT _
So N«(3n) -+ 1, N(o n) -+ N( -dd = 1 - N(dd, completing the proof of ( i ) . To prove ( ii ) we can argue in very much the same way and arrive at parameters o� and (3� with Pn replaced by p� . Using the following limiting relations: .
1
*
1m Pn = - ' n1.-.?oo 2 we get
· 10g(K/S) + O'n � (1 - 2p�) . 0 1m 1m n = n1.-.?oo n1.-.?oo 20' y'n .dnP� (1 - p�) - 10g(K/S) - ( r - "; )T - d2 ( s , T ) . JT *
2
_
a
For the upper limit we get
T
_
_
nlim -+oo y'kn (p�) - I (l - p�) ---t oo (3� = nlim whence ( ii ) follows similarly.
=
+00 , o
By the above proposition we have derived the classical Black-Scholes Eu ropean call option valuation formula as an asymptotic limit of option prices in a sequence of Cox-Ross-Rubinstein type models with a special choice of parameters. We will therefore call these models discrete Black-Scholes mod els. Let us mention here that in the continuous-time Black-Scholes model the
4 . 6 B inomial Approximations
135
dynamics of the (stochastic) stock price process S ( t) are modeled by a geo metric Brownian motion (or exponential Wiener process) . The sample paths of this stochastic price process are almost all continuous and the probability law of S( t ) at any time t is lognormal. In particular the time T distribution of log{S(T)/ S(O) } is N(TJ.L, T 2 ) (here J.L is the growth rate, a the volatility of the stock). Looking back at the construction of our sequence of Cox-Ross Rubinstein models we see that kn log Sn (T) = 10g(1 + Zn,i) , S(O) with log(Zn,i) Bernoulli random variables with JP(log(l + Zn ,i ) = a JL1:) = Pn = 1 - JP(log(l + Zn , i ) = - a JL1:) . By the (triangular array version) of the central limit theorem, we know that log ssg;) properly normalized converges in distribution to a random variable with standard normal distribution. Doing similar calculations as in the above proposition we can compute the normalizing constants and get Sn (T) . 2 2 nl�"!:, log S(O) N ( T ( r a / 2 ) , Ta ) , i.e. SS(cg;) is in the limit lognormally distributed. Using the terminology of weak convergence, we can therefore say that the probability measures JPn induced by the distributions of Sn (T)/S(O) converge to the probability measure Q induced by N (T(r a2 /2), T 2 ) . Therefore as a direct consequence of the definition of weak convergence we have Proposition 4.6.2. Let X be a contingent claim of the form X = h(S(T)) with h a bounded, uniformly continuous real function. Denote by ll'X resp. llx the time t 0 price of X in the nth discrete-time resp. the continuous (
8
rv
-
-
(
=
time Black-Scholes market model. Then
nlim -too ll'X = llx · Proof. Writing the pricing formula for the contingent claim using the expectation operator with respect to the risk-neutral probability measures, we have ll'X = lEJP n (h(Sn (T))) = hdJPn , resp. llx = lEQ (h(S(T)) ) = hdQ (since the a-field at t = 0 is assumed to be trivial, we can use expectation instead of conditional expectation) . The result now follows from the port manteau theorem of weak-convergence theory (see e.g. the book Billingsley 0 (1 968), § 1.2).
J J
4. Mathematical Finance in Discrete Time
136
Using hex ) = max { O, (K - x) } we get the above convergence for the European put option, and put-call parity gives the result for the European call option ( as above ) . Observe g(x) = max { O, (x - K) } is unbounded, so we can't apply Proposition 4.6.2 to give another direct proof of Proposition 4.6. 1. Example.
We now turn briefly to different choices of Un and dn and their effects. 4.6.3 Further Limiting Models
As already mentioned, different choices of the sequences ( un) and (dn) lead to different asymptotic stock price processes. We briefly discuss two possible choices. Jump Stock Price Movements
The key to the results in the last section was the weak convergence of the se quence of random variables log e SS(tg;) ) . To show this convergence we basically used the De Moivre-Laplace theorem for binomial random variables. We now use another classical limit theorem for binomial random variables - the 'weak law of small numbers' or 'law of rare events', which states that for certain parameters the limiting distribution is a Poisson distribution ( compare §2.9). Indeed, if we choose Un = U e< , ( > 0 ( independent of ) and dn = e�Lln with some 0 < � < r we have ( for large enough ) an arbitrage-free market model with unique risk-neutral probabilities P� given by =
n
n
*
_
Pn -
exp(rLln) - exp(�Lln) A ) u - exp (t:." Un
-+
0, (
n
-+ 00 .
)
For this lattice approach the step size of an upward move remains constant through all Cox-Ross-Rubinstein models, but the probability it will occur becomes very small. On the other hand, the size of a downward move becomes very small ( as Lln -+ 0, we have dn -+ 1), but its probability becomes very close to 1. Recall that in the sequence of Cox-Ross-Rubinstein models we modeled the stock price at time T as
with 10g(Zn,i) Bernoulli random variables. Given the size of the up and down movements and the probabilities P� as above, an application of the law of rare events ( see §2.9) shows that the corresponding sequence of equivalent proba bility measures lPn of the Cox-Ross-Rubinstein models converges weakly to the probability measure Q induced by a Poisson distribution with parameter \ 1\
=
Tu (r-�) u-l '
4 . 6 Binomial Approximations
137
We can apply the pormanteau theorem again to find the valuation formula of a European put and use put-call parity to get the pricing formula for a European call. We use the following notation: en is the time t = 0 price of a European call in the nth Cox-Ross-Rubinstein model with parameters as above and 00 e -J.tJLi 1jiJ.t (x ) = 1 1jiJ.t (x - 1) = L ., _
-
. l. = x
�.
the complementary Poisson distribution function with parameter JL. With this notation we have the following limiting relation: The parameter A is given as above and x = ( log(KjS(O) ) - eT)j log u. In the limiting continuous-time model the stock price process has to be modelled in such a way that 'jumps' are possible, i.e. the paths of the stochas tic stock price process must allow discontinuities. This is done by using the continuous-time Poisson process ( or another point process, see Chapter §5.2). The distribution of the stock price process in the continuous-time model is then log-Poisson. This kind of binomial model was introduced by Cox and Ross (1976) ; see also Cox and Rubinstein (1985) , p. 365 for a somewhat different textbook treatment. Constant Elasticity of Variance Diffusion
We now allow the up and down movements of the binomial process to dif fer predictably from period to period. More explicitly we write ( using the notation from above) To obtain an arbitrage-free market, we have to choose the probabilities in the underlying single-period models according to (4. 10) , i.e.
This, of course, implies that the equivalent martingale measure for the nth Cox-Ross-Rubinstein model is dependent on the whole family of probabilities Pn , o , . . , Pn ,kn - 1 · For instance, if we use the functions *
.
*
u(y, t) = JLyt + ayP Vt, and d(y, t) = JLyt - ayP Vt" 0 < P :::; 1 , and set un(S(t), t) = exp{ u(S(t) , tn and dn (S(t) , t) = exp{ d(S(t) , tn,
138
4. Mathematical Finance in Discrete Time
we have
With these parameters, one can show that the probability measures lE:m con verge weakly to a probability measure Q induced by a certain gamma-type distribution. This leads to the constant elasticity of variance option pricing formula for the limit of European call option prices at time 0 in the above sequence of Cox-Ross-Rubinstein models: 00
00
- rT � lim e = 8(0) � , � g (i, x)G(i + >. , y) - Ke � g (i + >. x)G(i, y) . n -+ oo n ' 0 i= l i =l
The function g(i, u ) is the gamma density function e -u ui - 1 g(z,. u) = (i - I) ! '
and the function G(i,
z
)
the complementary gamma distribution function G(i, z )
00
=
! g (i, u)du. z
The parameters are given as x = 2>'r8(0) t erT/A/(a 2 (e r T/A - 1 ) ) , y 2 >. rK * /(a 2 (e rT/A - 1 ) ) and >' 1/(2(1 - p ) ) . The corresponding continuous-time stock price dynamics are given by =
=
d8(t)
=
p,8(t)dt + a8(t)PdW(t)
(where dW(t) denotes the stochastic differential with respect to the Wiener process - we treat this in Chapter 5) and the constant elasticity in the (con ditional) variance term (in front of dW(t)) gives the name to this model. Remark 4.6. 1 . The numerics of the above approximations have been subject to investigation for quite some time (see Broadie and Detemple (1997) and Leisen (1996) for discussion and references) . Such numerical schemes are easy to implement, for instance using Mathematica, and the reader is invited to do so. 4 . 7 American Opt ions
4.7. 1 Theory
Consider a general multi-period framework. The holder of an American derivative security can 'exercise' in any period t and receive payment !(8t )
4 . 7 American Options
139
(or more generally a non-negative payment It). In order to hedge such an option, we want to construct a self-financing trading strategy 'Pt such that for the corresponding value process V", (t) V", (O) = x initial capital V", (t) 2: It, Vt.
(4. 19)
Such a hedging portfolio is minimal, if for a stopping time 7 V", (7) = lr .
We assume now that we work in a market model (fl, F, IF, JP) , which is complete with JP* the unique martingale measure. Then for any hedging strategy 'P we have that under JP* M(t) = V", (t) = (3(t) V", (t)
( 4. 20 )
is a martingale. Thus we can use the STP (Theorem 3.5.1) to find for any stopping time 7 V", (O) = Mo = JE* (V", ( 7 ) ) . ( 4.21 ) Since we require V", (7) 2: f-r for any stopping time we find for the required initial capital (4.22) x 2: sup JE* ((3(7 ) f-r ) . -r E T
Suppose now that 7 * is such that V", (7 * ) = f-r* ; then the strategy 'P is minimal, and since V", (t) 2: It for all t we have x =
JE* ((3(7 * ) f-r* ) = sup JE* ((3(7) f-r ) . -r E T
(4.23)
Thus ( 4.23) is a necessary condition for the existence of a minimal strategy 'P . We will show that it is also sufficient and call the price in (4.23) the rational price of an American contingent claim. Now consider the problem of the option writer to construct such a strategy 'P. At time T the hedging strategy needs to cover fT, i.e. V", (T) 2: fT is required. At time T - 1 the option holder can either exercise and receive fT - l or hold the option to expiry, in which case B (T - l)JE* ((3(T) fT I FT -d needs to be covered. Thus the hedging strategy of the writer has to satisfy max{ fT - b B(T - l)JE* ((3(T)fT IFT - d } . Using a backward induction argument we can show that V", (T - 1)
=
V", (t - 1 )
=
max{ ft - b B(t - l)JE* ((3(t)V", (t) IFt - d } . Considering only discounted values, this leads to V", (t - 1)
=
max{ !t - l , JE* (V", (t) IFt- d } .
(4.24)
(4.25) (4.26)
4 . Mathematical Finance in Discrete Time
140
Thus we see that Vcp (t) is the Snell envelope Zt of !t In particular, we know that Zt =
and the stopping time T*
=
sup lE * (ir IFt )
(4.27)
r ET.
min { s ? t
:
Zs = is }
is optimal. So (4.28)
In case t
=
0 we can use TO' = min { s x
=
Zo
=
?
lE* (jro ) =
0 : Zs = is } , and then sup lE* ( ir )
(4.29)
r E To
is the rational option price. We still need to construct the strategy rp . To do this recall that supermartingale and so the Doob decomposition yields
Z
is a
(4.30)
with a martingale M and a predictable, increasing process A . We write Mt Mt Bt and At AtBt . Since the market is complete, we know that there exists a self-financing strategy cp such that =
=
(4.31)
Also using (4.30) we find Zt Bt Vrp (t) - At . Now on C ( (t, w) 0 :$ t < we have that Z is a martingale and thus At (w) = O. Thus we obtain from Vrp (t) Zt that =
r* (w) }
=
:
=
(4.32)
Now T* is the smallest exercise time and Ar* (w)
=
O. Thus (4.33)
Undoing the discounting we find (4.34)
and therefore cp is a minimal hedge. Now consider the problem of the option holder, how to find the optimal exercise time. We observe that the optimal exercise time must be an optimal stopping time, since for any other stopping time a ( use Proposition 3.6.2) (4.35)
4.7 American Options
141
and holding the asset longer would generate a larger payoff. Thus the holder needs to wait until Za 1a i.e. (i) of Proposition 3.6.2 is true. On the other hand with the largest stopping time (compare Definition 3.6.2) we see that :::; This follows since using cp after with initial capital from exercising will always yield a higher portfolio value than the strategy of exercising later. To see this recall that V<,O = ZtBt + At with At > 0 for t > So we must have (J :::; and since At = 0 for t :::; we see that z a is a martingale. Now criterion (ii) of Proposition 3.6.2 is true and is thus optimal. So Proposition 4 . 7. 1 . A stopping time (J E Tt is an optimal exercise time for =
1/
(J
1/.
1/
1/.
1/
1/
(J
the American option (it) if and only if
lE* ({3( (J )fa ) = sup lE* ({3( T )fr) . rET,
( 4.36)
4.7.2 American Options in the eRR Model
We now consider how to evaluate an American put option in a standard CRR model. We assume that the time interval [0, T] is divided into N equal subintervals of length Ll say. Assuming the risk-free rate of interest r (over [O,T]) as given, we have 1 + p e r Ll (where we denote the risk-free rate of interest in each subinterval by p ) . The remaining degrees of freedom are resolved by choosing u and d as follows: 1 + u = e a vL1 , and 1 + d = (1 + u) - l = e - a vL1 . =
By condition (4. 10) , the risk-neutral probabilities for the corresponding single period models are given by
Thus the stock with initial value 8 = 8 ( 0) is worth 8(1 + u) i ( l + d) j after i steps up and j steps down. Consequently, after N steps, there are N + 1 possible prices, 8(1 + u) i ( l + d) N - i (i = 0, . . . , N) . There are 2 N possible
paths through the tree. It is common to take N of the order of 30, for two reasons: typical lengths of time to expiry of options are measured in months (9 months, say) ; this gives a time step around the corresponding number of days, 2 30 paths is about the order of magnitude that can be comfortably handled by computers (recall that 2 10 1 , 024, so 2 30 is somewhat over a billion) . We can now calculate both the value of an American put option and the optimal exercise strategy by working backwards through the tree (this method of backward recursion in time is a form of the dynamic programming •
•
=
142
4. Mathematical Finance in Discrete Time
(DP) technique, due to Richard Bellman, which is important in many areas of optimization and Operational Research) . 1 . Draw a binary tree showing the initial stock value and having the right number, N, of time intervals. 2. Fill in the stock prices: after one time interval, these are 8(1 + u) ( upper ) and 8 ( 1 + d) ( lower ) ; after two time intervals, 8(1 + u) 2 , 8 and 8(1 + d) 2 8j ( 1 + u) 2 ; after i time intervals, these are 8(1 + u)i ( 1 + d) i - i 8(1 + u) 2i- i at the node with j 'up' steps and i - j 'down' steps ( the ' (i , j ) ' node ) . 3. Using the strike price K and the prices at the terminal nodes, fill in the payoffs i�,i max { K 8(1 + u)i ( l + d) N - i , O } from the option at the terminal nodes underneath the terminal prices. 4. Work back down the tree, from right to left. The no-exercise values iii of the option at the (i, j) node are given in terms of those of its upper and lower right neighbours in the usual way, as discounted expected values under the risk-neutral measure: =
=
-
=
Iii = e- rL1 fp * if!. l 'i+l + ( 1 P* ) ii't l , i ) · The intrinsic ( or early-exercise ) value of the American put at the (i , j) node -
- the value there if it is exercised early - is K 8(1 + u) i ( l + d) i -i ( when this is nonnegative, and so has any value ) . The value of the American put is the higher of these: it] max{ iii , K 8(1 + u) i ( 1 + d) i - i } = max { e- rL1 (p* if!. l,i+l + ( 1 p* ) if!.l, ) ' K 8(1 + u) i (1 + d) i i } . i 5. The initial value of the option is the value it filled in at the root of the tree. 6. At each node, it is optimal to exercise early if the early-exercise value there exceeds the value iii there of expected discounted future payoff. Note. The above procedure is simple to describe and understand, and simple to program. It is laborious to implement numerically by hand, on examples big enough to be non-trivial. Numerical examples are worked through in detail in Hull (1999) , p.359-360 and Cox and Rubinstein ( 1985) , p.241-242. Mathematically, the task remains of describing the continuation region the part of the tree where early exercise is not optimal. This is a classical optimal stopping problem, and as we mentioned above, a solution by explicit formulas is not known - indeed, is probably not feasible. It would take us too far afield to pursue such questions here; for a fairly thorough ( but quite difficult ) treatment, see Shiryaev et al. ( 1995) . We return to the theory of American options in the continuous-time context in §6.3. 1 . We conclude by showing the equivalence of American and European calls without using arbitrage arguments. -
=
-
-
-
-
-
4 . 8 Further Contingent Claim Valuation in Discrete Time
143
Theorem 4.7. 1 . Let (Zn ){j' be the payoff sequence of an American option.
Then h = ZN is the payoff of the corresponding European option. Write GA (n) , GE (n) for the values at time n of the American and European options. Then (i) GA (n) � GE (n) , (ii) If GE (n) � Zn , then GA (n) = GE (n) . Proof. (i) We use the supermartingale resp. martingale property of the price processes of the discounted American resp. European call to get
(ii) (CE (n) ) is a P*-martingale, so in particular a P*-supermartingale. Be ing the Snell envelope of (Zn ) ' (CA (n) ) is the least P*-supermartingale dominating (Zn ) . So if CE (n) � Zn in the condition of the theorem, 0 CE (n) � CA (n) , so CE (n) = CA (n) . as
Corollary 4.7. 1 . In the Black-Scholes model with one risky asset, the Amer
ican call option is equivalent to its European counterpart. Proof.
Here Zn = (Sn - K) + . Discounting,
CE (n)
=
(1 + p) - N JE * ( (SN - K) + IFn)
(
� JE * SN - K ( l + p) - N I Fn as
)
=
Sn - K(l + p) - N,
Sn is a lP*-martingale. Without the discounting, this says GE (n) � Sn - K (l + p ) - ( N - n ) .
This gives GE (n) � Sn - K; also GE (n) � 0; so GE (n) � (Sn - K) + and the result follows from the theorem.
=
Zn , 0
4 . 8 Further Contingent C laim Valuation in D iscrete Time
4.8 . 1 Barrier Options
Barrier options are options whose payoff depends on whether or not the stock price attains some specified level before expiry. We will be brief here, referring to §6.3.3 for a more extensive discussion of barrier options in continuous time. The simplest case is that of a single, constant barrier at level H. The option may pay ('knock in') or not ('knock out') according to whether or not level H is attained, from below ('up') or above ('down') . There are thus four as
144
4. Mathematical Finance in Discrete Time
possibilities - 'up and in' , 'up and out', 'down and in', 'down and out' for the basic - single, constant barrier - case. In addition, one may have two barriers, with the option knocking in (or out) if the price reaches either a lower barrier HI an upper barrier H2 . More generally, one may have non-constant - 'moving' - barriers, with the level a function of time. As always, it pays to be flexible, and to be able to work in discrete or continuous time, as seems more appropriate for the problem in hand. For a full treatment in continuous time, see Zhang ( 1997) , Chapters 10, 1 1 , or §6.3.3. Now a continuous-time price process model, such as the Black-Scholes model based on geometric Brownian motion (§6.2) , may be approximated in various ways by discrete-time models (such as the discrete Black-Scholes model, the Cox-Ross-Rubinstein binomial tree model of §4.5); for the passage from discrete to continuous time, see §4.6 (and more generally, §5.9 below). When we have a barrier option in discrete time, we price it as with the American options of §6.3.1 by backward induction. Some sample paths hit the barriers, and for these we can fill in the payoff from the boundary conditions that define the barriers; as before, we fill in the payoff at the terminal nodes at expiry. We then proceed backwards in time recursively, at each stage using all current information to fill in, as before, the payoffs at new nodes one time step earlier. When we reach the root, the payoff is the value of the option initially. Problems may easily be encountered when dealing with barrier options in discrete time if the discretization process is not chosen and handled with care. A new discretization process, due to Rogers and Stapleton (1998) , proceeds by first discretizing space, by steps t5x > 0, and then discretizing time, into TO , Tl , . . . , where or
TO :=
0,
Tn + ! :=
inf{t > Tn :
I X (t) - X ( Tn ) 1 >
t5X } ,
n ::::
0,
and deal with the resulting random walk (�n) ' where This approximation scheme is accurate, reasonably fast, and very flexible: it is capable of handling a wide variety of problems, with moving as well fixed barriers. For the theory, and detailed comparison with other available methods, see Rogers and Stapleton (1998) ; another approach is due to Ait Sahlia and Lai (1998b). Techniques useful here include continuity corrections for approximations to normality, Edgeworth expansions, and Richardson ex trapolation. as
4 . 8 . 2 Lookback Options
Lookback - or hindsight - options, which we discuss in more detail in §6.3.4 in continuous time, are options that convey the right to 'buy at the low,
4.8 Further Contingent Claim Valuation in Discrete Time
145
sell at the high' - in other words, to eliminate the regret that an investor operating in real time on current, partial knowledge would feel looking back in time with complete knowledge. Again, most of the theory is for continuous time (see e.g. Zhang (1997) , Chapter 12), but a discrete-time framework may be preferred - or needed, if the only prices available are those sampled at certain discrete time-points. Care is obviously needed here, as discretization of time will miss the extremes of the peaks and troughs giving the highs and lows in continuous time. Discrete lookback options have been studied from several viewpoints; see e.g. Heynen and Kat (1995) , Kat (1995) and Levy and Mantion (1997). An interesting approach using duality theory for random walks has been given by AitSahlia and Lai (1998a). 4.8.3 A Three-period Example
Assume we have two basic securities: a risk-free bond and a risky stock. The one-year risk-free interest rate (continuously compounded) is r = 0.06 and the volatility of the stock is 20%. We price calls and puts in a three-period Cox-Ross-Rubinstein model. The up and down movements of the stock price are given by 1 + u = eUvzs. = 1. 1224 and 1 + d = (1 + u ) - l
= e-uvzs.
=
0.8910,
with a = 0.2 and ..::1 = 1/3. We obtain risk-neutral probabilities by (4. 10) p*
=
eT4 - d
u-d
=
0.5584.
We assume that the price of the stock at time t = 0 is 8(0) = 100. To price a European call option with maturity one year (N 3) and strike K = 10) we can either use the valuation formula (4.13) or work our way backwards through the tree. Prices of the stock and the call are given in Figure 4.2 below. One can implement the simple evaluation formulae for the CRR- and the BS-models and compare the values. Figure 4.3 is for 8 = 100, K = 90, r = 0.06, a = 0.2, T = 1 . =
To price a European put, with price process denoted by p(t) , and an Amer ican put, P(t) , (maturity N = 3, strike 100), we can for the European put either use the put-call parity ( 1 . 1 ) , the risk-neutral pricing formula, or work backwards through the tree. For the prices of the American put we use the technique outlined in §4.8. 1 . Prices of the two puts are given in Figure 4.4. We indicate the early exercise times of the American put in bold type. Recall that the discrete-time rule is to exercise if the intrinsic value K - 8(t) is larger than the value of the corresponding European put.
146
4 . Mathematical Finance in Discrete Time
<
S � l00 c = 1 1 .56
time t = 0
8 = 1 12.24 c 18.21 =
8 = 89. 10 c = 3.67
<
8 = 1 25.98 c = 27.96 8 = 100 c = 6 . 70
<
8 = 79. 38 c=o
< <
8 = 141 .40 c = 4 1 .40 8 = 1 12 .24 c = 1 2 .24 8 = 89. 10 c=o 8 = 70. 72 c=o t=3
t=2
t= 1 Fig. 4 . 2 .
<
Stock and European call prices Approximating CRR prices
R
�
R
� ..
8 "
'iii
IX) '"
.g c.
0
� "!
:!! CD '" CD
� � "!
:!! 50
1 00
1 50
200
Approximation
Fig. 4 . 3 .
Approximation of Black-Scholes price by Binomial models
147
4.9 Multifactor Models
p
P
= =
5 . 82 6.18
<
p = 2 . 08 P = 2.08 p = 1O.65 P = 1 1 . 59
Fig. 4 . 4 .
P = 4 . 76 P = 4 . 76
p = 18.71 P 20.62 =
t
time t = 0
< <
p=o p=o
=
t=2
1
< < <
p=o p=o p=o p=o p = 10.90 P 10.90 =
p = 29.28 P = 29.28 t=3
European p ( . ) and American P ( . ) put prices
4 . 9 Multifactor Models
We now discuss examples of discrete-time financial market models with more than two underlying assets. Such models are useful for the evaluation of mul tivariate contingent claims, such as options on multiple assets ( options on the maximum of two or more asset prices, dual-strike options, and portfo lio or basket options ) . For the exposition we assume d + 1 financial assets 80 , 81 , . . . , 8d . We assume 80 = B, a risk-free bank account or bond, and use B as numeraire. 4.9.1 Extended Binomial Model
This model, proposed by Boyle, Evnine, and Gibbs (1989) , uses a single bino mial tree for each of the underlying d risky assets. So we have 2d branches per node. We discuss the case d = 2 ( Le. the model consists of two risky assets and the bank account ) in detail; the generalization to d > 2 is straightfor ward. To show that this model is arbitrage-free we have to find an equivalent martingale measure, and to show that it is complete we have to prove unique ness of the equivalent martingale measure. A similar argument to that for the Cox-Ross-Rubinstein model shows that the multi-period extended binomial model is arbitrage-free ( complete ) if and only if the single-period model is (compare §4.5.2) . So it is enough to discuss the single-period model with trad ing dates t = 0 and t = 1 (= T) . We assume a risk-free rate of return of � 0, so B(O) 1 and B ( I ) = 1 + Furthermore we have two risky assets, 81 and 82 . Since both risky assets are modeled by single binomial trees, we have four possible states of the world at time t 1 with values of (81 (1) , 82 ( 1 ) ) given T
=
T.
=
148
4. Mathematical Finance in Discrete Time
by (U1 S1 (0) , U2 S2 (0) ) with probability Puu , (U1 S1 (0) , d2 S2 (0)) with probabil ity Pud, (d1 S1 (0) , U2 S2 (0) ) with probability Pdu and (d1 S1 (0), d2 S2 (0)) with probability Pdd , where we assume Ui > di , i = 1 , 2 and positive probabilities. Under the risk-neutral probabilities P�u , P�d , Pdu , Pd d the discounted stock price processes 8i (t) = Si (t) / B (t) have to be martingales. These martingale conditions imply the following two equations: 1E[ 81 ( 1 )] = 81 (0) 1E [ 82 (1) ] = 82 (0)
{o}
{o}
(p� u + P�d )U l + (Pdu + Pdd ) d 1 = ( 1 + r) , (p� u + Pdu )U 2 + (P�d + Pdd )d2 = (1 + r) .
Furthermore, besides the fact that the p* have to be positive to generate an equivalent measure, we must have So we have three equations for the unknown probabilities P�u , P�d , Pdu , Pdd and in general ( depending on the parameters U1 , d 1 , U2 , d2 , r) we will have several ( even infinitely many ) solutions of the system of the equations above. This means that the extended binomial model is arbitrage-free, but not com plete ( in accordance to our rule of thumb (§1 .4) that we should have as many financial assets to trade in as states of the world ) . 4.9.2 Multinomial Models
The extended binomial model shows that while it is tempting to model each asset by a single binomial tree, we lose the desirable property of market completeness in doing so. We will therefore now construct an arbitrage-free, complete market model ( with d > 2 financial assets ) following the infor mal rule of allowing as many different states of the world as we have assets to trade in. Furthermore the stochastic stock price processes in this model can be constructed to be of Markovian nature, that is, rather than the single period returns being independent unconditionally, they are independent given the present value of the process. This also allows for a more realistic repre sentation of the true prices and is more in line with the most prominent continuous-time model, the Black-Scholes market model, in which the stock price processes are Markovian. We follow an approach that is basically due to He (1990) . Again we only discuss the d = 2 case ( with the risk-free bank account B, with rate of return r 2: 0, as numeraire asset and two risky assets 81 , S2 ) ; the case d > 2 follows by the same prescription. Let us start with the single-period model. As in the extended binomial case above we assume trading dates t = 0 and t = 1 ( = T) , but now we have only three possible states of the world at time t = 1 . Indeed we set with
4.9 Multifactor Models
149
;
==
JP( Zl = uu , Z2 U 21 ) = P I ; JP(Zl = U 12 , Z2 = U22 ) = P2 JP(Zl = U 13 , Z2 U 2 3 ) = P3 · In general Zl and Z2 are not independent, but we still can choose Uij in
such a way that they are uncorrelated. Under the risk-neutral probabilities the discounted stock price processes Si (t) Si (t)/B ( t) have to be martingales. These martingale conditions imply the following two equations:
=
p i , P i , Pa ,
1E[ Sl ( l)] = Sl (O) 1E[ S2 ( 1 )] = S2 (0)
{:}
{:}
uupi + U 1 2P; + U13P; = (1 + r ) , U 2 1 Pi + U22 P; + U 2 3P; = ( 1 + r ) .
Furthermore, besides the fact that the p* have to be positive to generate an equivalent measure, we must have Therefore we have three equations for the three unknown probabilities and in general (given reasonable parameters Uij ) we will have a unique solution of the system of the equations above, and hence an arbitrage-free, complete financial market model. In the multi-period setting with time horizon T and the set of trading dates given by {O to < t 1 < . . . < t n T} of equidistant time points with distance Ll n (observe that we have n time steps) , we model the stock price processes by
=
=
k
Si (tk) = Si (O) II Zij , k j=l
= O, l ,
. . . , n,
i = 1 , 2,
with a sequence of independent random vectors (ZU) h� j � n such that Z�j ) , Z�j ) are uncorrelated (but possibly dependent) and JP(Z1U) JP(Z1U) JP(Z1(j )
Z2U) - u U) U ' - u U12) ' Z2U) U) ' Z2(j ) - u 13
(j . U» ) - p1 ) , - u21 U » ) - p2U) ., - u 22 - u 2U3» ) - p3U) . -
Since for each j the random vector ZU) can be in one of three possible states, the above argument applies for each 'underlying' single-period market and the multi-period market is arbitrage-free and complete. The most important case here is z;j+1) = Ui (S(tj ) , tj , fU » ) , i = 1 , 2, j = 0, . . . n - 1 , with a sequence of independent random vectors (f(j » ) j � , n - 1 such that f �j ) , f �j ) are uncorrelated (but possibly dependent) and sufficiently smooth functions Ui . Then u�J + 1 ) are predictable functions of S ( tj ) making the discrete stochastic process Si (t) Markovian. We will construct a financial market model of this type in §6.4.
150
4. Mathematical Finance in Discrete Time
Exercises
4 . 1 Construct hedging strategies for the European call and put in the setting of the example in §4.8.4. 4.2 Compare the Black-Scholes price with Cox-Ross-Rubinstein price ap proximations. Is the convergence of Cox-Ross-Rubinstein prices to the Black Scholes price 'smooth' or 'oscillating' ? (See (Leisen 1996) for details.) 4.3 Consider a European call option, written on a stock 8 with strike price 100, that matures in one year. Assume the continuously compounded risk free interest rate is 5%, the current price of the stock is 90 and its volatility is a = 0.2. 1. Set up a three-period binomial (Cox-Ross-Rubinstein) model for the stock price movements. 2. Compute the risk-neutral probabilities and find the value of the call at each node. 3. Construct a hedging portfolio for the call. 4.4 Consider put options, written on a stock 8, with strike price 100 that mature in one year. Assume the continuously compounded risk-free interest rate is 6%, the current price of the stock is 100 and its volatility is a = 0.25. 1. Set up a three-period binomial (Cox-Ross-Rubinstein) model for the stock price movements. 2. Compute the risk-neutral probabilities and find the value of a European put at each node. 3. Construct a hedging portfolio for the European put. 4. Now compute the values of a corresponding American put at each node and set up a hedging portfolio. Compare with the hedging portfolio in 3. 4.5 Consider a European powered call option, written on a stock 8, with expiry T and strike K. The payoff is (p > 1):
Cp (T) =
{ �8(T)
-
K) P , 8(T) :::: K; 8(T) < K.
Assume that T = 1 year, 8(0) = 90, a = 0.3, K = 100. Consider a two-period binomial model. 1. Price Cp using the risk-neutral valuation formula. 2. Construct a hedge portfolio and compute arbitrage prices (which of course will agree with the risk-neutral prices) using the hedging portfolio. 3. Compare the hedge portfolio with a hedge portfolio for a usual European call. What are the implications for the risk-management of powered call options?
Exercises
15 1
4.6 In static hedging of exotic options, one tries to construct a portfolio of st andard options - with varying strikes and maturities but fixed weights that will not require any further adjustment - that will exactly replicate the value of the given target option for a chosen range of future times and market levels. We will construct a static hedge for a barrier option in a binomial five period model. Consider a zero interest-rate world with a stock worth 100 today. The stock price can move up and down 10 with probability 0.5 at the end of a fixed period. Our target for replication is a five-period up-and-out European-style call with a strike of 70 and a barrier of 120. This option has natural boundaries both at expiration in five periods and on the knockout barrier at 120. Create a portfolio of ordinary options that collectively have the same pay off as the up-and-out call on the boundaries. To create such a portfolio follow the steps: 1. Start with an ordinary call struck at 70. It has the same payoff if the barrier is never reached. 2. Add a short position in 10 five-period calls with strike 120 to the portfolio to make the portfolio value 0 at the time 4 boundary point. 3. Add a long position in 5 three-period calls struck at 120 to complete the portfolio. For each portfolio, compute the value-process at every node and compare it with the value of the barrier option.
5 . Stochastic Processes in Continuous Time
5 . 1 Filtrations ; Finite- dimensional Distribut ions
The underlying set-up is as in Chapter 3: we need a complete probability space ( .ft , F, IP ) , equipped with a filtration, i.e a nondecreasing family IF = (Ft) t> o of sub-a-fields of F : Fs <:;:; Ft <:;:; F for a :-::; < t < 00 . Here, Ft represents the information available at time t, and the filtration IF represents the information flow evolving with time. We assume that (.ft, F, IP, IF) , the stochastic basis ( or filtered probabil ity space) , satisfies the 'usual conditions' ( in Meyer's terminology; see §3.2, Dellacherie and Meyer ( 1982) and Meyer ( 1976)): a. Fo contains all IP-null sets of F; b. IF is right-continuous, i.e. Ft = Ft+ : = n s>tFs . A stochastic process X = (X (t) ) t ;?: O is a family of random variables defined on (.ft, F, IP, IF) . We say X is adapted if X(t) E Ft ( i.e. X (t) is Frmeasurable ) for each t: thus X(t) is known when Ft is known, at time t. If {tv · . , tn } is a finite set of time points in [0 , 00) , ( X (h ) , . . , X (tn ) ) is a random n-vector, with a distribution, J..L ( h , . . . , tn) say. The class of all such distributions as { h , . . . t n } ranges over all finite subsets of [0, 00) is called the class of all finite-dimensional distributions of X . These satisfy certain obvious consistency conditions: 1 . deletion of one point t i can be obtained by 'integrating out the unwanted variable', as usual when passing from joint to marginal distributions, 2. permutation of the t i permutes the arguments of the measure J..L ( tl , . . . , tn) on JRn . Conversely, a collection of finite-dimensional distributions satisfying these two consistency conditions arises from a stochastic process in this way ( this is the content of the Daniell-Kolmogorov theorem) . This classical result ( due to P.J. Daniell in 1918 and A.N. Kolmogorov in 1933) is the basic existence theorem for stochastic processes. For the proof, which depends on compact ness arguments, see e.g. Karatzas and Shreve ( 1991 ) , §2.2A ( for compactness, see e.g. Rudin ( 1976)). Important though it is as a general existence result, however, the Daniell Kolmogorov theorem does not take us very far. It gives a stochastic process s
.
,
154
5. Stochastic Processes in Continuous Time
X as a random function on [0, 00 ) , i.e. a random variable on IR[O,oo) . This is a vast and unwieldy space; we shall usually be able to confine attention to much smaller and more manageable spaces, of functions satisfying regularity conditions. The most important of these is continuity: we want to be able to realize X (X (t, w) )t>o as a random continuous function, i.e. a member of e [o, 00 ) ; such a proce; X is called path-continuous (since the map t � X (t, w) is called the sample path, or simply path, given by w) - or more briefly, continuous. This is possible for the extremely important case of Brownian motion (§5.3) , for example, and its relatives. Sometimes we need to allow our random function X (t, w) to have jumps. It is then customary, and convenient, to require X (t ) to be right-continuous with left limits (RCLL) , or cadlag ( continu Ii droite, limite Ii gauche) - i.e. to have X in the space D [O , 00) of all such functions (the Skorohod space) . This is the case, for instance, for the Poisson process and its relatives (see §5.4) . General results on realisability - whether or not it is possible to realize, or obtain, a process so as to have its paths in a particular function space - are known; see for example the Kolmogorov-Centsov theorem in Karatzas and Shreve (1991) , §2.2B. For our purposes, however, it is usually better to construct the processes we need directly on the function space on which they naturally live. Given a stochastic process X , it is sometimes possible to improve the regularity of its paths without changing its distribution (that is, without changing its finite-dimensional distributions) . For background on results of this type (separability, measurability, versions, regularization etc.) see e.g. the classic book Doob (1953) . There are several ways to define 'sameness' of two processes X and Y. We say (i) X and Y have the same finite-dimensional distributions if, for any integer n and {tl , · · · , t n } a finite set of time points in [0 , 00) , the random vectors (X(tl ) , . . . , X (t n ) ) and (Y(t l ) , . . . , Y (t n ) ) have the same distribution; (ii) Y is a modification of X if, for every t 2: 0, we have IP(Xt yt) 1 ; (iii) X and Y are indistinguishable i f almost all their sample paths agree =
=
1P [Xt
=
yt ; '10 ::; t <
00 ]
=
=
1.
Indistinguishable processes are modifications of each other; the converse is not true in general. However, if both processes have right-continuous sample paths, the two concepts are equivalent. This will cover most processes we encounter in this book. For proof, see e.g. Protter (2004) , p. 4. A process is called progressively measurable if the map (t, w) r--t Xt (w) is measurable, for each t 2: 0. Progressive measurability holds for adapted processes with right-continuous (or left-continuous) paths (see e.g. Karatzas and Shreve (1991) , p. 4-5) - and so always in the generality in which we work.
5.2 Classes of Processes
1 55
Finally, a random variable T : [l -+ [0, 00] is a stopping time if {T ::; t} E Ft for all t � o . If {T < t} E Ft for all t, T is called an optional time. For right-continuous
filtrations IF the concepts of stopping and optional times are equivalent. For a set A C IRd and a stochastic process X , we can define the hitting time of A for X as TA := inf{t > 0 : Xt E A}. For our usual situation (RCLL processes and Borel sets) hitting times are stopping times. We will also need the stopping time a-algebra FT defined as FT
=
{A E F A n {T ::; t} E Ft V t � o}. :
Intuitively, FT represents the events known at time T o The continuous-time theory is technically much harder than the discrete time theory, for two reasons: 1. questions of path-regularity arise in continuous time but not in discrete time; 2. uncountable operations (such as taking the supremum over an interval) arise in continuous time. But measure theory is constructed using countable operations: uncountable operations risk losing measurability. This is why discrete and continuous time are treated in separate chapters in this book. For further technical background, we must refer to standard works treating stochastic processes measure-theoretically, e.g. Doob (1953) , Meyer (1966) and Revuz and Yor (1991) . 5 . 2 C lasses o f Pro cesses
5 . 2 . 1 Martingales
The martingale property in continuous time is just that suggested by the discrete-time case: Definition 5 . 2 . 1 . A stochastic process X
= (X(t)) o :::; t <= is a martingale relative to (IF, JP) if (i) X is adapted, and IE I X (t) 1 < 00 for all ::; t < 00; (ii) IE[X(t) IFs] = X(s) JP a.s. (0 ::; s ::; t), and similarly for sub- and supermartingales. There are regularization results, under which one can take X(t) RCLL in t (basically t -+ IEX (t) has to be right-continuous) . Then the analogues -
of the results for discrete-time martingales hold true (compare Chapter 3). Among the contrasts with the discrete case, we mention that the Doob-Meyer
1 56
5. Stochastic Processes in Continuous Time
decomposition below, easy in discrete time (§3.3.3) , is a deep result in con tinuous time. For background, see e.g. Meyer (1966) - and subsequent work by Meyer and the French school (Dellacherie and Meyer (1978) and Revuz and Yor (1991)). Martingales model fair games. Submartingales model fa vourable games. Supermartingales model unfavourable games. Martingales represent situations in which there is no drift, or tendency, though there may be lots of randomness. In the typical statistical situation where we have data signal + noise, martingales are used to model the noise component. It is no surprise that we will be dealing constantly with such decompositions later (with 'semi-martingales'). Interpretation.
=
Some martingales are of the form Xt IE [X IFt l (t 2:: 0) for some integrable random variable X. Then X is said to close (Xt ) , which is called a closed (or closable) martingale, or a regular martingale. It turns out that closed martingales have specially good convergence properties: Xt -+ Xoo (t -+ oo)a.s. and in £ 1 , and then also Closed martingales.
=
Xt = IE [Xoo lFtl , a.s. This property is equivalent also to uniform integrability (UI) : SUP t
J
{ /xt / > x }
I Xt l dlP -+ 0 (x -+ 00 ) .
For proofs, see e.g. Williams (1991) , Ch. 14, Neveu (1975) , IV . 2 . Doob-Meyer Decomposition. One version in continuous time of the Doob decomposition (Theorem 3.6.4) in discrete time - called the Doob-Meyer (or the Meyer) decomposition - follows next but needs one more definition. A process X is called of class (D) if {Xr T a finite stopping time} is uniformly integrable. Then a (cadlag, adapted) process Z is a submartin gale of class (D) if and only if it has a decomposition :
Z = Zo + M + A
with M a uniformly integrable martingale and A a predictable increasing pro cess, both null at O. This composition is unique (see e.g. Rogers and Williams (1994) , VI §6).
5 . 2 Classes of Processes
Square-integrable Martingales.
M 2 if M is L 2 -bounded:
157
For M = (Mt) a martingale, write M E
sup t lE(Ml) < 00 , and M E M5 if further Mo = O. Write CM 2 , cM 5 for the subclasses of continuous M. For M E M 2 , M is convergent: Mt -+ Moo a.s. and in mean square for some random variable Moo E L 2 • One can recover M from Moo by The bijection
M = (Mt ) ++ Moo is in fact an isometry, and as Moo E L 2 , which is a Hilbert space, so too is M 2 . For proofs, see e.g. Rogers and Williams (1994) IV.4, §§23-28, or Neveu (1975), VII. Quadratic Variation. A non-negative right-continuous submartingale is of class (D) (see e.g. Karatzas and Shreve (1991), 1.4) . So it has a Doob-Meyer decomposition. We specialize this to X 2 , with X E cM 2 : X 2 = X5 + M + A, with M a continuous martingale and A a continuous (so predictable) and increasing process. We write (X) := A here, and call (X) the quadratic variation of X. W3 shall see later that this is a crucial tool for the stochastic integral. We shall further introduce a variant on (X) (the 'angle-bracket process') , called [Xl (the 'square-bracket process') , needed to handle jumps. Quadratic Covariation. We write (M, M) for (M} , and extend (.) to a bilinear form (., .) with two different arguments by the polarization identity: (M, N) : = 41 ( (M + N, M + N) - (M - N, M - N) . (The polarization identity reflects the Hilbert-space structure of the inner product (., .).) If N is of finite variation, M ± N has the same quadratic variation as M, so (M, N) = O. Where there is a Hilbert-space structure, one can use the language of projections, of Pythagoras' theorem etc. , and draw diagrams as in Euclidean space. For a nice treatment of the Linear Model of statistics in such terms (analysis of variance = ANOVA, sums of squares etc.), see Williams (2001), Chapter. 8.
158
5. Stochastic Processes in Continuous Time
5 . 2 . 2 Gaussian Processes
A vector X E JRn has the multivariate normal distribution in n dimensions if all linear combinations X "E� l aiXi of its components are normally distributed ( in one dimension ) ; see e.g. Rao (1973) . Such a distribution is determined by a vector J.L of means and a non-negative definite n x n matrix E of covariances, and is written N ( J.L, E) . Then X has distribution N ( J.L, E) if and only if it has characteristic function a
¢x (t)
:=
'
=
IE ( exp { i t' . X}) = exp { it' . J.L
-
� t' Et}
(t E JRn ) .
Further, i f E is positive definite ( so non-singular ) , X has density
( Edgeworth's formula: F.Y. Edgeworth in 1892 ) . A process X = (X ( t ) ) t > o is Gaussian if all its finite-dimensional distributions are Gaussian. Such-a process can be specified by: 1. a measurable function J.L J.L(t ) with E(X (t ) ) = J.L(t) , the mean function; 2. a non-negative definite function a (s, t) with a (s , t) = cov(X(s), X (t) ) , the covariance function. Gaussian processes have many interesting properties. Among these, we quote Belyaev 's dichotomy: with probability one, the paths of a Gaussian process are either continuous, or extremely pathological: for example, unbounded above and below on any time interval, however short. Naturally, we shall confine attention in this book to continuous Gaussian processes. =
5 . 2 . 3 Markov Processes
X is Markov if for each t, each A B E a (X(s) < t) ( the 'past' ) , : s
E
a(X(s)
s
>
t ) ( the 'future' ) and
lP(A I X (t) , B) = lP(A I X (t) ) .
That is, if you know where you are ( at time t ) , how you got there doesn't matter so far as predicting the future is concerned - equivalently, past and future are conditionally independent given the present. The same definition applied to Markov processes in discrete time. X is said to be strong Markov if the above holds with the fixed time t replaced by a stopping time ( a random variable ) . This is a real restriction of the Markov property in the continuous-time case ( though not in discrete time ) . Perhaps the simplest example of a Markov process that is not strong Markov is given by T
5 . 2 Classes of Processes
1 59
X(t) : = 0 (t ::; r ) , t - r (t � r ) ,
where is an exponentially distributed random variable. Then X is Markov (from the lack of memory property of the exponential distribution) , but not strong Markov (the Markov property fails at the stopping time r) . One must expect the strong Markov property to fail in cases, as here, when 'all the action is at random times'. Another example of a Markov but not strong Markov process is a left-continuous Poisson process - obtained by taking a Poisson process (see below) and modifying its paths to be left-continuous rather than right-continuous. For background and further properties, see e.g. Cox and Miller (1972 ) , Chapter 5, Ethier and Kurtz (1986) , Chapter 4, and Karatzas and Shreve (1991) , §2.6. r
5.2.4 Diffusions
A diffusion is a path-continuous strong Markov process such that for each time t and state x the following limits exist: I-£(t, x ) a 2 (t, x )
: = limh./.O : = limh./.O
-hl lE [(X(t + h ) - X(t) ) I X (t) = -hl lE [(X(t + h ) - X(t)) 2 I X(t)
xl ,
=
x] .
Then 1-£ ( t, x ) is called the drift, a 2 (t, x ) the diffusion coefficient. The term 'diffusion' derives from physical situations involving Brownian motion (§5.3 below) . The mathematics of heat diffusing through a conducting medium (which goes back to Fourier in the early 19th century) is intimately linked with Brownian motion (the mathematics of which is 20th century). The theory of diffusions can be split according to dimension. For one dimensional diffusions, there are a number of ways of treating the theory; see for instance the classic treatments of Breiman (1992) and Doob ( 1953 ) . For higher-dimensional diffusions, there is basically one way: via the stochastic differential equation methodology (or its reformulation in terms of a martin gale problem) . This shows the best way to treat the one-dimensional case: the best method is the one that generalises. It also shows that Markov pro cesses and martingales, as well as being the two general classes of stochastic process with which one can get anywhere mathematically, are also intimately linked technically. We will encounter diffusions largely as solutions of stochas tic differential equations in §5.6; for further background see Grimmett and Stirzaker (2001) , Chapter 13, Revuz and Yor (1991), Chapter 7, and Stroock and Varadhan (1979) .
1 60
5. Stochastic Processes in Continuous Time
5 . 3 Brownian Motion
Brownian motion originates in work of the botanist Robert Brown in 1828. It was introduced into finance by Louis Bachelier in 1900, and developed in physics by Albert Einstein in 1905; see §5.3.4. for background and reference. Trle fact that Brownian motion exists is quite deep, and was first proved by Norbert Wiener (1894-1964) in 1923. In honour of this, Brownian motion is also known as the Wiener process, and the probability measure generating it - the measure IP. on e[O, 1] (one can extend to e[o, 00 ) ) by IP* ( A ) = IP( W. E A)
=
IP({t --+ Wt (w ) } E A)
for all Borel sets A E e[O, 1] - is called Wiener measure. 5 . 3 . 1 Definition and Existence Definition 5.3. 1 . A stochastic process X
= (X(t)h�o is a standard (one dimensional) Brownian motion, BM or BM(IR) , on some probability space (il, :F, IP) , if (i) X (O) = 0 a.s. , (ii) X has independent increments: X (t+u) - X (t) is independent ofa(X(s) : s ::; t) for u � 0, (iii) X has stationary increments: the law of X (t + u) - X(t) depends only on u, (iv) X has Gaussian increments: X (t + u) - X (t) is normally distributed with mean 0 and variance u, X (t + u) - X (t) N(O, u) , (v) X has continuous paths: X(t) is a continuous function of t, i. e. t --+ X (t, w ) is continuous in t for all W E il. rv
The path continuity in (v) can be relaxed by assuming it only a.s.; we can then get continuity by excluding a suitable null-set from our probability space. We shall henceforth denote standard Brownian motion BM(IR) by W (W(t)) (W for Wiener), though B = (B(t)) (B for Brown) is also common. Standard Brownian motion BM(IRd ) in d dimensions is defined by W(t) (W1 (t) , . . . , Wd (t) ) , where Wb . . . , Wd are independent standard Brownian motions in one dimension (independent copies of BM(IR) ) . We turn next to Wiener's theorem, on existence of Brownian motion. The proof, in which we follow Steele (2001), Ch. 3, is a streamlined version of the classical due to Levy in his book of 1948 and Cieselski in 1961 (see below for references) . =
:=
Theorem 5 . 3 . 1 (Wiener) . Brownian motion exists.
5.3 Brownian Motion
161
Covariance. Before addressing existence, we first find the covariance func tion. For s ::::: t, Wt = Ws + (Wt - Ws ) , so as JE(Wt ) = 0,
The last term is JE(Ws )JE(Wt - Ws ) by independent increments, and this is zero, so Cov(W. , Wt) = JE(W; ) =
(s ::::: t) : Cov(Ws , Wt)
s
=
min(s, t) .
A Gaussian process (one whose finite-dimensional distributions are Gaus sian) is specified by its mean function and its covariance function, so among centered (zero-mean) Gaussian processes, the covariance function min(s, t) serves as the signature of Brownian motion. For a ::::: tl < . . . < tn , the joint law of X(td , X(t 2 ) ' . . . ' X (tn ) can be obtained from that of X(td, X (t 2 ) X(h ) , . . . , X (tn) - X (tn - d · These are jointly Gaussian, hence so are X (td , . , X(tn) : the finite-dimensional distributions are multivariate normal. Re call §5.2.2 that the multivariate normal law in n dimensions, Nn (p" E) is specified by the mean vector p, and the covariance matrix E (non-negative definite) . So to check the finite-dimensional distributions of BM stationary independent increments with Wt N(O, t) it suffices to show that they are multivariate normal with mean zero and covariance Cov(Ws , Wt ) = min(s, t) as above. Finite-dimensional Distributions.
.
.
-
-
rv
for t E [0, 1] ) . This gives 00. First, take L 2 [0, 1] , and any complete orthonormal system (cons) (¢n ) on it. Now L 2 is a Hilbert space, under the inner product Construction of HM.
It suffices to construct
BM
t E [0, n] by dilation, and t E [0, 00) by letting n
--+
1 (f , g)
=
J
f (x) g (x) dx
o
so norm
I l f ll
(or
J
fg ) ,
: = (J f2 ) 1/2 ) . By Parseval's identity, 1
J o
00
fg
= "L (f , ¢n ) (g , ¢n) n=O
(where convergence of the series on the right is in L 2 , or in mean square: Il f - L � (f, ¢k ) ¢k ll --+ a as n --+ 00) . Now take, for s, t E [0, 1 ] ' f ( x ) = l [o, s] (x) ,
g (x )
= l [o, t] (x) .
1 62
5 . Stochastic Processes in Continuous Time
Parseval's identity becomes min(s, t) Now take
s
00
=
t
nL=O J cPn(x)dx J cPn(x)dx. 0
0
(Zn) independent and identically distributed N(O, 1 ) , and write t Wt L Zn J cPn(x)dx. n=O 00
=
0
This is a sum of independent random variables. Kolmogorov's theorem on random series ('three-series theorem', Shiryaev (1996) , IV, Theorem 3) says that it converges a.s. if the sum of the variances converges. This is t by above. So the series above converges a.s., and by excluding the exceptional null set from our probability space (as we may), everywhere.
§2,
L:=o (J� cPn(x)dx)2,
=
The Haar System. Define
on [0, ! ) , else. Write o ( t 1 , and for � 1 , express in dyadic form as = + k for a unique j 0, 1 , . . . and k 0, 1 , . . . - 1 . Using this notation for j, k throughout, write - k) (t) (so has support [k/2j , (k + 1 / ) . So if ( f. ) have the same j, 0, while if have different js, one can check that is on half its support, on the other half, so = 0. Also is on [k/2 , (k + 1 ) /2 ] , so 1. Combining:
H )
==
=
n
n
, 2j Hn 2j/2 H(2j t ) 2j ) j _2(jj t+h)/2J H� J HmHn c5mn,
n
=
2j
n,
:=
Hn HmHn 2(j t+h)/2 H� 2j ==
m, n
m
m, n
=
n
HmHn J HmHn
=
and form an orthonormal system, called the Haar system. For complete ness: the indicator of any dyadic interval [k/ (k + 1 /2 ] is in the linear span of the (difference two consecutive and scale) . Linear combinations of such indicators are dense in £2 [0, 1] . Combining: the Haar system is a complete orthonormal system in £2 [0, 1] .
(Hn) Hn
Hns
2j,
) j
(Hn)
The Schauder System. We obtain the Schauder system by integrating the Haar system. Consider the triangular function (or 'tent function')
5 . 3 Brownian Motion
2t
L\(t)
�
1:
(1
�
163
t) on [�, 1] , else.
Llo (t) t, Lll(t) := Ll(t), and define the nth Schauder function Lln Lln(t) : = Ll(2jt - k) ( n 2j + k 1). Note that Lln has support [k/2j, (k + 1)/2j] ( so is 'localized' on this dyadic
Write by
:=
;:::
=
interval, which is small for n, j large). We see that t
Jo H(u)du
and similarly
where
lo
=
=
� Ll(t),
t
Jo Hn(u)du = lnLln(t),
1 and for n ;:::
1,
In �2 Tj/2 (n 2j + k 1). The Schauder system ( Ll n) is again a cons on L2 [0, 1] . =
Theorem 5 . 3 . 2 . For (Zn ) O' independent N(O,
as above,
converges uniformly on Brownian motion.
: = nL=O lnZn Lln(t)
[0, 1] ' a. s . The process
Lemma 5 . 3 . 1 . For Zn independent N(O, 1 ) ,
for some random variable
So for any
1) random variables,
In, Lln
00
Wt
Proof.
;:::
=
X
For x > 1 ,
a
> 1,
C < 00 a. s.
W
( Wt
t E
[0, 1 ] ) is
164
5. Stochastic Processes in Continuous Time
JP ( J Zn J > y'2a log n)
Since E n- a
< 00
for a > 1 ,
:::;
y'2/ 7l' exp{ -a log n}
= y'2/ the Borel-Cantelli lemma gives
JP ( J Zn J > y'2a log n
So
for infinitely many n)
C : = sup � n > 2 y'log n
< 00
=
7l'
n-a .
0.
a.s. o
Proof of Theorem 5.3.2.
1 . Convergence.
Choose J and M � 2 J ; then 00
L
n =M
In J Zn J Lln (t) :::;
00
C L In y'log nLln (t) . M
The right is majorized by 00 2i _ 1
�
C L L Tj/ 2 y'j + 1 Ll 2i +k (t) J k =O (perhaps including some extra terms at the beginning, using 2j + k < 2j + 1 , log n :::; (j + 1) log 2, and Lln ( . ) � 0, so the series is absolutely conver gent). In the inner sum, only one term is non-zero (t can belong to only one dyadic interval [k/2j , (k + 1)/2j ) ) , and each Lln (t) E [0, 1] . So n =
00
C L "21 T j / 2 v'.7+1 \:It E [0, 1] ' j=J and this tends to ° J -+ 00 , so M -+ 00 . So the series E In Zn Lln (t) is absolutely and uniformly convergent, a.s. Since continuity is preserved under uniform convergence and each Lln (t) (so each partial sum) is continuous, Wt is continuous in t. LHS :::; as
2. Covariance.
as
By absolute convergence and Fubini's theorem,
So the covariance is
5 . 3 Brownian Motion
2: n
165
t
s
J ¢ J ¢n
=
m
0
0
min ( s, t) ,
by the Parseval calculation above. Take h , . . . , tm E [0, 1] ; we have to show that (W(t1 ) , . . . , W(t n ) ) is multivariate normal, with mean vector 0 and covari ance matrix ( min ( ti, tj ) ) . The multivariate characteristic function is 3. Joint Distributions.
which by independence of the Zn is
Since each Zn is N(O, 1), the right-hand side is
The sum in the exponent on the right is tj
tk
2: l� ?= 2: Uj Uk .:::l n (tj )Lln (tk ) = ?= 2: Uj Uk 2: J Hn (u)du J Hn (u)du, 00
m
m
m
n =O 3=1 k =1
m
3 =1 k =1
giving
m
00
n=O 0
0
m
2: 2: Uj Uk min(tj , tk) ,
j =1 k = 1 by the Parseval calculation, as (Hn ) are a cons. Combining,
This says that (W(t1 ) , . . . , W(t n ) ) is multinormal with mean 0 and covari ance function min ( tj , t k) required. This completes the construction of BM. o
as
5. Stochastic Processes in Continuous Time
166
Wavelets. The Haar system ( and the Schauder system obtained by integration from it, are examples of wavelet systems. The original function, H or is a mother wavelet, and the 'daughter wavelets' are obtained from it by dilation and translation. The expansion of the theorem is the wavelet expansion of BM with respect to the Schauder system For any f E C[O, l] , we can form its wavelet expansion
Hn),
(.1n)
.1,
(.1n).
)
00
f ( t = n=O cn.1n(t ), with wavelet coefficients Cn. Here Cn are given by k + � ) - "21 [f ( 2ki ) + f (k----v-+ 1)] . cn = f ( � This is the form that gives the .1n ( ) term its correct triangular influence, localized on the dyadic interval [k/2i , (k + 1)/2i ] . Thus for f EM, Cn lnZn, with In, Zn as above. The wavelet construction of BM above is, in modern language, the classical 'broken-line' construction of BM due to Levy in his L
.
=
book of 1948 - the Levy representation of EM using the Schauder system, and extended to general cons by Cieselski in 1961; see McKean (1969) , §1.2 for a textbook account. The earliest expansion of BAI 'Fourier-Wiener ex pansion' - used the trigonometric cons ( Paley and Zygmund 1930-32, Paley, Wiener and Zygmund 1932) ; see Kahane (1985), Preface and §16.3. -
We shall see that Brownian motion is a fractal, and wavelets are a useful tool for the analysis of fractals more generally. For background, see e.g. Holschneider (1995) , §4.4. Note.
For further background, see any measure-theoretic text on stochastic pro cesses. A treatment starting directly from our main reference of measure theoretic results ( Williams (1991)) is Rogers and Williams (1994) , Chapter 1 . The classic is Doob (1953), VIII.2. Excellent modern texts include Karatzas and Shreve (1991) and Revuz and Yor ( 1991) ( see particularly Karatzas and Shreve (1991) , §2.2-4 for construction ) . From the mathematical point of view, Brownian motion owes much of its importance to belonging to all the important classes of stochastic processes: it is ( strong ) Markov, a ( continuous ) martingale, Gaussian, a diffusion, a Levy process etc. From an applied point of view, as its diverse origins Brown's work in botany, Bachelier's in economics, Einstein's in statistical mechanics etc. - suggest, Brownian motion has a universal character, and is ubiquitous both in theory and in applied modeling. The universal nature of Brownian motion as a stochastic process is simply the dynamic counterpart - where we work with evolution in time - of the universal nature of its static counterpart, the normal ( or Gaussian ) distribution - in probability,
5.3 Brownian Motion
167
statistics, science, economics etc. Both arise from the same source, the central limit theorem. This says that when we average large numbers of independent and comparable objects, we obtain the normal distribution (see §2.8) in a static context, or Brownian motion in a dynamic context (see §5. 1 1 for the machinery - weak convergence - needed to handle such limiting results for stochastic processes; cf. §2.6 for its static counterpart) . What the central limit theorem really says is that, when what we observe is the result of a very large number of individually very small influences, the normal distribution or Brownian motion will inevitably and automatically emerge. This explains the central role of the normal distribution in statistics - basically, this is why statistics works. It also explains the central role of Brownian motion as the basic model of random fluctuations, or random noise as one often says. As the word noise suggests, this usage comes from electrical engineering and the early days of radio (see e.g. Wax (1954)) . When we come to studying the dynamics of stochastic processes by means of stochastic differential equations (§5.8 below), we will usually find a 'driving noise' term. The most basic driving noise process is Brownian motion; its role is to represent the 'random buffeting' of the object under study by a myriad of influences which we have no hope of studying in detail - and indeed, no need to. By using the central limit theorem, we make the very complexity of the situation work on our side: Brownian motion is a comparatively simple and tractable process to work with - vastly simpler than the underlying random buffeting whose effect it approximates and represents. The precise circumstances in which one obtains the normal or Gaussian distribution, or Brownian motion, have been much studied (this was the predominant theme in Levy's life's work, for instance) . One needs means and variances to exist (which is why the mean p, and the variance 0' 2 are needed to parametrize the normal or Gaussian family) . One also needs either independence, or something not too far removed from it, such as suitable martingale dependence (for martingale central limit theory, see the excellent book Hall and Heyde (1980)) or Markov dependence (see Ethier and Kurtz (1986) ) . 5.3.2 Quadratic Variation of Brownian Motion
Recall that a N(p" 0' 2 ) distributed random variable � has moment-generating function JvI (t) .IE (exp{tO ) exp P,t + (j 2 t 2 ;=
=
{ � }.
We take p, 0 below; we can recover the general case by adding p, back on. So, for � N(0, (j 2 ) distributed, =
168
5 . Stochastic Processes in Continuous Time
{�0'2e } 1 �0'2t2 ;! ( �0'2 t2 ) 2 (t6) = 1 � 0'2 t 2 ! 0' 4t4 ( t6 ) . 4! 2! As the Taylor coefficients of the moment-generating function are the mo ments (hence the name moment-generating function! ) , JE (e) Var (O = 0'motion 2 ,JE(t;,on4) =JR3, 0'4this, sogivesVar (e) = JE (t;,4 ) - [JE (eW 20'4. For W Brownian JE (W(t)) 0, Var (W(t)) JE« W(t)2) = t, Var (W(t)2) 2t2. In particular, for t > 0 small, this shows that the variance of W(t ) 2 is neg ligible compared with its expected value. Thus, the randomness in W(t ) 2 is negligible compared to its mean for t small. This suggests that if we take a fine enough partition P of [0, tJ a finite set of points 0 = to tl . t n t with grid mesh I I P I I max I ti - ti- Il small enough - then writing LlW(ti ) W(ti ) - W(ti - d and Llti ti - ti-I , n �(LlW(ti))2 i=1 will closely resemble n n n �JE i=1 - ti d t. i=1 « LlW(ti ))2) � i=1 Llti = �(ti This is in fact true: n n � i=1 (LlW(ti))2 -+ � i=1 Llti = t in probability (max I ti ti - l 0) . This limit is called the quadratic variation of W over [0 , tJ : Start with the formal definitions. A partition 1l'n of [0, tJ is a finite set of points tni such that 0 = tno t n = tj the mesh of the tn partition is l 1l'n l maxi( tni - tn , ( i - I » ) , the maximal subinterval length. We consider nested sequences (1l'n ) of partitions (each refines its predecessors by adding further partition points) , with l 1l'n l O. Call (writing ti for tni for M(t) = exp +
+
=
+
+
+0
+
0
=
=
=
=
=
:=
=
-
<
<
. . <
:=
:=
=
-
=
-
:=
<
l < . . . <
I
-+
, k (n )
-+
simplicity)
the quadratic variation of W on The following classical result is due to Levy (in his book of 1948); the proof below is from Protter (2004), §I.3.
(1l'n ) .
Theorem 5.3.3 (Levy) . The quadratic variation of a Brownian path
over [0, tJ exists and equals
t, in mean square (and hence in probability): (W)t t. =
169
5.3 Brownian Motion
Proof.
=
I) (LliW) 2 - (Llit)} i
where since Lli W N(O, Llit), IE [(Lli W) 2 ] = Llti , so the Yi have zero mean, and are independent by independent increments of W . So rv
since variance adds over independent summands. Now as LliW N(O, Llit) , (LliW)/VLlit N(O, 1) , so (LliW) 2 /Llit where Z N(O, 1). So Yi (LliW) 2 - Llit (Z 2 - I)Llit, and rv
rv
rv
rv
Z2 ,
rv
=
L IE [(Z 2 - 1) 2 ] (Llit) 2 c L(Llit? i i So writing c for IE [(Z2 _ 1 ) 2 ] , Z N(O, 1), a finite constant. But IE [ (7rn W - t) 2 ]
=
=
rv
L (Llit) 2 ::; m� Llit x L Llit = l 7rn l t, i i •
giving as required.
D
Remark 5. 3. 1 . 1 . From convergence in mean square, one can always extract an a.s. convergent subsequence. 2. The conclusion above extends in full generality to a.s. convergence, but an easy proof requires the reversed martingale convergence theorem, which we omit. 3. There is an easy extension to a.s. convergence under the extra restriction L n l 7rn l < 00, using the Borel-Cantelli lemma and Chebychev's inequality. 4. If we consider the theorem over [0, t + dt] , [0, t] and subtract, we can write the result formally as (dWt ) 2 dt. This can be regarded either as a convenient piece of symbolism, or acronym, or as the essence of Ita calculus. =
1 70
5. Stochastic Processes in Continuous Time
Note. The quadratic variation as defined above involves the limit of the quadratic variation over every sequence of partitions whose maximal subin terval length tends to zero. We stress that this is not the same as taking the supremum of the quadratic variation over all partitions - indeed, this would give 00, rather than t (by the law of the iterated logarithm for Brownian mo tion, see Rogers and Williams (1994) , I.l6) . This second definition - strong quadratic variation - is the appropriate one in some contexts, such as Lyons' theory of rough paths, but we shall not need it, and quadratic variation will always be defined in the first sense in this book.
Suppose now we look at the ordinary variation E I Ll W (t) I , rather than the quadratic variation E (LlW(t»2 . Then instead of E (LlW(t» 2 E Llt t , we get E I LlW(t) 1 E .,f3i. Now for Llt small, .,f3i is of a larger order of magnitude than Llt. So if E Llt t converges, E .,f3i diverges to +00. This suggests: rv
=
rv
=
Corollary 5 . 3 . 1 (Levy) . The paths of Brownian motion are of unbounded
+ 00 on every interval. Because of the above corollary, we will not be able to define integrals with respect to Brownian motion by a path-by-path procedure (for BM the rel evant convergence in the above results in fact takes place with probability one) . However, turning to the class of square-integrable continuous martin gales CM2 (continuous square-integrable martingales) , we find that these processes have finite quadratic variation, but all variations of higher order are zero and, except for trivial cases, all variations of lower order are infinite with positive probability. So quadratic variation is indeed the right variation to study. Returning to Brownian motion, we observe that for s < t , variation - their variation is
JE(W(t) 2 IFs )
=
JE( [W(s)
=
W( S) 2
+ ( W (t) - W(s) W I Fs )
+ 2W(s)JE [(W(t) - W(s» IFsl
+JE [(W(t) - W(s» 2 IFs l W(s) 2 + 0 + (t - s) .
So W(t)2 - t is a martingale. This shows that the quadratic variation is the adapted increasing process in the Doob-Meyer decomposition of W2 (recall that W2 is a nonnegative submartingale and thus can be written as the sum of a martingale and an adapted increasing process) . This result extends to the class cM2 (and indeed to the broader class of local martingales (see §5. 1O below» . Theorem 5.3.4. A martingale M E CM2 is of finite quadratic variation (M) , and (M) is the unique continuous increasing adapted process vanishing at zero with M 2 - (M) a martingale.
5 . 3 Brownian Motion
171
The quadratic variation result above leads to Levy's 1948 result, the mar tingale characterization of Brownian motion. Recall that W (t) is a continuous martingale with respect to its natural filtration (Ft ) and with quadratic vari ation t . There is a remarkable converse: Theorem 5.3.5 ( Martingale Characterization of BM ) . If M is any continuous, square-integrable (local) (Ft ) -martingale with M(O) 0 and quadratic variation t , then M is an (Ft ) -Brownian motion. Expressed differently this is: If M is any continuous, square-integrable (local) (Ft ) -martingale with M(O) = 0 and M(t)2 - t a martingale, then M is an (Ft ) -Brownian mo =
tion.
In view of the fact that (W) (t ) = t, a further useful fact about Brownian motion may be guessed: If M is a continuous martingale then there exists a Brownian motion W(t) such that M(t) W( (M) (t) ) , i.e. the martingale M can be transformed into a Brownian motion by a random time-change. These results already imply that Brownian motion is the fundamental con tinuous martingale, and we will provide further evidence of this throughout the remainder of this chapter. For further details and proofs, see e.g. Karatzas and Shreve (1991) , §§1.5, 3.4, Rogers and Williams (2000), I.2., Revuz and Yor (1991 ) , I.2, IV. 1 . =
5.3.3 Properties of Brownian Motion Brownian Scaling.
For any c > 0, write
with W BM. Then We is Gaussian, with mean 0, variance c - 2 x c2t = t and covariance
min(s, t) = Cov( W(s), W(t)) . Also We has continuous paths, W does. So We has all the properties of Brownian motion. So, We is Brownian motion. It is said to be derived from W by Brownian scaling with scale-factor c > O. Since (W( ut) : t 2 0) ( v'uW(t) : t 2 0) in law, Vu > 0, W is called self-similar with index 1/2 ( Bingham, Goldie, and Teugels (1987), §8.5) . Brownian motion is thus a fractal. A piece of Brownian path, looked at under a microscope, still looks Brownian, however much we 'zoom in and magnify'. Of course, the contrast with a function f with some smoothness is stark: a differentiable function begins to look straight under repeated zoom ing and magnification, because it has a tangent. =
as
=
1 72
5. Stochastic Processes in Continuous Time
Time-Inversion.
Write
Xt
:=
Then X has mean 0 and covariance Cov(Xs , Xt )
=
tW( l /t) .
s t.Cov(B( l / s ) , B( l /t))
=
s t. min( l / s , l /t)
min(t, s ) = min(s, t) . Since X has continuous paths also, as above, X is Brownian motion. We say that X is obtained from W by time-inversion. This property is useful in transforming properties of BM 'in the large' (t � 00 ) to properties 'in the small ' , or local properties (t � 0) . For example, one can translate the law of the iterated logarithm (LIL) from global to local form. Using time-inversion, we see that - as the zero-set of Brownian motion Z := {t 2: 0 Wt = O} is unbounded (contains infinitely many points increas ing to infinity) , it must also contain infinitely many points decreasing to zero. That is, any zero of Brownian motion (e.g., time t 0, as we are choosing to start our BM at the origin) produces an 'echo' - an infinite sequence of zeros at positive times decreasing to zero. How can we hope to graph such a function? (We can't!) How on earth does it manage to escape from zero, when hitting zero at one time, say, forces zero to be hit infinitely many times in any time-interval [ + ] ( > O)? The answer to these questions in volves excursion theory, one of Ito's great contributions to probability theory (1 9 70) . When BM is at zero, it is as likely to leave to the right as to the left, by symmetry - but it will leave, immediately, with probability one. These 'excursions away from zero' - above and below - happen according to a Pois son random measure governing the excursions - the excursion measure - on path-space. As there are infinitely many excursions in finite time-intervals, the excursion measure has infinite mass - it is a-finite but not finite. For details of the form of the Brownian excursion, background, proofs etc., we refer to Rogers and Williams (2000), or Bertoin (1996) , IV. Note however that, far from being pathological as one might at first imagine, the behaviour described above is what one expects of a normal, well-behaved process: the technical term is ' {O} is regular for 0' (Bertoin ( 1996) , IV) , and 'regular' is used to describe good, not bad, behaviour. Since Brownian motion has continuous paths, its zero-set Z is closed. Since each zero is, by above, a limit-point of zeros, Z is a perfect set. The zero-set is also uncountable ('big', in one sense) , but Lebesgue-null - has Lebesgue measure zero ('small', in another sense) . The machinery for measuring the size of small sets such as Z is that of Hausdorff measures. The Hausdorff measure properties of Z have been studied in great detail. The zero-set Z has a fractal structure, which it inherits from that of W under Brownian scaling. The natural machinery for studying the fine detail of the structure of fractals is, as above, that of Hausdorff measures. =
:
=
u
u, u
t
t
5.3 Brownian Motion
1 73
Parameters of Brownian Motion - Estimation and Hypothesis Test ing. If we form J.Lt + aWt - or replace N(O, t ) by N ( J.Lt , at) in the defini
tion of Brownian increments - we obtain a Levy process that has contin uous paths and Gaussian increments, called Brownian motion with drift J.L and diffusion coefficient a, BM(J.L, a ) , rather than standard Brownian mo tion BM = BM(O, 1 ) as above. By above, the quadratic variation of a seg ment of BM(J.L, a ) path on the time-interval [0, t] is a 2 t, a.s. So, if we can observe a Brownian path completely over any time-interval however short, then in principle we can determine the diffusion coefficient a with probability one. In particular, we can distinguish between two different a s - a1 and a2 , say - with certainty. In technical language: the Wiener measures IP.l and IP. 2 representing these two Brownian motions with different as on function space are mutually singular. By contrast, if the two as are the same, the two measures are mutually absolutely continuous. We can then test a hypothesis Ho : J.L J.Lo against an alternative hypothesis Hl : J.L = J.Ll by means of the appropriate likelihood ratio ( LR) . To find the form of the LR, we shall use Girsanov's theorem, which we discuss in §5.7. In practice, of course, we cannot observe a Brownian path exactly over a time-interval: there would be an infinite amount of information, and our ability to sample is finite. So one must use an appropriate discretization - and then we lose the ability to pick up the diffusion coefficient with certainty. Problems of this kind are not only of theoretical interest, but also important in practice. In mathematical finance, when the driving noise is modeled by Brownian motion, the diffusion coefficient is called the volatility, the parameter that describes how sensitive a stock-price is to price-sensitive information ( or economic uncertainty, or driving noise ) . Volatility enters explicitly into the most famous formula of mathematical finance, the Black-Scholes formula. Volatility estimation is of major importance. So too is volatility modeling: alas, in real financial data the assumption of constant volatility is usually untenable for detailed modeling, and one resorts instead to more complicated models, say involving stochas tic volatility, §7.3. For recent work here, see Barndorff-Nielsen and Shephard =
(2001).
5.3.4 Brownian Motion i n Stochastic Modeling
To begin at the beginning: Brownian motion is named after Robert Brown (1773- 1858) , the Scottish botanist who in 1828 observed the irregular and haphazard - apparently random - motion of pollen particles suspended in water. Similar phenomena are observed in gases - witness the familiar sight of dust particles dancing in sunbeams. During the 19th C., it became sus pected that the explanation was that the particles were being bombarded by the molecules in the surrounding medium - water or air. Note that this picture requires three different scales: microscopic ( water or air molecules ) , mesoscopic ( pollen or dust particles ) and macroscopic ( you, the observer ) . These ideas entered the kinetic theory of gases, and statistical mechanics,
1 74
5. Stochastic Processes in Continuous Time
through the pioneering work of Maxwell, Gibbs and Boltzmann. However, some scientists still doubted the existence of atoms and molecules (not then observable directly) . Enter the birth of the quantum age in 1900 with the quantum hypothesis of Max Planck (1858-1947) . Louis Bachelier ( 1870-1946) introduced Brownian motion into the field of economics and finance in his thesis Theorie de la speculation of 1900. His work lay dormant until much later; we will pick up its influence on Ito, Samuelson, Merton and others below. Albert Einstein (1879-1955), in his work of 1905, attacked the problem of demonstrating the existence of molecules, and for good measure estimating Avogadro 's number (c. 6.02 10 2 3 ) experimentally. Einstein realized that what was informative was the mean square displacement of the Brownian particle - its diffusion coefficient, in our terms. This is proportional to time, and the constant D of proportionality, WarWt = Dt, is informative about Avogadro's number (which, roughly, gives the scale factor in going from the microscopic to the macroscopic scale) . This Einstein relation is the prototype of a class of results now known in statistical me chanics as fluctuation-dissipation theorems. All this was done without any proper mathematical underpinning. This was provided by Wiener in 1923, as mentioned earlier. Quantum mechanics emerged in 1925-28 with the work of Heisenberg, Schrodinger and Dirac, and with the 'Copenhagen interpretation' of Bohr, Born and others, it became clear that the quantum picture is both inescapable at the subatomic level and intrinsically probabilistic. The work of Richard P. Feynman (1918-1988) in the late 1940s on quantum electrodynamics (QED) , and his approach to quantum mechanics via 'path integrals', introduced Wiener measure squarely into quantum theory. Feynman's work on quan tum mechanics was made mathematically rigorous by Mark Kac ( 1914-1984) (QED is still problematic!) ; the Feynman-Kac formula (giving a stochastic representation for the solutions of certain PDEs) stems from this. Subsequent developments involve Ito calculus, and we shall consider them in §5.6 below. Suffice it to say here that Ito's work of 1944 picked up where Bachelier left off, and created the machinery needed to use Brownian motion to model stock prices successfully (note: stock prices are nonnegative - pos itive, until the firm goes bankrupt - while Brownian motion changes sign, indeed has lots of sign changes, as we saw above when discussing its zero-set Z). The economist Paul Samuelson in 1965 advocated the Ito model - geo metric Brownian motion - for financial modelling. Then in 1973 Black and Scholes gave their famous formula, and the same year Merton derived it by Ito calculus. Today Ito calculus is a fundamental tool in stochastic modeling generally, and the modelling of financial markets in particular. In sum: wherever we look - statistical mechanics, quantum theory, eco nomics, finance - we see a random world, in which much that we observe is X
5.4 Point Processes
1 75
driven by random noise, or random fluctuations. Brownian motion gives us an invaluable model for describing these, in a wide variety of settings. This is statistically natural. The ubiquitous nature of Brownian motion is the dy namic counterpart of the ubiquitous nature of the normal distribution. This rests ultimately on the Central Limit Theorem ( CLT) - known to physicists as the Law of Errors - and is, fundamentally, why statistics works. 5 .4 Point Processes
Suppose that one is studying earthquakes, or volcanic eruptions. The events of interest are sudden isolated shocks, which occur at random instants, the history of which unfolds with time. Such situations occur in financial set tings also: at the macro-economic level, the events might be stock-market crashes, devaluations etc. At the micro-economic level, they might be indi vidual transactions. In other settings, the events might be the occurrence of telephone calls, insurance claims, accidents or admissions to hospital etc. The mathematical framework needed to handle such situations is that of point processes. A point process is a stochastic process whose realizations are, not paths as above, but counting measures: random measures f..L whose value on each interval I ( or Borel set, more generally is a non-negative integer f..L ( I) . Often, each point may come labeled with some quantity ( the size of the transaction, or of the earthquake on the Richter scale, for instance) , giving what is called a marked point process. We turn below to the simplest and most fundamental point process, the Poisson process, and the simplest way to build it. Stochastic processes with stationary independent increments are called Levy processes ( after the great French probabilist Paul Levy (1886-1971» ; see §5.5 below, and for a modern textbook reference, see Bertoin (1996) . The two most basic prototypes of Levy processes are Poisson processes and Brownian motion (§5.3) . We include below a number of results without proof. For proofs and background, we refer to any good book on stochastic processes, e.g. Dur rett (1999) .
)
5.4. 1 Exponential Distribution
exponential(>')T !P(T ::; t) e->.t t ;::: JE(T) War(T) !P(T > + ti T > t) !P(T >
A random variable is said to have an exponential distribution with rate >., T if 1 for all o. Recall 1/>. and 1/>. 2 . Further important properties are or
=
=
=
-
=
Proposition 5 .4. 1 . (i) Exponentially distributed random variables possess
the 'lack of memory ' property:
s
=
s
).
176
5. Stochastic Processes in Continuous Time
(ii) Let T1 , T2 , Tn be independent exponentially distributed mndom vari ables with pammeters AI , A 2 , . . . , A n resp. Then min { TI ' T2 , , Tn } is exponentially distributed with mte Al + A 2 + . . . + A n · (iii) Let TI , T2 , Tn be independent exponentially distributed mndom vari ables with pammeter A. Then G n = TI +T2 + . . + Tn has a Gamma(n, A) distribution. That is, its density is (At) n - l lP(Gn = t) = Ae-A t for t :::: 0 (n I ) ! •
.
.
•
•
•
•
·
•
.
_
5.4.2 The Poisson Process Definition 5.4. 1 . Let tl , t 2 , . . . t n be independent exponential(A) mndom variables. Let Tn = tl , + . + t n for n :::: 1, To = 0, and define N(s) = .
max { n : Tn ::; s } .
.
Interpretation: Think of t i the time between arrivals of events, then Tn is the arrival time of the nth event and N(s) the number of arrivals by time s. as
Lemma 5.4. 1 . N ( s ) has a Poisson distribution with mean AS.
The Poisson process can also be characterised via Theorem 5.4. 1 . If {N(s) , s :::: O} is a Poisson process, then
(i) N(O) = 0, (ii) N(t + s) - N(s) = Poisson(At) , and (iii) N(t) has independent increments. Conversely, if (i), (ii) and (iii) hold, then {N(s) , s :::: O} is a Poisson process.
The above characterization can be used to extend the definition of the Poisson process to include time-dependent intensities Definition 5.4.2. We say that {N(s) , s :::: O} is a Poisson process with mte
A (r) if (i) N(O) = 0, (ii) N(t + s) - N(s) is Poisson with mean J: A(r)dr, and (iii) N(t) has independent increments. 5.4.3 Compound Poisson Processes
We now associate Li.d. random variables Yi with each arrival and consider S(t)
=
YI
+ . . . + YN( t ) ,
S(t)
=
0
if N(t) = o.
5.4 Point Processes
1 77
Theorem 5.4.2. Let ( Yi ) be i. i. d. and N be an independent nonnegative
integer random variable, and S as above. (i) If lE(N) < 00, then lEeS) = lE(N)lE(Yt ) . (ii) If lE(N 2 ) < 00, then Ware S) = lE(N) War(Yl ) + War(N) (lE(Yl ) ) 2 . (iii) If N = N(t) is Poisson(At), then WareS) = tA(lE(Yl ) ) 2 .
A typical application in the insurance context is a Poisson model of claim arrival with random claim sizes. Again we are interested in bounds on the ruin probability. Let (X (t) ) model the capital of an insurance company. The inital capital > 0, insurance payments arrive continuously at a constant rate c > 0 Xo and claims are received at random times t! , t 2 , . ' " where the amounts paid out at these times are described by nonnegative random variables Yl , Y2 , Assuming ( i ) arrivals according to a Poisson process, ( ii ) Li.d. claim sizes Yi with F(x) = IP(Yl � x) , F(O) = 0 , J1 = Iooo xdF(x) < 00 , and ( iii ) independence of arrival and claim size process, we use a model = u
•
as
X (t)
= u
+
ct - Set) .
.
. .
(5. 1)
Using the above we see that a natural requirement is c > AJ1. We are again interested in the probability of ruin. Write T=
inf {t 2': 0 : X(t) � O} ,
then the probability of ruin is JP ( T < 00) and the probability of ruin before time t is JP(T � t) . Theorem 5.4.3. Let R be the (unique) root of the equation 00
� J eTX (1 - F(x))dx = 1 . o
Then for any t and thus JP ( T
2': 0
<
00) � e - Ru .
5.4.4 Renewal Processes
Suppose we use components - light-bulbs, say - whose lifetimes Xl , X2 , are independent, all with law F on (0, 00. The first component is installed new, used until failure, then replaced, and we continue in this way. Write •
.
•
1 78
5. Stochastic Processes in Continuous Time
n Sn : = L Xi , Nt := max{k : Sk
<
t} .
Then N = (Nt : t :::=: 0 ) is called the renewal process generated by F; it i s a counting process, counting the number of failures seen by time t. The law F has the lack-oj-memory property iff the components show no aging - that is, if a component still in use behaves as if new. The condition for this is JP(X > s + t l X > s) = JP(X > t) (s , t > 0) , or JP(X > s + t) = JP(X > s)JP(X > t) . Writing F(x) := 1 F(x) (x :::=: 0) for the tail of F , this says that -
F(s + t) = F(s)F(t) (s, t :::=: 0) .
Obvious solutions are for some
A>
0 - the exponential law E(A). Now J(s + t) = f(s)f (t) (s , t :::=: 0)
is a 'functional equation' - the Cauchy functional equation and it turns out that these are the only solutions, subject to minimal regularity (such one sided boundedness, as here - even on an interval of arbitrarily small length!). For details, see e.g. Bingham, Goldie, and Teugels (1987) , § l . l . l . So the exponential laws E(A) are characterized by the lack-of-memory property. Also, the lack-of-memory property corresponds in the renewal con text to the Markov property. The renewal process generated by E(A) is called the Poisson (point) process with rate A, Ppp(>.. ) . So: among renewal processes, the only Markov processes are the Poisson processes. When we meet Levy processes we shall find also: among renewal processes, the only Levy processes are the Poisson processes. It is the lack of memory property of the exponential distribution that (since the inter-arrival times of the Poisson process are exponentially dis tributed) makes the Poisson process the basic model for events occurring 'out of the blue'. For basic background, see e.g. Grimmett and Stirzaker (2001) , §6.8. Excellent textbook treatments are Embrechts, Kliippelberg, and Mikosch (1997) (motivated by insurance and finance applications) and Daley and Vere-Jones (1988) (motivated by the geophysical applications mentioned above) . -
as
5.5 Levy Processes
1 79
5 . 5 Levy Pro cesses
5 . 5 . 1 Distributions The Levy-Khintchine Formula. The form of the general infinitely-divisible distribution was studied in the 1930s by several people (including Kolmogorov and de Finetti) . The final result, due to Levy and Khintchine, is expressed in CF language - indeed, cannot be expressed otherwise. To describe the CF of the general i.d. law, we need three components: (i) a real (called the drift, or deterministic drift), (ii) a non-negative (called the diffusion coefficient, or normal component, or Gaussian component) , (iii) a (positive) measure on JR (or JR \ {O}) for which
a
a
f..L 00 J00 min(1, I x I 2 )f..L (dx) J I x I 2 f..L (dx) J f..L (dx) < 00 ,
-
that is, called the
< 00 ,
Ixl < l Levy measure.
Ix l � l
< 00 ,
The result is (recall §2.1O)
Theorem 5 . 5 . 1 (Levy-Khintchine Formula) . A function ¢ is the char
acteristic function of an infinitely divisible distribution iff it has the form ¢ (u ) =
exp { - !li (u)} (u E JR) ,
where
!li (u) = u � a2 u2 J - eiux iux1C- l, 1 ) (x)f..L (dx) (5.2) 2: 0 f..L . 1. Normal N(f..L , (2). Here 0, f..L = O. 2. Compound Poisson CP(l, Here a = 0, f..L has finite total mass (far from true in general!), say, and f..L Then J� I x l df..L ( x) and J� l xf..L ( dx). 3. Cauchy. See below (under 'Stability') . ia +
for some real
a,
a
+
+
(1
and Levy measure
Examples.
F) .
l
a =
=
a=
IF.
< 00 ,
l
Recall the classical Central Limit Theorem . . . are iid with mean f..L and variance a2, Sn Xk , then X , 2 - nf..L ) /(aVn) is asymptotically standard normal: ( S:�;t � x) q>(x) := vk ] e - �y2 dy (n -+ (0) "Ix E JR. 00
The Central Limit Problem.
(CLT) . If X l , (Sn
1P
=
-+
-
E�
1 80
5. Stochastic Processes in Continuous Time
Self-decomposability. Recall that if, in the central limit problem of §2.1O, we restrict from (two-suffix) triangular arrays ( Xn k ) to (one-suffix) sequences Xn ) , we come to a subclass of the infinite-divisible laws I, called the class of self-decomposable laws SD : SD c I.
(
Stability. Suppose we now restrict to identical distribution as well as inde pendence in SD above. That is, we seek the class of limit laws of random walks Sn E� Xk with ( Xn ) iid - after an affine transformation (centering and scaling) - that is, for all limit laws of ( Sn - n ) / bn . It turns out that the class of limit laws so obtained is the same as the class of laws for which Sn has the same type as Xl - i.e. the same law to within an affine transformation, or a change of location and scale. Thus the type is 'stable' (invariant, un changed) under addition of independent copies, whence such laws are called stable. They form the class S: =
a
S
c
SD
c
I.
It turns out that this class of stable laws can be described explicitly by parameters - four in all, of which two (location and scale, specifying the law within the type) are of minor importance, leaving two essential parameters, called the index E (0, 2)) and the skewness parameter E [- 1, 1)) . To within type, the Levy exponent is
Q (Q
f3 (f3
!Ji(u) l u l O« l - if3sgn(u) tan �1l'Q) for Q =I- 1 (0 Q 1 or 1 Q 2) and !Ji(u) l u l ( l + if3sgn(u) log l u I ) if Q 1 . The Levy measure is absolutely continuous, with density of the form + dx/x H o< x 0 , M(dx) - { cc_dx/ l xI H o< x 0, with c+ , c- � 0 and f3 (c+ - c_ )/(c+ + c_ ) . For proof, see Gnedenko and Kolmogorov (1954) , Feller (1968) , XVIII.6, or Breiman (1992) , §§9.8-1 1 . The case Q 2 (for which f3 drops out) gives the normal/Gaussian case, already familiar. The case Q 1 and f3 0 gives the (symmetric) Cauchy law above. The case Q 1 , f3 =I- 0 gives the asymmetric Cauchy case, which is awkward, and we shall not pursue it. From the form of the Levy exponents of the remaining stable CFs (where the argument u appears only in l u l o< and sgn(u)), we see that, if . . . + with Xi independent copies, =
<
<
<
:$
=
=
>
<
=
=
=
Xn
=
=
Sn
=
Xl
+
5.5
Sn/n1 /0I.
=
X l in distribution
(n
=
Levy
Processes
181
1 , 2, . . . ) .
This is called the scaling property of the stable laws; those ( all except the asymmetric Cauchy ) that possess it are called strictly stable. The stable den sities do not have explicit closed forms in general, only series expansions. The normal and ( symmetric ) Cauchy densities are known ( above ) , as is one further important special case: Levy 's density. Here 1 /2 , = + 1 . One can check that for each a, a
=
f3
f {x ) = V2:x3 exp { _ � a2 /x } = x:/2 ¢>{a/ v'X) (x > 0) has Laplace transform exp { -av'2S} ( ::::: 0) ; see Rogers and Williams (1994) §I.9 for proof. This is the density of the first-passage time of Brownian motion over a level a > O. The other remarkable case is that of = 3/2, f3 = 0, studied by the Danish astronomer J. Holtsmark in 1919 in connection with the gravitational field of stars - this before Levy's work on stability. The power 3/2 comes from 3 s
a
dimensions and the inverse square law of gravity. 5.5.2 Levy Processes
Suppose we have a process X = (Xt : t ::::: 0) that has stationary indepen dent increments. Such a process is called a Levy process, in honour of their creator, the great French probabilist Paul Levy (1886-1971) . Then for each n = 1 , 2, . . . , displays Xt as the sum of n independent ( by independent increments ) , identi cally distributed ( by stationary increments ) random variables. Consequently, Xt is infinitely divisible, so its CF is given by the Levy-Khintchine formula 5.2. The prime example is: the Wiener process, or Brownian motion, is a Levy process. Poisson Processes. The increment Nt + u - Nu (t, u ::::: 0) of a Poisson pro cess is the number of failures in {u, t + u] ( in the language of renewal theory ) . By the lack-of-memory property of the exponential, this is independent of the failures in [0, u] , so the increments of N are independent. It is also identi cally distributed to the number of failures in [0, t] , so the increments of N are stationary. That is, N has stationary independent increments, so is a Levy process: Poisson processes are Levy processes. We need an important property: two Poisson processes ( on the same fil tration ) are independent iff they never jump together ( a.s. ) . For proof, see e.g . Revuz and Yor ( 1991) , XII. I.
182
5. Stochastic Processes in Continuous Time
The Poisson count in an interval of length t is Poisson pe A t ) (where the rate A is the parameter in the exponential E(A) of the renewal-theory viewpoint), and the Poisson counts of disjoint intervals are independent. This extends from intervals to Borel sets: (i) For a Borel set B, the Poisson count in B is Poisson P(A I B I ) , where 1 . 1 denotes Lebesgue measure; (ii) Poisson counts over disjoint Borel sets are independent. Poisson ( Random ) Measures. If v is a finite measure, call a random mea sure ¢ Poisson with intensity (or characteristic) measure v if for each Borel set B, ¢(B) has a Poisson distribution with parameter v(B) , and for Bl " ' " Bn, ¢(Bl ) , . . . , ¢(Bn) are independent. One can extend to a-finite measures v: if (En) are disjoint with union JR and each v(En) < 00, construct ¢n from v restricted to En and write ¢ for L ¢n ' Poisson Point Processes. With v as above a (a-finite) measure on JR, consider the product measure IL = v dt on JR x [0, 00) , and a Poisson measure ¢ on it with intensity IL. Then ¢ has the form x
¢ = L d( e(t ) , t ) , t �O
where the sum is countable (for background and details, see Bertoin (1996) , §0.5, whose treatment we follow here) . Thus ¢ is the sum of Dirac measures over 'Poisson points' e ( t ) occurring at Poisson times t. Call e = ( e ( t ) : t � 0) a Poisson point process with characteristic measure v,
e
=
Ppp(v) .
For each Borel set B, [ O, t ] ) = card { s � t : e ( s ) E B } is the counting process of B it counts the Poisson points in B - and is Poisson process with rate (parameter) v(B) . All this reverses: starting with an e = ( e ( t ) : t � 0) whose counting processes over Borel sets B are Poisson P(v(B) ) , then - as no point can contribute to more than one count over disjoint sets, disjoint counting processes never jump together, so are inde d(e ( t) , t ) is a Poisson measure with intensity pendent by above, and ¢ : = L t >o N(t, B) : = ¢(B
x
-
IL =
v
x
a
dt .
Note. The link between point processes and martingales goes back to S. Watanabe in 1964. The approach via Poisson point processes is due to K. Ito in 1970 (Proc. 6th Berkeley Symp.); see below, and - in the context of excursion theory - Rogers and Williams (2000) , VI §8. For a monograph treatment of Poisson processes, see Kingman (1993) .
5 . 5 Levy Processes
183
5 . 5 . 3 Levy Processes and the Levy-Khintchine Formula.
We can now sketch the close link between the general Levy process on the one hand and the general infinitely-divisible law given by the Levy-Khintchine formula (L-K) on the other. We follow Bertoin (1996) , § l . l . First, if X (Xt) i s Levy, the law of each Xl is infinitely divisible, so given by lE exp{iuXt } = exp{ -!li (u) } (u E JR) with !Ii a Levy exponent as in (5.2) . Similarly, =
lE exp{iuXt } = exp{ -t!li(u) } (u E JR) , for rational t at first and general t by approximation and cadlag paths. Then is called the Levy exponent, or characteristic exponent, of the Levy process X. Conversely, given a Levy exponent !Ii (u) as in 5.2, construct a Brownian motion as in §5.3, and an independent Poisson point process .1 = ( .1t t 2: 0) with characteristic measure jJ-, the Levy measure in (5.2) . Then X ( t ) at + aBt has CF !Ii
:
I
lE exp{iuXI (t) }
=
{
exp{ -t!lil (t) } = exp -t ( i a + U
.
� a 2 u2 ) } ,
giving the non-integral terms in (5.2) . For the 'large' jumps of .1 , write .1 t(2)
.
·
=
if l { 0 t else. .1
.1 t l 2: 1 ,
Then .1(2) is a Poisson point process with characteristic measure jJ-(2) (dx ) : = l ( l x l 2: 1)jJ-(dx ) . Since J min(l, I x I 2 )jJ-(dx ) < 00 , jJ-(2) has finite mass, so .1(2) , a ppp(jJ-(2) ) , is discrete and its counting process X?)
:=
� .1�2)
(t 2:
0)
s :S;t
is compound Poisson, with Levy exponent
There remain the 'small jumps', .1 t(3)
. _
.
-
if { 0.1t else.
l .1 t l
<
1,
a Ppp(jJ-(3) ) , where p,(3) (dx ) l ( l x l < 1 )jJ-(dx ) , and independent of .1(2) because .1(2) , .1 (3) are Poisson point processes that never jump together. For each > 0, the 'compensated sum of jumps' =
E
184
5. Stochastic Processes in Continuous Time
X; E, 3)
:=
L 1 ( 10 < I Ll s l s� t
<
I ) Lls - t
f x1 ( €
< Ixl
<
I )JL(dx)
(t
2:
0)
is a Levy process with Levy exponent W(E ,3 ) (U) =
f (1 - eiux + iux) 1 ( €
< I x l < I )JL(dx) .
Use of a suitable maximal inequality allows passage to the limit from finite to possibly countably infinite sums of jumps): X; E , 3 ) Levy process with Levy exponent W(3 ) (U)
=
f ( 1 - eiux + iux) 1 ( lx l
€
..j.. °
�
(going X; 3) , a
< I ) JL (dx) ,
independent of X(2) and with coolag paths. Combining: Theorem 5 . 5 . 2 . For a E JR, (7 2: 0, J min(I, I x I 2JL(dx) < 00 and w(u)
=
�
i au + (72U2 +
f ( 1 - eiux + iux1 ( lx l
< I)JL(dx) ,
the construction above yields a Levy process x
=
X(I) + X(2) + X(3)
with Levy exponent W = W(I) +W(2) +l[i( 3 ) . Here the XCi) are independent Levy processes, with Levy exponents W( i ) ,- X(I) is Gaussian, X(2) is a compound Poisson process with jumps of modulus 2: 1 ,- X( 3 ) is a compensated sum of jumps of modulus < 1 . The jump process LlX = (LlXt : t 2: 0) is a Ppp(JL) , and similarly L1X(i) is a Ppp (JL(i» for i = 2, 3. Subordinators. We resort to complex numbers in the CF ¢(u) = lE(eiuX ) because this always exists - for all real u - unlike the ostensibly simpler moment-generating function (MGF) M(u) : = lE( e uX ) , which· may well di verge for some real u. However, if the random variable X is non-negative, then for s 2: ° the Laplace-Stieltjes tronsform (LST)
'¢(s)
:=
lE(e-Sx ) ::::; lE I = 1
always exists. For X 2: ° we have both the CF and the LST to hand, but the LST is usually simpler to handle. We can pass from CF to LST formally by taking u = is, and this can be justified by analytic continuation. Some Levy processes X have increasing (i.e. non-decreasing) sample paths; these are called subordinators (Bertoin (1996) , Ch. III) . From the con struction above, subordinators can have no negative jumps, so JL has support in (0, 00) and no mass on (-00, 0) . Because increasing functions have FV,
5.5 Levy Processes
185
one must have paths of ( locally) finite variation, the condition for which can be shown to be min ( l, Ixl )J.L(dx) < 00.
J
Thus the Levy exponent must be of the form 00
lP(u) =
J
- i du + (1 - eiUX )J.L(dx) , o
with d � O. It is more convenient to use the Laplace exponent 4>(s)
=
lP ( i s) :
00
J
.IE ( exp { - sXt }) = exp { -t4>(s) } (s � 0) , 4>(s) = ds + (1 - e- SX )J.L(dx) . o
Example. The Stable Subordinator.
1),
Here d = O, 4>(s)
=
se:>,
(0
<
Q
<
J.L(dx) dx/(F(l - Q ) X<>- l ) . The special case Q = 1/2 is particularly important: this arises as the first passage time of Brownian motion over positive levels, and gives rise to the Levy density above. =
Classification. IV (Infinite Variation) . The sample paths have infinite vari ation on finite time-intervals, a.s. This occurs iff (J"
>0
or
J min( l, Ixl )J.L(dx)
= 00.
FV (Finite Variation, on finite time-intervals, a.s.).
J min ( l, Ix l )J.L(dx)
< 00 .
Here there are infinitely many jumps in finite time intervals, a.s.: J.L has infinite mass, equivalently J� l J.L(dx) 00 :
IA (Infinite Activity).
=
v(IR) = 00 . FA (Finite Activity). Here there are only finitely many jumps in finite time, a.s. , and we are in the compound Poisson case:
J.L(IR)
<
00.
186
5. Stochastic Processes in Continuous Time
Economic Interpretation. Suppose X is used as a driving noise process in a financial market model for asset prices (example: X BM in the Black Scholes-Merton model). If prices move continuously, the Brownian model is appropriate: among Levy processes, only Brownian motions have continuous paths ( f.l 0, so there are no jumps) . If prices move by intermittent jumps, a compound Poisson (FA) model is appropriate - but this is more suitable for modelling economic shocks, or the effects of big transactions. For the more common case of the everyday movement of traded stocks under the com petitive effects of supply and demand, numerous small trades predominate, economic agents are price takers and not price makers, and a model with infinite activity (IA) is appropriate. There is a parallel between the financial situation above - the IA case (lots of small traders) as a limiting case of the FA case (a few large ones) and the applied probability areas of queues and dams. Think of work arriving from the point of view of you, the server. It arrives in large discrete chunks, one with each arriving customer. As long as there is work to be done, you work non-stop to clear it; when no one is there, you are idle. The limiting situation is that of a dam. Raindrops may be discrete, but one can ignore this from the water-engineering viewpoint. When water is present in the dam, it flows out through the outlet pipe at constant rate (unit rate, say); when the dam is empty, nothing is there to flow out. The martingale concept, though crucial, is a little too restrictive, and one needs to generalise it. We will be brief here; see §5.1O below for more detail. First, a local martingale M ( M(t)) is a process such that, for some sequence of stopping times Sn --* 00 , each stopped process M( n ) = (M(t 1\ Sn ) ) is a martingale. This localization idea can be applied elsewhere: a process (A(t)) (adapted to our filtration, understood) is locally of finite variation if each (A(t I\ Sn ) ) is of finite variation for some sequence of stopping times Sn --* 00 . A semi-martingale is a process (X(t)) expressible as =
=
=
X (t)
=
M (t) + A(t)
with (M(t)) a local martingale and (A(t)) locally of finite variation (the con cept is due to Meyer) . The Gaussian component X( 1 ) is a martingale; so too is the compensated sum of (small) jumps process X( 3 ) , while the sum of large jumps process X ( 2 ) is (locally) of finite variation, be ing compound Poisson. Thus a Levy process X XCI ) + X(2) + X( 3 ) is a semi-martingale. Indeed, Levy processes are the prototypes, and motivating examples, of semi-martingales. The natural domain of stochastic integration is predictable integrands and semi-martingale integrators. Thus, stochastic integration works with a general Levy process as integrator. Here, however, the theory simplifies considerably. For a monograph treatment of stochastic calculus in this stripped-down setting of Levy processes, we refer to the book Levy Processes as Semi-martingales.
=
5.6 Stochastic Integrals; Ito Calculus
187
Applebaum (2004) . Weierstrass, and several other analysts of the 19th C., constructed examples of functions which were continuous but nowhere differentiable. These were long regarded as in teresting but pathological. Similarly for the paths of Brownian motion. This used to be regarded as very interesting mathematically, but of limited rele vance to modelling the real world. Then - following the work of B. B. Man delbrot (plus computer graphics, etc.) - fractals attracted huge attention. It was then realized that such properties were typical of fractals, and so as we now see fractals everywhere (to quote the title of Barnsley's book) ubiquitous rather than pathological. The situation with Levy paths of infinite activity is somewhat analogous. Because one cannot draw them (or even visualise them, perhaps) , they used to be regarded as mathematically interesting but clearly idealised so far as modelling of the real world goes. The above economic/financial interpretation has changed all this. 'Levy finance' is very much alive at the moment (see e.g. Bingham and Kiesel (2001), Bingham and Kiesel (2002) and Barndorff Nielsen, Mikosch, and Resnick (2000)) . Moral: one never quite knows when this sort of thing is going to happen in mathematics! Note. What Constitutes Pathological Behaviour?
5 . 6 Stochastic Integrals ; Ito Calculus
5 . 6 . 1 Stochastic Integration
Stochastic integration was introduced by K. Ito in 1944, hence its name Ito calculus. It gives a meaning to t
t
J XdY J X(s, w )dY(s, w) , =
o
0
for suitable stochastic processes X and Y, the integrand and the integrator. We shall confine our attention here mainly to the basic case with integrator Brownian motion: Y = W . Much greater generality is possible: for Y a con tinuous martingale, see Karatzas and Shreve (1991) or Revuz and Yor ( 1991 ) ; for a systematic general treatment, see Meyer (1976) or Protter (2004) . The first thing to note is that stochastic integrals with respect to Brown ian motion, if they exist, must be quite different from the measure-theoretic integral of §2.2 For, the Lebesgue-Stieltjes integrals described there have as integrators the difference of two monotone (increasing) functions, which are locally of finite variation. But we know from §5.3.4 that Brownian motion is of infinite (unbounded) variation on every interval. So Lebesgue-Stieltjes and Ito integrals must be fundamentally different.
1 88
5. Stochastic Processes in Continuous Time
In view of the above, it is quite surprising that Ito integrals can be defined at all. But if we take for granted Ito's fundamental insight that they can be, it is obvious how to begin and clear enough how to proceed. We begin with the simplest possible integrands X, and extend successively in much the same way that we extended the measure-theoretic integral of Chapter 2. Indicators.
! X dW :
If X { t , w ) = l [a,bj (t) , there is exactly one plausible way to define
{O
t
if t � a, X (s, w)dW(s, w) := W(t) - W(a) if a � t � b, W( b ) - W(a) if t � b. o Simple Functions. Extend by linearity: if X is a linear combination of indicators, X = L:�= l Ci 1 [ai ,bi j , we should define
J
Already one wonders how to extend this from constants Ci to suitable ran dom variables, and one seeks to simplify the obvious but clumsy three-line expressions above. We begin again, this time calling a stochastic process X simple if there is a partition 0 = to < tl < . . . < tn = T < 00 and uniformly bounded :Ft ,. -measurable random variables �k ( I �k I � C for all k = 0, . . . and w, for some C) and if X (t , w) can be written in the form n X (t, w) = �o (w) l{o} (t) + L �i (w) l(ti , t i+1j (t) ( 0 � t � T, w E Q) . , n
i=O
t It (X) : =
Jo XdW
k-l =
� �i (W(ti+l ) - W(ti») + �k (W(t) - W(tk» .=0
n = L �i (W(t /\ ti+l ) - W(t /\ t i» . i=O Note that by definition Io(X) = 0 IP - a.s . . We collect some properties of the stochastic integral defined so far: Lemma 5.6. 1 . (i) It (aX + bY ) alt (X) + blt ( Y ) . (ii) lE(It (X) I:Fs ) = Is (X) IP - a.s. (0 � < t < 00 ) , hence It (X) is =
continuous martingale.
s
a
5.6 Stochastic Integrals; Ito Calculus
189
Proof. (i) follows from the fact that linear combinations of simple func tions are simple. (ii) There are two cases to consider. (a) Both s and t belong to the same interval [tk , t k+1 ) . Then �k is Ft k -measurable, W( t ) - W ( s ) (independent
But
so Fs-measurable ( t k :::; s ) , so independent of increments property of W) . So .IE( It ( X ) I Fs) Is ( X ) + �k .IE(W( t ) - W ( s ) I Fs ) = Is (X) . (b) s < t belongs to a different interval from t : s E [tm , tm + 1 ) for some m < k . Then =
k- l
+
L
i= m + l
�i (W ( t i + 1 ) - W( ti ) )
+
�k (W( t ) - W ( t k ) )
(if k = m + 1 , the sum on the right is empty, and does not appear) . Take .IE( . IFs ) on the right. The first term gives Is (X). The second gives �m .IE [ ( W ( tm + 1 ) - W( s )) I Fs ) �m · 0 0, as �m is Fs -measurable, and simi 0 larly so do the third and fourth, completing the proof. =
=
Note. The stochastic integral for simple integrands is essentially a mar tingale transform, and the above is essentially the proof of Chapter 3 that martingale transforms are martingales (compare Theorem 3.4. 1). We pause to note a property of square-integrable martingales which we shall need below. Call M ( t ) - M ( s ) the increment of M over ( s , t) . Then for a martingale M , the product of the increments over disjoint intervals has zero mean. For, if s < t :::; < v , u
.IE [ (M(v) - M (u) ) (M( t ) - M ( s ) )) M( s ) ) I F ) )
u = .IE [ (M( t ) - M( s ) ).IE( (M(v) - M( ) ) I Fu ) ) , =
.IE [ .IE( (M(v) - M(u) ) (M ( t )
-
u
taking out what is known (as s, t :::; ) The inner expectation is zero by the martingale property, so the left-hand side is zero, as required. We now can add further properties of the stochastic integral for simple functions. u .
Lemma 5.6.2. (i) We have the ItO isometry E
( h (X)) ' )
�
E
(j )'dS)X(s
190
5. Stochastic Processes in Continuous Time
(I:
IE X (u) 2 du ) 1P - a . s . Proof. We only show (i), the proof of (ii) is similar. The left-hand side in (i) above is IE(It (X) . It (X)) , i.e.
(ii)
IE ((It (X) - Is (X)) 2 IFs)
=
Expanding out the square, the cross-terms have expectation zero by above, leaving
Since �i is Fti-measurable, each �r-term is independent of the squared Brow nian increment term following it, which has expectation War ( W (ti+ d W ( ti ) ) = ti+1 - ti . So we obtain k-l
L IE ( �n (ti + l - ti ) + IE (��) (t - tk) ' i=O
This is (using Fubini's theorem) J; IE(X(u) 2 )du quired.
=
IE
(I; X(u) 2 du) , as re 0
The Ito isometry above suggests that J; X dW should be defined only for processes with t for all t . IE ( X( U ) 2 ) du <
J
00
o
We then can transfer convergence on a suitable L 2 -space of stochastic pro cesses to a suitable L 2 -space of martingales. This gives us an L 2 -theory of stochastic integration (compare the L 2 -spaces introduced in Chapter 2), for which Hilbert-space methods are available (see Appendix A) . For the financial applications we have in mind, there is a fixed time interval - [0, T] say - on which we work (e.g. , an option is written at time t = 0, with expiry time t T) . Then the above becomes =
T
J IE(X(u) 2 )du o
<
00 .
5.6 Stochastic Integrals; Ito Calculus
191
By analogy with the integral of Chapter 2, we seek a class of integrands suitably approximable by simple integrands. It turns out that: ( i ) The suitable class of integrands is the class of (8( [0, oo ) ) Q9F) -measurable, adapted processes X with J� IE ( X (u) 2 ) du < 00 for all t > O. ( ii ) Each such may be approximated by a sequence of simple integrands so that the stochastic integral J� XdW may be defined as the limit of J� Xn dW. ( iii ) The properties from both lemmas above remain true for the stochastic integral J� X d W defined by ( i ) and ( ii ) . It is not possible to include detailed proofs of these assertions in a book of this type ( recall that we did not construct the measure-theoretic integral of Chapter 2 in detail either - and this is harderl ) . The key technical ingredients needed are Hilbert-space methods ( see Appendices A and B ) in spaces defined by integrals related to the quadratic variation of the integrator ( which is just t in our Brownian motion setting here ) and the Kunita-Watanabe inequalities ( Rogers and Williams (1994) , IV.28 and Meyer (1976), II) . For full details, see e.g. Karatza8 and Shreve (1991), §§3. 1-2 and Revuz and Yor (1991) , IV. 1-2. For a good treatment at an accessible level, see e.g. 0ksendal (1998) , §3.1 . Example. We calculate J W(u)dW(u) . We start by approximating the in tegrand by a sequence of simple functions. Approximation.
(Ft)Xn
X It(Xn)
Xn(U) By definition,
=
It(X)
=
1
W (O) = 0 W(tln) :
w «n �l) t )
=
if o :S u :S tin, if tin < u :S 2tln, if (n - l )tln < u :S t.
Rearranging terms, we obtain for the sum on the right
� W ( � ) (W Ck : l)t ) w ( � )) � W(t) 2 - � [� (w Ck : l)t ) w ( � ) rl -
=
-
Since the second term approximates the quadratic variation of W and hence tends to t for n -+ 00, we find
192
5. Stochastic Processes in Continuous Time t
J W{u)dW{u) � W{t) 2 - � t.
(5.3)
=
o
Note the contrast with ordinary (Newton-Leibniz) calculus! Ito calculus re quires the second term on the right - the Ito correction term - which arises from the quadratic variation of W . One can construct a closely analogous theory for stochastic integrals with the Brownian integrator W above replaced by a square-integrable martingale integrator M. The properties above hold, with Lemma 5 . 5 . 2 { i ) replaced by
IE
[ (!
X(U )dM( U »
) '] [i �
IE
]
X(U )'d(M) (U > .
The natural class of integrands X to use here is the class of predictable processes (a slight extension of left-continuity of sample paths) . Quadratic Variation, Quadratic Covariation. We shall need to ex tend quadratic variation and quadratic covariation to stochastic integrals. The quadratic variation of It { X ) = f� X { u)dW{u) is f� X {u) 2 du. This is proved in the same way as the case X 1, that W has quadratic vari ation process t. More generally, if Z{t) f� X { u)dM{u) for a continuous martingale integrator M, then (Z) (t) f� X 2 { u)d (M) (u) . Similarly (or by polarization) , if Zi {t) = f� Xi {u)dMi (u) (i = 1, 2) , (Zl , Z2 ) (t) f� Xl { U)X2 {u)d (Ml , M2 ) (u). Semi-martingales. It turns out that semi-martingales give the natural class of stochastic integrators: one can define the stochastic integral t H{u)dX { u) H{u)dM{u) + H{u)dA{u) =
=
=
=
t
J o
t
=
J 0
J 0
for predictable integrands H ( above) , and for semi-martingale integrators X - but for no larger class of integrators, if one is to preserve reasonable con vergence and approximation properties for the operation of stochastic inte gration. For details, see e.g. Meyer (1976) ,Protter (2004) ,Rogers and Williams (2000),Rogers and Williams (1994) . With integrands as general as above, stochastic integrals are no longer martingales in general, but only local martingales (see e.g. 0ksendal (1998), p. 35) . For our purposes, one loses little by thinking of bounded integrands (recall that we usually have a finite time horizon T, the expiry time, and that bounded processes are locally integrable, but not integrable in general) . Semi-martingales and an associated theory of stochastic integration with good convergence properties provide the tools needed to study convergence of as
5.6 Stochastic Integrals; Ito Calculus
193
a sequence of financial models (in discrete time, say) to a limiting model (in continuous time, say) . See for example Kurtz and Protter (1991) , to which we return in §6.4. 5.6.2 Ito's Lemma
Suppose that b is adapted and locally integrable (so J� b(s)ds is defined as an ordinary integral, in Chapter 2) , and a is adapted and measurable with J� IE ( a(u) 2 ) du < 00 for all t (so J� a(s)dW(s) is defined as a stochastic integral, as in §5.5) . Then as
t
t
X(t)
: = Xo
+
J b(s)ds + J a(s)dW(s) o
0
defines a stochastic process X with X(O) = Xo (which is often called Ito process ) It is customary, and convenient, to express such an equation sym bolically in differential form, in terms of the stochastic differential equation .
dX(t)
=
b(t)dt + a(t)dW(t) , X(O)
=
Xo .
(5.4)
Now suppose f IR ---+ IR is of class C2• The question arises of giving a meaning to the stochastic differential df (X ( t)) of the process f (X ( t)) , and finding it. Given a partition P of [0, tJ , i.e. 0 to < t l < . . < tn = t, we can use Taylor's formula to obtain n -l f(X(t)) - f(X(O)) L f( X(t k + l )) - f(X(t k )) k=O n -l L f ' (X(t k ))LlX(t k ) k=O :
=
.
=
+
�� !" (X(t k ) + Ok LlX(t k )) (LlX (t k )) 2 k=O
with 0 < Ok < 1. We know that E(LlX(t k )) 2 ---+ (X) (t) in probability (so, taking a subsequence, with probability one) , and with a little more effort one can prove t n -l L !" (X(t k ) + Ok LlX(t k ))(LlX(t k )) 2 ---+ !,, (X(u) )d (X) (u) . k=O 0
J
The first sum is easily recognized as an approximating sequence of a stochastic integral (compare the example in §5.6. 1); indeed, we find
194
5 . Stochastic Processes in Continuous Time n- l
t
J
L j '( X (t k )) L1 X(t k ) --+ j' (X( u ))dX( u ) . k=O 0
So we have Theorem 5 . 6 . 1 (Basic Ito Formula) .
ven by (5.4)
and f E C 2 , df(X(t))
or writing
out the
=
then
If X has stochastic differential gi f(X) has stochastic differential
�
j' (X(t))dX(t) + j" (X(t))d (X) (t) ,
integrals,
t
f(X(t) ) = f(xo) +
t
Jo j' (X(u))dX (u) + � J j" (X(u) )d (X ) (u) . 0
More generally, suppose that f : JR2 --+ JR is a function, continuously dif ferentiable once in its first argument (which will denote time) , and twice in its second argument (space) : f E C 1 ,2 . By the Taylor expansion of a smooth function of several variables we get for t close to to (we use subscripts to denote partial derivatives: ft a f lat, ftx := a2 f lat8x) : f(t, X(t)) = f(to , X(to)) +(t - to) ft (to , X(to)) + (X(t) - X(to))fx (to, X(to)) 1 1 , 2 fxx (to, X(to)) + 2 (t - to) 2 ftt (to, X(to)) + 2 (X(t) - X(to)) +(t - to)(X(t) - X(to))ftx (to, X(to)) + . . . , which may be written symbolically as df ftdt + fxdX + 21 ftt (dt) 2 + ftxdtdX + 21 fxx (dX) 2 + . . . . In this, we substitute dX(t) b(t)dt + a(t)dW(t) from above, to obtain df = ftdt + fx (bdt + adW) :=
=
=
+ 21 ftt ( dt ) 2 + ftxdt ( bdt + adW ) + 21 fxx (bdt + adW) 2 + . . . Now using the formal multiplication rules dt·dt 0, dt·dW 0, dW ·dW dt (which are just shorthand for the corresponding properties of the quadratic variations, compare §5.3.4) , we expand (bdt + adW) 2 = a 2 dt + 2 badtdW + b2 (dt) 2 = a 2 dt + higher-order terms =
to get finally
=
=
5 . 6 Stochastic Integrals; Ito Calculus
df
(
= ft + bfx +
195
�(T2 fxx ) dt + (Tfx dW + higher-order terms.
As above, the higher-order terms are irrelevant, and summarizing, we obtain Ito 's lemma, the analogue for the Ito or stochastic calculus of the chain rule for ordinary (Newton-Leibniz) calculus: Theorem 5 . 6 . 2 ( Ito's Lemma ) . If X (t) has stochastic differential given
by (5.4), then f
f(t, X (t)) has stochastic differential
=
df
(
= ft + bfx +
�(T2 fxx ) dt + (Tfx dW
That is, writing fa for f(O, xo ) , the initial value of f , t
t
f
=
fa +
/ ( ft + bfx + �(T2 fxx )dt + / (T fx dW a
a
We will make good use of: Corollary 5 . 6 . 1 . IE (J(t, X (t) ) )
�(T2
+ J; IE (It + bfx + fxx ) dt. Proof. J; h d W is a stochastic integral, so a martingale, so its expec D tation is constant (= 0, as it starts at 0) . Note. Powerful as it is in the setting above, Ito's lemma really comes into its own in the more general setting of semi-martingales (of which X above is an important example) . It says there that if X is a semi-martingale and f is a smooth function as above, then f (t, X(t)) is also a semi-martingale. The ordinary differential dt gives rise to the finite-variation part, the stochastic differential gives rise to the martingale part. This closure property under very general non-linear operations is very powerful and important. Ito Lemma in Higher Dimensions. If f(t , X , . . . , X d ) is in its zeroth (time) argument t and in its remaining d space arguments Xi , and !vI = (MI , . . . , Md ) is a continuous vector martingale, then ( writing fi , fij for the first partial derivatives of f with respect to its ith argument and the second partial derivatives with respect to the ith and jth arguments) f(t, M(t)) has stochastic differential =
fa
(T
l
C2
df (t, M (t))
C1
d fo (t , M(t)) dt + L fi (t , M(t)) dMi (t)
= +
1
"2
d
L
i,j= l
i= l
/ij (t, M (t))d ( Mi ' Mj ) (t) .
196
5 . Stochastic Processes in Continuous Time
Application.
To compute ! WdW, use f(x) = x 2 Then .
W(t) 2
=
W(0) 2 +
t
t
J 2W(u)dW(u) � J 2du, o
+
0
which after rearranging is just (5.3) . 5.6.3 Geometric Brownian Motion
Now that we have both Brownian motion W and Ito's Lemma to hand, we can introduce the most important stochastic process for us, a relative of Brownian motion geometric (or exponential, or economic) Brownian motion. Suppose we wish to model the time evolution of a stock price 8(t) (as we will, in the Black-Scholes theory to follow in Chapter 6). Consider how 8 will change in some small time-interval from the present time t to a time t + dt in the near future. Writing d8(t) for the change 8(t + dt) - 8(t) in 8, the return on 8 in this interval is d8(t)j8(t) . It is economically reasonable to expect this return to decompose into two components, a systematic part and a random part. The systematic part could plausibly be modelled by J-Ldt, where f.l is some parameter representing the mean rate of return of the stock. The random part could plausibly be modelled by (idW(t) , where dW(t) represents the noise term driving the stock price dynamics, and (i is a second parameter describing how much effect this noise has - how much the stock price fluctuates. Thus (i governs how volatile the price is, and is called the volatility of the stock. The role of the driving noise term (cf. §5.3) is to represent the random buffeting effect of the multiplicity of factors at work in the economic environment in which the stock price is determined by supply and demand. Putting this together, we have the stochastic differential equation d8(t) = 8(t) (f.ldt + (idW(t) ) , 8(0) > 0, due to Ito in 1944. This corrects Bachelier's earlier attempt of 1900 (he did not have the factor 8(t) on the right - missing the interpretation in terms of returns, and leading to negative stock prices!) Incidentally, Bachelier's work served as Ito's motivation in introducing Ito calculus. The mathemat ical importance of Ito's work was recognised early, and led on to the work of Doob (1953) , Meyer (1976) and many others (see the memorial volume Ikeda, Watanabe, M., and Kunita (1996) in honour of Ito's eightieth birthday in 1995) . The economic importance of geometric Brownian motion was rec ognized by Paul A. Samuelson in his work from 1965 on (Samuelson (1965)), for which Samuelson received the Nobel Prize in Economics in 1970, and by Robert Merton (see Merton (1990) for a full bibliography) , in work for which he was similarly honoured in 1997. -
5.6 Stochastic Integrals; Ito Calculus
197
The differential equation above has the unique solution
Set) S(O) exp { (JL - �a2 ) t + adW(t)}. =
For, writing
f (t,x) : = exp { (JL - �a2 ) t + ax }, we have ft = (JL - �a2 ) f, fx af, fxx = 0'2 f, and with x Wet), one has dx dW(t), (dx) 2 dt. Thus Ito's lemma gives df(t, Wet)) ftdt fx dW(t) 21 fxx (dW(t)) 2 = f ( ( JL - �a 2 ) dt adW(t) �a 2 dt ) = f(JLdt adW(t)), so f (t, Wet)) is a solution of the stochastic differential equation, and the initial condition f (O, W(O)) S(O) as W(O) = 0, giving existence. For uniqueness, we need the stochastic ( or DolE�ans, or Doleans-Dade ) exponential ( see §5.1O below) , giving Y c(X) exp { X - ! (X)} ( with X a continuous semi-martingale) as the unique solution to the stochastic =
=
=
=
+
=
+
+
+
+
=
=
=
differential equation
1. ( for the general definition and properties see e.g. Jacod and Shiryaev ( 1987 ) , lA, Protter (2004), 11.8, Revuz and Yor ( 1991 ) , IV.3, VULl, Rogers and Williams (2000) , IV. 19 ) ( Incidentally, this is one of the few cases where a stochastic differential equation can be solved explicitly. Usually we must be content with an existence and uniqueness statement, and a numerical algo below. ) Thus above is the rithm for calculating the solution; see stochastic exponential of + Brownian motion with mean ( or drift ) In particular, and variance ( or volatility )
dY(t) Y (t-)dX(t), YeO) =
=
§5. 7 Set) JLt aW(t), 0'2 . JL log Set) log S(O) (JL - �a 2 ) t aW(t) has a normal distribution. Thus Set) itself has a lognormal distribution. This geometric Brownian motion model, and the log-normal distribution that it =
+
+
entails, are the basis for the Black-Scholes model for stock-price dynamics in continuous time, which we study in detail in §6.2.
198
5 . Stochastic Processes in Continuous Time
5 . 7 Stochastic Calculus for Black- Scholes Mo dels
In this section we collect the main tools for the analysis of financial markets with uncertainty modelled by Brownian motions. Consider first independent N(O, 1) random variables Zl , . . . , Zn on a prob ability space (n, F, lP) . Given a vector "I bl , . . . , "In ) , consider a new probability measure P on (n, F) defined by =
As exp{. } > 0 and integrates to 1 , as J expbiZildlP exp{ hn , this is a probability measure. It is also equivalent to lP (has the same null sets) , again as the exponential term is positive. Also =
P(Zi E dzi , i = l , . . . ) , n
This says that if the Zi are independent N(O, 1 ) under lP, they are indepen dent N bi 1) under P. Thus the effect of the change of measure lP -+ P, from the original measure lP to the equivalent measure P, is to change the mean, from 0 (0, . . . , 0) to "I bl , . . . , "In ) . This result extends t o infinitely many dimensions - i.e., from random vectors to stochastic processes, indeed with random rather than deterministic means. Let W (WI , . . . Wd ) be a d-dimensional Brownian motion defined on a filtered probability space (n, F, lP, IF) with the filtration IF satisfying the usual conditions. Let b (t) : 0 � t :s; T) be a measurable, adapted d dimensional process with 1:: "Ii (t) 2 dt < 00 a.s., i = 1, . . . , d, and define the process (L(t) : 0 � t � T) by '
=
=
=
L(t)
�
exp
{! -
�(,) ' dW( , ) -
�
!
l I >(s ) 1 1' ds
}.
(5.5)
Then L is continuous, and, being the stochastic exponential of - J; "I ( s ) ' d W( s ) is a local martingale. Given sufficient integrability on the process "I, L will in fact be a (continuous) martingale. For this, Novikov 's condition suffices:
5 . 7 Stochastic Calculus for Black-Scholes Models
199
We are now in the position to state a version of Girsanov's theorem, which will be one of our main tools in studying continuous-time financial market models. Theorem 5 . 7 . 1 ( Girsanov) . Let , be as above and satisfy Novikov 's con
dition; let L be the corresponding continuous martingale. Define the processes Wi , i = 1 , . . . , d by
Jo t
Wi (t) : = Wi (t) +
,i (S) dS,
(0
�
t � T) , i = 1 , . . . , d.
Then under the equivalent probability measure jp (defined on (n, FT ) ) with Radon-Nikodym derivative djp = L(T) , dIP the process W = ( WI , . . . , Wd) is d-dimensional Brownian motion.
In particular, for ,(t) constant ( ,) , change of measure by introduc ing the Radon-Nikodym derivative exp { -,W(t) - h 2t } corresponds to a change of drift from c to c - ,. If IF (Ft } is the Brownian filtration (ba sically Ft = a(W(s) , 0 � s � t) slightly enlarged to satisfy the usual condi tions) any pair of equivalent probability measures Q IP on F = FT is a Girsanov pair, i.e. =
=
'"
diJ dIP
l
= L (t) Ft
with L defined as above. Girsanov's theorem (or the Cameron-Martin Girsanov theorem ) is formulated in varying degrees of generality, discussed and proved, e.g. in Karatzas and Shreve ( 199 1 ) , §3.5, Protter (2004) , 111.6, Revuz and Yor ( 199 1 ) , VIII, Dothan ( 1990) , §5.4 (discrete time) , § 1 1 .6 (con tinuous time) . Our main application of the Girsanov theorem will be the change of mea sure in the Black-Scholes model of a financial market §6.2 to obtain the risk neutral martingale measure, which will as in the discrete-time case guarantee an arbitrage-free market model and may be used for pricing contingent claims (see § 6 . 1 ) . To discuss questions of attainability and market completeness we will need: Theorem 5 . 7 . 2 (Representation Theorem) . Let M =
(M(t))t>o be a ReLL local martingale with respect to the Brownian filtration (Ft ) . Then
5. Stochastic Processes in Continuous Time
200
t
t M (O) + J H(s)dW(s) , t � 0 o with H ( H(t))t? o a progressively measurable process such that J� H ( S) 2 ds 00, t 0 with probability one. That is, all Brownian local martingales may beas represented as stochastic integrals with respect to Brownian motion (and such are continuous). The following corollary has important economic consequences. Corollary 5.7. 1 . Let G be an FT -measurable random variable 0 T oowith G I 00 ; then there exists a process H as in theorem 5. 7. 2 such lE(I ) that T G lEG + J H (s)dW(s) . M( )
�
=
<
=
<
<
=
<
o We refer to, e.g., Karatzas and Shreve (1991) §3.4 and Revuz and Yor (1991) , V.3 for multidimensional versions of the result and proof. As mentioned above, the economic relevance of the representation theorem is that it shows that the Black-Scholes model is complete - that is, that every contingent claim ( modelled as an appropriate random variable ) can be replicated by a dynamic trading strategy. Mathematically, the result is purely a consequence of properties of the Brownian filtration. The desirable mathematical properties of Brownian motion are thll.s seen to have hidden within them desirable economic and financial consequences of real practical value. The next result, which is an example for the rich interplay between proba bility theory and analysis, links stochastic differential equations ( SDEs ) with partial differential equations ( PDEs ) . Such links between probability and stochastic processes on the one hand and analysis and partial differential equations on the other are very important, and have been extensively stud ied ( for background and details, see e.g. the excellent treatments Bass (1995) and Durrett (1996b)) . Suppose we consider a stochastic differential equation, dX( ) J.L(t , X(t))ds + O'(t, X(t))dW(t) (to :::; t :::; T) , with initial condition X(to) x. For suitably well-behaved functions J.L , O' , this stochastic differential equation will have a unique solution X :::; T) ( compare §5.8) . Taking existence of a unique solution for granted for the moment, consider a smooth function F( , X( ) ) of it. By Ito's lemma, 1 dF Ftdt + FxdX + 2 Fxxd (X) ,
t
=
=
=
t t
=
(X(t) : to :::; t
5.7 Stochastic Calculus for Black-Scholes Models
201
and as d (X) = (/-Ldt + adW) = a 2 d (W) = a 2 dt, this is dF = Ft dt + Fx (/-Ldt + adW) + "21 a 2 Fxxdt = Ft + /-LFx + a 2 Fxx dt + aFxdW.
(
)
�
Now suppose that F satisfies the partial differential equation Ft + /-LFx + "21 a 2 Fxx 0 with boundary condition, F(T, x) h(x) Then the above expression for dF gives =
=
.
dF = aFxdW, which can be written in stochastic-integral rather than stochastic-differential form as 8
F(s, X(s)) = F(to, X(to) ) +
J a(u, X(u))Fx (u, X(u))dW(u) .
to Under suitable conditions, the stochastic integral on the right is a martingale, so has constant expectation, which must be 0 as it starts at O. Then F(to, x ) = lE ( F(s, X(s) ) I X(to) = x) . For simplicity, we restrict to the time-homogeneous case: /-L(t, x ) = /-L(x) and a(t, x) = a(x), and assume /-L and a Lipschitz, and h E C5 (h twice continu ously differentiable, with compact support) . Then (see e.g. 0ksendal (1998), §§7.1 ,8.2) the stochastic integral is a martingale, and replacing s by T we get the stochastic representation F( , x ) lE ( F(X(T) ) I X(t) = x) for the solution F. Conversely, any solution F which is in C 1 , 2 (has continuous derivatives of order one in and two in x) and is bounded on compact t-sets arises in this way (see also Karatzas and Shreve (1991), §5.7B for alternative conditions) . This gives: Theorem 5.7.3 (Feynman-Kac Formula) . /-L(x) , a(x) F F (t, x) (5.7) Ft + /-LFx + "21 a 2 Fxx = 0 F(T, x) = h(x) F(t, x) = lE [ h(X(T)) I X(t) x] ,
t
to,
=
t,
t
solution
=
with final condition
For to the partial differential equation
Lipschitz, the
has the stochastic representation =
202
5. Stochastic Processes in Continuous Time
where X satisfies the stochastic differential equation dX(s)
=
J.l(X(s ) )ds + a (X(s))dW(s ) (t � s � T)
with initial condition X ( t)
=
x.
The Feynman-Kac formula gives a stochastic representation to solutions of partial differential equations. We shall return to the Feynman-Kac formula in §.6.2 below in connection with the Black-Scholes partial differential equation. Application. One classical application of the Feynman-Kac formula is to Kac's proof of Levy's arc-sine law for Brownian motion. Let be the amount of time in [0, tJ for which Brownian motion takes positive values. Then the proportion Tt /t has the arc-sine law - the law on [0, 1] with density 1 ( x E [0, 1] ) . 7rx l - x) Tt
(
For proof, see e.g. Steele (2001), §15.3. 5 . 8 Stochast ic Differential Equations
The most successful single branch of mathematical or scientific knowledge we have is the calculus, dating from Newton and Leibniz in the 17th century, and the resulting theories of differential equations, ordinary and partial (ODEs and PDEs) . With any differential equation, the two most basic questions are those of existence and uniqueness of solutions - and to formulate such questions precisely, one has to specify what one means by a solution. For example, for PDEs, 19th century work required solutions in terms of ordinary functions - the only concept available at that time. More modern work has available the concept of generalised functions or distributions (in the sense of Laurent Schwartz ( 1915-2002)) . It has been found that a much cleaner and more coherent theory of PDEs can be obtained if one is willing to admit such generalised functions as solutions. Furthermore, to obtain existence and uniqueness results, one has to impose reasonable regularity conditions on the coefficients occurring in the differential equation. Perhaps the most basic general existence theorem here is Picard's theo rem, for an ordinary differential equation (non-linear, in general) dx(t)
=
b( t, x (t))dt, x (O)
=
xo ,
or to use its alternative and equivalent expression as an integral equation,
+ J b(s , x (s) ) ds. t
x ( t)
=
Xo
°
5 . 8 Stochastic Differential Equations
203
If one assumes the Lipschitz condition I b(t, x) - b(t, y ) 1
:S
K l x - yl
for some constant K and all t E [0, T] for some T > 0, and boundedness of b on compact sets, one can construct a unique solution x by the Picard iteration
t
x ( O ) (t)
:=
Xo , x ( n +l) (t) := Xo +
J b(s, x ( n) (s))ds. o
See e.g. Hale (1969) , Theorem 1.5.3, or any textbook on analysis or differential equations. (The result may also be obtained as an application of Banach's contraction-mapping principle in functional analysis.) Naturally, stochastic calculus and stochastic differential equations contain all the complications of their non-stochastic counterparts, and more besides. Thus by analogy with PDEs alone, we must expect study of SDEs to be complicated by the presence of more than one concept of a solution. The first solution concept that comes to mind is that obtained by sticking to the non-stochastic theory, and working pathwise: take each sample path of a stochastic process as a function, and work with that. This gives the concept of a strong solution of a stochastic differential equation. Here we are given the probabilistic set-up - the filtered probability space in which our SDE arises - and work within it. The most basic results, like their non-stochastic counterparts, assume regularity of coefficients (e.g., Lipschitz conditions) , and construct a unique solution by a stochastic version of Picard iteration. The following such result is proved in Karatzas and Shreve (1991), §5.2. Consider the stochastic differential equation dX(t) = b (t, X (t))dt + a(t, X (t))dW(t) , X ( O) = �,
where b(t, x) is a d-vector of drifts, a(t, x) is a d dispersion matrix, W(t) is an r-dimensional Brownian motion, � is a square-integrable random d-vector independent of W , and we work on a filtered probability space satisfying the usual conditions on which W and � are both defined. Suppose that the coefficients b, a satisfy the following global Lipschitz and growth conditions: x r
I I b(t, x) - b (t, y ) II + l I a ( t , x) - a(t, y ) II
:S
K Il x - yll ,
IIb(t, x) 11 2 + Il a(t, x) 11 2 :S K 2 ( 1 + Il x I1 2 ) , for all t � 0 , x, y E JRd , for some constant K > O . Theorem 5 . 8 . 1 . Under the above Lipschitz and growth conditions,
(i) the Picard iteration X ( O ) ( t )
:=
�,
204
5. Stochastic Processes in Continuous Time
x< n +1 ) (t) : = e +
t
t
J b(s, (s))ds + J O"(s, x
0
converges, to X(t) say; (ii) X(t) is the unique strong solution to the stochastic differential equation X(O) e, X(t) = e J b(s,X(s))ds + J O"(s,X(s))dW(s); o (iii) X(t) is square-integrable, each Tt)> satisfies 0 there exis ts a constant C, depending only on K and T,andsuchfor that the growth condition X( +
=
t
t
0
Unfortunately, it turns out that not all SDEs have strong solutions. However, in many cases one can nevertheless solve them, by setting up a filtered prob ability space for oneself, setting up an SDE of the required form on it, and solving the SDE there. The resulting solution concept is that of a see e.g. Karatzas and Shreve Naturally, weak solutions are distributional, rather than pathwise, in nature. However, it turns out that it is the weak solution concept that is often more appropriate for our purposes. This is particularly so in that we will often be concerned with convergence of a sequence of ( discrete ) financial models to a ( continuous ) limit. The relevant convergence concept here is that of ( see In the con tinuous setting, the price dynamics are described by a stochastic differential equation, in a discrete setting by a stochastic difference equation. One seeks results in which weak solutions of the one converge weakly to weak solutions of the other; see The Ornstein-Uhlenbeck Process. The most important example of a stochastic differential equation for us is that for geometric Brownian motion above ) . We close here with another example. Consider a model of the velocity of a particle at time = moving through a fluid or gas, which exerts a force on the particle consisting of: 1. a frictional drag, assumed proportional to the velocity, 2. a noise term resulting from the random bombardment of the particle by the molecules of the surrounding fluid or gas. The basic model for processes of this type is given by the ( linear ) stochastic differential equation
tion;
weak solu
(1991), §5. 3 .
weak convergence §5. 1 O).
§6. 4 .
(§5. 6 . 3
t (V(O) vo),
V(t)
dV -{3Vdt adW, =
+
(5. 8)
5.8 Stochastic Differential Equations
205
whose solution is called the ( velocity ) process with re laxation time and diffusion coefficient D It is a stationary Gaussian Markov process ( not stationary-increments Gaussian Markov like Brownian motion ) , whose limiting ( ergodic ) distribution is N ( O, j3D ) ( this is the classical Maxwell-Boltzmann distribution of statistical mechanics ) and whose limiting correlation function is If we integrate the Ornstein-Uhlenbeck velocity process to get the Ornstein Uhlenbeck displacement process, we lose the Markov property ( though the process is still Gaussian ) . Being non-Markov, the resulting process is much more difficult to analyse ( see e.g. Rogers and Williams (1994) , §I.23, p. 54). The Ornstein-Uhlenbeck process is important in many areas, including: 1. statistical mechanics, where it originated, 2. mathematical finance, where it appears in the for the term structure of interest-rates, see §8.2.2. We solve the stochastic differential equation (5.8). We know that
Ornstein-Uhlenbeck �a2 132 • /
1 /13
:=
e -i3 I · I .
Vasicek model
V(t) e- i3t =
solves the homogeneous part of (5.8). Using the method of variation of con stants or of parameters ) , we have
(
d(CV) C'V V'C C'e - i3t - j3CVdt . =
But
+
=
CV solves (5.8), and identifying terms leads to C'(t) ei3t adW(t). t t V * (t) e -i3 J ei3uadW (u) =
So
=
o
solves (5.8) . This approach to solving linear stochastic differential equations can be generalised, see Karatzas and Shreve (1991) , §5.6. Kloeden and Platen (1992) , Chapter 4, contains more examples of explicitly solvable stochastic differential equations. Numerics. Most differential equations ( ordinary or partial ) do not have solutions in closed form, and so have to be solved numerically in practice. This is all the more true in the more complicated setting of stochastic differential equations, where it is the exception rather than the rule for there to be an explicit solution. One is thus reliant in practice on numerical methods of solution, and here a great deal is known. We must refer elsewhere for details; for an excellent monograph account, see Kloeden and Platen (1992). For a recent overview on numerical methods in finance we refer to Rogers and Talay (1997) .
206
5. Stochastic Processes in Continuous Time
5 . 9 Likelihoo d Estimation for D iffusions
We consider diffusions dXt
=
j.L(Xt ; 8)dt + a(Xt ; 8)dWt
(5 .9)
with Wt a standard Brownian motion, and 8 E 8 with 8 a compact param eter space. We require the diffusions to be time-homogeneous and stationary and work under the following additional assumptions. Assumption 1 . For all 8 E 8 the
exists a strong solution of (5.9) . This may be guaranteed by imposing Lipschitz and linear growth bounds:
1j.L(x; 8) - j.L(y; 8) 1 l a(x; 8) - a(y; 8) 1 1j.L(x; 8) 1 2 la(x; 8) 1 2
:::; K Ix - y l , :::; K I x - yl , :::; K2 ( 1 + IxI 2 ) , :::; K 2 ( 1 + I xI 2 ) , where K is some fixed constant independent of 8. Assumption 2 . 8 is compact and
the true parameter value 80 is an element of 8, i.e. 8 E 8 . Given observations Xo , Xt 1 , , Xtn ( which we may assume to be equidis tant, i.e. to be observations at + 1 equidistant time points to = 0, t l , . . , tn, with Ll f tk - tk- l ) the likelihood function of the discrete-time data is •
.
•
.
n
=
In ( 8) = p(Xo , Xtl " . . , Xtn ; 8) n- l = p(Xo , to; 8) II p(Xtk , tk ; Xtk+l ' tk + 1 ; 8) k =O n -l p(Xo , to ; 8) II p(Xtk , Llr ; Xtk + l ; 8) , =
k=O
where p( . ; 8) denotes the joint density; the first equality follows from the Markovian nature, and the second equality from the assumption that we have a time-homogeneous process. We shall be concerned with the log - likelihood function e n (8) , n- l en (8) = log I n (8) 10g p(Xo , to ; 8) + L 10g p(Xtk , Llr ; Xtk+1 ; 8) . =
k=O
Assumption 3 .
The likelihood function en (8) is twice continuously differentiable in 8 in the interior of 8. Furthermore the class of random matrices so obtained
5.9 Likelihood Estimation for Diffusions
{ [ [Pe[)f)n2(f)) ] f) } '
is uniformly bounded. So the observed information In ( f) )
=
E
207
eo
- [82';;2( O)J exists.
Assumption 4.
The expected information matrix
has full rank and is uniformly bounded for f) E e . Assumption 5 .
For every vector A E IR d we have that the quadratic form A ln ( f) ) A ---+ 00 . These assumption are needed to assure that the maximum likelihood esti mator 0 can be obtained from the ( true ) likelihood function and is consistent and asymptotically normal. '
1 -5 the maximum likelihood estimator exists and is consistent and asymptotically normal (CAN), i. e. with eo the true parameter On ---+ f)o ( n ---+ (0 ) in probability, (5. 10)
Theorem 5 . 9 . 1 . Under assumptions
On
and
In ( f)o ) ! ( On - f)o ) ---+ N (n ---+ (0 ) in distribution, N
�
N ( O, 1 ) .
(5 . 1 1 )
For a proof see e.g. Barndorff-Nielsen and S0rensen ( 1 994) or Brandt and Santa-Clara (2002) . Indirect Inference. A naive approach to the above estimation problem is to use a discrete Euler approximation: where
X(Ll) t
=
x(Ll) for k Ll
< t < (k + 1)..1 , kLl -
(5. 1 2)
with E�Ll) Gaussian white noise. From Karatzas and Shreve ( 1 99 1 ) we know that the process x�Ll) converges weakly to X (t) for Ll ---+ 0 at a sufficient rate. Thus a naive approach to estimate f) would be to use an approximation of type (5.12) ( usually Ll 1 is chosen ) and compute the maximum likelihood estimator. However, it is known that such an estimator is inconsistent. =
208
5. Stochastic Processes in Continuous Time
Generalized Method of Moments Estimator This approach has been used in Chan, Karolyi, Longstaff, and Sanders (1992) and will be the main benchmark. The main idea here is to approximate conditional moments of XiL1 X( i - l) L1 given X(i- l)L1 . In Chan, Karolyi, Longstaff, and Sanders (1992) the authors discuss as a first idea
V ar ( XiL1 - X( i - l) L1 ! X( i- l)L1 = x ) = a 2 x 2-y Ll ,
and one could use these moments to develop an estimator. However, as a generalization of the idea, all kind of moments implied by the model might be used. To be more concrete, in the Chan, Karolyi, Longstaff, and Sanders (1992) the authors proceed as follows. Define the residuals and set
i= 2
E iX(i- l)L1 A 2x 2-y E i2 - �a ( i- l)L1 1+2-y A 2x Ei (i - l) L1 - �a 2x. (il ) L1
Find the parameters by optimizing with a positive definite weight matrix W (O) . A drawback of this method is that only information on the moments is used, that is even if the transition probabilities are known this information is not used, ignoring aspects of time aggregation. ( See e.g. Campbell, Lo, and MacKinlay (1997) , Hamilton (1994), James and Webber (2000) for further details. ) FUrther discussion of the above methods and related approaches to this problem can be found in Matyas (1999) , Liesenfeld and Breitung (1999), Gourieroux and Monfort (1996) , Chapter 6 and S�rensen (2001). Discussion from a non-parametric point of view is in Alt-Sahalia (1996), Chapman and Pearson (2000) , Downing (1999), Stanton (1997) with a recent overview in Kiesel (2001).
5. 10 Martingales, Local Martingales and Semi-martingales
209
5 . 10 Martingales , Local Martingales and Semi-martingales
5 . 1 0 . 1 Definitions Local Martingales. For X time, call X'T the process X
process and T a stopping stoppedt )ata Tstochastic , =
(X
If X is a process and ( Tn ) an increasing sequence of stopping times, call X a if (X'Tn ) (X'Tn l\ t ) is a martingale. The sequence ( Tn ) is called a
local martingale localizing sequence.
=
adapted, cadlag process X martingale if it can beAndecomposed as
Semi-martingales.
(Xt ) is a
semi-
with N a local martingale and A a process locally of finite variation (we will abbreviate 'locally of finite variation' to FV). Such processes are also called and the above is called the (semi-martingale, under stood) For background, see e.g. Protter (2004) , III, Rogers and Williams (1994) , IV.15. (Recall: a function ! is of fi if the supremum of its variation L i 1 ! (Xi + i !(xi) 1 over partitions Xi is finite; this happens iff ! can be represented as the difference of two monotone functions.)
decomposable, decomposition.
-
nite variation
The crucial difference between left-continuous (e.g., caglad) functions and right-continuous (e.g. , cadlag) ones is that with left-continuity, one can 'predict' the value at 'see it coming' - knowing the values before We write p, called the (or previsible) a-algebra, for the algebra on IR + x n (IR+ for time � 0, n for randomness w E n - we need both for a stochastic process X (X ( w))) for the a-field generated by ( = smallest a-field containing) the adapted caglad processes. (We shall almost always be dealing with adapted processes, so the operative thing here is the left-continuity.) We also write X E P as shorthand for 'the process X is P measurable', and X E P if also X is bounded. Previsible (= Predictable) Processes.
t. predictable t t,
t
-
17-
=
b
Predictability and Semi-Martingales. Let us confess here why we need to introduce the last two concepts. One can develop a theory of J� H( s , w)dM( s , w) or J� HsdMs , where H, M are stochastic pro cesses and the M is a the H is (and bounded, or L2 , or whatever) . This can be done (though we shall have
integrals,
integrator
stochastic semi-martingale, integrand previsible
210
5 . Stochastic Processes in Continuous Time
to quote many proofs; see e.g. Protter (2004) or Rogers and Williams (2000)) . More: this theory is the most general theory of stochastic integration possi ble, if One demands even reasonably good properties (appropriate behaviour under passage to the limit, for example) . For emphasis: Integrands: previsible, Integrators: semi-martingales. Prototype. H
motion (§5.3) .
is left-continuous (and bounded, or L 2 , etc.) ; M is Brownian
Economic Interpretation. Think of the integrator M as, e.g. , a stock-price process. The increments over [t, t + ] ( > 0, small) represent 'new informa tion'. Think of the integrand H as the amount of stock held. The investor has nO advance warning of the price change Mt +dt - Mt over the immediate future [ t, t + dt] , but has to commit himself On the basis of what he knows already. So H needs to be predictable at H before t (e.g. , left- continuity will do) , hence predictability of integrands. By contrast, MHdt - Mt represents new price-sensitive information, or 'driving noise'. The value process of the portfolio is the limit of sums of terms such as Ht- (Mt +dt - Mt ) ' the stochastic integral J� Hs dMs · This is the continuous-time analogue of the martingale transform in discrete time; see e.g. Neveu (1975) , VIlI.4. u
u
1. The definitive setting for stochastic integration has locally bounded integrands and semi-martingale integrators (see e.g. Rogers and Williams (2000) , VI.38). We shall use this in Chapter 6, with in tegrands as trading strategies, integrators as price processes and integrals as portfolio value processes. Thus in particular, restriction to locally bounded trading strategies is natural. 2. Stochastic calculus with 'anticipating' integrands, 'backward' stochastic integrals, etc. , have been developed, and are useful (e.g., in more advanced areas such as the Malliavin calculus, see Nualart (1995)) . But let us learn to walk before we learn to run. Note.
predictable (= previsible)
Doob-Meyer Decomposition: Local Form.
may be decomposed uniquely as
If Z is a local martingale, Z
Z = Zo + M + A,
with M a local martingale and A a predictable increasing process, both null at O. For a proof, see e.g. Rogers and Williams (2000) , VI 32. Square-Bracket Process and Product Rule. Recall that in §5.2 we in troduced the (angle-bracket) quadratic variation process (X) of a continuous square-integrable martingale X. Then
5 . 1 0 Martingales, Local Martingales and Semi-martingales
X2
=
2
J
211
XdX + (X)
gives the Doob-Meyer decomposition of the submartingale X 2 on the left into its martingale part 2 X dX and its increasing part (X) . If continuity of X is dropped, one needs a more general definition of quadratic variation - and one needs to distinguish between X_ (the process t H Xt- ) and X . We now define the (square-bracket) quadratic variation [Xl for a semi-martingale X by [Xl X 2 - 2 X_ dX.
J
J
:=
Writing
[X , Xl
for
[Xl
and using polarization as before, one has
[X , Y l
=
XY -
J
X_ d Y -
J
L dX.
This is called the product rule (or integration by parts) . It is the simplest special case of Ito 's formula (§5.1O.2 below) - to which it is in fact equivalent. For semi-martingales X , one now has two quadratic variations, [Xl and (X) . They differ by the sum of the squares of the jumps. Writing LlXs for the jump Xs - Xs - (if any) , For full detail, see Protter ( 2004 ) , II.6, Rogers and Williams (1994) , VI.34. For the two quadratic variations coincide, as there are no jumps. (Because of this, some authors prefer to use only one quadratic variation, which has its advantages, but we prefer to use two; cf. Protter (2004) , III.3.)
X continuous,
5 . 1 0 . 2 Semi-martingale Calculus
The change-of-variable formula (or chain rule) of ordinary calculus extends to the Lebesgue-Stieltjes integral, and tells us that for smooth f (E C l ) and continuous A, of FV (on compacts) , t
f(At ) - f (Ao ) =
J
J' (As ) dAs .
o
(This of course does not apply to Bs dBs � Bt - � t, but there the integral is Ito, not Lebesgue-Stieltjes.) Rather less well-known is the extension to A only right-continuous:
J�
=
t
f(At ) - f(Ao)
=
J
0+
J' (As- )dA s +
L
O< s �t
{ f(As ) - f (As- ) - J' (As- ) LlAs } .
212
5. Stochastic Processes in Continuous Time
Proof is deferred, as this is a special case of the result below ( or see a book on stochastics, e.g. Rogers and Williams (2000) , IV.I8 ) . One thus seeks a setting capable of handling both the last two dis played results together. Recall that the square-bracket process is non decreasing, so can be split into a continuous part, say, and a sum-of jumps part.
[X, X] C [X, X]
For X a semi-martingale and f then f(X) is also a semi-martingale, and t t f(Xt ) - f(Xo) / f'(Xs - )dXs + � / f" (Xs - )d[X,X] � + 0<s9 L { f(Xs ) - f(Xs- ) - f'(Xs - ) LlXs } . First, we note the special case for X continuous. Theorem 5 . 1 0 . 2 (Ito's Formula for Continuous Semi-martingales) . Foremi-martingale, a continuouands semi-martingale and f E C2 , f(X) is also a continuous X s t t f(Xt} - f(Xo) / f'(Xs )dXs + � J f" (Xs )d[X, XJ s . o 0 Corollary 5 . 1 0 . 1 . If X Xo+M+ A is the decompo sition of the continuous semi-martingale X , that of f (X) is t t t f(Xt ) f(Xo) + / f'(Xs )dMs {/ f'(Xs )dAs + � / f" (Xs )d[M] s }. o Proof. Let A be the class of C2 functions f for which the desired result holds. Then A is a vector space. But A is also closed under multiplication, so is an alg ebra. Indeed, if f, A, write Ft , Gt for the semi-martingales f(Xt ) and g(Xt ) and use the product rule. If X Xo+ M+A is the decomposition of a ( continuous ) local martingale M and a FV process A, the continuous Xlocalintomartingale part of F f (X) is f f' (Xs )dMs . So writing x cm for the continuous martingale part of X, etc., Theorem 5 . 1 0 . 1 (Ito's Formula for General Semi-martingales) . E C2 , =
0+
0+
=
=
+
=
0
g
E
0
=
=
.
[
F G] t ,
=
which by above is
.
[Fcm, Gcm] t [/ f'(Xs )dMs , / g'(Xs )dMs] t , o 0 =
5 . 1 0 Martingales, Local Martingales and Semi-martingales
213
t
J 1'( Xs)g' (Xs)d[M, M] s .
o The product rule now says that t t t Ft Gt - FoGo = FsdGs + GsdFs + U' g' ) (Xs)d[M]s.
J 0
J
J 0
0
As Ito's formula holds for I ( by assumption ) ,
�
dFt ! ' (Xt)dXt + j" (Xt)d[Mlt , =
and similarly for g. Substituting, t FtGt - FoGo = {Fsg' (Xs) + 1' (Xs)Gs}dXs + o t {Fsg " (Xs) + 2 ! ' (Xs)g' (Xs) + j" (Xs)Gs}d[M] s . o This says that Ito�s formula also holds for Ig. So A is an algebra, and a vector space. Since A contains I(x) = x, A contains all polynomials. One can now reduce from the local-martingale to the martingale case by a localization argument, and extend from A to C2 by an approximation argu ment. For details, we refer to Rogers and Williams ( 2000 ) , IV.32 ( and for the extension to discontinuous X, V1.39 ) . A different approach, using Taylor's D formula for I, is in Protter ( 2004 ) , 11.7.
J �J
Note. 1 . For vector functions 1 = (FI , . . . , In) and vec tor processes X = (X l , . . . , Xn) , we use the Einstein summation convention and Dd for BIIBi. Then the Ito formula extends ( with much the same proof ) as t t I (Xt) - I( Xo) Dd(Xs)dX; + Dii l(Xs )d[Xi , Xi ] s. o 0 2. With I (x, y) = xy, we then recover the product rule ( integration-by-parts formula) : t t ( 5. 13 ) X(t)Y(t) = X(u-)dY(u) + Y(u-)dX(u) [X, Y] (t). o 0 3. In particular, the class of semi-martingales is closed under C 2 functions. This gives a powerful - and highly non-linear - closure property of the class
Higher dimensions. =
J
J
�J
J
+
5 . Stochastic Processes i n Continuous Time
214
of semi-martingales. 4. In the one-dimensional case, we may re-write Ito's formula in shorthand form using differential notation instead of integral no tation as
Differential Notation.
d! (Xt ) !' (Xt )dXt + � f" (Xt )d[Xl tThe left is called the stochastic differential of f(X). So the above gives the stochastic differential of f(X) in terms of the stochastic differential dX of X and the quadratic variation [X , Xl or [Xl of X. 5. Formalism. For the continuous case, we can obtain Ito's formula from =
Taylor's formula in differential form, if we adopt the following rules for dif ferentials: whenever V has FV. In particular, for X
=
W BM in IRn , (dwD 2 dt ,
dW;dWI 0 (i "# j), dWdV 0 V of finite variation on compacts. The interpretation here is that for i "# j, the Brownian increments dWi and dWj are independent with zero mean, so lE(dwtdWl) lE(dWD lE(dWl) = 0 0 0, while for i j the earlier symbolism (dWt )2 dt becomes (dWl)2 dt. The formalism above works well, and is a flexible and =
=
=
x
=
=
=
=
=
x
reliable tool in practice. Recall that normally in calculus we simply omit all differentials of order higher than one. We do have some prior experience of retaining second-order differentials, e.g. in differential geometry (first and second quadratic forms, curvature, geodesics etc.) . So retaining second-order differentials, and ma nipulating them by the above simple rules, is not a complete culture-shock. (Differential geometry and stochastic calculus in fact combine, as stochastic differential geometry - see e.g. Rogers and Williams (2000) , V §5.) Levy's Martingale Characterization of Brownian Motion. We may
now complete Levy's result on quadratic variation of Brownian motion (The orem 5.3.3) to an 'if and only if' statement. Recall that Brownian motion is continuous, so the angle-bracket and square-bracket quadratic variations are the same. Theorem 5 . 10.3
Xo
t:
=
O.
. Let X be a continuous local martingale with Then X (Levy) is Brownian motion BM iff it has quadratic variation [X, Xl t t (t � 0). =
5 . 1 0 Martingales , Local Martingales and Semi-martingales
215
Proof (Kunita-Watanabe) . We already know that BM has quadratic variation For the converse, fix () E JR and take
t.
where
f (x t) := exp { i(}x + � (}2 t } . Now fx = i(} f, fxx = _(} 2 f, ft � (} 2 f (in an obvious notation for partial derivatives ) . By Ito's formula, df(Xt , t) = fxdXt + ft dt 21 fxxd[X, X] t , and we are given [X, X] t t, df = iOfdXt + � (}2 fdt - � (}2 fdt i(}dXt . So f is a stochastic integral w.r.t. a continuous local martingale, so M = f is itself a continuous local martingale. Also I Mt l exp ( � (} 2 t ) 00, whence we can drop the 'local'. So for s ::; t, .IE (Mt I Fs ) = Ms , or .IE [exp { iOXt + � (} 2 t } I Fs ] = exp { _ � (} 2 (t s) } . So X has stationary independent Gaussian increments, and is continuous, so � BM. ,
=
+
=
as
=
<
=
-
0
Much the same proof applies in higher dimensions. One assumes
[X i , xj ] Oij t , =
l , . . , xn ) is BM(JRd ) (Levy process, Gaussian incre (X X It), path-continuous) .
and concludes ments Nn (O,
=
.
5.10.3 Stochastic Exponentials
For Xrandom a continuous semi-martingale with variable, there exists a unique and Z an Fo-measurable Xo(continuous) semi-martingale o Z satisfying Zt = Zo J Zs dXs · ( O , tj Then where £(X) t : = exp { Xt � [X] t } Theorem 5 . 10.4 (DoIeans) . =
0
+
-
.
5. Stochastic Processes in Continuous Time
216
Proof. Existence.
d£(X) t
=
By Ito's formula, 1
1
£(X)tdXt - "2 £(X)t + 2 £(X) t
=
£ (X)tdXt
( the first term in the centre is the first partial for the Xcargument, the sec ond that for the [X ] t-argument, the third the second-order - Ito - term for the first argument ) . So £ ( X ) t satisfies the equation. Uniqueness.
( much
as
above ) ,
Write Yt
:=
exp { -Xt + � [X] t } . By Ito's formula again 1
1
dYt = - yt dXt + "2 yt d [X ] t + "2 Ytd[Xl t
-YtdXt + ytd[XJ t .
=
Now let Zt be any solution to the equation. By the product rule, as dZt
=
Zt dXt ,
d( Zt yt )
=
Zt dyt + YtdZt + drY, Z] t
=
Zt (-YtdXt + Ytd[X]t ) + Yt (Zt dXt ) + (-Yt Zt )d[Xl t
( as Zt Zo + J� ZdX, yt Yo - J� Y dX + Stieltjes integral, and the Stieltjes integral does not contribute, [Y, Z] t Z(-Y )d [X , X] - t YZd [X] ) . So =
=
=
=
d(Zt Yt) 0, 1 , so Zt = Zo/ yt =
Ztyt is constant,
=
Zo as Yo
=
=
o
Zo £ (X)t .
This result was proved by Catherine Doleans ( Doleans-Dade: 1970, ZfW ) , so stochastic exponentials are also called Doleans ( or Doleans-Dade ) exponentials. =
5. 1 0. 1 . 1 . We may re-write the ( stochastic ) integral equation in the theorem in differential notation as
Remark
dZt
=
Zt dXt ·
This is then a ( SDE ) , the SDE. It has a unique solution, the stochastic exponential as above. This is actually unusual: few SDEs can be solved explicitly. 2. For possibly discontinuous semi-martingales, the defining equation becomes Zt = Zo + J( O , tJ Zs_ dXs , and the solution becomes
stochastic differential equation
Zt
=
�
exp { Xt - [X , X]d
IT
0< s 9
(1 + Ll Xs ) exp
exponential
{ -LlXs
+
� (LlXs)2 } ,
the product ( which may be infinite ) converging. For proof, see e.g. Protter (2004) , II.8.
5 . 10 Martingales, Local Martingales and Semi-martingales
217
Application. Geometric - o r 'Exponential' - Brownian Motion. We return to the geometric Brownian motion (GBM)process of §5.6.3. The pro cess GBM may now be recognized as the stochastic exponential that solves the defining SDE, ( GBM) dSt St (jLdt + ad Wt ) , So x (here jL E JR , a > 0, W is BM ) . =
=
5 . 1 0.4 Semi-martingale Characteristics
Recall the treatment of Levy processes and the Levy-Khintchine formula in §5.5.3. Thus to specify a Levy process we need to specify three things: a drift a variance a2 and a Levy or jump measure. It turns out that semi martingales provide a far-reaching generalization of Levy processes, which is valuable for modelling purposes in a number of contexts. Correspondingly, one will need (at least) a generalization of the 'Levy triple' above needed to specify them. It turns out further that one can indeed specify a semi martingale by such a triple, in such a way that the special case of Levy processes stands out clearly. However, the truncation function l( lxl < 1) used in Th. 5.5.2 to separate the large and small jumps is canonical only in its broad outline and not in its detailed form. Accordingly, neither in the Levy case nor the general semi-martingale case can one expect uniqueness. One specifies the triple with respect to a chosen 'truncation function', and has to be prepared to be flexible and change this if need be. We will merely sketch the theory, referring to the standard text Jacod and Shiryaev (1987) , Ch. II, III for details. First, we need a truncation function h a bounded function of compact support with h ( x ) x near 0 (prototype: h(x) x1(lxl < 1 ) though taking h to be continuous may be convenient) . The introduction of h allows us to decompose the jumps into the 'small' ones (those with values in the neighbourhood of 0 in which h(x ) x) and the 'large' ones (the others) . Semi-martingales in which the bounded-variation process A in the decomposition X Xo + M + A is also predictable are called special super-martingales. Semi-martingales with bounded jumps are special. So, using h to eliminate the large jumps reduces X to a special semi martingale, X ( h) say. Write its decomposition as a,
-
=
=
-
=
=
X ( h)
=
Xo + M(h) + B ( h) .
Then B (h ) is predictable, and depends on h. With Xc the continuous mar tingale part of X , the quadratic covariation G ( Gij ) , that is, =
G ij
=
(X i ,C , xj,C ) ,
gives a continuous part, G, independent of h. The jumps of X may be com pensated by a predictable random measure again independent of h. Then the triple v,
218
5. Stochastic Processes i n Continuous Time
(B, C,
v)
gives the of the semi-martingale X (Jacod and Shiryaev (1987) , II.2a) . It turns out (Jacod and Shiryaev (1987) , II.4c) that a semi martingale has independent increments if and only if it has a version (B, C, ) of its characteristics that are Specializing to Levy processes (increments stationary as well as independent) :
characteristics
1/
deterministic. Levy processes are semi martingales time only via- those whose characteristics are deterministic and depend on v(w ; dt,dx) = dtJ.L(dx) ;
then (b, c, J.L ) are as in the Levy-Khintchine formula. We have seen from the Fundamental Theorem of Asset Pricing that to price assets, one needs to change measure to the (for complete markets) or an (for incomplete markets) - equivalent martingale measure. For mar ket models generated by Levy or semi-martingale processes (incomplete, in general) , one needs to know how the characteristics behave under change of measure. This question is thoroughly studied in Jacod and Shiryaev (1987), III.3d, to which we refer for details. In brief: C is unaltered, picks up a mul tiplicative factor, while B is radically changed (Jacod and Shiryaev (1987) , III, (3.27)). This is to be expected: the basic (Bl:Ownian) form of Girsanov's theorem shows how change of measure changes the drift. Further, two Brow nian motions with parameters ( J.Li, O"i ) have equivalent measures if and only if they have the same variance: 0"1 = 0"2 .
v
Applications. 1 . Suppose we take a market model, which has Brownian noise as in the Black-Scholes case, but with the addi tional feature of jumps, or 'shocks', at the instants of a Poisson process but of random sizes. Models of this type go back to Merton in 1976 (Merton (1990), Ch. 9); see also Lamberton and Lapeyre ( 1996), Ch. 7 and Lando (1995). In particular, Lando's paper gives details of a number of parametric models arising as special cases. 2. The hyperbolic distributions of §2. 12 may be used to build a Levy process (hyperbolic, or generalized hyperbolic, Levy motion). The Levy measure is known explicitly, and hence also its semi-martingale characteristics. The model is incomplete; there is a continuum of equivalent martingale measures, giving prices which span the entire non-arbitrage in terval (bid-ask spread). For details of the model, and in particular, for asset pricing and option pricing questions, we refer to Eberlein (2001) and the references cited there. 3. The classical Merton prob lem is that of finding the optimal balance between investment and consump tion - that is, to find the optimal strategy for an investor who wishes to maximize his utility (with respect to some utility function) over time, suit ably choosing how much to consume and how much to invest, again over
Jump-Diffusion Models.
Hyperbolic Models.
Fundamental Theorem of Utility Maximization.
5 . 1 1 Weak Convergence of Stochastic Processes
219
time. In the Brownian ( or Black-Scholes) case, the problem is solved in Mer ton ( 1990) , Part II. The problem may be formulated much more generally, in a semi-martingale context, using the characteristics. For logarithmic utility, a solution is obtained - 'Fundamental Theorem of Utility Maximization' - in Goll and Kallsen (2003) , to which we refer for details. A is a random measure z that is, so that for disjoint measurable sets Ai one has z(UAi) = I: Z ( Ai) - so that for each A, the random variable z(A) is infinitely divisible. Such a random field may be described by characteristics; see e.g. Jacod and Shiryaev ( 1987) , 11. 1 .2, Barndorff-Nielsen and Shephard (2001), §3. In particular, specializing to the homogeneous case, there is a one-to-one correspondence between self-decomposable laws - or stationary stochastic processes X (Xt ) with each Xt having this law - and Levy processes Z = (Zt } with
4. Stochastic Volatility Models of Ornstein-Uhlenbeck Type. Levy field -
=
Xt = e- At Xo +
t
J e- A ( t -s) dZ( AS ) o
for all A > O. Then X is called a stochastic process of and satisfies the SDE
type,
dXt
=
Ornstein-Uhlenbeck
-AXtdt + dZ(At) .
The prototype is Z Brownian motion, when X is the classical Ornstein Uhlenbeck process. The motivation here is This takes account of the well-known empirical observation of volatility smile, volatility skew etc. Thus volatility is not constant, but changes, unpredictably, and so one seeks a stochastic model to describe a process 2': O} rather than a parameter 0"2 . The principal model is the one above, using the SDE
stochastic volatility modelling. { 0"2(t) : t
For background and details, see Barndorff-Nielsen and Shephard (2001) and Barndorff-Nielsen and Shephard (2002). 5 . 1 1 Weak Convergence of Stochastic Processes
5. 1 1 . 1 The Spaces Cd and Dd
In the situations relevant to us, the metric space S (or more precisely (S, d) , where d is the metric) will be a space of functions, and the stochastic processes Xn , X for which the weak convergence of Xn to X is to be studied have paths (the maps -+ Xn (t, )) in the function space S.
t
w
220
5. Stochastic Processes in Continuous Time
The most basic case of interest is where S is the space of continuous functions C ( or C d in d dimensions ) on the time interval [0, T] ( with t ° as the initial time and t = T as expiry ) . The natural choice of a metric d on S = C is the uniform metric =
d( j, g)
:=
sup { I f (t) - g (t) 1
:t
E
[0 , T] } .
This induces the topology ( in metric situations the word 'topology' can be replaced by 'convergence concept' ) of uniform convergence: ¢:}
fn fn
--+
--+
f f
in (C, d) (n --+ 00 ) uniformly on [0, T] : sup
t E [O,Tj
{ I fn (t) - f (t) l }
--+
°
( n --+ 00 ) .
For a thorough treatment of weak convergence on C under this metric, see Billingsley (1968), Chapter 2. This set-up is all we need to handle situations where the relevant stochas tic processes have continuous paths. This covers cases such as Brownian motion, geometric Brownian motion, diffusions, and solutions to stochas tic differential equations. However, it is not adequate to handle situations where a discrete-parameter stochastic process a'pproximates to a continuous parameter one ( as in the de Moivre-Laplace central limit theorem, or the passage from the discrete to the continuous Black-Scholes model, see §4.6). In such discontinuous situations we need to go beyond continuous func tions to functions with jumps ( the simplest form of discontinuity - which will suffice for our purpose ) . Recall (§5. 1) that we allow stochastic processes with paths that are right-continuous with left limits, but may have jump discontinuities. The space of such functions is called the Skorohod space D D[O, T] ( or D d in higher dimensions ) . There are various ways of metrising D; the most important one - which suffices for our purposes - is the Skorohod metric d ( for the precise definition, which allows a suitable time change in the time interval [0, TJ , see Billingsley (1968) , Chapter 3 , where a thorough treatment of weak convergence on D under this metric is given ) . The motivation there is to handle the large-sample behaviour of empirical distributions; as above, our motivation here is to handle discrete financial models approximating to continuous ones. =
5 . 1 1 . 2 Definition and Motivation
Let S denote a metric space with metric d and let C(S) denote the set of real-valued continuous functions defined on S. Let Cb ( S ) and Co (S) denote the subsets of C(S) given by all continuous functions that are bounded and have compact support, respectively. Suppose we are given S-valued random variables Xn , n 1 , 2, and X , which may possibly be defined on different probability spaces. Let IEn , IE =
.
.
.
5 . 1 1 Weak Convergence of Stochastic Processes
221
denote expectation on the probability spaces on which Xn , X are defined. Then we say that the sequence {Xn } converges in distribution to X if Let IPn , 1 , 2, . . . , and IP denote the measures defined on ( 8, B ( 8 ) that are induced by Xn , 1 , 2, . . and X, respectively. We have for any f E Cb ( 8 ) n =
.
n =
lEn f(Xn ) -+ lEf(X)
We use this
as
{:}
J f(s)IP(ds) J f(s)IP(ds). -+
s
our formal definition:
s
Definition 5 . 1 1 . 1 . A sequence of probabilities (IP (n ) ) on some metric space
8 is said to converge weakly to IP if
J f (s)IP(ds) J f (s)IP(ds) -+
s
s
for every bounded and continuous function on the metric space. We will use the notation IPn -+ IP weakly or, slightly abusing terminology, Xn -+ X weakly. Let g(.) be any continuous function from 8 into any metric space. A direct consequence of the definition of weak convergence is that Xn -+ X weakly implies g(Xn ) -+ g(X) weakly.
To indicate the reason for our particular interest in this notion of conver gence, consider for the moment the diffusion process X(.) where
b(X(t) )dt + O'(X (t) )dW(t) , X(O) = x. This random variable takes values in the space 8 = D d [O, 00 ) ( of course X(.) takes values in the smaller space C d [O, 00 ) , but it will become clear shortly dX (t)
=
why we prefer the larger space ) . Suppose we are interested in calculating a quantity such
as
where the functions k(.) and g(. ) are bounded and continuous. Consid ered as a function defined on D d [O, oo) , the mapping r.p -+ k(r.p(s))ds + g(r.p(T) ) is bounded and continuous. Suppose also that there are approxi mations eh ( . ) to X ( . ) available for which the analogous functional Vh(x) = lEx k (eh (s))ds + g (eh (T) ) could be readily computed. Then if the sense in which the processes eh ( . ) approximate X ( . ) is actually weak convergence, we can conclude Vh (x) -+ V (x) . This observation is the motivation for our subsequent investigations.
JOT
[JoT
]
222
5 . Stochastic Processes in Continuous Time
5 . 1 1 . 3 Basic Theorems of Weak Convergence
Further characterizations of weak convergence are given by the following result, called the portmanteau theorem: Theorem 5 . 1 1 . 1 . Suppose {JP( n ) , n
= 1 , 2, . . . } and JP are probability mea sures on a metric space. The following are equivalent: (i) JP( n ) converges weakly to JP. (ii) lim sup JP( n ) (F) ::; JP( F ) for all closed sets F. (iii) lim inf JP( n ) (G) � JP( G ) for all open sets G. (iv) lim JP( n ) (A) = JP(A) for all Borel sets A such that JP(8A) = 0) .
There is a simple condition that ensures that a sequence of probability measures has a weakly convergent subsequence. Definition 5 . 1 1 . 2 . A sequence of probability measures JP( n ) on a metric
space is tight if for every f there exists a compact set K such that sUPn JP( n ) ( K e) ::; f .
The most important result here is: Theorem 5 . 1 1 . 2 ( Prohorov ) . If a sequence of probability measures on a
metric space is tight, it is relatively compact in the topology of weak conver gence, i. e. it has a weakly convergent subsequence.
This is a direct theorem ( Billingsley ( 1968) , Theorem 6 . 1 ) . The converse - that relative compactness implies tightness - needs conditions on the met ric space. Separability and completeness suffice for this ( Billingsley ( 1968) , Theorem 6.2) . Example. From Random Walk to Brownian Motion. The basic tivating example for weak convergence theory is the dynamic ( or stochastic process ) analogue of a static ( or distributional ) result, the central limit the orem of §2.9. Recall that this says that if we have a random walk (Sn) a sequence of sums Sn 2:�= 1 Xk of independent, identically distributed random variables Xn then if the Xn have mean JL and variance a2 , Sn , when centred and scaled to give (Sn nJL) / (avn) , converges in distribution to the standard normal law, N(0, 1 ) . We can turn the random walk into a stochastic process by defining mo
=
-
-
�n (kln, w) : = (Sk (W)
-
kJL) / (a vn) , k = 0, 1 , . . . , n,
and extending the definition of �n from the discrete set of time points kin to the continuous set of all t E [0, 1] by making �n (t, w) a 'broken line' process 1 . equal to �n (kin, w) above at each time t = kin, k 0, 1 , . , n; 2. linear on each subinterval [ ( k - l) ln, kin] ; 3. continuous on [0, 1] . =
.
.
5 . 1 1 Weak Convergence of Stochastic Processes
223
( �n is called the continuous piecewise-linear interpolant of the values �n (k/n ) above.) Thus �n has paths in e e[O, 1] , the space of continuous real-valued functions defined on [0, 1] . If lPn is the probability measure of e giving the distribution of �n ' lPn (A) = lP (w : �n ( . ' w ) E A) for all suitable sets A E e (technically: the Borel sets of e under its sup-norm topology) , and lPoe is the probability measure on e giving the distribution of Brownian motion (the Wiener process W of §5 . 3) , the Erdos-Kac-Donsker in variance principle states that =
lPn
--+
lPoe
weakly, that is,
IE (f( �n ) )
--+
IE ( f ( W ) )
for each bounded continuous function f e --+ JR . This result is due to Erdos and Kac in 1946, Donsker in 195 1 ; for proof see e.g. the classic book Billingsley ( 1968), Chapter 2, § 1O, or the more recent monograph Jacod and Shiryaev ( 1987) , VII.3. Much more general versions of the result have been proved; results of this type are known as functional central limit theorems. For background, see e.g. the two books just cited, and Ethier and Kurtz ( 1986). :
5 . 1 1 .4 Weak Convergence Results for Stochastic Integrals
We closely follow Duffie and Protter Duffie and Protter ( 1992) . The follow ing set-up is fixed for this section. For each n, there is a probability space (n(n) , F(n) , lP(n) ) and a filtration IF(n) = ( Fi n) k::; T of sub-a-fields of F(n) (satisfying the usual conditions) on which x (n) and cp(n) are RCLL adapted processes with values respectively in JRd and Mk x d (the space of k x d matri ces) . (We can, and do, always for an RCLL version of any semi-martingale) . We let IE(n) denote the expectation with respect to (n(n) , F(n) , lP(n) ) . There is also a probability space and filtration on which the corresponding properties hold for X and cp , respectively. Moreover, (x(n) , cp(n) ) --+ ( X, cp) weakly. We assume throughout that x (n) is a semi-martingale for each n, which implies the existence of J cp(n) ( s- )dx (n) ( s) . The following property of {x (n) } is the key to the following discussion. Definition 5 . 1 1 . 3 . A sequence {x(n) } of semi-martingales is good if, for any {cp(n) } , the weak convergence (x(n) , cp(n) ) to ( X, cp ) implies that X is a semi-martingale and also the convergence of (x(n) , cp(n) , J cp(n) ( . - ) dx (n) ) to (X, cp, J cp ( . - ) dX) .
The following result, showing that 'goodness' is closed under appropriate stochastic integration, is due to Kurtz and Protter (199 1 ) , who also provide necessary and sufficient conditions for goodness. Proposition 5 . 1 1 . 1 . If (x(n) ) is good and (x(n) , cp(n) ) converges weakly, then (J cp(n) ( . - ) dx (n) ) is also good.
224
5 . Stochastic Processes in Continuous Time
We now state a simple condition for goodness sufficient for our applica tions. Condition G. A sequence {x(n) = M(n) + A(n) } of semi-martingales satis fies condition (G) if { lE(n) (SUPt :<:; T I .1 M(n) ( t ) l ) } and { lE(n) ( IA(n) I T ) } are bounded.
( Here I . I T
denotes variation on [O, T] .)
Theorem 5 . 1 1 .3 .
If x(n) satisfies Condition G, the {x(n) } is good.
The case of stochastic differential equations of the following form is par ticularly important: t
z(n) ( t ) =
Z(t)
=
J fn (s , z(n) (s -))dx (n) (s) , o
J f(s , Z(s-))dX (s) , o
where fn and f are continuous real-valued functions on IR + IRd into Md x m such that: 1 . x --+ fn ( t , x ) is Lipschitz ( uniformly in t), for each 2. t --+ fn ( t , x ) is left continuous with right limits ( LeRL ) for each x and 3. for any sequence {x n } of RCLL functions with X n --+ x in the Sko rohod topology, (Yn , xn ) converges to ( y , x ) ( Skorohod ) , where Yn ( s) x
n,
n,
fn ( s+ , xn ( s)) , y(s)
=
=
f( s+ , x( s) ) .
The following theorem is proved in Kurtz and Protter (1991) . {x(n) } is good, and let (fn ) n �l and f satisfy (i)-(iii). Suppose (
Theorem 5 . 1 1 .4. Suppose
lutions of
z(n) ( t ) =
Z(t)
=
t
Jo fn ( s , z(n) ( s-))dx (n) ( s) ,
J f( s , Z(s-))dX(s) , o
respectively. Then (z(n) ,
In order to apply the results to 'discrete-time' trading strategies
=
Exercises
225
The following convergence result is sufficient for our purposes. Lemma 5 . 1 1 . 1 . Let (s(n ) ) and S be Dd [ O , T] -valued on the same proba
bility space, S be continuous, and Sen) -+ S weakly. For each n, let the � random times { T n) } define a grid on [ 0 , T] with mesh size converging n -+ 00 to 0 almost surely. For some continuous f : JRd x [0, T] -+ JR, � � � ) � let cp(n) ( t) = f ( s(n) ( T n) ) , T n) , E [T n) , T :)l ] , and cp( t) = f ( S ( t) , t) . Then ( cp(n) , s( n) ) -+ (cp, S) weakly.
t
The immediate consequence is: Corollary 5 . 1 1 . 1 . Suppose, moreover, that { s(n) } is good. Then
J cp(t - ) n) ds n) (t) J cp(t- )dS(t) weakly. (
-+
(
Exercises
5 . 1 (Bayes Formula) . Assume P is absolutely continuous with respect to IP and Z is its Radon-Nikodym derivative. If Y is bounded (or P-integrable) and Ft measurable, then -
JE(Y I Fs )
W
=
1
Z( s)
JE(YZ(t) I Fs )
a.s. s
::::;
t.
(5. 14)
Let be a Brownian motion. 1 . If c is constant, show that
5.2
{ CW(t) - � c2 t } is a martingale. 2. Fix 0 b and let T inf{t � 0 W(t) b or W(t) b ) M(t) e-"2c cosh { c W(t) - -- } 2 ( Y (t)
a
<
<
exp
:=
=
=
:
1 2
=
= a
} . Show that
+a
t
is a martingale. 3. Use the optional sampling theorem to find JE( exp{ -c2 T/2 } ) , the Laplace transform of T (which determines the distribution of T) . 4. Letting -+ find the distribution of the first-passage time � b} . r( b) inf { t 5 . Let r { r ( b) : b > O } be the resulting first-passage time process of Use the stationary independent increments property of to show that r has stationary independent increments, so is a Levy process. (So r is an increasing Levy process - a subordinator which we shall need in §6.3.3.) a
=
:=
: W(t) - 00
-
W
W.
226
5. Stochastic Processes in Continuous Time
5.3
If W is Brownian motion and c > 0, V(t) : = e - ! ct W(cect )
is an Ornstein-Uhlenbeck (velocity) process. 5.4 If W is Brownian motion, so W(O) 0, we can define a new process by conditioning on W(l) = 0 also. 1. We can define the conditional distributions of all the W(s) , 0 < s < 1, together by extending the argument in Exercise 5 .5 above. 2. We can define a stochastic process Y {Y(t) 0 ::; t ::; I } with Y(O) Y(l) 0 whose unconditional distributions are the conditional distributions above. (These distributions satisfy the Daniell-Kolmogorov consistency conditions of §5. 1.) 3. Call the resulting process Y the Brownian bridge or tied-down Brownian motion. Show that Y is zero-mean Gaussian with covariance function =
=
=
Cov(Y(s ) , Y(t)) 5.5
:
=
=
-
min { s , t} - st, (0 ::; s, t ::; 1 ) .
If W is Brownian motion and M(t)
:=
max{W(s) : 0 ::; s ::; t}
is its maximum process (not Markovian!), then: 1 . M(t) has the same distribution as IW(t) l ; 2. M (t) , I W(t) 1 have density
f( x , t)
=
( 7ft2 )! exp { - 2t2 } . x
(Reflect W in the level x and use the strong Markov property of W at the stopping time r(x) to start the process afresh; cf. §6.3.4.) 5.6 Let W(t) be a standard Brownian motion and a(t) a bounded, mea surable and adapted process, 0 ::; t ::; T < 00 . Define t t e(t) a(u)dW(u) a(u) 2 du =
�J
J o
0
and Z(t) = exp { ( e ( t ) } . 1 . Show that Z ( t) satisfies the stochastic integral equation t Z(t) = 1 + Z(u)a(u)dW(u) .
J o
Exercises
227
2. Set = tion (SDE)
Y(t) 1jZ(t). Show that Y(t) satisfies the stochastic differential equa dY(t) Y(t)a2 (t)dt - Y(t)a(t)dW(t), Y(O) 1 . =
=
3 . Consider the SDE
dW(t) a(t)X(t)dt + b(t)dX(t) , with a, b sufficiently regular processes. Show that X(t) = Xl (t)X2 (t) with X, (t) � exp a(u)du } , X, (t) <, + *) exp { - a(u)du } dW(,) =
{i i
�
1
satisfies ( ) 5 . 7 (Vasicek process, see §8.2). Let be given by *
.
r dr(t) = ( - (3r(t))dt + /,dW(t), r(O) = ro, with (3, ro positive constants, W standard Brownian motion. 1. Show that r(t) has the normal distribution 0:
0: ,
/"
) IE(r(t))
2. Deduce that has limiting distribution N (�, ;; (the -+ 00, Maxwell-Boltzmann distribution of §5.7) . If i- � this shows that exhibits mean-reversion. 3. Deduce that if has this distribution as its initial distribution, it is strongly stationary (it retains this distribution for all time) . 5 . 8 (Cox-Ingersoll-Ross process, see §8.2) . Let r be given by as
t
r( t)
ro
r
dr(t) ( - (3r(t))dt + a v;:Tt)dW(t), r(O) ro, with {3, a, ro positive constants, W standard Brownian motion. (r is an example of a Bessel process; for a general treatment see (Revuz and Yor 1 991), =
=
0:
0:,
XI.) and its limiting value as -+ 00 . (This is another example of Find a mean-reverting process.) 2. Use Ito's formula to compute and compute
1.
t
IE(r(t))
dr(t) 2
IE (r(t) 2 ).
228
5. Stochastic Processes in Continuous Time
5.9
1 . a) Show that if X
=
Poisson(J.t) , ¢(s)
=
then
JE ( e x )
=
e-!'(I - s) .
b) Differentiate once and set s 1 in (i) to conclude that JE( X ) c) Differentiate k times and set s 1 in (i) to conclude that =
=
J.t.
=
JE( X(X 1) · · · (X - k + 1)) -
=
J.t k .
d) Use Var (X) JE( X(X 1)) +JE ( X) JE ( X) 2 to conclude Var( X) J.t. e) Use (i) to show that if X Poisson(J.t) and Y = Poisson(lI)are inde pendent, then X + Y Poisson(J.t + ) 2. Consider a Poisson process N(t) with rate ,x, and suppose that each time an event occurs it is classified either type I or type II event. Suppose further that each event is classified type I event with probability p and a type II event with probability I -p. Let NI (t) aIi'd N2 (t) denote respectively the number of type I and type II events occurring in [0, t] . Show that a) NI (t) and N2 (t) are both Poisson processes having respective rates ,Xp and 'x(1 p) . b ) The two processes are independent. 5 . 1 0 Show that the Levy measure given in §5.5.1. does indeed give the sym metric Cauchy law in the case 1, /3 O. (Use symmetry to show that the relevant integral is a function of l ui only, so we can take u > O. Now note that the integral is real. Differentiate under the integral sign, and use 1000 X-I sin xdx 1'0/2.) -
=
-
=
II .
=
as
-
0: =
=
=
=
6 . Mathematical Finance in Continuous Time
This chapter discusses the general principles of continuous-time financial mar ket models. In the first section we use a rather general model, which will serve also as a reference in the later chapters. A thorough discussion of the benchmark multi-dimensional Black-Scholes model is the topic of the second section. We discuss the valuation of several standard and exotic contingent claims in the continuous-time Black-Scholes model in the third section. After examining the relation between continuous-time and discrete-time models we close with a discussion of futures and currency markets. 6 . 1 Continuous-time Financial Market Models
6 . 1 . 1 The Financial Market Model
We start with a general model of a frictionless (compare Chapter 1) security market where investors are allowed to trade continuously up to some fixed finite planning horizon T . Uncertainty in the financial market is modelled by a probability space (Q, F, JP) and a filtration IF = (Ft ) O
Definition 6 . 1 . 1 . A numeraire is a price process X (t) almost surely strictly
positive for each t
E
[0, T] .
230
6. Mathematical Finance in Continuous Time
We assume now that So(t) is a non-dividend paying asset, which is (almost surely) strictly positive and use So as numeraire. 'Historically' (see Harrison and Pliska ( 1981 ) ) the money market account B (t) , given by B (t) e r ( t ) with a positive deterministic process r(t) and r(O) 0, was used as a numeraire, and the reader may think of So ( t) as being B ( t ) . Our principal task will b e the pricing and hedging of contingent claims, which we model as FT -measurable random variables. This implies that the contingent claims specify a stochastic cash-flow at time T and that they may depend on the whole path of the underlying in [0, T] - because FT con tains all that information. We will often have to impose further integrability conditions on the contingent claims under consideration. As we have seen in Chapter 4, the fundamental concept in (arbitrage) pricing and hedging contingent claims is the interplay of self-financing replicating portfolios and risk-neutral probabilities. Although the current setting is on a much higher level of sophistication, the key ideas remain the same. We call an JRd +l -valued predictable, locally ])ounded process =
=
cp (t)
(CPo (t), . . . , CPd (t)) , t E [0, T]
=
with JOT JE(cpo (t ))dt < 00, I:t= o JOT JE(cpHt))dt < a trading strategy (or dynamic portfolio process). (The conditions ensure that the stochastic inte gral J; cp(u)dS (u ) exists; cf. §5 . 6.) Here CPi (t) denotes the number of shares of asset i held in the portfolio at time t - to be determined on the basis of information available before time t; i.e. the investor selects his time t portfolio after observing the prices S(t-) . The components CPi (t) may assume negative as well as positive values, reflecting the fact that we allow short sales and assume that the assets are perfectly divisible. 00
Definition 6 . 1 . 2 . (i) The value of the portfolio
scalar product
V
:=
cp(t) . S (t )
cP
at time t is given by the
d
=
L CPi (t )Si (t ) , t E [0, T] . i=O
The process V
J o
J
.= 0 0
(iii) A trading strategy cP is called self-financing if the wealth process V
6. 1 Continuous-time Financial Market Models
231
Remark 6. 1 . 1 . (i) The financial implications of the above equations are that all changes in the wealth of the portfolio are due to capital gains, as opposed to withdrawals of cash or injections of new funds. (ii) The definition of a trading strategy includes regularity assumptions in order to ensure the existence of stochastic integrals. In the current rather general set-up these conditions are flexible enough to allow a discussion of the no-arbitrage condition; however, they prove too restrictive for the question of market completeness (see below) . Our objective is to be very flexible with regard to the chosen numeraire. The next results underline this. Proposition 6 . 1 . 1 (Numeraire Invariance Theorem) . Self-financing
portfolios remain self-financing after a numeraire change. Proof.
We follow Geman, EI Karoui, and Rochet (1995) , and assume
Sk , X are continuous (the general case requires additional conditions and is
treated in Delbaen and Schachermayer (1995b» . From a financial viewpoint this property is immediately clear. For a mathematical deduction we use the product rule (5. 13) to get
where (Sk , 1/ X) (t) denotes the quadratic covariation between the semi martingales Sk and 1/ X. In the same manner
The self-financing condition implies that
Thus the portfolio expressed in the new numeraire remains self-financing.
0
Turning back to the special numeraire So (t) we consider the discounted price process Set)
:=
with Si (t) = Si (t)/ So (t) , i process V
=
Set) So (t)
=
( 1 , S l (t) , . . . Sd (t »
1 , 2, . . , d. .
-
Furthermore, the discounted wealth
232
6. Mathematical Finance in Continuous Time
Vcp (t) Vcp (t) := � DO (t)
=
- (t) cpo (t) + � L..J CPi(t)Si i =l
and the discounted gains process Gcp (t) is d t Gcp (t) : = L CPi (t)dSi (t) . •= 1
J 0
Observe that Gcp (t) does not depend on the numeraire component CP O . As in Chapter 4, it is convenient to reformulate the self-financing condition in terms of the discounted processes: Proposition 6 . 1 . 2 . Let
and only if
cP
be a trading strategy. Then
Vcp (t) = Vcp (O) + Gcp (t) . Of course, Vcp (t) � 0 if and only if Vcp (t) � O.
cP
if self-financing if
'
The proof follows by the numeraire invariance theorem using So as 0 numeraire. Remark 6. 1 . 2. As in Chapter 4 ( Proposition 4. 1 .3) the above result shows that a self-financing strategy is completely determined by its initial value and the components cPt , . . . , CPd . In other words, any set of predictable processes CP l , ' . . , CP d such that the stochastic integrals J CPidSi, i = 1, . . . , d exist can be uniquely extended to a self-financing strategy cP with specified initial value Vcp (O) = by setting the cash holding as d d t CPo(t) = + L CPi (u)dSi (u) - L CPi (t)Si (t) , t E [0, T] .
v
V
•= 1
J 0
.= 1
6 . 1 . 2 Equivalent Martingale Measures
As in Chapter 4, we develop a relative pricing theory for contingent claims. Again the underlying concept is the link between the no-arbitrage condition and certain probability measures. We begin with: Definition 6 . 1 .3 . A self-financing trading strategy
cP is called an arbitrage opportunity if the wealth process Vcp satisfies the following set of conditions: Vcp (O) = 0 , JP(Vcp (T) � 0 ) = 1 , and JP(Vcp (T) 0) O.
> >
Arbitrage opportunities represent the limitless creation of wealth through risk-free profit and thus should not be present in a well-functioning market. Observe that as in Chapter 4 arbitrage opportunities are defined with respect to a certain class of trading strategies. The main tool in investigating arbitrage opportunities is the concept of equivalent martingale measures:
6 . 1 Continuous-time Financial Market Models
233
Definition 6 . 1 .4 . We say that a probability measure Q defined on (n, F) is
a (strong) equivalent martingale measure if:
(i) Q is equivalent to IP, (ii) the discounted price process S is a Q-local martingale (martingale). We denote the set of martingale measures by P.
A useful criterion in determining whether a given equivalent measure is indeed a martingale measure is the observation that the growth rates relative to the numeraire of all given primary assets under the measure in question must coincide. For example, in the case So (t) B (t) we have: Lemma 6 . 1 . 1 . Assume So (t) = B (t) = e r( t) , then Q '" IP is a martingale =
measure if and only if every asset price process Si has price dynamics under Q of the form where Mi is a Q-local martingale.
The proof is an application of Ito's formula and is left as an exercise. In order to proceed, we have to impose further restrictions on the set of trading strategies. Definition 6 . 1 . 5 . A self-financing trading strategy
to the numeraire So) if
V
We next analyse the value process under equivalent martingale measures for such strategies. Proposition 6 . 1 .3 . For
t
J
is a local martingale. Now V
234
6. Mathematical Finance in Continuous Time
Theorem 6 . 1 . 1 . Assume
trage opportunities in if! .
P -=I- 0. Then the market model contains no arbi
Vcp (t) is a supermartingale lEQ (Vcp(t)I Fu ) � Vcp( u ), for all u � t � T. For to be an arbitrage opportunity we must have Vcp(O) = Vcp(O) = O. So lEQ (Vcp(t)) � 0, for all 0 � t � T. Now is tame, so Vcp(t) 0, 0 � t � T, implyipg lEQ (Vcp(t) ) = 0, 0 t � T, and in particular lEQ (Vcp(T)) = O. But an arbitrage opportunity also has to satisfy lP (Vcp(T) 0) 1 , and since Q lP, this means Q (Vcp(T) 0) = 1 . Both together yield Q (Vcp(T) > 0) = lP (Vcp(T) > 0) 0,
Proof. For any cp E if! and under any Q E P, by Proposition 6 . 1 .3. That is,
cp
E
if!
;:::
cp
cp
;:::
�
;:::
rv
=
=
and hence the result follows.
o
Having seen that the absence of arbitrage opportunities is implied by the existence of an equivalent martingale measure, a natural question is to ask whether the converse is true - which would yield a fundamental theorem of asset pricing. In the discrete setting of Chapter 4, we could indeed prove a fundamental theorem of asset pricing, but our arguments relied heavily on the discreteness of the setting. In the current context, the restriction on trading strategies introduced by the no-arbitrage condition is not strict enough ( as pointed out by Duffie and Huang ( 1985)) and some further requirements have to be put on the portfolios to establish a version of a fundamental theorem of asset pricing in continuous time. These requirements should of course be economically meaningful - as is the no-arbitrage condition. A lot of effort has been put into solving this interesting and challenging question, and several alternatives have been proposed. To give the reader an impression of developments in this area, we quote a result by Delbaen and Schachermayer (1994) ( leaving aside some technical assumptions) . First, we need some additional notation. A simple predictable trading strategy is a predictable process that can be represented as a finite linear combination of stochastic processes of the form 1/I l (7"1 ,7"2J where 71 and 72 are stopping times and 1/1 is an -measurable random variable. We say that a simple predictable trading strategy is a-admissible if the relative wealth process ;::: -8 for every E [0,
Vcp(t)
F7"l t T).
6 . 1 Continuous-time Financial Market Models
235
Definition 6 . 1 .6. A price process S satisfies NFLVR (no free lunch with
vanishing risk) if for any sequence ('P n ) of simple trading strategies such that 'Pn is 8n -admissible and the sequence 8n tends to zero, we have VCPn (T) ---+ 0 in probability as n ---+ 00 .
The following fundamental theorem of asset pricing is proved in Delbaen and Schachermayer (1994) .
Theorem 6 . 1 . 2 (Fundamental Theorem of Asset Pricing) . For a fi
nancial market model with bounded prices, there exists an equivalent mar tingale measure if and only if the condition NFLVR holds.
One can relax boundedness here to local boundedness, and then NFLVR is equivalent to existence of an equivalent local martingale measure. With price processes unrestricted beyond being semi-martingales, one needs a new concept, that of sigma-martingales. Then NFLVR is equivalent to existence of an equivalent sigma-martingale measure ( Delbaen and Schachermayer ( 1998) , Th. 1 . 1 ) . For a recent survey of the complications lurking i n the background here, see Kabanov (2001). 6 . 1 .3 Risk-neutral Pricing
We now assume that there exists a strong equivalent martingale measure JP* ( which implies that there are no arbitrage opportunities with respect to tJ> in the financial market model M . ) Until further notice we use JP* as our reference measure, and when using the terms martingale and local martingale, we always assume that the underlying probability measure is JP* . In particular, we restrict our attention to contingent claims X such that X/So (T) E L 1 ( F, JP* ) . We now define a further subclass of trading strategies: Definition 6 . 1 .7. A self-financing trading strategy 'P is called (JP* - ) admis
sible if the relative gains process
Jo 'P(u)dS(u) t
G cp (t)
=
is a (JP* ) martingale. The class of all (JP* ) admissible trading strategies is denoted tJ>(JP* ) . By definition S is a martingale, and G is the stochastic integral with -
-
respect to S . By Remark 6.1.2 we see that any sufficiently integrable processes 'P 1 , . . , 'P d give rise to JP* -admissible trading strategies. We do not assume that admissible trading strategies are tame. However, since we only used the supermartingale property of G cp in the proof of The orem 6. 1 . 1 , we can repeat the above argument to obtain .
236
6. Mathematical Finance in Continuous Time
Theorem 6 . 1 .3 . The financial market model M contains no arbitrage op portunities in
Under the assumption that no arbitrage opportunities exist, the ques tion of pricing and hedging a contingent claim reduces to the existence of replicating self-financing trading strategies (compare Chapter 4 and § 1 .4). Formally: Definition 6 . 1 . 8 . (i) A contingent claim X is called attainable if there ex
ists at least one admissible trading strategy such that
We call such a trading strategy
Again, we emphasise that this depends on the class of trading strategies. On the other hand, it does not depend on the numeraire: it is an easy exercise in the continuous asset-price process case to show that if a contingent claim is attainable in a given numeraire it is also attainable in any other numeraire and the replicating strategies are the same (the general case is more involved, see Delbaen and Schachermayer (1995b)) . If a contingent claim X is attainable, X can be replicated by a portfolio
=
Vcp (t) .
Of course the questions arise of what will happen if X can be replicated by more than one portfolio, and what the relation of the price process to the equivalent martingale measure(s) is. The following central theorem is the key to answering these questions: Theorem 6 . 1 .4 ( Risk-neutral Valuation Formula ) . The arbitrage price process of any attainable claim is given by the risk-neutral valuation formula
(6.1) The uniqueness question is immediate from the above theorem: Corollary 6 . 1 . 1 . For any two replicating portfolios
6 . 1 Continuous-time Financial Market Models
237
Proof of Theorem 6 . 1 .4. Since X is attainable, there exists a replicating strategy cP E cP(IP* ) such that X and Since cp E = cP(IP* ) the discounted value process is a martingale, and hence
V
=
o
Remark 6. 1 . 3. The above argument addresses only the case of different repli cating portfolios for X in cP(IP* ) . Of course the choice of IP* was itself com pletely arbitrary, and we could have several - indeed, many - equivalent mar tingale measures Q. This makes the definition of admissible strategies with respect to different martingale measures somewhat unsatisfactory. However, if we combine our previous two definitions and call a self-financing trading strategy cp admissible, if cp is tame and Q-admissible for some Q E P, and X attainable, if there exists an admissible cp replicating it, we show that for admissible CPI E cP(QI ) and CP2 E cP(Q 2 ) replicating X we have
IEq l [ So�T) 1 Ft] = IEq2 [ So�T) 1 Ft] ,
implying that the arbitrage price process is unique. More generally, one can even show that
IIx(O) qsupEP So (O) IEq [ X(T) 1Ft] =
s'
0
where B(X) is the class of all admissible (in the wider sense) trading strategies replicating X. Remark 6. 1 . 4 . Existence of an equivalent martingale measure is closely re lated to properties of the stochastic (Doleans) exponential E (U) of Under suitable conditions, a probability measure Q is equivalent to the un derlying probability measure IP if and only if it is of the form QU for a suitable local martingale U , where dQU = dIP Er (U) IP - a.s .. Then if 1'/t : = dQ / dIP I , E [0, one has U = f� 1'/:;; I E [0, For details and reference see e.g. Jacod and Shiryaev IIL3, Musiela and Rutkowski Protter (2004), IIL6, Revuz and Yor VIlLI.
§5.10.
Ft ) t IE ( (1997), §1O.I.4 ,
T],
t (1987), d1'/u , t T]. (1991),
238
6. Mathematical Finance in Continuous Time
For pure pricing purposes it is sufficient to find a strong equivalent mar tingale measure, but for risk-management purposes it is more important to find the replicating portfolio. We have the following useful result, which shows that hedging is equivalent to the existence of a stochastic integral represen tation of the discounted claim. Lemma 6 . 1 .2 . Assume that the discounted contingent claim XI So(T) is IP* integrable. If the (IP* )-martingale M defined by
admits an integral representation of the form d t
M(t)
= x + I::= J 'Pi (u)dSi(u) , . 10
with 'P1 , . . . , 'Pd predictable and locally bounded, then X is attainable. Proof. We are looking for a trading strategy 'P E iP(IP* ) such that V'I' (T) X. We use the components 'P1 , . . . , 'Pd from the definition of the integral representation of M and then define
v = x.
=
=v+
=x
J
=
• =1 0
By definition M is a non-negative martingale, so 'P E iP(IP* ) follows, and also by definition
V'I' (T)
= So(T)V-'I' (T)
=
So(T)M(T) So(T) SoX (T) =
=
x. o
We thus see that ( given some integrability conditions ) attainability, and hence completeness, is closely linked to martingale representation theorems. Statement and proofs of the martingale representation property require rather technical definitions involving the fine structure of the filtration F. Moreover, in a general set-up the conditions on the process 'P ( the integrand ) no longer guarantee that 'P may be used as a suitable trading strategy. We refer the reader to Jacod and Shiryaev ( 1987) , I1I.4 and Revuz and Yor ( 1991 ) , V, for an exhaustive treatment. We close by stating the following partial analogue of the completeness the orem in the discrete setting, Theorem 4.3. 1 ( see Harrison and Pliska ( 1981), §3.4) :
6 . 1 Continuous-time Financial Market Models
239
JP* is the unique martin gale measure for the financial market model M , then M is complete, in the restricted sense that every contingent claim X satisfying X E L I ( F , JP * ) So (T)
Theorem 6 . 1 . 5 . If the strong martingale measure
is attainable. 6 . 1 . 4 Changes of Numeraire
We now develop the technique of change of numeraire, which will be ex tremely useful in the valuation of contingent claims, and also for deduction of theoretical results ( we follow Geman, EI Karoui, and Rochet (1995) in this section) . We put ourselves in the following framework. Assume that we used So (t) as numeraire, and that there exists a ( strong ) equivalent martingale mea sure JP* , i.e. the basic security prices discounted with respect to So are JP* martingales. Theorem 6 . 1 . 6 . Let X (t) be a non-dividend-paying numeraire such that
X (t) / So (t) is a JP* -martingale. Then there exists a probability measure Qx , defined by its Radon-Nikodym derivative 1J(T) with respect to JP* , 1J(t)
=
I dJP
dQ:
Ft
=
X (t) X(O)So (t) '
such that: (i) the basic security prices discounted with respect to X are Qx -martingales; (ii) if a contingent claim Y is attainable under (So (t) , JP* ) , then it is at tainable under (X (t) , Q x ) , and the replicating portfolio is the same. So the arbitrage price processes, given by the risk-neutral valuation formula, coincide. Proof. ( i ) Denote by Z = Z(t) / X (t) the relative price of a security Z ( it may be one of the components Si) with respect to the numeraire X (t) . Since 1J(t) = X (t)jX(O)So (t) is a JP*-martingale we can use the Bayes formula (5.1 4 ) ( see Exercise 5.1.) and the JP*-martingale property of Z(t)/So (t) to get for 0 :::; u :::; t
]
�
• Z (t)1J(t) ! F ] u 1J( ) IElP [ X(O)So (u) Z(t) X (t) = IElP • X(u) X(t) X (O)So (t) Z(u) So (u) Z(u) = = = Z(u) . X(u) So (u) X (u)
IEQx [ Z(t) I Fu =
[
I
:Fu]
240
6. Mathematical Finance in Continuous Time
So the basic securities are martingales, and hence so is any portfolio. (ii) If Y is attainable under JP* , then there exists a JP* -admissible port folio cp such that V",, (T) Y, and by the risk-neutral valuation formula =
Using again the Bayes formula (5. 14) we obtain
[ X�T) 7] (T) I Ft] X(O) So (t) Y X(T) ] = X(t) IEJP o [ X(T) X(O)So(T) Ft I
IEq x [ x�T) I Ft ] = 7]�t ) IEJP
o
_
V",, ( t ) = X(t) · Since self-financing portfolios remain self-financing under a change of numer aire (by the numeraire invariance theorem) , and as by part (i) V",, (t) is a Qx-martingale, it follows that cp is Qx-admissible and the result follows. 0 Of course the above is true for a general change of numeraire: Corollary 6 . 1 . 2 . Let X (t) , Y( t ) be numeraires satisfying the assumptions of the above proposition and Z a contingent claim. Then we have the change of-numeraire formula
Proof. We can use the second part of the above proposition, or use the Radon-Nikodym derivative dQx dQx j dQ y X(T)jY(T) X(T) j Y(T) = = dQ y FT dJP * dJP * X(O) So (T) Y(O) So (T) X(O)jY(O) 0 and repeat the argument. Examples. (i) The money market account as numeraire. Assuming that a riskless asset (e.g. bank account) exists, it is natural to take it as a numeraire. More precisely, define B(t) as the value at date t of a fund created by in vesting one monetary unit at time t 0 and continuously reinvested at the (instantaneously riskless) instantaneous interest rate r ( t) . Under technical assumptions (see Chapter 8, where we discuss fixed-income securities in de tail)
I
=
B(t)
�
exp
{/ } r (U)dU
6 . 1 Continuous-time Financial Market Models
241
The discounted price process of a security with respect to the numeraire B(t) is simply S(t)
�
exp
{-i }
r ( u ) du S(t) .
'Historically' ( see Harrison and Pliska (1981)) B(t) was used as the numeraire Ba (t) and then QB = IP* . (ii) Zero-coupon bonds as numeraire. A zero-coupon bond is the natural choice of numeraire if one looks at the time t price of an asset giving the right to a single cash-flow at a well-defined future time T. The simplest such asset is a zero-coupon bond with cash-flow 1 at time T. We denote its time t price by pe t , T) ( again see Chapter 8 for further discussion ) . We assume that the money-market account B(t) ( as defined above ) was used to define IP* . Since peT, T) = 1 , the risk-neutral valuation formula gives
So pet, T) is a IP*-martingale, and by Theorem 6.1.6 dQT dIP*
=
1
pea, T) B (t )
=
1 pea , T)
exp
t
{-/
}
r ( u ) du .
The relative price process of a basic asset Z with respect to pe t , T) , i.e. Z = Z(t) jp(t, T) , is called the forward price Fz (t) of the security Z, and
given by
Fz (t) = lEQT [Z(T) I Ftl ,
implying that Fz is a Q T martingale. To rephrase this: the forward price, relative to time T, of a security which pays no dividends up to time T, is equal to the expectation of the value at time T of this security under the 'forward risk-neutral' martingale measure Q T . ( iii ) The numeraire portfolio. A natural idea is to introduce a numeraire such that one can work with the historical probability IP. In general, this cannot be achieved by choosing a single asset as numeraire. However, it is known that for continuous asset price processes there exists a self-financing, predictable and positive portfolio
242
6. Mathematical Finance in Continuous Time
Proposition 6 . 1 .4. Denote by
T and K respectively the maturity and exer cise price of a European call option on an asset Z. The time t 0 price C (O) of the call can be written as C(O) p(O, T)
=
=
IEQ T [(Z(T) - K) + l ,
or
C(O) = Z(O) Qz (A) - Kp(O, T) Q T ( A ) , where A {w : Z(T, w) > K} . Proof. The first expression is immediate from Theorem 6.1.6. To derive the second write C�) ( ) IEQ T [(Z(T) - K) + ] IEQ T [ Z(T) IA] - K Q T (A) . p O, T Using Z as a numeraire in the middle expectation, we have by the change of numeraire formula (6.2) =
--
=
=
[
Z(T) p(O, T)IEQ T p(T T) l A , which gives the second expression.
]
=
Z(O)IEz [IAl , o
Corollary 6 . 1 . 3 (Exchange Option) . Consider the option to exchange
asset 2 against asset 1 at time T , which gives one the right to the cash flow (ZdT) - KZ2 (T))+ (with K 1). Using Z2 as a numeraire, its price Ce x (O) is such that =
or
Cex (O) Zl (O) QZl ( A ) - KZ2 (O) QZ2 (A) , where A {w : Zl (T, w) > KZ2 (T , w) } . In the following we will frequently use this technique. =
=
6 . 2 The Generalized Black- Scholes Model
6 . 2 . 1 The Model
We consider a financial market M B S in which 1 assets are traded contin uously. The first of these is an asset without systematic risk, called the bank
d+
6 . 2 The Generalized Black-Scholes Model
243
account (or sometimes the bond) , with price process defined as in §6. 1 .4, i.e. its dynamics are given by (6.3) dB (t) B (t)r(t) dt, B (O) 1 . The remaining d assets (usually referred to as stocks) are subject to system atic risk. The price process Si (t) , 0 :::; t :::; T of the stock 1 :::; i :::; d is modelled by the linear stochastic differential equation =
d8; (t)
�
(
8; (t) b; (t)dt
=
+ � U;j (t)dWj (t)
)
, 8; (0)
�
p; E
(0, 00); (6.4)
here W(t) (W1 (t) , . . . , Wn (t) ) ' , 0 :::; t :::; T is a standard n-dimensional Brownian motion, defined on a filtered probability space (st, :F, lP, IF) . We assume that the underlying filtration IF {:Ft , 0 :::; t :::; T} is the Brownian filtration (basically :Ft cr(W ( ) 0 :::; :::; t) slightly enlarged to satisfy the usual conditions, compare §5.7) . We interpret the process {r (t) , O :::; t :::; T} as the interest-rate process - the short rate or the spot rate - i.e. r(t) is the (instantaneously riskless) instantaneous interest rate. The vector process {b(t)' (b 1 (t) , . . . , bd (t) ) , 0 :::; t :::; T} is the vector of appreciation rates for the stocks, measuring the instantaneous rate of change of S at time t. Finally the matrix process of volatilities {cr(t) (crij (t) ) , 1 :::; i :::; d, 1 :::; j :::; n, 0 :::; t :::; T} models the instantaneous intensity with which the jth source of uncertainty influences the price of the ith stock at time t. The processes r (t) , b(t) , cr(t) are referred to as the coefficients of the model M BS . We assume that these coefficients are progressively measurable with respect to IF and satisfy the integrability condition T =
=
=
s
,
s
=
=
J ( j r(t) j + jj b(t) jj + Ii cr(t) jj 2 )dt
<
00,
a.s.
o
Remark 6. 2. 1 . (i) The class of progressively measurable processes is a slight
enlargement of the class of adapted processes with (right- or left-) continuous paths, which we will mainly encounter. (ii) r(t) , b(t) , cr(t) adapted to IF implies that anticipation of the future is precluded and allows for dependence of the stock prices on their past. In the special case when the coefficients are deterministic the stock price process S becomes a Markov process, and in fact a vector diffusion process. (iii) The case d n 1 and constant coefficients is the classical Black-Scholes world (as in Black and Scholes (1973)) . The first question is of course to ask whether the model M BS is free of arbitrage (which we use here meaning with respect to tame trading strate gies) . As we have seen in §6. 1 .2, we can preclude arbitrage opportunities if we show the existence of an equivalent martingale measure. To get a feel of how such a measure can be found we use the classical Black-Scholes world. =
=
244
6. Mathematical Finance in Continuous Time
Example. Black-Scholes model (Martingale Measures) .
now is
The model
dB(t) = rB(t)dt, B( O ) = 1, (t)), O"d dt 8(0) = E (0, 00) , ( (t d8(t) = 8 ) b + W with constant coefficients b E JR, r, O" E JR+ . We write as usual S (t) 8 (t) / B (t) for the discounted stock price process (with the bank account being P
=
the natural numeraire), and get from Ito's formula
dS (t) = S (t) { (b - r)dt + O"dW (t) } .
Because we use the Brownian filtration any pair of equivalent probability measures IP '" Q on FT is a Girsanov pair (§5.7), i.e. dQ dIP Ft L(t)
I
with
L(')
�
exp
{j �
=
7 ( S ) dW( S )
�
�
j }, 7 ( S ) 2 ds
(-y(t) 0 :::; t :::; T) a measurable, adapted d-dimensional process with '"'((t) 2 dt < 00 a.s .. By Girsanov's Theorem 5.7. 1 we have dW(t) = dW(t) - '"'((t)dt, where W is a Q-Wiener process. Thus the Q-dynamics for S are dS (t) = S (t) {(b - r - O"'"'( (t)) dt + O" dW(t) } . and JOT
:
Since S has to be a local martingale under Q we must have
b - r - O"'"'((t) = 0 and so we must choose
t E [0, TJ ,
b-r '"'((t) = '"'( = -, 0"
(the 'market price of risk') . Indeed, this argument leads to a unique martin gale measure, and we will make use of this fact later on. Using the product rule (5. 13) , we find the Q-dynamics of 8 as
d8(t) = 8(t) {rdt + O"dW}. We see that the appreciation rate b i s replaced by the interest rate r, hence the terminology risk-neutral (or yield-equating) martingale measure. We now generalise the arguments used in the above example.
6.2 The Generalized Black-Scholes Model
245
Theorem 6. 2 . 1 . (i) If M B S is arbitrage-free (with respect to tame strate
gies), then there exists a progressively measurable process 'Y : [0, T] x n !Rd , called the market price of risk (or relative risk), such that
b(t) - r (t) I d
�
:::;
(6.5) T a.s. (I d denotes the d-dimensional vector with all entries 1). (ii) Conversely, if such a process 'Y(.) exists and satisfies, in addition to the =
a(t) "((t), 0 :::; t
above requirements,
T
J Ib (t)
11 2 dt
<
,(t) ' dW(t)
-
00,
(6.6)
a.s .
o
and
[ {1
1E exp -
then M B S is arbitrage-free.
�
1
11 1 (t) I I ' dt
}1
�
1,
(6.7)
Proof. (Compare Karatzas (1996) .) We give a sketch of the proof of (i) , leaving out the measure-theoretic arguments. Let cp(t) be a tame trading strategy. Suppose there exists a set A c [0, T] x n with positive product measure (,\ @ JP) (A) > 0, where ,\ denotes Lebesgue measure, such that for all t E [0, T] we have a(t) ' cp(t) = 0 ('no exposure to risk') , ' cp(t) (b(t) - r (t) I d ) ¥= 0 ('non-zero rate') , on A . For an arbitrary constant c > 0 , construct a new trading strategy � . sgn (cp(t) ' (b(t) - r (t) Id ) ) cp(t) 'Ij;(t) =
{
For the discounted value process of the trading strategy 'Ij; we have the fol lowing dynamics: d dV", (t) = L 'lj;i (t)d Si (t) i= l �
t, ", (t) 8, (t) { (b, (t) - r(t))dt + t, u,; (t) dWj (t) } d
=
L Si (t)C Icpi (t) (bi (t) r(t)) 1 dt. i =l
-
So V", (t) � 0, 0 :::; t :::; T, hence 'Ij; is tame. In particular, V", (T) > 0 on a set B E FT with JP(B) > O. This means 'Ij; gives rise to an arbitrage opportunity.
246
6. Mathematical Finance in Continuous Time
To rule out such an arbitrage opportunity we must have that every vector in the kernel of a ' (t, w) must be orthogonal to b(t, w) - r(t, w) l d for a.e. (t, w ) . Thus b(t, w) - r(t, w) l d should belong to (kernel (a' (t, w)) .l.. = range(a(t, w)) , which is precisely the above condition. It then can be shown that 'Y(.) can be selected in Condition (6.5) to be progressively measurable. To prove (ii), recall that by Theorem 6.1 . 1 the existence of an equivalent martingale measure rules out arbitrage. The above Conditions (6.6) and (6.7) ensure that the exponential process L(t)
�
exp
{-/
�(u)' dW(u)
is a martingale, and thus JP * (A)
:=
-�/
�}
Ib(u) II' u ,
0 :5 t :5 T
lE(L(T) lA ) ' A E FT
(6.8)
(6.9)
defines an equivalent probability measure JP* with Radon-Nikodym deriva tive dJP* dJP Ft = L(t) , 0 � t � T. (Occasionally, we shall call L the Girsanov density.) JP* is the risk-neutral equivalent martingale measure . Under JP*,
1
t
W (t)
:=
W (t )
+ f 'Y ( s )ds ,
0�t�T
o
is Brownian motion. The stock-price process dynamics under JP* are
or equivalently for the discounted price processes, dB; (t)
�
B; (t)
(t,
u;;
(t)d W; (t)
),
i
�
1 , . . . , d.
So S is a local JP* -martingale and therefore JP* is an equivalent martingale 0 measure. Remark 6. 2. 2. Observe that Si are the stochastic exponentials of the pro cesses Zi = E7=1 J a j dWj (compare §5.6.1). So under the Novikov condition i
6.2 The Generalized Black-Scholes Model
247
they are martingales. As an illustration of the effect of different choices of " we consider a simple model with two securities 81 , 82 . Let the price-process dynamics be given by
+ +
d81 (t) = 81 (t) (b1 (t)dt lTl (t)dW(t) ) , d82 (t) = 82 (t) (b2 (t)dt lT2 (t)dW(t)) . Assume there i s a process , ( ) (which we might use to define a change of .
measure) such that
Observe that in the numerators we have the excess rate of return of the risky assets over the risk-free rate and in the denominators the volatility of the assets. So these quotients can be interpreted as the risk premium per unit of volatility. As in the theorem above this ratio is often called the market price of risk (compare §8.2) . We can rewrite the above as d81 ( t) = 81 (t) ( (r(t) + ,(t)O" I (t)) dt d82 (t) = 82 (t) ( (r(t) , (t) 0"2 (t) )dt
+
If, for example, "( 0, then ==
d81 (t) = 81 (t) (r(t)dt d8� (t) = 82 (t) (r(t)dt
+ 0"1 (t)dW(t) ) , + 0"2 (t)dW(t) ) .
+ 0"1 (t)dW(t)) , + 0"2 (t)dW(t) ) ,
and we see that Hi = 8d B, i 1 , 2 are (local) martingales under IP , so we are already in our usual risk-neutral setting. If we set "( = 0"2 , we get (doing some stochastic calculus) =
d
[ 8812 (t)(t) ] = (O"I (t) - 0"2 (t)) 8821 (t)(t) dW(t) .
So 8d 82 is a (local) martingale in this setting. Since the attitude towards risk is described by 0"2 (= , ) (the 'risk' in holding the asset 82 ) IP is called a risk-neutral measure with respect to 82 • (Observe that in this example we calculated the drift coefficients bi from the volatilities O"i , the risk-free rate r and the market price of risk , in order to use the original probability measure IP as a martingale measure. Our usual approach is to change the underlying measure using , determined by r, bi , O"d We now turn to the question of market completeness. Again we start by looking at the classical Black-Scholes example.
248
6. Mathematical Finance in Continuous Time
Example. Black-Scholes model (Completeness) . We already know that we have a unique martingale measure JP* (recall 'Y = (b r) l(J' in Gir sanov's transformation) . Given a contingent claim X E L l (fl, F, JP), then X E Ll (fl, F, JP* ) also, and we can define the JP* -martingale -
Using the martingale representation theorem for:. one-dimensional Brownian motion (compare Corollary 5.7.1), we know that under JP*
J h (u)dW(u) . t
M (t)
=
M(a) +
o
Now the JP* -dynamics of S are dS(t) = S (t ) dW ( t ) , (J'
so in fact M (t )
=
M(a) +
t
J c,ol (u)dS(u) , o
with c,ol (t) = hit ) . Using Remark 6.1.2 and defining (J'S(t)
c,oo (t) = M(t) - c,o l (t)S(t) = M ( t ) - h(t) , (J'
we obtain a self-financing replicating trading strategy (in terms of the dis counted values (1 , S)). We have thus shown that X is attainable, and since X was arbitrary the model is complete (in the restricted sense). Turning to the general case, we restrict consideration to contingent claims X with XI B(T) E L l (fl, FT , JP* ) . From Theorem 6.1.5 we know that unique ness of the martingale measure implies that a financial model is complete. In M B S all equivalent martingale measures are given by means of Girsanov's transformation, and hence are characterised by their corresponding Radon Nikodym derivatives with respect to JP. However, these Radon-Nikodym derivatives are characterised by functions 'Y satisfying (use Theorem 6.2.1) bet ) - r(t) l d
(J'(t)"( (t) , a ::; t
::;
T. Keeping this in mind, a characterization of completeness in M B S will involve conditions on the coefficients of the model. More precisely: =
Theorem 6 . 2 . 2 . The following are equivalent:
(i) there exists a unique equivalent martingale measure for the discounted stock price process S;
6 . 2 The Generalized Black-Scholes Model
249
(ii) the generalised Black-Scholes model MB S is complete (in the restricted sense that every contingent claim X with X/B (T) E L1 (fl, F, IP* ) is attainable) ; (iii) equality n = d holds and the volatility matrix a(t, w) is (..\ ® IP ) -a. e.
non-singular.
Remark 6. 2. 3. ( i ) If n � d and the matrix a has full rank, we can reduce the number of stocks by duplicating some of them as (t, w ) -dependent linear combinations of others. ( ii ) We discuss completeness by assuming that trading is restricted to primary securities. Completeness of financial market models with traded put and call options is discussed, e.g. in Bajeux-Besnainou and Rochet ( 1996 ) and Madan and Milne (1993). In particular the question of static hedging is treated in Carr, Ellis, and Gupta (1998) ( compare also Exercise 4.6) . Proof of Theorem 6 . 2 . 2 Again we only sketch the proof. ( i ) {:} ( iii ) follows from the special structure of the Girsanov density used to perform a change of measure in the Brownian setting ( compare §5.8) .
To show ( iii ) => ( ii ) ( which is equivalent to showing ( i ) => ( ii ) , a special case of Theorem 6.1 .5) we repeat the argument given in the classical Black Scholes model. Let IP* be the martingale measure defined by the unique Gir sanov transformation with parameter "((t) (a ' (t) ) - l (b(t) r ( t ) l d ) Using the martingale representation property for the multidimensional Brownian motion ( compare Corollary 5.7. 1), we have for any contingent claim X -
=
M(t)
:=
=
IElP*
M(O)
( B�T) 1 Ft )
+ J h' (u)dW(u) t
=
M(O)
+ L J hi (u)dWi (u) , d
'
t
,= 1 0
+
o
0 :::; t :::; T,
with W(t) = W(t) J; "((u)du . On the other hand, the IP*-dynamics of S are d dSi (t) Si (t) L O'ij (t)dWj (t) . j= l We thus have d t M(t) = M(O) L IPi(u)dSi (u), =
+
with
J
'= 1 0
250
6 . Mathematical Finance in Continuous Time
Using Remark 6.1.2 and defining CPo (t)
=
M (t) - cp' S(t) ,
cp is admissible. Then X is attainable using cp as replicating trading strategy ( the replicating trading strategy for the undiscount�d assets is of course B( t ) · cp(t» , and since X was arbitrary the model is complete ( in the restricted sense ) . To show ( ii ) implies ( i ) we use an argument like that in the first part of
the proof of Theorem 4.3.1 to show that in a complete market the mapping IEq (Y/B (T) ) is constant for any ( integrable ) contingent claim Y ( see Jacka (1992)) . If there are two equivalent martingale measures Q1 and Q 2 with Q1 =I- Q 2 , then we must have an event A E F with Q1 (A) =I- Q 2 (A) . SO we can define a contingent claim Y lA with IEql (lA/B (T» =I- IEq2 ( l A /B (T) ) , which is a contradiction to the above 0 statement. So we must have a unique martingale measure. H : P -t IR given by H(Q)
=
:=
6.2.2 Pricing and Hedging Contingent Claims
We assume now that d = n, that the coefficients of the model satisfy the inte grability conditions, that a is non-singular and that 1' ( . ) in (6.5) exists. Then the financial market model admits a unique equivalent martingale measure JP* with Radon-Nikodym derivative given by the Girsanov transformation L(')
�
exp
{i -
�(u) ' dW(u)
-
�
i dU} ' 11�(u) , , 2
0 ':; ' ':; T
and 'Y(t) = a 1 ( t ) ( b ( t) - r(t) l d ) . By Theorems 6.2. 1 and 6.2.2, the model is free of arbitrage and complete. We call such a model standard. Recall that a contingent claim X is a FT -measurable random variable such that X/ B(T) E L1 (fl, :FT , JP* ) . Using this we get ( we write IE* for IE!po in this section ) : -
Theorem 6.2.3 (Risk-neutral Valuation Formula) . Let M B S be a
standard multi-dimensional Black-Scholes model and X a contingent claim. The arbitmge price process of X is given by the risk-neutml valuation formula
Proof. Since the model is complete, any such contingent claim is attain0 able, and the result follows from Theorem 6 . 1 .4
Of course, for practioners it is of equal importance to find the replicat ing portfolio ( which exists by completeness ) . The martingale representation
6 . 2 The Generalized Black-Scholes Model
251
theorem guarantees the existence, but the explicit construction is rather in volved. We now look at the special case of the classical Black-Scholes model. By the risk-neutral valuation principle the price of a contingent claim X is given by IIx (t) = e { - r ( T - t ) } /E* [ X I Ft ] , with IE " given via the Girsanov density L(t) =
exp
{
(b - r)
( )
1 -;;b-r
- -;;- W (t) - "2
2
}
t .
Furthermore, if X is of form X = ifJ(S(T) ) with a sufficiently integrable function ifJ, then the price process is also given by IIx (t) = F(t, S(t) ) , where F solves the Black-Scholes partial differential equation Ft (t, s)
+
rsFs (t, s)
+
�
u 2 s 2 Fss (t, s) - rF(t, s)
=
0,
F(T, s)
=
ifJ(s) .
(6. 11) (6. 12)
To obtain this partial differential equation we used the Feynman-Kac repre sentation (see §5.7) of the P* -martingale (6.13)
M(t) = exp { - rT} IE* [ ifJ( S (T) ) I Ft l ;
indeed, with G (t, S (t) ) = M (t) we know from Theorem 5.7.3 that G satisfies the partial differential equation (recall that S satisfies the P* SDE dS = :
S(rdt + udW) )
with 'initial' (more accurately, final or terminal) condition G (T, s) = s-rT ifJ(s) . By the risk-neutral valuation principle IIx (t) = e r tM(t) , and so F(t, s) = e rt G (t , s ) , and computing the partial derivatives of F we obtain the repre sentation. Specialising further and considering a European call with strike K and maturity T on the stock S (so ifJ(T) = ( S(T) - K)+ ) , we can evaluate the above expected value (which is easier than solving the Black-Scholes partial differential equation) and obtain: Proposition 6 . 2 . 1 (Black-Scholes Formula) . The Black-Scholes price
process of a European call is given by C (t) = S(t) N (d1 ( S(t) , T - t) )
- K e -r ( T -t) N(d2 ( S(t) , T - t) ) .
The functions d 1 ( s , t) and d2 ( s , t) are given by
(6. 14)
252
6 . Mathematical Finance in Continuous Time _
d 1 ( s , t) -
)
2
log(s/K) + (r + T ) t O"yIit
Ii _
_
d2 (s, t - d1 ( S , t ) - O" y t -
'
log (s /K ) + (r Ii
O"y t
� )t
Observe that we have already deduced this formula as a limit of a discrete time setting in §4.6. To obtain a replicating portfolio we use Ito's lemma to find the dynamics of the JP* -martingale M(t) G (t, S (t) ) : =
dM(t)
O"S (t) Gs (t, S(t) ) d W (t) .
=
U sing this representation, we get in terms of the notation in the Black-Scholes model h(t)
=
O"S (t) Gs (t, S (t) ) ,
which gives for the stock component of the replicating portfolio using the discounted assets !P I (t)
=
Gs (t, S (t) ) B (t) ,
and using the self-financing condition the cash component is !Po (t)
=
G (t, S(t) ) - Gs (t, S (t) ) S(t) .
To transfer this portfolio to undiscounted values we multiply it by the dis count factor, i.e F(t, S(t) ) B (t) G (t, S(t)) and get: =
Proposition 6.2.2. The replicating strategy in the classical Black-Scholes
model is given by
!Po
=
F(t, S(t) ) - Fs (t , S(t) ) S(t) , B (t)
!P I
=
Fs (t, S (t) ) .
In their original paper Black and Scholes (1973) , Black and Scholes used an arbitrage pricing approach (rather than our risk-neutral valuation ap proach) to deduce the price of a European call as the solution of a partial differential equation (we call this the PDE approach) . The idea is as follows: start by assuming that the option price C(t) is given by C (t) f(t, S(t)) for some sufficiently smooth function f JR+ x [0, T] -+ JR. By Ito's formula (Theorem 5.6.2) we find for the dynamics of the option price process (observe that we work under JP so dS S (bdt + O"dW) ) =
:
dC
=
{
=
I
ft (t, S) + fs (t, S) S b + "2 fss (t , S) S 2 0" 2
}
dt + fs SO" dW.
(6. 15)
Consider a portfolio 'l/J consisting of a short position in 'l/JI (t) = fs (t, S(t)) stocks and a long position in 'l/J2 (t) = 1 call and assume the portfolio is self-financing. Then its value process is
6.2 The Generalized Black-Scholes Model
V", (t)
=
-'lfJ1 (t) S(t)
253
+ G(t) ,
and by the self-financing condition we have (to ease the notation we omit the arguments) dV", =
-
=
-
=
'l/h dS
+ dG
fs (Sbdt
( +1 ft
+ SO"dW) + ( ft + fs Sb + � fSS S2 0"2 ) dt + fs SO"dW
)
2 2 2" fss S 0" dt.
So the dynamics of the value process of the portfolio do not have any exposure to the driving Brownian motion, and its appreciation rate in an arbitrage-free world must therefore equal the risk-free rate (for a mathematically precise formulation of this equality we refer the reader to Musiela and Rutkowski (1997), §5.2), i.e. dV", (t)
=
=
rV", (t)dt
( -r fs S
Comparing the coefficients and using G(t) -rSfs
+ rf
=
ft
=
+ rG) dt.
f(t, Set» � ,
we must have
+ 2"1 0"2 S2 fss .
This leads to the Black-Scholes partial differential equation for f (6.11), i.e. ft + r s fs
Since
+ 2"1 0"2 s2 fss - rf
G(T) = (S(T) - K) + we need f(s , T) = (s K) + for all s E 1R + . Note. One point in the justification of
=
O.
to impose the terminal condition
-
the above argument is missing: we have to show that the trading strategy short '!/J l stocks and long one call is self-financing. In fact, this is not true, since '!/J l = '!/Jl (t, Set) ) is dependent on the stock price process. Formally, for the self-financing condition to be true we must have dV", (t)
Now '!/J (t)
=
=
d( '!/Jl (t) S(t) )
'!/J (t , Set» �
+
dG(t)
=
'!/Jl (t)dS(t)
+ dG(t) .
depends on the stock price and so we have
d( '!/Jl (t , S (t» S(t) ) = '!/Jl (t) dS(t)
+ S(t)d'!/Jl (t, Set)) + d ('!/Jl , S) (t) .
We see that the portfolio '!/J is self-financing, if S (t)d'!/Jl (t , Set» �
+ d ('!/Jl , S) (t)
=
O.
It is an exercise in Ito calculus to show that this is not the case (see Exercise 6. 1).
254
6 . Mathematical Finance in Continuous Time
6.2.3 The Greeks
We will now analyse the impact of the underlying parameters in the standard Black-Scholes model on the prices of call and put options. The Black-Scholes option values depend on the ( current ) stock price, the volatility, the time to maturity, the interest rate and the strike price. The sensitivities of the option price with respect to the first four parameters are called the Greeks and are widely used for hedging purposes. We can determine the impact of these parameters by taking partial derivatives. Recall the Black-Scholes formula for a European call (6. 14) : C(O)
=
C(S , T, K, r, a)
=
SN(dI ( S, T) )
-
K e - rT N(d2 (S, T) ) ,
with the functions dI e S , t) and d2 (s, t) given by _
d1 ( s , t ) -
2
log(s/ K) + ( r + T ) t ' avfi.t
. fi. _ log(s/ K) + ( r d2 ( s , t ) - d l ( S, t ) - av t avfi.t _
� )t
One obtains . ac V =
.
e :=
p
:=
r
'
.
=
aa
ac aT ac ar a2 c aS 2
= TKe -rT =
N(d2 ) >
0,
n(dd > °. Sa VT
( As usual N is the cumulative normal distribution function and n is its density. ) From the definitions it is clear that ..:1 - delta - measures the change in the value of the option compared with the change in the value of the underlying asset, V - vega - measures the change of the option compared with the change in the volatility of the underlying, and similar statements hold for - theta - and p - rho ( observe that these derivatives are in line with our arbitrage-based considerations in §1.3). Furthermore, ..:1 gives the number ofrshares in the replication portfolio for a call option ( see Proposition 6.2.2) , so measures the sensitivity of our portfolio to the change in the stock price.
e
6.2 The Generalized Black-Scholes Model
255
The Black-Scholes partial differential equation (6.1 1 ) can be used to ob tain the relation between the Greeks, i.e. ( observe that 8 is the derivative of C, the price of a European call, with respect to the time to expiry T t, while in the Black-Scholes PDE the partial derivative with respect to the current time t appears ) -
1 rC = '2 s 2 a 2 r + rsLl - 8.
Let us now compute the dynamics of the call option's price C(t) under the risk-neutral martingale measure JP* . Using formula (6.15) we find dC(t)
=
rC(t)dt + aN(d1 (S(t) , T - t)) S(t)dW (t) .
Defining the elasticity coefficient of the option's price as ry C (t)
=
Ll (S(t) , T - t)S(t) C(t)
=
N(d 1 (S(t) , T - t)) C(t)
we can rewrite the dynamics as dC(t) = rC(t)dt + m( (t)C(t)dW (t) .
So, as expected in the risk-neutral world, the appreciation rate of the call option equals the risk-free rate r. The volatility coefficient is a ryC , and hence stochastic. It is precisely this feature that causes difficulties when assessing the impact of options in a portfolio. 6.2.4 Volatility
One of the main issues raised by the Black-Scholes formula is the question of modelling the volatility a ( for a discussion of further sensitivities in using the Black-Scholes model see Dana and Jeanblanc (2002)) . Before we can implement the Black-Scholes formula to price options, we have to estimate
a. V
Because the formula is explicit, we can, as noted in §6.2.3, determine the the partial derivative
-
V
=
8C/8a,
finding The important thing to note here is that vega is always positive. This mathe matical consequence of the explicit Black-Scholes formula is reassuring, as it is evident on financial grounds: increasing volatility does not make us more likely to 'win' ( possess an option in the money at expiry ) , but does mean that when we win, we 'win bigger', hence that our option - the right to make this one-way bet - is worth more.
256
6. Mathematical Finance in Continuous Time
Next, since vega is positive, C is a continuous - indeed, differentiable - strictly increasing function of a . Thrning this round, a is a continuous ( differentiable ) strictly increasing function of Cj indeed, 1 aa V = ac aa ' so V = ac ' Informally, one can get the graph of a against C from that of C against a thought of as drawn on tracing paper - by turning the paper over and rotating through a right angle. Formally, one needs explicit proof of the relevant result on inverse functions, for which see a text on analysis ( e.g. Burkill (1962) , §3.9) . Thus the value a = a(C) corresponding to the actual value C C(a) at which call options are observed to be traded in the market can be read off. The value of a obtained in this way is called the implied volatility. First, note that the accuracy with which the ( implied ) volatility a can be inferred from the observed price C depends on the vega V, or rather its reciprocal l/V, by the formulae t5C Vt5a, t5a V - 1 t5Cj if vega is small, i.e. V - 1 is large, small errors t5C in call-option prices are magnified into large errors t5a in the implied volatility, and one loses accuracy. But when vega is large, one gains accuracy: since implied volatility depends on sensitivity analysis, one expects it to work best when prices are sensitive to a! Next, note that determination of implied volatility in this way is well adapted to the familiar and classical Newton-Raphson iteration procedure of numerical analysis. If the transcendental equation C = C(a) is to be solved for a, and an is the current approximation, then the next is - C an - C(an) - C an +l := an - C(Sn) V(an) . C' (an) For background, see a text on numerical analysis, e.g. Jacques and Judd ( 1987) . But the most important thing to note about implied volatility is that here one is at the mercy of the model. One of the great merits of the Black-Scholes analysis is that it uses a simple, parametric model and gives explicit formulae in its conclusions. All parametric models - in statistics, finance or any other field - are wrong in the sense that they represent at best idealised or simpli fied descriptions of reality, and yield correspondingly idealised or simplified conclusions. One cannot expect the geometric Brownian motion model un derlying the Black-Scholes analysis to be any more than a rough description of reality. Thus, one cannot expect the above derivation of implied volatil ity, which depends explicitly on the details of this model, to give anything =
/'oJ
/'oJ
=
6. 2 The Generalized Black-Scholes Model
257
more than a rough description of actual volatility of actual prices ( see Shaw (1998) , Chapter 1 , for the pitfalls arising when using implied volatility in the context of exotic options ) . Since volatility is so important - just as the Black-Scholes analysis is so important - it has been studied in great detail, and much empirical knowledge has been built up about implied volatility in a range of empirical studies. Now volatility is a parameter of the model, and so should be constant - it represents a property of the particular underlying stock in question. However, volatility is observed to vary, most notably with the underlying stock price S - or equivalently, with the strike-price K. In particular, for values of S much higher than the strike price K - that is, for options deeply in the money - the volatility is higher. The resulting upward curve in the graph of against K is called the volatility smile. Since volatil ity is observed in practice to vary with the price of the underlying, which is random - a stochastic process - an obvious way to refine the model is to accept that the volatility is itself random and treat it accordingly. Such stochastic volatility models ( on which we touched in §5.1O) are undoubtedly important, but as they are much more complicated than their counterparts with deterministic volatility, we postpone a treatment of them in the context of incomplete market models to §7.2 For references to the recent literature, see e.g. Campbell, Lo, and MacKinlay (1997) , §§9.3.6, 12.2. 1, 12.2.2, Frey (1997) and Hobson (1998) . Returning to the Black-Scholes model with its geometric Brownian motion dynamics, the other obvious way to estimate volatility is to use, not the option price observed now, but the underlying asset price observed over time. The resulting volatility estimate is called the historic volatility. To estimate it, given data available in practice - finitely many values of the stock price at finitely many time points tl < . . . < tn - one may follow the usual statistical procedure for parameter estimation, the method of maximum likelihood ( see §5.9 and e.g. Rao (1973)) . One can avoid explicit use of a parametric model as simple as geometric Brownian motion: motivated by the important problem of volatility estima tion, there has been an upsurge of recent interest in estimation of diffusion coefficients for diffusion processes. For details, background and references, see e.g. Dohnal (1987) , Florens-Zmirou (1989) and Genon-Catalot and Jacod (1994). We now have two types of volatility estimate: implied and historic. Were the model exact, these two would agree ( approximately, to allow for sampling error ) , as they would simply be different ways of measuring the same thing. However, empirical studies reveal more disagreement between implied and historic volatility estimates than can be accounted for merely by sampling error - which of course reflects, in another guise, the deficiencies of the un derlying model - geometric Brownian motion - in the Black-Scholes analysis. One obvious way to seek to avoid the limitations of this - or any other parametric model is to avoid choice of a parametric model altogether and a
.
258
6 . Mathematical Finance in Continuous Time
resort instead to the methods of non-parametric statistics. Non-parametric volatility estimates have been studied recently, but to discuss them in detail here would take us too far afield. We refer for details to Campbell, Lo, and MacKinlay ( 1997) , § 12.3.4, Ghysels et al. ( 1998 ) , Fouque, Papanicolaou, and Sircar (2000) .
6 . 3 Further Contingent Claim Valuation
We generally consider a Black-Scholes setting. 6.3.1 American Options
As in discrete time ( § 1 .3.3) , these are equivalent to Eu ropean calls - there is no advantage in exercise before expiry. One can also handle extensions, including ) which goes back to McKean in 1 965; ( i ) the perpetual call (T ( ii ) options on stocks paying dividends. For a full account, we refer to Karatzas and Shreve ( 1998) , §2.6. American Puts. We saw in §4.8, in discrete time, the link between Ameri can options and the Snell envelope. We also saw, in the binomial model; how to price American options by backward induction ( dynamic programming ) , and how the stopping and continuation regions S and C - the parts of the tree where it is optimal to exercise now and when it is optimal to hold for exercise later - emerge a corollary of this procedure. We now consider American options ( puts, above ) in more detail, following the excellent sur vey by Myneni ( 1 992) and the textbook account Karatzas and Shreve ( 1998) to which we refer for proofs, background and further references. We content ourselves with formulating and discussing the main results. We stress that, American options are much harder to handle than European ones, the avail able results are much less explicit than in the European case. In particular, explicit pricing formulae - and explicit descriptions of the regions S, C or the optimal stopping boundary S * separating the two - are not known, and are probably inaccessible in general. Early-exercise Decomposition. We use our standard approach, risk neutral valuation. Writing Tt, T for the set of stopping times 7 taking values in [t, T] , the optimal stopping problem is to maximise the risk-neutral expectation of our discounted payoff (K S (7) )+ over all stopping times 7 E Tt, T . Write American Calls.
= (0
,
as
as
as
-
-
6 . 3 Further Contingent Claim Valuation
259
for this optimal expected payoff over all stopping times (optimal expected gain on stopping without foreknowledge of the future) . Then as in discrete time, J(t) is the Snell envelope (least supermartingale majorant of the dis counted payoff process). When the current stock price at time t is x, write P(x, t) sup{lEx [e-r( r- t ) (K - S(7) ) + ] : 7 E 1t,r } for the value function of the option (so P(x, t) =: pet) = ert J(t) , using that Set) = x contains as much relevant information as Ft ) . Then P(x, t) ;::: ( - x) + , with equality if and only if stopping now is optimal: S = { (x, t) E IR+ [0, T] : P(x , t) = - x) + }, :=
K
X
(K [0, T] : P(x, t) > (K - x) + } .
C = { (x, t) E IR+ Then P is continuous (Myneni (1992), Prop. 3.1) , whence the stopping region S is closed and its complement, the continuation region S, is open. Also Set) := {x : (x, t) E S} and C (t) := {x : (x, t) E C} are intervals, the graph of S* (t) := sup{x : x E Set) } is contained i n S , and for each t, S * (t) gives the price level at or below which exercising now is optimal (as the option is a put - giving us the right to sell at price K - the optimal exercise region will clearly be of this form) . The Snell envelope, being a supermartingale, may be (a.s.) decomposed uniquely into a martingale part and a potential part (the Riesz decomposition; see below) , as follows: X
J(t) lE [,-"T (K - S(T)) + IF,j + lE �
[1 ,-""rKl{s(u) <s- (u) } du l
F' '
(Myneni (1992) , Theorem 3.2; the result is due to El Karoui and Karatzas) . Because pet) ert J ( t) , there is an analogous decomposition of the value function, P(x, t) = p(x, t) + e(x, t) , where p(x, t) = lEx [e-r(T - t ) (K - S(T)) + is the price of the corresponding European option, which can be thought of as the intrinsic value of the American option - what it is worth if the possibility of early exercise is removed. Accordingly, the second term =
]
260
6. Mathematical Finance in Continuous Time
is the early-exercise premium - the extra value conferred by the right to ex ercise early. It is proportional to the strike price K, as it should be; it also shows very clearly the role of the indicator process l {s( t) <s* (t)} , which tells us whether or not exercise now is optimal. Because of this, the result is often called the early-exercise decomposition. The proof involves the Ito-Tanaka extension of Ito's formula ( and so local time ) , dual predictable projections ( compensators ) , and the Doob-Meyer decomposition in continuous time. For background on the Riesz decomposition - the probabilistic counterpart of the classical decomposition of F. Riesz of 1930 of a superharmonic function into a harmonic function and potential - see Doob (1984) and Neveu (1975). The theory may be extended in various ways, including perpet ual options, options on dividend-paying stocks, restrictions on borrowing ( of cash ) , restrictions on short selling ( of stocks ) , higher interest rates for bor rowing than for lending etc. For a detailed account, see Karatzas and Kou (1998) . Convergence of American Options. Given a sequence of discrete-time financial models approximating a continuous-time model, the weak conver gence theory of §6.4 below ensures that the price of, say, a European option in the discrete setting converges to that of its continuous-time counterpart. This argument does not, however, apply to the more complicated setting of American options, as here the early exercise possibility introduces a control to which standard weak-convergence theory does not apply. Nevertheless, under mild conditions the prices of American options in the discrete-time setting do indeed converge to their continuous-time counterparts. For details, see Amin and Khanna (1994) . By appropriate choice of topology, weak convergence theory can be ex tended to American options, giving convergence results for price processes, their Snell envelopes and hedging strategies. For formulation and proof of results of this type, see Lamberton and Pages (1990) . Extensions.
6.3.2 Asian Options
Here the payoff is a function of the average price of the underlying between contract time and expiry time. Asian options are widely used in practice for instance, for oil and foreign currencies. The averaging complicates the mathematics, but, e.g., protects the holder against speculative attempts to manipulate the asset price near expiry. For details and references, see e.g. Geman and Yor (1993), and Rogers and Shi (1995) .
6.3 Further Contingent Claim Valuation
261
Pricing options typically involves calcu lating an expectation of the form JE (X + ) , where X is a random variable such as S K (European call) , K - S (European put) or Y K with Y JOT Sdp, (Asian option above) . Rogers and Shi (1995), §3, point out a general method yielding a lower bound, which may be surprisingly accurate. Splitting X into its positive and negative parts X + and X - (X X + X , IXI X + + X - ) , we have X + 2: X, whence
The Rogers-Shi Lower Bound. -
-
=
as
=
=
and also JE (X + I Z) 2: 0, so Thus using iterated conditional expectations and the above, The accuracy of this lower bound may be estimated from the quite general bounds The success of the method hinges on finding a suitable choice of condition ing variable Z for which JE (X I Z) , and hence the expectation of its positive part, is easier to handle than JE (X + ) but the two are still close. In the case of a fixed-strike Asian option considered above, a suitable choice is given by T Z := W (u)du
J a
(a zero-mean Gaussian variable) . For the details (which are quite compli cated) , and numerical illustration, we refer to Rogers and Shi (1995) . We now outline the approach to Asian options of Geman and Yor (Geman and Yor (1993) ; Yor (1992a) , Yor (1992b) , Chap ter 6) in terms of Bessel processes and Laplace transforms. For b 2: 0 and W a standard B M (IR) , consider the SDE The Geman-Yor Method.
dp(t)
=
bdt + 2 -/p (t) dW (t) , Po
=
a 2: O.
This SDE has a solution p = ( p (t)) t > o , or p" , called a Bessel-squared pro cess with index �b - 1, BES Q li� As for the name: the square root R" of a Bessel-squared process p" is a Bessel process BESli , a diffusion whose v :=
262
6. Mathematical Finance in Continuous Time
generator involves the Bessel operator giving rise to the classical Bessel func tions (see Watson (1944) ) . The semigroup of and the transition probability of involve the Bessel function of imaginary argument. The link with mathematical finance arises because the exponential of a drifting Brownian motion (the price process in the lognormal or geometric Brownian motion model of Black-Scholes theory) is a time-changed Bessel process. Specifically (Geman and Yor ( 1993) , Prop. 2.3) :
BESQ Iv v ,
BES6,
exp(W(t) + vi)
�
R"
(i
)
exp(2(W(s) + VS) } dS .
For the fixed-strike Asian option, the payoff is 1 (A (T) - K) + , A(x) := __ x to -
x
jS(U)dU,
to and risk-neutral valuation gives the price at time t as (writing IE for the risk-neutral expectation) Define c(") (x, q)
,�
DE
[ (l
exp{2(W( u ) H u ) } du
-
q
fl
Use of the Markov (indeed, independent increments) property of Brownian motion at time t and Brownian scaling leads to e - r ( T- t ) 4 (t) ( ) Ct, (K) = C ( h , q) , T a2 T - t0 where t a2 a2 2r T to) = 1 , t)j q := K(T h := 4 ( : lJ 2 a 4S(t)
(S )
-
(
v
-
-
[ S(U)dU) .
Thus q is a random variable whose value is known at time t. Negative values of q reflect high stock prices over the time already expired. Geman and Yor (1993) , (3. 1 ) , obtain the following contingent pricing formula (which applies only when q :::; 0):
6 . 3 Further Contingent Claim Valuation
263
This explicit pricing formula - the Geman- Yor formula - somewhat resem bles the classic Black-Scholes formula in structure. It does not involve the volatility a explicitly (only implicitly, via the stock price S(t) and its integral to date, It: S(u)du). From the Geman-Yor formula, one may make comparisons between Asian option prices and those of their European counterparts. Geman and Yor showed that: (i) for v + 1 ;?: 0 (that is, for ;?: 0) the Asian option price is less than that of its European counterpart, for any strike price K; (ii) for v < 0 the reverse is true, at least for K close enough to zero. Note that for options on domestic stock the risk-neutral drift r is the domestic spot rate, which will be positive, so case (i) applies. For options on foreign stock, or currency options, the risk-neutral drift is the difference of the foreign and domestic spot rates, and can have either sign. In particular, the wide spread belief that Asian options are cheaper than European ones - which partly accounts for the popularity of Asian options - is only partly true. As above, our ability to price Asian options is tantamount to our knowl edge of the functions CCv) (h, q). Geman and Yor show that the Laplace trans form (in h) of this function can be identified in terms of confluent hyperge ometric functions (for background on these, see Slater (1960)) . They find that, writing J.L := V2A v 2 , r
oo J
+
e - >'hCCvl (h , q)dh
o
=
fl/C2q) e-x x ! CJL-Vl - 2 ( 1 - 2 q x) ! (JL + v ) + l dx . A(A 2 2v)r( -21 ( J.L v) - 1)
Jo
-
-
-
Numerical inversion of the Laplace transform may now be applied to the right-hand side to obtain numerical values of CCvl (h, q) . The fast Fourier transform (FFT) is applied to this problem in Eydeland and Geman ( 1995) . 6.3.3 Barrier Options
The question of whether or not a particular stock will attain a particular level within a specified period has long been an important one for risk managers. From at least 1967 - predating both CBOE and Black-Scholes in 1973 practitioners have sought to reduce their exposure to specific risks of this kind by buying options designed with such barrier-crossing events in mind. As usual, the motivation is that buying specific options - that is, taking out specific insurance - is a cheaper way of covering oneself against a specific danger than buying a more general one. One-barrier options specify a stock-price level, H say, such that the op tion pays ('knocks in') or not ('knocks out') according to whether or not level H is attained, from below ('up') or above ('down') . There are thus four pos sibilities: 'up and in', 'up and out', 'down and in' and 'down and out'. Since
264
6. Mathematical Finance in Continuous Time
barrier options are path-dependent (they involve the behaviour of the path, rather than just the current price or price at expiry), they may be classified as exotic; alternatively, the four basic one-barrier types above may be regarded as 'vanilla barrier' options, with their more complicated variants, described below, as 'exotic barrier' options. Note that holding both a knock-in option and the corresponding knock-out is equivalent to the corresponding vanilla option with the barrier removed. The sum of the prices of the knock-in and the knock-out is thus the price of the vanilla - again showing the attractive ness of barrier options being cheaper than their vanilla counterparts. A barrier option is often designed to pay a rebate a sum specified in ad vance - to compensate the holder if the option is rendered otherwise worthless by hitting/ not hitting the barrier. We restrict attention to zero rebate here for simplicity. Consider, to be specific, a down-and-out call option with strike K and barrier H (the other possibilities may be handled similarly) . The payoff is (unless otherwise stated min and max are over [0, T] ) as
-
(S(T) - K )+ l {min S( .) ;:::: H } = (S(T) - K) l {s (T);:::: K,min S( .);:::: H } ,
so by risk-neutral pricing the value of the option is DOCK, H IE [e-rT (S(T) - K) l{ s ( T) ;:::: K ,min S (. );:::: H } ] ' where S is geometric Brownian motion, S(t) = po exp{ (/L - � a 2 t) aW(t) } . Write c /L - �a 2 fa; then min S(.) 2: H iff min (ct + W(t) ) 2: a - 1 log (H/po ) . Writing X for X (t) := ct + W(t) drifting Brownian motion with drift c, m , M for its minimum and maximum processes m (t) min{X( s ) : s E [0, tn , M(t) := max{X(s) : s E [0, tn , the payoff function involves the bivariate process (X, m ) , and the option price involves the joint law of this process. Consider first the case c = 0: we require the joint law of standard Brown ian motion and its maximum or minimum, (W, M) or (W, m ) . Taking (W, M) for definiteness, we start the Brownian motion W at the origin at time zero, choose a level b > 0, and run the process until the first-passage time (see Exercise 5.2) r(b) := inf{t 2: 0 : W(t) 2: b} at which the level b is first attained. This is a stopping time, and we may use the strong Markov property for W at time r(b) . The process now begins afresh at level b , and by symmetry the probabilistic properties of its further evolution are invariant under reflection in the level b (thought of as a mirror) . This reflection principle leads to the joint density of (W(t) , M(t) ) as lPo (W(t) E dx , M(t) E dy ) :=
+
:=
-
:=
=
{I
x) 2 v'2ifi3 exp - 2 (2Y - X) /t
2(2 y
-
}
(0 :::;
x
:::;
y) ,
6.3 Further Contingent Claim Valuation
265
a formula due to Levy. ( Levy also obtained the identity in law of the bivariate processes (M(t) - W (t) , M (t ) ) and ( ! W(t) ! , L(t) ) , where L is the local time process of W at zero: see e.g. Revuz and Yor The idea behind the reflection principle goes back to work of Desire Andre in and indeed further, to the method of images of Lord Kelvin then Sir William Thomson, of on electrostatics. For background on this, see any good book on electromagnetism, e.g. Jeans Chapter Levy's formula for the joint density of (W(t) , M (t) ) may be extended to the case of general drift c by the usual method for changing drift, Girsanov's theorem. The general result is
(1991), VI. 2 ). (1824-1907), 1887, (1925), VIII.
1848
IPo (X (t) E
dx, M (t) E dy) 2(2y - x) exp { - (2Y - X) 2 + ex - 1 t } (O :s; x :s; y) . 2t 2 27ft3 See e.g. Rogers and Williams (1994), I, ( 13. 10), or Harrison (1985), §1. 8 . As an alternative to the probabilistic approach above, a second approach to this formula makes explicit use of Kelvin's language - mirrors, sources, sinks; see e.g. Cox and Miller (1972), §5.7. Given such an explicit formula for the joint density of (X ( t ) , M(t)) - or equivalently, (X (t) , m(t) ) - we can calculate the option price by integration. The factor S(T) - K, or S - K, gives rise to two terms, in S and K, while the integrals, involving relatives of the normal density function may be =
� V
-c
2
n,
obtained explicitly in terms of the normal distribution function N - both features familiar from the Black-Scholes formula. Indeed, this resemblance makes it convenient to decompose the price DOCK ,H of the down-and-out call into the ( Black-Scholes ) price of the corresponding vanilla call, CK say, and the knockout discount, K O D K , H say, by which the knockout barrier at H lowers the price:
The details of the integration, which are tedious, are omitted; the result is, writing A - � a2 , := r
where Po is the initial stock price as usual and Cl , C2 are functions of the price P Po and time t T given by ) log (H jpK ) + ( r ± � a 2 ) t Cl , 2 p, t = Vt =
=
2
(
a
is that of the excellent text Musiela and Rutkowski (1997), §9.(the6 , notation to which we refer for further detail) . The other cases of vanilla barrier
266
6. Mathematical Finance in Continuous Time
options, and their sensitivity analysis, are given in detail in Zhang (1997) , Chapter 10. Geman and Yor ( 1996) adapt the Laplace-transform method they devel oped for the Asian options of §6.3.2 to two-barrier options. They show (their Appendix 2) how, and why, the distribution of (X(t) , m(t) , M (t) ) becomes simpler if the fixed time t is replaced by a random time To , independent of X and exponentially distributed with parameter �02 (such exponentially distributed times lead directly to Laplace transforms) . We refer to Geman and Yor ( 1996) for details of their proof, which involves Girsanov's theo rem, Brownian scaling, Wiener-Hopf factorization and excursion theory. The Geman-Yor method gives numerical results for pricing after numerical in version of the Laplace transform. It also gives numerical results for hedging to the same accuracy - which is a rare occurrence for path-dependent op tions. In particular, this approach completely avoids the dangers inherent in Monte-Carlo simulation applied to barrier-crossing problems. Before sim ulating, we must discretize to pass from a diffusion in continuous time to a random walk in discrete time, which reduces calculating probabilities to counting paths. However, the path counts inevitably depend on the specifics of the discretization used (irrelevant to the original problem) , to a degree that leads to unavoidable loss of accuracy. Many other, exotic, types of barrier options have been developed. These include: (i) moving-boundary (or 'floating-barrier') options, where the barrier at con stant level H is replaced by a function H (t) of time; (ii) Asian barrier options, where the boundary crossing is by an average of the price process rather than the price process itself; (iii) barrier options where the barrier operates only for part of the time interval [0, T] : forward start, for an interval of the form [u, TJ , early end ing, for one of the form [0, J window for one of the form [u, ] C [0, T] ; (iv) outside barrier options, where the barrier and the payoff function relate to different underlying assets (for instance, a stock and a currency, etc.). v ,
v
For details, see e.g. Zhang (1997) , Chapter 1 1 . 6.3.4 Lookback Options
In everything we have encountered so far, uncertainty has unfolded with time, and our task has been to make optimal use of the information available to date. For options, at expiry T the investor is in possession of the history of the price evolution over the time interval [0, T] of the option's life, and it may well be - will be - that with hindsight he could have done better than he actually did without hindsight. It is only natural to look back with regret at what inability to foresee the future has cost one. If only one could buy at the low, and sell at the high . . .
6 . 3 Further Contingent Claim Valuation
267
Goldman, Sosin, and Gatto (1979) made two key realizations: i this natural wish on the part of investors would provide a potential mar ket for options providing them with the right to do just that, looking back in time; ii the relevant mathematics - risk-neutral valuation, and the distributional results on drifting Brownian motion and its maximum and minimum, used above in our treatment of barrier options - existed to price such options. It thus became as practical matter to develop, and price, such lookback options as other option types we have already encountered, and this was accordingly done. Of course, the term 'option' is applied here by extension rather than strictly: the options described above will always be exercised! They will accordingly be more expensive than options encountered before. We write S for the price process, Ml, mr for its maximum and minimum over [0, t , M� , v] ' mru , v] for its maximum and minimum over [u , and simi larly with S replaced by other processes X . The two basic types of standard, vanilla lookback options are the lookback call, with payoff
() ()
a
)
]
v], (
LC(T) (S(T) - mf)+ S(T) mf, giving one the right at time ° to buy at the low over [0, T] , and the lookback put, with payoff LP(T) (M,j. - S(T)) + M,j. S(T) , =
:=
:=
-
-
=
giving one the right at time t to sell at the high over [0, TJ . Risk-neutral pricing gives the arbitrage price at time t E [0, T) of the lookback call as LC(t) e-r (T - t ) IE [(S(T) - mf ) IFd e - r ( T - t ) IE (S(T) IFt) e-r ( T - t ) IE ( m f l Ft) II - 12 , say. Now =
-
=
=
as the discounted price process is a JP* -martingale. Also
S(t)
=
Po exp
{ (r
-
� a2 ) t
+
}
a W (t) ,
with W a standard JP*-Brownian motion, so for t :::; u :::; T,
(
S u)
=
( )}
S(t) exp {X(u) X t , -
268
6. Mathematical Finance in Continuous Time
where
1 2
X (t) = ct + O'W(t) , c := r - _ 0' 2
is a drifting ]P* -Brownian motion. Now
S ( . ) = S(t) exp {m;T mf, T = min ' }' [t ,TJ and
' { mtS , mts T } . mTS = mIn ,
By stationary independent increments of X, m;T is independent of Ft and equal in law to m ; , where T := T - t . To evaluate I2 , it thus suffices to find f(s, m ) : = .IE [min { m, s exp {m ; } }) ,
as then
I2 = e-r ( T -t ) f(S(t) , m f ) , giving 12 in terms of data known at time t. Now f(s, m ) - m = .IE [min { m, s exp{m; } } - m] = .IE [ (s exp { m; } - m) l{ m:; �log ( m/ s ) }] .
Since we already have an explicit expression for the density of the minimum of a drifting Brownian motion, we can calculate the expectation on the right by multiplying the function on the right by the density and integrating. As before, the calculations are simplified by reducing to the drift less case c = 0 by Girsanov's theorem. We omit the details of the calculation, which are quite tedious; the result is, writing s for S(t) , m for m f , for T - t as before, and r I , 2 := r ± � 0' 2 ,
T
with a similar formula for the put, P LP(t) . For details, see Musiela and Rutkowski (1997) , Proposition 9.7. 1. The options above are floating strike lookback options, in that the role previously played by the strike K is now played by a random variable. But one can also have fixed strike lookback options, with payoffs max ( M� - K, O ) , max ( K - mf, 0)
6.3 Further Contingent Claim Valuation
269
for calls and puts respectively. One can have partial lookback options, with payoffs max{S{T) - )..mf , 0)
for calls,
max{).. Mi - S{T) , 0)
for puts, where ).. E (0, 1] is the degree of partiality: one only partially elim inates regret - and pays less for the option accordingly. And one can have American lookback options, which may be exercised early. For details and background on such exotic lookback options, see Zhang (1997) , Chapter 12. The theory above supposes that prices are monitored continuously. In practice, prices are only monitored discretely - and as we saw in our treat ment of barrier options, discretization can produce appreciable errors, partic ularly if Monte-Carlo simulation is resorted to. For a contemporary account of discrete lookback ( and barrier) options, see Levy and Mantion (1997). 6 . 3 . 5 Binary Options
A binary ( or digital ) option is a contract whose payoff depends in a discon tinuous way on the terminal price of the underlying asset. The most popular variants are: Cash-or-nothing options. Here the payoffs at expiry of the European call resp. put are given by BCC{T) = C l{s ( T » K} resp. BCP{T) = C l {s( T)
=
S{T) 1 Kl {K1 < S ( T)
The valuation of these options shows the power of the risk-neutral valuation formula. Recall the situation for a European call. Here the payoff is given by C(T) = (S{T) - K) l{ S ( T » K} and the arbitrage price by the risk-neutral valuation formula (6.14): C(t)
=
er t JE* [C (T) e-rT IFt ] .
We can now price all the above options by using the linearity of the expec tation operator; see Exercise 6.4.
270
6. Mathematical Finance in Continuous Time
6 . 4 Discrete- versus Continuous-time Market Models
6.4.1 Discrete- to Continuous-time Convergence Reconsidered
Discrete-time financial market models, where the underlying probability space is finite and trading occurs at only finitely many points in time, are largely understood (cf. Chapter 4). The intuitive assumption of no arbitrage can be characterised by the probabilistically desirable martingale property of the security price process with respect to an equivalent martingale measure. Using the risk-neutral valuation principle, we are able to price attainable con tingent claims in a unique way. For such claims, it is also possible to obtain risk-free hedging portfolios satisfying the needs of risk-management restric tions. In addition, we can explicitly characterise the property of completeness of the market via the uniqueness of the equivalent martingale measure. As we have seen so far in this chapter, our knowledge concerning continu ous-time securities markets is less satisfactory. The concept of no-arbitrage has to be replaced by the more involved concept of no free lunch with van ishing risk, and true market completeness is hardly ever encountered in a continuous-time setting (we are only able to replicate contingent claims in some reasonable large subset of the set of contingent claims). Despite these shortcomings, continuous-time models have become cen tral to the financial community, both academics and practitioners. The eco nomic acceptability of continuous-time models is usually motivated by some sort of limiting procedure - steadily increasing the number of trading dates leads to a continuous model. This naive approximation, however, does not allow continuous-time models to be viewed as proper limits of discrete-time financial markets, because the fundamental properties of no arbitrage and completeness are in general not preserved by such limiting procedures. The following question (first proposed by Kreps (1982)) is the key to our discus sion: If an idealised economy (i.e. a continuous-time security market model) has a certain property (e say) , is it possible to approximate the idealised economy by a sequence of 'real-life' economies (i.e. finite securities market models) such that each finite economy has property 8? If so, we say the approximation procedure has the structure-preserving property. A property of considerable practical and theoretical interest is the prop erty of 'dynamic completeness' of a market model. For example, the Black Scholes risk-neutral model is often criticised for its assumptions of continuous dynamic hedging (and constant volatility, which we addressed in §6.2) . Prac titioners look for some tangible ideas to improve risk-management faced with violation of the Black-Scholes world by discrete-dynamic hedging. A proper convergence concept quantifying the general validity and robustness of the continuous security market will allow for such improvement. Furthermore, this convergence would allow for another way to tackle these problems nu merically.
6.4 Discrete- versus Continuous-time Market Models
271
There are basically two approximation procedures that have the struc ture-preserving property. The first approximation method relies on pathwise approximation results for stochastic processes and was developed by Taqqu, Willinger et al. in a series of papers (Cutland, Kopp, and Willinger (1991) , Taqqu and Willinger (1987) and Willinger and Taqqu (1991)). Such pathwise approximations rely on the fine structure of the filtrations (flow of informa tion) used and provide therefore a natural approach to dynamic hedging and completeness. The drawback of this method is the relative complexity of the convergence results and the limited computational tractability (in general the number of nodes in the approximating tree grows exponentially with the number of trading days) . The second approximation method is based on the notion of weak con vergence for stochastic processes. In this framework, the limiting stochastic process of the risky assets is approximated by finite Markov chain models. The concept of weak convergence of these objects is well-established and can readily be used. The computational efficiency of the approximation is, due to the Markovian nature of the approximating models, very high. The obvious drawback, of course, is that we now use a weaker convergence theory, which only controls the distributions rather than the particular paths. Furthermore the theory applies only under the assumptions of a diffusion setting (which is included in the setting of §6.2) . In the sequel we will use the second ap proximation method, which is capable of producing all relevant results for our purposes here. 6.4.2 Finite Market Approximations
We now want to develop a general theory of approximations of continuous time markets, as defined in §6.2 - restricted to the diffusion setting - by discrete-time market models. Our analysis in Chapter 4 and §6.2 showed that the defining properties of market models are the (d + I-dimensional) stochas tic process S, describing the price processes (discounted with respect to an appropriate numeraire) of the modelled securities and the (one of possibly several) equivalent martingale measure used for pricing (and hedging) pur poses. We can characterise such a martingale measure by its Radon-Nikodym derivative L with respect to the original probability measure (compare §5.8 and Remark 6. 1 .4) . To emphasise this we denote the financial market market models by M ( S, L) . We assume that the numeraire processes are bounded and to avoid notational distractions assume they are equal to one, i.e. we are working with the already discounted security prices in this section. Definition 6.4. 1 . A finite market approximation of the continuous-time se
curities market model M ( S, L) is a sequence of discrete-time market models (M ( n ) (s(n) , L(n» )) :'=l ' where the { s( n ) ( t ) , t E [0, Tn are (d+ l) -dimensional and the { L ( n ) ( t ) , t E [0, Tn are real-valued stochastic processes defined on
272
6 . Mathematical Finance in Continuous Time
probability spaces (.o( n ) , F( n ) , lP( n » ) with the following properties: (i) For each n � 0, the sample paths of the process s( n ) are piecewise con stant and RCLL (right-continuous with left limits) and jump only at times t given by a grid Tn = {O = t�n ) < t�n ) < t�n ) < . . . < t tl = T} , where for n -+ 00 the grid mesh sUP k t�n ) t �� l -+ 0 . (ii) For each n � 0, s( n ) = { s( n ) (t) : t E 'T } is a finite Markov chain. (iii) For each n � 0, { L( n ) (t) , t E 'T } is the Radon-Nikodym derivative (used for the change of measure) in the discrete-time market model M ( n ) . (iv) (s( n ) , L( n » ) converges weakly to (S, L) , that is for all bounded continuous functions h : D d + 2 [0, -+ JR, we have
I
-
l
T]
where D d+ 2 [0, T] is the space of all RCLL functions from [0, T] to JR d + 2 , and lEn ' lE denote the expectations under the probability measures lP( n ) , lP. We denote by Q( n ) , Q the martingale measure on (.o( n ) , F( n » ) resp. (.0, F) induced by the Radon-Nikodym derivative processes L( n ) , L, i.e. dQ( n ) dlP ( n )
=
L (n) ,
dQ = L. dlP
Of course, the above definition doesn't require the approximating markets to share the properties of the limiting market. Due to the importance of this we give the following definition. Definition 6.4.2. A finite market approximation is structure-preserving with
respect to the property e of the continuous-time limit, if each discrete-time market has the property e .
The property we will focus on is dynamic completeness. By the theory of weak convergence (see §5.9), we have the following con sequence for contingent claim pricing using finite market approximation to a continuous-time market model:
Theorem 6.4. 1 . Let M ( n ) (s( n ) , L( n » ) be a finite market approximation of
a continuous-time market model M (S, L) and assume that { L( n ) (T) } is uni formly integrable. For contingent claims Y = f(S) and y( n ) = J( s( n » ) , with f : D d +1 [0, T] -+ JR+ a bounded function, and time t = ° prices IIy resp. IIy( n ) of the contingent claim in their respective markets, we have the follow ing limiting relation: IIy = lim IIy(n) . n -+oo
6.4 Discrete- versus Continuous-time Market Models
273
Proof. We will use the risk-neutral valuation formula in the discrete- and continuous-time markets. Since { L( n ) (T) } is uniformly integrable and f is bounded, the sequence of random variables { L ( n ) (T)f(s( n ) ) } is uniformly integrable. This and convergence in distribution give convergence of expec tations ( see e.g. Theorem 25. 12 in Billingsley (1986)) :
IIy
=
IEq [f(S)]
=
IE [L (T) f(S)]
o
Remark 6. 4 . 1 . ( i ) Observe that Theorem 6.4.1 applies to all finite market approximations ( not just to structure-preserving approximations ) . ( ii ) A simple sufficient condition for uniform integrability is uniform bound edness in LP for some p > 1 ( see §2.6 and Williams ( 1991) , Chapter 13 ) . Example. As an example one can consider pricing a European put option in a continuous-time model. The theorem tells us that we can price the put by pricing the corresponding put in the appropriate discrete-time setting and then computing the limit of the discrete-time model prices ( a European call is then priced by put-call parity ) . The next theorem explores the consequences for trading strategies for contingent claims. Theorem 6.4.2. Let M ( n ) (s( n ) , L( n ) ) be a finite market approximation of a
continuous-time market model M (S, L) , and assume that { s( n ) } is a good se quence of semi-martingales. Assume that the sequence of discrete-time trading strategies cp( n ) converges weakly to a continuous-time trading strategy cpo Then the (discrete-time) gains Gcp(n) and value Vcp( n ) processes converge weakly to their continuous-time counterparts Gcp and Vcp .
The proof is a direct consequence of the definition of the processes and the goodness property of semi-martingales. Of course the real work in applying Theorem 6.4.2 is to show the goodness of the underlying approximating process { s( n ) } and the weak convergence of the trading strategies. Having established these facts, one can apply the the orem to justify 'discrete-time hedging of continuous-time claims'. Suppose we construct a continuous-time hedging strategy cp to hedge a contingent claim X, then the theorem tells us that under appropriate conditions on the ap proximating discrete-time models, we can hedge ( in the limit ) the contingent claim with the induced discrete-time strategies. Given that in practice trad ing is discrete, this is very reassuring for risk managers trying to cover their positions with trading strategies deduced from continuous-time models.
274
6. Mathematical Finance in Continuous Time
6.4.3 Examples of Finite Market Approximations
We consider a Black-Scholes type continuous-time financial market model with a diffusion setting within the framework of §6.2. Let us consider a securities market consisting of d risky stocks and one locally risk-free bond. The d-dimensional vector of stock prices, S, and the bond price, B, are described by the stochastic differential equations 1,
dB(t)
=
B(t)r(S(t) )dt, B (O)
dS(t)
=
b(S(t)) dt + a(S(t) )dW(t) , Si (O)
=
=
Pi
E (0, 00) ,
(6 . 16)
where W (WI , . . . , Wd ) is a d-dimensional standard Brownian motion de fined on a complete probability space (il, :F, JP). We assume that the ap preciation rate vector b JRd � JR, the volatility matrix a JRd � JRd x d and the interest-rate process r JRd � JR are continuous. Furthermore, r is assumed to be bounded and a to be non-singular. Define the covariance matrix a(x ) a(x)a' (x) (aij (x))i.j = I . . . . d . This is necessarily non-negative definite; we assume that it is positive definite: then for some number 8 > 0, =
:
:
:
=
=
(6. 1 7)
In addition, we assume that b and satisfy the uniform Lipschitz condition, that is, there exists a constant C > 0, such that for all x , y E JRd , a
j b(x) - b( y ) j + jj a(x) - a( y ) jj � C j x - yj .
(6.18)
Given this setting, it follows by the results of §6.2 that the market M = M (B, S) described by the above bond process B and the d-dimensional vec tor S with the bond price process used as a numeraire is arbitrage-free and complete ( in the restricted sense). In particular, there exists a unique mar tingale measure given by its Radon-Nikodym derivative L( ' )
�
exp
with ( Recall I d equation
{-i
, ( S (u)) ' dW(u)
0 ", t '" T,
a(x) - I (b(x) - r (x) ld ) . ( 1 , . . . , 1)'.) Observe that L satisfies the stochastic differential , (x)
=
- � i 1 I, ( 8 (u)) II ' dU } ,
dL(t)
=
-1' (S(t) )L(t)dW(t) , L (O) = 1 . ( L i s the stochastic exponential as defined in §5. 1O, compare §6.2.2) . Our =
aim now is to construct a sequence of discrete-time financial markets M ( n ) , sharing the properties of no-arbitrage and completeness, which approximates the above continuous-time financial market.
6.4 Discrete- versus Continuous-time Market Models
275
We do this by utilising a finite Markov-chain approximation scheme ( for an excellent overview of such methods, see Kushner and Dupuis (1992) ) , and start by approximating the stock price processes. Let h > 0 be a scalar approximation parameter. We wish to find a se quence of Markov chains on finite state spaces that converges in distribution ( h -+ 0) to the process defined in (6. 16) over the time interval [0, T] . The basis of the approximation is a discrete-time parameter, finite state space Markov chain {�� , n < } whose 'local properties' are consistent with those given in ( 6 . 1 6 ) . The continuous-time parameter approximating process will be a piecewise constant interpolation of this chain, with appropriately chosen interpolation intervals. To make the above precise, for each h > 0 let {�� , n < } be a discrete parameter Markov chain on a discrete state space Sh E ]Rd . Suppose we have an interpolation interval Ll th ( x ) > 0, and define Llt� Ll th (�� ) . Let sUPx Ll th ( x ) -+ 0 as h -+ 0, but infx Llth ( x ) > 0 for each h > O. Define the difference Ll�� = �� +1 - �� and let IE�, n resp. Cov� , n de note the conditional expectation resp. covariance given {�f , i ::; n, �� x } . Suppose that the chain obeys the following 'local consistency' conditions: sup 1I�� +1 - �� I I -+ 0 (h -+ 0) , as
oo
,
oo
=
=
n
IE�, n (Ll�� ) = b h ( x ) Ll t h ( x ) + 0 (Ll t h ( x ) ) , Cov�, n (Ll�� - IE�, n Ll�� ) = a h ( x ) Llt h ( x ) = a ( x ) Llt h ( x ) + 0 (Llth ( x ) ) . Note that the chain has the 'local properties' of the diffusion process (6. 16) ,
in the following sense:
IE [ S(t + Llt) - S(t) I S(t)] '" b(S(t) ) Ll t, Cov [S(t + Llt) - S(t) I S(t)] '" a (S(t)) Ll t.
We outline the construction of suchT a Markov chain ( the reader should consult He ( 1990) for details ) . Assume = 1 and set h lin , hence our grid . h t ( n ) kl n , k 0 . . . , n. T } WIt n n n I n = {O = t o( ) < t (l ) < . . . < t n( ) k Now construct a sequence of triangular arrays of independent, identically distributed, d-dimensional random vectors (E( n ) ) k::; n , with components E� ; ) uncorrelated, but possibly dependent. Each component takes exactly d + 1 different values and =
�
=
=
=
,
We now define the d-variate, ( d + l ) -nomial approximation process for the diffusion, s ( n ) , as the solution of the stochastic difference equation ( we omit the index n for the grid ) s( n ) (tk + d
=
�
s (n Ek s( n ) (tk ) + b( ) (tk ) ) + O' ( s( n ) (tk ) ) , n yn
(6. 19)
276
6. Mathematical Finance in Continuous Time
and s( n) (0) = S(O) . From (6.19) , we see that the random vector €�n) is used to approximate the random increment of the Brownian motion from time tk to time tk +l , and that s( n) is a Markov chain. We check the consistency requirements: for the drift condition we use that €�n) have expectation vector zero:
for the covariance it follows that
Hence the constructed Markov chain has the local consistency property. Define s( n) (t) = s( n) (tk) for tk � t < tk + l , then the sample paths of S are piecewise constant and only have jumps at tk (of course s( n) (T) = s( n) (T) ) . In the same way, replacing differential equations by difference equations we model bond price processes B( n) as
and the Radon-Nikodym processes L( n) as
We refer the reader to He (1990) for the actual construction of the processes L( n) , starting with the state-price vectors (compare § 1 .4) . In our current set ting, we know from §4.7.2 that the discrete-time markets are free of arbitrage and complete, so L( n) (T) defines the unique martingale measure. We have the following convergence theorem. Theorem 6 . 4 . 3 . The sequence of discrete-time financial markets M ( n) ((B( n) , s( n » ) , L( n » ) is a structure-preserving finite market approxima tion with respect to dynamic completeness of M ( (B, S) , €) . Proof.
We already know that completeness is shared by each of the
M ( n) , s and M . So it only remains to show that x( n) = ((B( n) , s( n » ) , L ( n » ) converges weakly to X = ( (B, S) , L ) . This is done by applying the martingale
6.4 Discrete- versus Continuous-time Market Models
277
central limit theorem ( Ethier and Kurtz (1986) , Chapter 7, and again see He 0 (1990) for the computational details ) . We now show weak convergence of contingent prices. Here a contingent claim is defined to be a random variable of the form Y = for some measurable, bounded function 1] --+ 1R+ Define by the time 0 values of the contingent claim in the continuous-time market and the nth approximating market.
t
g(S) n IIy, II� )
g : Dd[O,
=
Proposition 6.4. 1 . We have the following convergence:
nlim-too IIy(n) IIy. Proof. Since the discount factors 1/ B resp 1/ B( n ) are bounded we only have to show uniform integrability of L( n ) (T) to be able to apply Theorem 6.4.1 . Using Remark 6.4. 1 , it is enough to show uniform boundedness of n) (T) in L2. The continuity of b, and together with the condition on the L(covariance matrix ( 6.17 ) ensure that "Y is continuous. Furthermore, mimicking the uniform boundedness proof of the successive approximations from any =
a
r
existence proof for strong solutions of stochastic differential equations ( see §5.8 or Kloeden and Platen ( 1992), proof of Theorem 4.5.3) , we can show that the sequences are uniformly bounded, i.e. for = 1, 2, . . .
s(n) L2 n sup IE ( Is; n\t k ) 1 2 ) C 00, i = 1 , . , d O� k � n ( recall the initial conditions are Si(O) Pi E [0, 00) , i = 1 , . d and that the components of the tkn ) have second moment equal to 1 ) . So "Y(s( n ) (t k )) is uniformly bounded for n 1 , 2, . . . O�supk � n IE ( l "Yi(s(n)(tk )) 12 ) C'Y 00, i 1 , . . . , d (6.20) for some constant C'Y ' Now using successively the Cauchy-Schwarz inequality, independence of the tkn) and S(n)(t k ), zero correlation between the coeffi cients of the tkn ), and condition (6.20): ::;
<
.
=
=
.
:
::;
.
<
=
.
278
6. Mathematical Finance in Continuous Time
We thus see that
[
]
IE L( n ) (T) 2 � exp {dC'"'( }
so L( n ) (T) uniformly L 2 -bounded and the claim follows.
o
To apply Theorem 6.4.2 we have to show: Proposition 6.4.2.
( s( n» ) is a good sequence of semi-martingales.
Proof. For the general case we use Theorem 5.11.3 and the condition G of §5. 11.4. We will only outline a proof in the classical Black-Scholes case ( see Duffie and Protter (1992) for further details ) . In this case we have
se n ) (tk +1 )
=
se n ) (tk ) + se n ) (tk )
(� + Jn €kn» .
Using the piecewise-constant process s( n ) (tk ) , we can write this as ds ( n ) (t)
=
s( n ) (t)dx ( n ) (t) ,
with for x( n ) for tk � t < tk + 1 defined as x ( n ) (t)
=
_1 ""ntk ( b + u/n) ) �
v'n '· j=1
_
v'n '·
J
:=
ntk 1 "" y( n ) .
_
v'n ' · j� =1
J
We already know that s e n ) converges weakly, so by Proposition 5. 1 1 . 1 we can conclude that s e n ) is good if we can show that x( n ) converges weakly and is good. Weak convergence of x( n ) ( to bt + u W(t)) follows along the lines of the example in §5. 1 1 .3. To show that x( n ) is good we show that it satisfies condition (G) of §5. 1 1 .4 and use Theorem 5.11.3. Now the (y� n» ) k::; n are independent, and have finite means ( inherited from the ( € kn» )k ::; n ) . So
is a martingale, and thus a decomposition of x( n ) is
The jumps in M( n ) are uniformly bounded, and because A ( n ) is deterministic with A( n ) (t) ---+ bt, (n ---+ 00 ) , IE ( I A( n ) (T) i ) is bounded. Thus condition (G) 0 holds and we are done. The last ingredient in Theorem 6.4.2 is a sequence of weakly converging discrete-time trading strategies. We give two examples:
6.4 Discrete- versus Continuous-time Market Models
279
L1-hedging. Assume that we are given a contingent claim Y g(S(T)) with a measurable and bounded function 9 IRd --+ IR+ . The completeness of of the financial market models M ( n ) (s( n ) , L( n ) ) guarantees the existence of replicating strategies cp (n ) for y( n ) g(s( n ) (T) ) (compare §4.5 ) . Similarly, completeness of M (S, L) leads to a replicating strategy cp in the correspond ing continuous-time model for Y g( S ( T ) ) . For 9 and a sufficiently smooth, He showed in He ( 1 990) the weak convergence of cp( n ) to cpo Discrete Black-Scholes Hedging. (Compare Duffie and Protter (1992) . ) Consider again the special case of the standard Black-Scholes model. Suppose we are interested in replicating a standard European call option with strike K and expiry T on the stock. Recall the Black-Scholes pricing formula (6.14) : =
:
=
=
C (t, S(t))
=
S(t)N (d1 ( S (t) , T - t)) - Ke- r (T - t) N (d2 (S(t) , T - t)) ,
with d1 ( s, t ) and d2 ( s, t ) given by _
d1 ( S , t ) -
2
log( s / K) + ( r + 0; )t a v'tt
r;.
_
_
d2 ( s, t ) - d1 ( s, t ) - av t -
'
log ( s /K ) + ( r - 0;2 )t a v'tt
Write Cs for 8 C(t , s ) /8s. The continuous-time replicating strategy is given by (compare Proposition 6.2.2) CP l (t)
=
Cs (t, S(t) ) , CPo (t)
=
e - r ( T - t ) (C( S(t) , t) - Cs ( S (t) , t)).
Consider the induced discrete-time strategies defined by and CP6n ) given so that (cp6n ) , cp in ) ) is self-financing with inital endowment C(O, S (O) ) . Using Lemma 5 . 1 1 . 1 we see that cp (n ) --+ cp weakly. We can now use Proposition 6.4.2 and these results to obtain from The orem 6.4.2: Proposition 6.4.3. The gain and value processes of the discrete-time trad
ing strategies in the above examples converge weakly to their continuous-time counterparts.
Together with Proposition 6.4. 1, the above results are very reassuring. Not only do the contingent claim prices converge, but we also have weak con vergence of the (delta-) hedging strategies, which is a first step in justification of the use of approximating discrete-time models for the risk-management of contingent claims. The second example is particularly important for practi cal needs, as in applications, hedging strategies are often designed using a continuous-time model, but naturally only performed in discrete time steps.
280
6. Mathematical Finance in Continuous Time
Remark 6.4 . 2. (i) Duffie and Protter (1992) give several other interesting examples where the above theory can be applied. (ii) Weak convergence techniques can also be used to link results on discrete time optimal consumption and investment problems to those in continuous time (recall the functionals which can be used in a weak convergence setting, see §5.9.2) . Pioneering work in this area has been done in He ( 1991). For an introductory treatment of the theory of optimal consumption and investment we refer the reader to Karatzas and Shreve ( 1991) , §5.8. An excellent recent treatment of this theory was given by Korn (1997a). (iii) Further applications of the weak convergence approach can be found in Nelson and Ramaswamy (1990), where emphasis is put on the numerical side. As examples the reader can find the constant elasticity of variance stock price process (see §4.6.3) and the Cox-Ingersoll-Ross diffusion model for the short rate (see §8.2) . (iv) As mentioned above, a comparison of the weak-convergence approach with pathwise approximating methods is given in Willinger and Taqqu (1991). The pathwise approach using methods from nonstandard analysis is given an extended treatment in a series of papers by Cutland, Kopp and Willinger (Cutland, Kopp, and Willinger (1991), Cutland, Kopp, and Willinger (1993a) , Cutland, Kopp, and Willinger (1993b)) . 6.4.4 Contiguity
The results above give settings where, in a sequence of approximating markets where asset prices converge, option prices converge too. We now consider this general question afresh using the concept of contiguity; we follow Hubalek and Schachermayer (1998). We have in mind a sequence of approximating market models, the nth of which has (real-world or physical) probability measure IPn . Let Qn the corresponding (unique, if we assume completeness) risk-neutral measure. We say, following Roussas (1972), that (Qn ) is contiguous to (IPn ) if, for all sequences of events (An ) At the other extreme, (IPn ) and (Qn ) are entirely separated if, along some subsequence (n k ), there are sets (Ank ) with With ( Sn ) the asset-price process in the nth model the result of Hubalek and Schachermayer (1998) is as follows. If the asset-price model ( Sn I IPn ) converge weakly to a complete asset-price model ( S l IP) , and if Qn , Q are the equivalent martingale measures, then if (i) the terminal values Sn (T) are uniformly Qn-convergent;
6.5 Further Applications of the Risk-neutral Valuation Principle
281
( ii ) (Qn)
is contiguous with respect to (lPn ) - then (Sn I Qn ) converge weakly to (S I Q) ( and so option prices converge ) . The result may fail if either of conditions ( i ) , ( ii ) is omitted. Examples of sequences where asset prices converge but option prices do not may readily be constructed by using a sequence of binomial models ( Sn , lPn ) , where for each the up and down probabilities U n , dn may depend on ( 'a non-homogeneous binomial model' ) . By contrast, with homogeneous binomial models the situation cannot occur. One may loosely summarize failure of convergence of asset prices to give convergence of option prices to the corresponding limit as saying the options are 'asymptotically wrongly priced'. Since option prices are usually fixed by arbitrage arguments, this suggests that in such situations there may be ar bitrage opportunities in the limit. This is indeed the case, Kabanov and Kramkov (1998) develop several concepts of asymptotic arbitrage, which are relevant to results of these kind. We must refer to the original sources cited above for further details. Rele vant background on weak-convergence theory is in Jacod and Shiryaev (1987), esp. V 2.32, VI 3. 18. See also Rachev and Riischendorf (1994) . Kabanov and Kramkov (1994) and Kabanov and Kramkov (1998) develop a theory of large financial markets, which - roughly - constitute a sequence of ordinary ( 'small' ) markets. The idea of asymptotic arbitrage mentioned above may be used to develop versions of the Fundamental Theorem of Asset Pricing in such a setting. For results of this type using contiguity ( actually, 'bicontiguity' ) as above, see Klein (2000). n
n
6 . 5 Further Applicat ions of t he Risk-neutral Valuation Principle
6 . 5 . 1 Futures Markets
We have been dealing so far with derivatives based on underlying assets stock - existing, and available for trading, now. It frequently happens, how ever, that the underlying assets relevant in a particular market will instead be available at some time in the future, and need not even exist now. Obvious examples include crop commodities - wheat, sugar, coffee etc. - which might not yet be planted, or be still growing, and so whose eventual price remains uncertain - for instance, because of the uncertainty of future weather. The principal factors determining yield of crops such as cereals, for instance, are rainfall and hours of sunshine during the growing season. Oil is another ex ample of a commodity widely traded in the future, and here the uncertainty is more a result of political factors, shipping costs etc. Financial assets, such as currencies, bonds and stock indexes, may also be traded in the future, on exchanges such as the London International Financial Futures and Options
282
6. Mathematical Finance in Continuous Time
Exchange (LIFFE) and the Tokyo International Financial Futures Exchange (TIFFE) , and we shall restrict attention to financial futures for simplicity. We thus have the existence of two parallel markets in some asset - the spot market, for assets traded in the present, and the futures market, for assets to be realized in the future. We may also consider the combined spot-futures market. Futures prices, like spot prices, are determined on the floor of the ex change by supply and demand, and are quoted in the financial press. Futures contracts, however - contracts on assets traded in the futures markets - have various special characteristics. Parties to futures contracts are subject to a daily settlement procedure known as marking to market. The initial deposit, paid when the contract is entered into, is adjusted daily by margin payments reflecting the daily movement in futures prices. The underlying asset and price are specified in the contract, as is the delivery date. Futures contracts are highly liquid - and indeed, are intended more for trading than for delivery. Being assets, futures contracts may be the subject of futures options. A transaction may instead, however, involve an agreement between two parties for the sale/purchase of a specified asset at some specified future time for some specified price. Such an agreement is a forward contract; unlike futures, forward contracts are not traded. They are intended for delivery, and are not liquid. For further background, and detailed discussion of futures and their forward counterparts, we refer to e.g. Duffie (1989) , Kolb (1991) and Musiela and Rutkowski (1997), Chapters 1 , 3, 6. We shall as before write t = 0 for the time when a contract, or option, is writt.en, t for the present time, T for t.he expiry t.ime of the option, and T* for the delivery time specified in the futures (or forward) contract. We will have T* ;:::: T, and in general T* > T; beyond this, T* will not affect the pricing of opt.ions with expiry T. Despite their fundamental differences, futures prices fs ( t , T ) - on a stock S at time t with expiry T and the corresponding forward prices Fs (t, T ) , are closely linked. We use the notation B(t, T) of Chapter 8 for the bond price process. We recall the definition of a predict.able (or previsible) process from Chapters 3, 5. We quote (Musiela and Rutkowski ( 1997), Prop. 3 .3.2) the following result, whose proof involves simple arbitrage arguments: -
Proposition 6 . 5 . 1 . If the bond price process B(t, T ) is predictable, the com
bined spot-futures market is arbitrage-free if and only if the futures and for ward prices agree: for every underlying S and every t � T, fs (t, T )
=
Fs ( t , T ) .
In the import.ant special case of the futures analogue of the Black-Scholes model, to which we t.urn below, the bond price process - or interest-rates process - is deterministic, so predict.able. However, it is as well to work in a more general framework when possible, to be able to incorporate stochastic
6.5 Further Applications of the Risk-neutral Valuation Principle
283
interest rates, as in Chapter 8. Black's Futures Options Formula. We turn now to the problem of ex tending our option pricing theory from spot markets to futures markets. We assume that the stock-price dynamics S are given by geometric Brownian motion dS ( t) = bS (t) dt + uS ( t)d W( t ) , and that interest rates are deterministic. We know that there exists a unique equivalent martingale measure, JP* (for the discounted stock price processes) , with expectation JE * . Write
J ( t) : = Js ( t , T * )
for the futures price J (t) corresponding to the spot price Set ) . Then risk neutral valuation gives J ( t ) = JE* ( S ( T * ) IFt ) (t E
[0, T*] ) ,
while forward prices are given in terms of bond prices by F (t) = S ( t) / B ( t, T * ) (t
E
[0, T *] ) .
So by Proposition 6.5. 1 J( t) = F (t)
S (t) eT ( T * -t ) (t E
[0, T*] ) . So we can use the product rule (5.13) to determine the dynamics of the futures price =
dJ ( t ) = ( b - r ) J ( t) dt + uJ( t)dW ( t ) , J(O)
=
S (O)eT T * .
We know that the unique equivalent martingale measure in this setting is given by means of a Girsanov density 1 b-r 2 exp{ b -u r W e t ) - -2 ( - ) t } , u so the JP*-dynamics of the futures price are dJ( t) u J( t)dW ( t) with W a JP*-Brownian motion, so J is a JP*-martingale. Observe that our derivation depends critically on the fact that interest rates are deterministic; for a more extended treatment, we refer to Musiela and Rutkowski (1997), §6. 1 . We turn briefly now to the futures analogue of the Black-Scholes formula, due to Black ( 1976). We and use the same notation - strike K, expiry T as in the spot case, and write N for the standard normal distribution function. L ( t)
=
- -
=
284
6. Mathematical Finance in Continuous Time
C of a European futures call option is C(t) = c(J(t) , T - t) ,
Theorem 6 . 5 . 1 . The arbitrage price
where c(J, t) is given by Black 's futures options formula:
c(J, t) := e-r t ( J N ( d 1 (J, t)) - K N ( d2 ( J, t)) ) ,
where
d 1 , 2 ( f, t ) Proof.
log(J / K) ± !a 2 t
:=
ayt
By risk-neutral valuation,
B(t)JE * [( J ( T) - K)+ / B( T ) 1Ft ] , ert . For simplicity, we work with t OJ the extension to the C(t)
=
with B(t) general case is immediate. Thus =
C(O)
JE* [( J ( T ) - K) + / B(T) ]
=
JE* [e - rT f ( T) ID ] - JE* [e-r T KID ]
D 12
=
=
=
=
say, where Thus
.
e-rT KlP* ( J ( T )
(
{J ( T )
:=
>
K)
{
>
K}.
� } K) .
e-rT KlP* f (O) exp aW ( T ) - a 2 T
>
where W is a standard Brownian motion under 1P* . Now � is standard normal, with law N under 1P* , so 12
=
=
=
( (
( (O)/K) - !a2 e - rT KlP* � < log J av'T (J(O)/K) - !a2 e - r T K N lOg av'T T e - r Kd2 ( J ( 0 ) , T ) .
)
:=
-W(T)/v'T
)
A similar calculation, also proceeding as in the spot-market case, gives II
=
e - r T f ( 0 ) d1 ( J( 0 ) , T ) . o
Observe that the quantities d 1 and d2 do not depend on the interest rate This is intuitively clear from the classical Black approach: one sets up a replicating risk-free portfolio consisting of a position in futures options and an offsetting position in the underlying futures contract. The portfolio requires no initial investment and therefore should not earn any interest. r.
6.5 Further Applications of the Risk-neutral Valuation Principle
285
6.5.2 Currency Markets
Frequently, we will be interested in assets - risk-free bonds and risky stocks - in several countries simultaneously. For simplicity, we restrict attention to just two countries, the domestic country with interest rate rd and the foreign country with interest rate r f . Again in the simplest case, both are positive constants; we can thus write as usual for the domestic and foreign 'bank accounts'. The link between the domestic and foreign economies is the exchange-rate process, Q say, by which one passes from denomination in foreign to domestic currency. The fluctuations in ex change rates depend on a multiplicity of factors: the interest rates rd , r f ' the performance of the two economies, the policies of the two governments and central banks, the corresponding things in countries significantly linked to them, the markets' perceptions of all of these, etc. Thus a multi-dimensional noise process is appropriate for modelling purposes; the simplest plausible model is geometric Brownian motion with d-dimensional noise process: dQ (t)
=
Q (t) (J1 Q dt + uQ dW(t) ) , Q (O) > 0,
where J1 Q is a constant drift ( which may have either sign ) , uQ is a positive vector of volatilities, W(t) is a standard d-dimensional Brownian motion. Solving, one obtains
where 11 . 1 1 denotes the Euclidean norm in JR d . The value of our foreign savings account Bf (t) is Bf (t) Q (t) in domestic currency, and Bf (t) Q (t)/ Bd (t) when discounted by the domestic interest rate. We write this process as Q * : its dynamics are given by d Q * (t)
with solution
=
Q * (t) ( ( J1Q
+ rf - rd )t + uQ dW(t ) ) ,
To avoid arbitrage - between the domestic and foreign bond markets ( the only markets presently in play ) - we need to pass to an equivalent mar tingale measure eliminating the drift term in the dynamics above. We have
286
6. Mathematical Finance in Continuous Time
a d-dimensional noise process, and cannot expect uniqueness of equivalent martingale measures unless there are d independent traded assets available. When this is the case, there exists a unique equivalent martingale measure JP* , called the domestic martingale measure, giving the dynamics as dQ ( t ) Q (t) ( (rd rf ) dt + UQ . d W ( t )) , Q (O) 0 =
>
-
with W a JP* -Brownian motion. This, as its name implies, is the risk-neutral probability measure for an investor reckoning everything in terms of the do mestic currency. Risk-neutral valuation gives the price process of a contingent claim X as 7l"x ( t ) e - rd ( T -t ) JE* (X I Fd · Consider now an agent involved in international trade, wishing to limit his exposure to adverse movements in the exchange-rate process Q. He will seek to purchase an option protecting him against this, in the same way that an agent dealing in a stock S will purchase an option in the stock. Of course, an exchange rate is not a tangible asset in the sense that a stock is; nevertheless, it is possible, and helpful, to treat options on currency in a way closely analogous to how we treat options on stock. For this purpose, consider the forward price at time t of one unit of the foreign currency, to be delivered at the settlement date T, in terms of the domestic currency. It is natural to call this the forward exchange rate, FQ (t , T) say. In terms of the bond-price processes, one has =
a relationship known as interest-rate parity. For, absence of arbitrage requires that the forward exchange premium must be the difference rd rf between the two exchange rates. Currency options may now be constructed, and priced, analogously to options on stock. For example, a standard currency European call option may be constructed, with payoff -
GQ (T)
:=
(Q (T) - K)+ ,
where Q (T) is the spot exchange rate at the option's delivery date and K is the strike price. Finding the arbitrage price of this option is formally the same as finding the price of a futures option, with the forward price of the stock replaced by the forward exchange rate. We already have the solution to this problem, in Black's futures options formula. Adapted to the present context, this yields the following currency options formula, due to Garman and Kohlhagen (1983) and to Biger and Hull (1983) independently. Proposition 6 . 5 . 2 . The arbitrage price of the currency European call option
above is
Exercises
where F ( t )
=
FQ ( t , T) is the forward exchange rate, log ( F/ K) ± � a� t d 1 , 2 (F, t)
:=
aQ Vit
287
'
Exercises
6 . 1 In the setting of the classical deduction of the Black-Scholes differential equation ( compare §6.2.2 ) , show that the trading strategy short fs stocks, long a call is not self-financing. Show furthermore that the additional cost associated with this trading strategy up to time T can be represented by a random variable with zero mean under the risk-neutral martingale measure in the standard Black-Scholes model. 6 . 2 1. Use the put-call parity ( compare §1 .3) to compute the pricing formula for the European put option in the classical Black-Scholes setting: P(O) P(S, T, K, r, a ) K e -rT N( -d2 (S, T)) - SN(-d1 (S, Tn 2. Compute the 'Greeks' for the European put option ( with the help of a computer -programme such as Mathematica, if available ) . 6.3 Use the parameters of the example in §4.8.4 to compute Black-Scholes prices of the European put and call options. Construct discrete ( using the tree ) and continuous ( using the Black-Scholes .<1) hedge-portfolios. Compare your portfolios at intermediate time-points. 6.4 Recall (§6.3.5) that a binary ( or digital ) option is a contract whose payoff depends in a discontinuous way on the terminal price of the underlying asset. The most popular variants are: Cash-Dr-nothing options. Here the payoffs at expiry of the European call resp. put are given by BCC(T) C1 { s (T» K } resp. BCP(T) C1 { s (T ) < K} ' Asset-Dr-nothing options. Here the corresponding payoffs are BAC(T) S(T) l {s ( T » K} resp. BAP(T) S(T) l{s ( T)
=
=
=
=
=
BCC(t)
=
e -r (T -t ) CN(d2 (S(t) , T - t) ) ,
with parameters as in the Black-Scholes formula. 2. Compute the delta, vega and theta for these options and compare them with the European call and put. ( Use e.g. Mathematica for these calcula tions. )
288
6. Mathematical Finance in Continuous Time
6.5 Let (il, F, JP, IF) be a standard filtered probability space (in particular IF is the internal filtration) . Consider the following financial market model
on this space.
bank account stock 1
dB(t) = rB (t )dt d81(t) 81(t) (b1 dt + O"UdW1(t) 0"1 2 dW2 (t)) =
+
stock 2
0" = [ O"U 0"21 0"0"2212 ]
with all coefficients constant, and the volatility matrix nonsingular. 1. Using as numeraire find an equivalent martingale measure JP* such that i = 1 , 2 are martingales and state the dynamics of and
8iB(t)8dB, 81 8 • 2 2. Now use 82 as numeraire. Define another change of measure from JP* to a further martingale measure Q such that 8d 82 = 8d 82 is a martingale and state the Q-dynamics of 8d 82 . 3. Use the risk-neutral valuation principle to find the time t = 0 price of the exchange option Cex with payoff at time T Cex (T) (81 (T) 82 (T) ) + . Hint: Recall the Black-Scholes formula (6. 14) for pricing a European call with strike K CBS = 8N(d1) - e - rT KN (d2 ) with d1 , 2 ( log(8/K) ( ± l�n T) /O" VT . =
=
=
+
r
-
7 . Incomplete Markets
We now return to general continuous-time financial market models in the setting of §6. 1 , i.e. there are d + 1 primary traded assets whose price processes are given by stochastic processes So , . . . , Sd , which are assumed to be adapted, right-continuous with left-limits (RCLL) and strictly posi tive semi-martingales on a filtered probability space (il, F, lP, IF) (as usual IF = (Ft ) t :5T) . We assume that the market is free of arbitrage, in the sense that there exist equivalent martingale measures, but it contains non attainable contingent claims, i.e. there are cash flows that cannot be repli cated by self-financing trading strategies. In view of Theorem 6. 1 .5 this means that we do not have a unique equivalent martingale measure. We try to answer the obvious questions in this setting: how should we price the non-attainable contingent claims, i.e. which of the possible equivalent martingale measures should we pick for our valuation formula based on expectation, and, how can we construct hedging strategies for the non-attainable contingent claims to 'minimize the risk? We try to answer these two questions in the general setting and then consider a prominent example of an incomplete market, a market with stochastic volatility, in more detail. Further approaches to the problems of pricing and hedging contingent claims in incomplete markets (which we don't cover) , e.g. construction and use of super-replicating strategies, embedding in complete markets, use of the numeraire portfolio etc. , are treated in e.g. Duffie and Skiadas ( 1994), El Karoui and Quenez (1995) , El Karoui and Quenez (1997) , Gourioux, Lau rent, and Pham (1998), Karatzas (1996) . The general theory of incomplete markets is developed in Magill and Quinzii (1996). 7 . 1 Pricing i n Incomplete Markets
7. 1 . 1 A General Option-Pricing Formula
We describe an approach proposed by Davis (compare Davis (1994) and Davis (1997)). If the market is incomplete, we have several choices of equivalent martingale measures to price contingent claims. Recall that we didn't impose conditions on the preferences of the economic agents other than that they prefer more to less. So it is quite natural to specify the preferences further in
290
7. Incomplete Markets
order to select one of the equivalent martingale measures. The specification of the investor's attitude towards risk is done in terms of a utility function. Definition 7. 1 . 1 . A continuous function U
: (0, 00) -+ IR that is strictly increasing, strictly concave and continuously differentiable with U' (x) = 0, lim -to U' (x) = 00 is called a utility function.
limx-too
x
Utility functions are commonly used in economics to quantify decision prob lems (see Ingersoll (1986) for motivation of special structure, further dis cussion etc.). The most commonly used examples of utility functions are U(x) log x and U (x) xCY./o. for 0. E ( -00, 1 ) /{ 0 } or U(x) (xCY. - 1 ) /0. (0. < 1), which includes U(x) log x as the case 0. O. Some authors restrict to U non-negative and bounded - e.g. , U(x) 1 ( c > 0). An investor with such a utility function U and initial endowment x trading only in the underlying assets So , . . . , Sd forms a dynamic portfolio t,p, whose value at time t is (we need to keep track of the initial endowment) . His objective is to maximise expected utility under the original probability measure of his final wealth at time given that he is allowed to choose his trading strategy t,p from a suitable subset iPa of the set of self-financing trading strategies. We write =
=
=
=
=
=
e-cx ,
Vcp , x (t)
T
U(x)
=
cpsupE<Pa IE [U (Vcp,x (T)) ]
for the maximal utility. Now suppose that a contingent claim X (a sufficiently integrable random variable) is made available for trading with current pur chase price p. We ask the question whether the maximal utility above can be increased by doing so. To find a 'fair' price p for a contingent claim we follow a 'marginal rate of substitution argument' quite commonly used in pricing: p is a fair price for the contingent claim if diverting a little of his funds into it at time zero has a neutral effect on the investor's achievable utility. More precisely, if we set W(8, x, p)
=
sup
¢E<Pa
IE
[U ( Vcp,x -o (T) + P�X) ] ,
then we can state: Definition 7. 1 . 2 . Suppose that for each fixed (x, p) , W (8, x, p) is differen tiable as a function of 8 for 8 = 0, and that there is a unique solution p(x)
of the equation
8W (O , p, x) = O. 88 Then p(x) is the fair option price at time t = O.
Before we can go on, we have to check whether this pricing approach is consistent with the arbitrage price for attainable contingent claims (otherwise the new approach would introduce arbitrage opportunities in the market and
7. 1 Pricing in Incomplete Markets
291
X
we would have to reject it ) . To this end let be an attainable contingent claim with arbitrage price ( found as the unique initial endowment of any self-financing replicating trading strategy ) . Suppose that is offered for p and that the investor buys contingent claims with the amount of cash and so has a remaining endowment of which can be used to set up a dynamic trading strategy cp E CPa . Since is increasing it is optimal for the investor to sell the replicating portfolio short ( so no exposure results from the contingent claim at time and invest 1) optimally. This results 1)). The marginal rate of in attaining an expected utility of substitution is therefore
Po 81p
X
x 8, U x + 8(� U(x + 8(P;
8,
-
T)
-
-
:8 U [X + 8 ( � - 1 )] ! 6=0 = ( � - l) U' (X) , which is only zero for p Po . In general it is very hard to check that the function U is differentiable =
( to justify our rather informal deduction above ) . Fortunately only a 'weak' form of differentiability, quite similar to a 'viscosity solution'-like approach to optimal control problems, suffices ( see Fleming and Soner (1993) and Karatzas (1996) for details on such weak approaches) . For the purpose of our current outline we will take differentiability of for granted, and as sume additionally the existence of a trading strategy cp * E CPa such that = IE Under these assumptions we obtain a general pric ing formula based on Definition Theorem 7. 1 . 1 (Davis) . Suppose that is differentiable at each x E IR +
U
[U (Vcp*,x (T))].
U(x)
and that
7.1. 2 .
U of Definition 7.1.2 is given by U' (x) > O. Then the fair price fi(x) IE l U ' (Vcp* , x (T)) X] p (7.1) UI(X) . •
Proof.
_
-
Assume that cp * E CPa is such that
W(8, x, p)
[U ( Vcp* , x (T) + �X) ] for 8 0, then one can show ( see Davis (1997) , Lemma 2) that !W (0,X, P) ! 6 = 0 :8 IE [U ( Vcp * , x (T) + �X) ] ! 6=0 ' So we have to differentiate IE ( U (.)) with respect to Using a Taylor expan sion of U around Vcp*, x (T) we have =
IE
=
=
O.
IE
[U (VCP* ,X -6(T) + �X)]
[U (Vcp*, x -6(T))] + P�IE lU' (V,CP' .x - 6(T)) X] + ( 0) . IE
0
292
7. Incomplete Markets
Neglecting the (J)-term and differentiating with respect to 15, we obtain 0
:J IE [U (v� x -o (T) � X ) ] 1 0=0 :J IE [U (V� x - o (T))) 1 0=0 + P� IE [U' (V� x (T)) X) . IE [U(V� x (T))) the first-order condition for optimality is -U ' (x) � IE [U' (V� x (T)) X ] 0 +
. .
•
Since U(x)
=
•
•
.
.
+
and from this
.
• .
(7.1) follows.
=
o
We ask for the reader's patience until cation of this pricing approach.
§7. 3 , where we will give an appli
7. 1 . 2 The Esscher Measure
S(t)
t
Let denote the price at time of a non-dividend-paying stock. Assume that there is a stochastic process {X(t)} with stationary and independent increments (Le. a Levy process, §5.5) such that
S(t) S(O) eX(t) , t � O. For 'IjJ a measurable function and Y a random variable, write IE ['IjJ(Y) ; h] IE ['IjJ(Y) ehY] lIE [ehY] . Assume that the moment-generating function M(h, t) IE [e hX(t) ] =
=
=
of X (t) exists; then The process
t) [M(h, 1 )] t. { ehX(t) M(h, 1)- t } t �O M(h,
=
is a positive martingale and can be used to define a change of probability measure, Le. it can be used to define the Radon-Nikodym derivative dQldlP of a new probability measure Q with respect to the original probability mea sure IP; Q is called the Esscher measure of parameter h. Gerber and Shiu introduced the risk-neutral Esscher measure: the Esscher measure of parameter h h * such that the process
(1995)
=
is a martingale. The condition
7. 1 Pricing in Incomplete Markets
293
lE [e - rt S(t); h*] S(O) yields ert lE [eX(t) ; h* ] = lE [ e X(M(ht ) *h,.lX() tt) ] [ M(lM(h+*h*, 1), 1) ] t or l + h* , 1 ) . e r = M(M(h * , l) This equation then uniquely determines the parameter h * . Because for t 0 h , e hX (t) M(h , l ) - t - -lE--=-e[he-XhX-(t()""'t)�] lES(t) [S(t)h] =
=
+
=
2::
_
we have the following:
( For a measurable function and h, k and t real numbers, with t 0, lE [S(t)kg (S(t» ; h] lE [S(t)k; h] lE [g (S(t») ; k + h].
Lemma 7. 1 . 1 Factorization Formula ) . 2::
9
=
Proof.
lE [S( t)k g (S(t)); h] lE [S(t)k g (S(t))e hX (t) M(h, 1 ) - t ] _ lE [ S(t)k+ h g (S(t))] - lE [S (t)h] _ lE [S (t)k+ h ] lE [ S(t) k+ h g ( S (t)) ] - lE [S(t)h] lE [S (t) k +h] = lE [S(t)k ; h] lE [g (S(t)) ; k + h]. =
o
We apply the above technique to the valuation of a European call with maturity T and strike K on the underlying with price dynamics By the risk-neutral valuation principle, we have to calculate
S(t).
lE [e-rT (S(T) K)+ ; h* ] lE [e - rT (S(T) K ) l{ s (T» K } ; h* ] e- rT { lE [S ( T) l { S ( T » K } ; h* ] K lE [ l { S ( T » K } ; h* ] }. -
=
=
-
-
To evaluate the first term, we apply the factorization formula with k and = l{ x> K} and get
h* g (x)
=
1,
h
=
294
7. Incomplete Markets
IE [ S( T) I {S( T » K} ; h * J
= IE [S( T) ; h *] IE [1{S( T » K} ; h * + 1 J = IE [e - rTS ( T) ; h * J erT JP [S(T)
>
K; h * + 1]
>
= S(O)erT JP [S(T) K; h* + 1] , where we used the martingale property of e- r t S (t) under the risk-neutral Esscher measure for the last step. Now the pricing formula for the European call becomes S(O)JP [S ( T)
>
K; h * + 1] - e - rT KJP [S( T)
>
K; h *] .
For X (t) a Brownian motion, the above formula recovers the Black-Scholes pricing formula. It is possible to generalise the above approach to the multi-asset case. Let X (t) = (X l ( t) , . . . , Xd ( t) ) ' . Then M ( z , t) = IE e zf X (t ) , z E IRd is the moment-generating function of X (t) . Assuming that { X ( t) } t >o is a Levy process (has stationary independent increments: §5.5) , we have
[
For h = (h 1 ' . . . ' hd ) E
]
M ( z , t) = [ M ( z, l) ] t , t ::::: O. IRd for which M (h , 1 ) exists the e hf X (t) M (h , l ) - t t�O
{
}
positive martingale
can be used to define a new measure, the Esscher measure of parameter vector h. Again h * is defined as the parameter h = h * such that e-r t Sj (t) , j = I , . . . , d are martingales. We can rewrite these conditions as er =
M (lj + h * , l) . , = l, . . . M (h * , l ) )
, d,
where I j = (0, . . . , 0 , 1 , 0, . . . , 0) with the jth coordinate being l . As in the one-dimensional case we have a factorization formula. For k = (k1 , . . . , kd )' let S( t)k = S l ( t) k1 . . . Sd ( t) kJ , then IE [S (t) k g (S(t) ) ; hJ = IE [ S( t) k ; hJ IE [g(S( t) ) ; k + h] . Remark 1. 1 . 1 . (i) The choice of the Esscher measure may be justified by a utility-maximising argument for a representative agent along lines similar to those leading to the pricing formula in §7. l .2. However, the class of possible trading strategies for the agent is very restrictive in this case (see Gerber and Shiu ( 1995) for details) . (ii) The Esscher measure offers a very attractive way to find at least one equivalent martingale measure in many incomplete market situations (see for instance its use in the hyperbolic Levy model Eberlein and Keller ( 1995) ) . X
X
7.2 Hedging in Incomplete Markets
295
7 . 2 Hedging in Incomplete Market s
In Chapter 6 we used general martingale representation theorems to prove existence of hedging strategies and to construct them. The existence of non attainable contingent claims implies in that context that there exist ( suffi ciently ) integrable, Fr-measurable random variables X for which we cannot find an integral representation ( with respect to an equivalent martingale mea sure ) in terms of the ( discounted ) price processes S (So , . . . , Sd) ' . There fore such claims carry an intrinsic risk, and our aim can only be to reduce the remaining risk to this minimal component. We will now describe sev eral criteria to quantify the remaining risk and the related construction of 'optimal' hedging strategies. To avoid technicalities, we assume that price processes are already discounted ( i.e. use So 1 as numeraire ) , and that the prices of the risky assets are given by continuous, square-integrable semi martingales, i.e. S P + M + A with P (P I , . . . , Pd ) ' , Pi E [0, 00 ) M J ad ( M ) , (MI , . . . , Md) ' a square-integrable martingale under 1P and A with a ( 0. 1 , . . , ad )' a predictable process. One of the most prominent problems in the contemporary mathematical finance literature is the pricing and hedging of contingent claims ( random cash flows at a prespecified time point ) in an incomplete market. In case of a complete market, the pricing and hedging of the contingent claim can be done via the unique martingale measure and martingale representation results. To price a contingent claim, we calculate an expected value with respect to the martingale measure, to hedge a contingent claim perfectly, we can obtain a hedging strategy by using the integrand of an appropriate stochastic integral, as in Chapter 6, or Harrison and Pliska ( 1981) . In the incomplete setting however, there are infinitely many martingale measures, each leading to a different pricing formula, and so the question of finding a 'price' of a contingent claim and a suitable hedging strategy is more involved. One stream of thought originating in the work of Follmer and Sonder mann (1986) and subsequently considerably improved by, to name a few, Follmer and Schweizer (1991 ) , Schweizer (1991) , Schweizer (1994) , Pham, RheinHinder, and Schweizer (1998) , Schweizer (2001b) ( an excellent review ) , is to use quadratic hedging approaches: local-risk-minimization and mean variance hedging ( see §7.2.2 below ) . Typically, one tries to find a hedging strategy that minimises a suitably defined ( quadratic ) risk function. In case an optimal hedging strategy exists it could be used to define a dynamic value process for the contingent claim; however, there is typically a positive probability for the wealth process of an investor following this strategy to be negative at the end, and thus this approach seems not to be suitable for pricing purposes ( see Korn ( 1997b) for a detailed discussion ) . Alternative approaches relying on utility-based arguments for pricing con tingent claims in incomplete markets are Aurell and Simdyankin (1998) , Davis (1994) , Karatzas and Kou (1996) , Rouge and EI Karoui (2000) . ( For a detailed study for the use of utility function and optimal investments, see =
==
=
=
=
=
=
.
296
7. Incomplete Markets
Kramkov and Schachermayer (1999) .) In all of these papers, links to hedging problems and thus martingale measure can be established. 7.2.1 Quadratic Principles
Quadratic principles have a distinguished history in finance and insurance, arising from their even more distinguished history in statistics and mathemat ics. The fountainhead in statistics is the method of least squares, introduced by Gauss and Legendre in the early 19th century, and its development into the Linear Model of statistics (see e.g. Plackett (1960)). The fountainhead in finance is the work of Markowitz, from 1952 on. Markowitz emphasised that one should think in terms of risk, as well as (expected) returns. Since expectations about returns can be summarized by means (mean vectors, in the multi-dimensional case), risks by variances (covariance matrices) , this gives one the mean-variance framework, still one of the cornerstones of con temporary investment theory. Mean-variance portfolio theory fits naturally with any developments that depend on distributional assumptions (on asset returns) only through the first two moments, such as normality assump tions or investors maximising quadratic utility. In an equilibrium setting, mean-variance theory leads to capital asset pricing models and the concept of diversification: one should hold a basket of assets to mimic the (efficient) market portfolio in order to be exposed only to systematic risk and diversify away specific risk exposure (which according to this theory is not rewarded anyway). For further aspects of mean-variance theory, see e.g. Bodie, Kane, and Marcus (1999) , Elton and Gruber (1995), Ingersoll ( 1986) . A third aspect is the use of quadratic loss functions. In many areas decision-theoretic statistics, optimization theory etc. - one introduces a loss function to quantify choice between alternatives, and is guided by the prin ciple of minimising expected loss. For background, see e.g. Robert (1997), Chapter 2 (esp. §2.5. 1) (Bayesian and decision-theoretic statistics) , Whittle (1996) (LQG: linear dynamics, quadratic cost, Gaussian noise). It is as natural to think of maximising expected utility as to minimize expected loss. To quantify this, one needs to select a utility function, u, say. The classic treatment here is von Neumann and Morgenstern (1953); for a more recent treatment see Elton and Gruber ( 1995) , Ingersoll (1986) (finance and investment) , or Robert (1997), Chapter 2 (Bayesian statistics). Here quadratic loss min iE [(W(O) - L) 2 ] (J E 8 where W(.) is the wealth function and L the target for final wealth, e an appropriate space of strategies - corresponds in the utility formulation max iE [u(W(O » ] (J E 8
to a quadratic utility function of the form ( w ) = w - cuP . (Note while the former seems very natural, the latter seems less so, as one needs to restrict u
7.2 Hedging in Incomplete Markets
297
in order to avoid negative wealth, and it implies unrealistic attitudes of investors towards absolute risk aversion, again see Elton and Gruber (1995) or Ingersoll ( 1986)) . Relevant here is the contrast between economic or financial theories that do, or do not, depend on the attitude to risk of individual agents. On the one hand, the assumption of non-satiation (preferring more to less) plus the condition of absence of arbitrage lead in complete markets (informally: there are at least as many traded assets as sources of risk) via the arbi trage pricing technique to pricing and hedging results including the classic Black-Scholes-Merton theory. One the other hand, incomplete market situa tions (more sources of risk than traded assets) are by definition characterised by the nonexistence of a perfect hedge for some securities, and thus require further assumptions on, e.g. the agent's risk preferences as expressed in his utility function or loss function to solve the hedging and pricing problems. Premium-principles, well-established in insurance, might also be used in this framework, see Schweizer (2001a) for application of quadratic valuation prin ciples, such as the variance and the standard-deviation principle. From the mathematical point of view, one uses martingale theory, as mar tingales model random phenomena without systematic drift - fair games in gambling, absence of arbitrage in financial market models etc. (see Musiela and Rutkowski (1997)). A mean-variance framework as above necessitates second moments, so one uses the L2 -theory of martingales, in particular the Kunita-Watanabe inequalities and decomposition (Rogers and Williams (2000), IV.4, Dellacherie and Meyer ( 1978), VII.2). They allow one to use Hilbert space methods, and the geometric language of projections and or thogonality. This is very natural in situations where all the essential features of that situation are modelled by the mean and (co-)variances. The classical situa tion of this type is, of course, the Gaussian (or multivariate normal) case, where means and variances determine - in the presence of Gaussianity the structure. The downside is that one cannot then model other features asymmetry and heavy tails, for instance - not present in the Gaussian case. One way to proceed is to retain means and variances - for convenience, their economic interpretation in the Markovitz case etc. - but replace the Gaus sian assumption by something more general. One is led to a semi-parametric model, with means and variances forming the parametric component and a non-parametric component describing other aspects. See for instance Bing ham and Kiesel (2002) and the references quoted there for background. w
7.2.2 The Financial Market Model
We use the setting outlined in e.g. Heath, Platen, and Schweizer (2001), Pham, Rheinliinder, and Schweizer (1998) and Schweizer (2001b) to which we also refer for further details.
298
7. Incomplete Markets
Let ( fl, F, JP) be a probability space with a filtration IF = ( Ft ) O:5t:5T satisfying the usual conditions of right-continuity and completeness, where T E (0, 00] is a fixed time-horizon. All processes considered will be indexed by t E [0, T] . We consider a market with d + 1 assets with price processes (St ) = (Sf , Sf , . . . st ) available for trading. One of the assets, So , is used as a numeraire, that is Sf is assumed to be strictly positive and all other assets are discounted with So . (In order to reflect the time value of money, it is natural when following asset values over time to work in discounted terms. Indeed, we have a choice as to which numeraire - the basic asset to take as our unit of accounting - to use, and to discount this to make its value constant.) We denote the discounted assets by ( 1 , Xd , with Xt (Xl , · . . , xt ) = (Sf / Sf , . . . , Sf / Sf ) , and consider only the discounted values in the sequel. We assume that X satisfies the structure condition (SC) ; this means X admits the decomposition (se) X = Xo + M + A , where M E M (JP) � is an JRd -valued locally square-integrable local JP martingale null at 0, 'and A is an JRd -valued adapted continuous process of fi nite variation null at 0. Furthermore, we denote by (M) = ( (M)ij ) t,:J. l , , d = = ( (M i , MJ» ) j = l d the matrix-valued covariance process of M. We assume i , , , continuous with respect to M, in the sense that that A is absolutely ,
=
lac
.
...
...
A� =
(J
d (M) ) \s
o
)
, �
t j X�d (M' , M'). , 0
J=1 0
'"
t " T, i
�
1, 2, . . , d, .
with a predictable JRd -valued process >" such that the variance process of the stochastic integral J >"dM , namely
J >..;rd t
Kt =
o
(M) s >" s =
J 'J= 1 0 d
.2:
t
>"� >"� d (M i , Mj) s '
is JP - almost surely finite for t E [0, T] . We fix an RCLL version of K and call this the mean-variance tradeoff (MVT) process. As shown in Delbaen and Schachermayer (1995a) and Schweizer (1995) , the structure condition (SC) is equivalent to a rather weak form of the no arbitrage condition (no free lunch with bounded risk) . Heuristically speaking, we can say that the MVT process K 'measures' the extent to which X devi ates from being a martingale (consult Schweizer (1994) , Schweizer (1995) for a precise statement and an explanation of the terminology 'mean-variance tradeoff') . If X is continuous (as it will be throughout) and admits an equiv alent local martingale measure (see §7.2.3 below) , the structure condition is automatically satisfied.
7 . 2 Hedging in Incomplete Markets
299
By construction, XI are discounted price processes that together with the constant price process 1 form a financial market model. We now introduce as a further traded asset a European contingent claim, i.e. a random payoff at time T, in our market. Formally, a contingent claim H is an (TT ' IP) random variable. A standard example would be a European call option On Xi with strike K and payoff H max{X i K, O } . If X is continuous and admits an equivalent local martingale measure (see below) , the structure condition is automatically satisfied. =
-
7.2.3 Equivalent Martingale Measures
The martingale property is evidently suitable for financial modelling because it captures unpredictability - absence of systematic movement or drift. The basic insight - due to Harrison and Kreps (1979) and Harrison and Pliska (1981) in the financial setting of Chapters 4 and 6 - is that, although dis counted prices are not martingales in general, they become martingales un der a suitable change of measure. One passes to an equivalent measure (same null sets: same things possible, same things impossible) , under which dis counted processes become martingales. Such a measure is called equivalent martingale measure (EMM); localization may be needed - working with local martingales rather than martingales - leading to equivalent local martingale measures (ELMM) . Mathematically, this change of measure technique is Gir sanov'S theorem, §5.7. Although this idea dates in the financial context from around 1980, it may be traced back much further in the actuarial and insur ance settings. The intuition is that it amounts to shifting probability weight between outcomes - giving more weight to unfavourable ones - to change the risk-averse into a risk-neutral environment. Returning to the mathematical formulation, we denote by P the set of equivalent local martingale measures (ELMM) . We assume that X admits an equivalent local martingale measure Q E P, and hence that our market model is free of arbitrage opportunities. For our subsequent analysis, two equivalent martingale measures will prove to be important. We denote by pE
=
{ Q P I �� E
E
L 2 (IP)
}
the set of all ELMMs with square-integrable density. Now define the strictly positive continuous local lP-martingale
(this is the stochastic exponential of Theorem 5. 10.4; see Musiela and Rutkowski (1997) , §1O.1.4 and recall the definition of K). If Z is a square integrable IP-martingale, then
300
7. Incomplete Markets
diP := dIP
A
ZT
E L 2 (IP)
defines an equivalent probability measure iP IP under which X is a local martingale, i.e. iP E P E . iP is called minimal equivalent local martingale measure for X (see Follmer and Schweizer ( 1991) , Schweizer ( 1995)) . As our second important ELMM we need the variance-optimal ELMM iP. The formal definition is �
Definition 7.2 . 1 . The variance-optimal ELMM jp is the unique element in
p 2 that minimizes
1 + War./p
[��]
over all Q
E
p2 .
The existence of iP for continuous X was shown by Delbaen and Schacher mayer (1996). As mentioned above, the first formal definition of the minimal martingale measure was given in Follmer and Schweizer (1991) , where the terminology was motivated by the fact that a change of measure to the minimal martingale measure disturbs the overall martingale and orthogonality structure as little as possible. Subsequently the minimal martingale measure was found to be very useful in a number of hedging and pricing applications (see Schweizer (2001b) for discussion and references) . Furthermore Schweizer (1995) pro vides several characterizations of the minimal martingale measure in terms of minimizing certain functional over suitable classes of (signed) equivalent local martingale measures (see also Schweizer (1999)) . 7.2.4 Hedging Contingent Claims
We now turn to the problem of hedging, that is, covering oneself against future losses associated with possession, or sale, of a contingent claim. The problem of hedging requires the introduction of trading strategies, i.e. selection of a covering portfolio in the underlying assets, and loss functions, i.e. how and when to value the success of a strategy. Informally, we can think of this problem as follows. By using our initial capital and following a certain trading strategy, we can generate a value process, which we use to cover our exposure. That is, at time T we have a certain value VT from trading; we need to cover H and evaluate the difference H - VT . Rewriting this, we get a decomposition
LT
=
H = VT + L T ·
(7.2)
As we shall see, decompositions like this under different measures play a decisive role in the solution of various hedging problems.
7.2 Hedging in Incomplete Markets
301
The corresponding mathematical tool in our setting is the basic decompo sition for L 2 -martingales, the (Galtchouk-)Kunita-Watanabe decomposition (see Rogers and Williams (2000) , IV.4) . L 2 -martingales null at 0 are if 0 for all stopping times T (equivalently, if is a uniformly integrable martingale). Such an possesses an or thogonal decomposition into its continuous part and its purely discontinuous part. In the financial context the continuous part is the stochastic integral - seen as a gains from trading process - and the purely discontinuous part orthogonal to it is the unhedgable risk L. Thrning to the problem of exact definitions of trading strategies, we note that P =I- 0 implies that X is a semi-martingale under !P. Thus we can introduce stochastic integrals with respect to X and associate the integrands with trading strategies.
strongly orthogonal lE(MT NT ) MN
M, N M
=
d -valued pre the linear space of all JR Denote by L(X) dictable X -integrable processes iJ. A self-financing trading strategy is any pair (Vo, iJ) such that Vo is an .'Fo -measurable random variable and iJ L(X ) We can think of iJ; as being the numbers of shares of asset i held at time t and Vo the initial capital of an investor. We associate a value process vt ( Va , iJt ) with it given by (7.3) vt (Vo, iJt ) : = Va + f iJu dXu · o Definition 7.2.2.
E
.
t
Here, we call Gt (iJ) = J� iJ u dXu the gains from trading process. The process is self-financing since as soon as the initial capital is fixed no further in- or outflow of funds is needed. We now turn to the problem of identifying the risk associated with the hedging problem, focusing first on the problem of covering our exposure to H at time T. Starting with an initial capital Vo and following a trading strategy iJ, we obtain a final wealth VT (VO , iJ) . Our assessment of the quality of our trading strategy for the problem at hand will thus rely on the difference between these quantities. Several approaches are possible: We could use a quadratic loss function; this leads to the notion of mean-variance hedging, and is described in further detail below. Other possibilities are Value-at-Risk (VaR) principles to cover against extreme losses, as in Follmer and Leukert (1999) and Follmer and Leukert (2000) , Or value-preserving strategies as in Korn (1997c) . In order to be able to consider quadratic loss functions, we restrict the class of trading strategies using Definition 7.2.3. iJ E L(X) GT (iJ) Gt (iJ) Q L 2 (!P)
the spaceprocess of all ()' betrading and the gainsLetfrom
which for iseachin is a formartingale
302
7. Incomplete Markets
E p 2 . A pair ( vo , iJ) such that Vo E 1R and iJ E () ' is called a mean-variance optimal strategy for H, if it solves the optimization problem
Q
(7.4) over iJ
E
() ' .
The solution of the problem can b e given i n terms of a decomposition (7.2) under the variance-optimal measure iP. From Gourioux, Laurent, and Pham (1998) we know
[ l] -
- diP :Ft = Zo + Zt : = IE dIP
t
Xu , I0 (ud
for some ( E 8' . We have (see Schweizer (2001b)) Theorem 7.2 . 1 . Let H E L 2 (IP) be a contingent claim and write the Galtchouk-Kunita- Watanabe decomposition of H under iP with respect to X as H = iE(H)
with
v,.H,jp
+
T
I ��, jp dXs + L!f.' jp o
:= iE[H I:Ft l
=
vi!' jp ,
(7.5)
t
=
iE(H) +
I ��, jp dXs + Lf' jp · o
Then the mean-variance optimal strategy for H is given by Vo* iJ * t
iE(H)
- cH,jp (v.H,jp
=
<" t
_
(t Zt
t-
t
_
iE(H)
_
I0 iJs* dXs
)
(7.6) (7.7)
Moreover, the minimal total risk of H is given by
where zt
:=
IE
[ �� I :Ft ] , is the density process of iP with respect to IP.
7.2 Hedging in Incomplete Markets
303
Observe that the solution hinges, like the solution for the problem of find ing risk-minimising strategies, discussed below, on finding a decomposition of type (7.2) , this time however under the variance-optimal martingale measure P. Using Theorem 7.2. 1 we now have a recipe to find the solution, i.e. the initial value and strategy f)* that solve the problem (7.4) . However, the main obstacle is the explicit calculation of the trading strategy, which is in the general case an open problem, mainly due to the fact that the structure of P has to be explored in more detail. However, solutions exist in many impor tant cases, especially in the case when the minimal-martingale measure iP and the variance-optimal martingale measure P coincide ( mainly due to the fact that the former is well-studied ) . We will return to that issue later on. An alternative to minimizing some loss function at the terminal date T is to insist on obtaining the target value, i.e. to be perfectly hedged, and have VT . By the very definition of an incomplete market, that can not be guaranteed using only self-financing strategies, and we must allow an additional movement of funds. We then have to monitor these additional cash-flows and an obvious approach is to look for a strategy that minimizes the additional funding required. To formalize these ideas we have to introduce further terminology. Definition 7 . 2 . 4 . A pre-strategy is any pair rp (f), 'Tl) , where f) E L(X) and 'Tl (t 'Tlt ) is a real-valued adapted process such that the value process
H
=
=
vt (rp) if f)
=
=
f) r Xt + 'Tlt is right- continuous. We call a pre-strategy a RM-strategy,
E L2 (X) . The cumulative cost process C(rp) is then defined by
-J t
Ct (rp)
:=
vt (rp)
o
f) udXu ·
(7.9)
A pre-strategy rp is called mean-self-financing if C(rp) is a martingale under IP.
The objective now is to minimise some loss functional associated with the cost process, and again a quadratic approach can be used. Indeed, if C(rp) is square-integrable, we define the risk process of rp by (7. 10) and we now can look for a strategy that minimises this process in a suitable way. Following Follmer and Sondermann (1986), we have Definition 7 . 2 . 5 . A RM-strategy
rp is called risk-minimizing if and
for any RM-strategy rjJ such that VT (rjJ)
=
VT (rp) IP-a. s.
304
7. Incomplete Markets
In general, RM-strategies with VT ( cp ) = H are not self-financing. However, one can show that they are mean-self-financing, so we can think of them - using the standard martingale interpretation - as being self-financing on average. In the special case, when X is already a local lP-martingale, finding a risk-minimising strategy for a contingent claim H E L2 ( FT , again uses a Galtchouk-Kunita-Watanabe decomposition. Indeed we can write
lP)
T
H
=
lE(H IFo) +
J {)H dXu o
+
L!f. ,
(7.11)
for some {)H E L2 (X) and some square-integrable P-martingale LH null at o strongly orthogonal to the space of stochastic integrals of {f {)dXI{) E L2 (X) } . With this decomposition as the key tool, one gets
Suppose that X is a local lP-martingale. Then every con tingent claim H E L2 ( FT , lP) admits a unique risk-minimising strategy cp * with given in terms of decomposition (7.11)VTby(c,o*) H lP-a. s.{)*c,o* is explicitly {)H ,
Theorem 7.2.2. =
.
=
Vi ( cp *) = lE [H I Ft l = : �* , C(c,o* ) lE[HI Fo] + LH . In the general case, when X is no longer a local martingale under =
but merely a semi-martingale, risk-minimising strategies do not exist in general (see Schweizer (2001b)). One has to be less ambitious, and instead of trying to minimize the variability of the (remaining) risk process up to maturity, one tries to minimize the local variability. The actual definition (see Schweizer (1991)) is rather delicate in continuous time, and so we only provide a for mulation which is equivalent under mild technical assumptions. Definition 7. 2.6. {) E L(X) J {)dX 52 L2 _
lP
Denote by is() the spacespaceof all processes forAnwhich the stochastic integral in the of semi-martingales. strategy is that any ispre-strategy c,o = ({), "l) with {) E () such that V (c,o) is square integrable, Vi (c,o) E L2 (lP) for each t E [0, T]. An L2 -strategy is called a pseudo-locally risk-minimising strategylP-orthogonal if its cost process C( c,o) is a square integrable lP-martingale and strongly to M, the lP-martingale part of x . c,o
Again, the key to finding such a pseudo-locally risk-minimising strategy is to obtain a suitable decomposition of type (7.2). For H E L2 (lP) the decomposition is known as the Follmer-Schweizer decomposition. T
H
=
Ho +
J �H dXu + L!f. , o
(7.12)
7.2 Hedging in Incomplete Markets
305
with E L 2 (Fo , IP ) , e H E e and a square-integrable IP-martingale LH null at 0 and strongly IP-orthogonal to M. Again, the locally risk-minimising strategy 'P can be read off as in Theorem 7.2.2. In case of continuous X , the construction of such a decomposition can be achieved with the help of the minimal ELMM P via a Galtchouk-Kunita Watanabe decomposition of the value process V under P.
Ho
7.2.5 Mean-variance Hedging and the Minimal ELMM
We now return to the discussion of the mean-variance hedging problem (7.4) and exploit the solution given in Theorem 7.2.1. As we saw, the key was to find a decomposition of type (7.2) under the variance-optimal martingale measure P. We consider a particular case discussed in detail in Pham, RheinUinder, and Schweizer (1998), to which we refer for proofs of the two lemmas below. The general idea is to consider a situation where the variance-optimal and the minimal ELMM coincide, so that one can characterise the solution of (7.4) in terms of the latter. We start with a result on the Follmer-Schweizer decomposition under stronger integrability assumptions.
Assume;::: 2thathas Xa Follmer-Schweizer is continuous. If kdecomposition is bounded, with p H Ho J e�dXs + L!f. IP-a.s. o with Ho lR, eH E LP (M) and LH E MP (IP) .
Lemma 7.2 . 1 (PRS 1 ) . E CP (FT , IP )
asthen any H
=
+
T
E
Here
LP (M)
denotes the space of all predictable processes {} such that
which in the case k bounded coincides with e . We now apply this result to get a Follmer-Schweizer decomposition of the density process of the minimal martingale measure. Observing that if KT is bounded, we have in addition � dIP E Lr ( IP ) d IP
and
dIP
�
dIP
lor every
r
<
00
E Lr (IP) for every
r
<
00 ,
�
C
(7.13)
we get by Lemma 7.2. 1 and (7.13) the Follmer-Schweizer decomposition
7. Incomplete Markets
306
(7. 14) with L in Mr (JP) for every
r
<
00 .
Lemma 7.2.2 (PRS 2) . Assume that X is continuous, k is bounded, and
X satisfies
the special assumption LT
H L2+E(FT ,
=
0 in the decomposition (7. 14).
(7. 15)
For fixed E JP) with some E > 0, the optimization problem (7.4) over {) E () has a solution � (x) given in feed back form by
(v,
"t "t Z(to t c(X)
=
cH
_
t
where
_
x
_
t j "dX)dX s s 0
=
)
,
t Z? .zE(ZT I Ft ) = lE(Z�) + j (s dXs , 0 ::; t ::; T, o t , lE(H , I Ft) Ho + j �sH dXs + LtH , 0 ::; t ::; T. Vi o =
=
Furthermore, the minimal quadratic risk is given by
(7. 16)
A basic tool in obtaining these formulas is the identity
with the process N defined by (7. 17) Then one has � (x) on.
= �H
N_ (. We shall make use of these identities later
7.2 Hedging in Incomplete Markets
307
The special assumption (7. 15) ensures in effect that the variance-optimal and the minimal-martingale measures coincide ( see RheinHinder and Schweizer (1997) for further discussion) . In the special case of a deterministic final value of the mean-variance trade-off we get Lemma 7 . 2 . 3 . Suppose that X is a continuous process such that p 2 =1= 0. If the final value KT of the mean-variance tradeoff process is deterministic, the variance-optimal martingale measure jp and the minimal martingale measure iP coincide, i. e. jp = iP, and
z!
=
Zt
=
(t
=
zP -J-
Zt
=
J �dM)t , e K T £ ( _ J �dM)t , _ e K T £ ( - J � dM) t �t
£(-
.
=
- Zt�t ,
.
e - ( K T - K, )
In a multidimensional diffusion setting, Gourioux, Laurent, and Pham (1998) obtain more general results by dynamic programming arguments. 7 . 2 . 6 Explicit Example
We consider the problem of a financial agent who committed himself to a random payment X at time T ( e.g. the payoff of an option ) . The financial agent has an initial endowment at time t 0 ( e.g. the asking price of the option ) and is allowed to trade in the financial assets SI , . . . , Sd during the time interval [0, TJ , using a suitable trading strategy ( e.g. a self-financing portfolio ) . The hedger's objective is to minimise the expectation of some time T of a quadratic cost functional: for fixed > 0, solve the optimization problem (7. 18) c
=
c
(G
=
308
7. Incomplete Markets
dS1(t) SSI (t)(t) (b(b1(t)dt al (t)dW1(t)), dS2 (t) 2 2 (t)dt + a211 (t)dWI (t) a22 (t)dW2 (t)), with bi , aij > 0; i , j 1 , 2 bounded adapted functions au bounded away from 0 (uniformly in ) and b1 (t)/al1 (t) a deterministic function. But our investor is only allowed to use trading strategies involving the bank account and S11 so he faces an incomplete market situation (in his eyes X S2 (T) is not attainable, for instance) . We assume now that X E V(fl, F, lP) with 2 Sfdt co} . Then (7.18) can be cp {cp : cp predictable, p > 2 and P solved by a Hilbert-space projection argument: cp * E P is optimal if and only if IE [(X - c - G
=
=
+
= w
=
JOT
=
<
=
fix
where o and get
XNo
=
X
=
Xo Jo CPl(u)dS1 (u) N +
T
+
E JR, E P and is orthogonal to
(7.19)
N(T),
a square-integrable lP-martingale with N(O) SI (so IE(SI(T)N(T)) 0). So for any cp E P we IE « X - c - G
, we have IE [(X - c - G"'l (T))G
CPl
[(XO
�
�
(1
(l
=
=
)1 ) (1
)
=
=
=
=
�
7.2 Hedging in Incomplete Markets
309
t t 81(t) PI + J 81 (u) b1(u)du J 81 (U)all (U)dWI (u), 0 S t S T, o 0 so the martingale part of 81 is a stochastic integral with respect to WI, so any square-integrable lP-martingale orthogonal to the martingale part of 81 is given as an appropriate stochastic integral with respect to W2 • Now any martingale measure in this setting is characterised in terms of a Girsanov +
=
transformation with density
L(t)
exp
�
x
exp
{ - (! ,,(u)dW,(u) i 'Y2 (U)dW2 (U)) } { � i (1, (u)2 h(U)2) dU } +
-
Since 81 must be a martingale under iP, we have 'Yl bI / all (recall dWI - 'Yl dt under iP) . In order to find a minimal martingale measure, we dWI must choose 'Y2 in such a way that any lP-stochastic integral with respect to remains a iP-martingale and this implies 'Y2 0 (i.e. W2 W2 )' W2 The next step is to find a Follmer-Schweizer decomposition. Using the =
=
=
=
minimal-martingale measure we get from the martingale representation the orem the iP-decomposition
V (t)
=
E (X !Ft )
=
t t Xo + J cp (u)d81(u) J 1jJ (U)dW2 (U), 0 S t S T, +
o
0
which leads to T
T
J cp(u)d81(u) + J 1jJ(U)dW2 (U) lP - a.s. (7. 20) o 0 Now we can use the strategy cp to describe the optimal strategy: Theorem 7.2.3. Let G* be the solution of the stochastic differential equation (7.21) dG*(t ) F(G* (t))d81 (t), G*(O ) 0, with F(G*(t )) cp (t ) + al l �)�t�l (t) ( V (t ) - c - G*(t ) ) and cp as in (7. 20) . Then the hedging strategy cp*(t ) F(G*( t )) solves (7.18). X
=
Xo
+
=
=
=
=
310
7. Incomplete Markets
Proof. ( Compare Schweizer (1992) , p 176 . ) Our integrability conditions ensure that the stochastic differential equation (7.21) has a unique solution such that
H(t,
=
[
IE ( V (t)
-
c
-
]
G* (t) ) G
0 :::; t :::; T,
we have to show H(T,
H (t)
J IE [!}��� ( V ( u) t
=
-
o
- c
-
]
G* (U) ) G
Since (b1 (t) /O"u (t)) is deterministic, this implies d d H (t)
t
From G
=
=
0 we know that H(O)
-
b1 (t) 2
---y- H(t) . (t ) o"u
=
0 and so H(t)
=
O.
o
The above proof relies crucially on the assumption (b1 (t)/O"u (t) ) is deter ministic. Further related problems, which have been treated in e.g. Delbaen and Schachermayer ( 1996 ) , Schweizer ( 1994 ) and Schweizer ( 1995) are to minimise
over all
c
E
IR and all self-financing trading strategies
over all self-financing trading strategies <po All these problems are related to closedness questions of sets of random variables given by appropriate stochas tic integrals in the set L 2 (JP) of FT -measurable random variables, see Monat and Stricker (1995) and Schweizer ( 1994) . Remark 7. 2. 1 . ( i ) The minimal martingale measure has several further inter esting features, e.g. it is the closest equivalent martingale measure to JP in terms of a suitably defined relative entropy ( for details and further properties see Delbaen and Schachermayer ( 1996 ) , Follmer and Schweizer (1991) and Schweizer (1995 ) ) .
7.2 Hedging in Incomplete Markets
311
( ii ) Given these desirable features it seems natural to use the minimal martin gale measure as the equivalent martingale measure to price contingent claims via expectations. In doing so we would assume that all non-traded risks can be traded away, and are thus unpriced ( by preserving orthogonality any risk orthogonal to the observable one is non-compensated under iP) . An equilib rium justification of the minimal martingale measure is given in Pham and Touzi (1996) . ( iii ) A drawback of the above optimization over self-financing strategies is that value processes of such strategies are unbounded from below, allowing for negative wealth. A related optimization problem addressing these short comings is treated in Kom (1997a) . So far we insisted that the trading strategies should be self-financing. If we relax this condition, i.e. enlarge the class of possible trading strategies, we face the problem of risk-minimizing hedging. We consider trading strategies such that the value processes are given by t
J cp (u)dS(u) + C (t), 0 � t � T with C an adapted process. Then the cost process Ccp of a strategy equals t t) Ccp( Vcp(t) J cp (u)dS(u), Vt E [0, T]. o We again treat the situation that IP is a martingale measure for S first. For a given contingent claim X we will then look for a strategy cp* which replicates the claim, in the sense that X V; ( T), and which at each time t minimises the remaining risk
Vcp(t)
= x
+
o
=
-
=
over all replicating strategies If the contingent claim is attainable the remaining risk can be reduced to zero ( and the risk-minimising strategy is self-financing) . In the general case we are no longer able to achieve this. To construct a risk-minimising strategy ( following Follmer (1991) and Follmer and Sondermann (1986)) , we use again the decomposition
cpo
T
J cp (u)dS(u) + N(T) , (7 . 22) where N is a martingale orthogonal to S. Then the value process associated with cp is given by X
=
Xo +
o
312
7. Incomplete Markets
Vcp (t) lE(XI Ft) =
t
=
Xo +
J
o
and we can construct the risk-minimising strategy by setting
=
=
VIP -
(7.23)
The associated cost process is given by
Ccp.
=
Xo +
N,
and so the remaining risk at time is given by
t
Observe that 0, 0 � � T, so the cost process associ ated to the risk-minimising strategy is a martingale ( we call such a strategy
lE(N(T) - N(t) I Ft) t mean-sel f-financing). In the general case, IP is not a martingale measure for S , and S is only a semi-martingale under IP. In this case Schweizer ( compare Schweizer (1988) and Schweizer (1991)) introduced the criterion of local risk-minimization, which is essentially equivalent to the associated cost process Ccp corresponding to a replicating strategy
to the martingale part of S under IP. Such a strategy will be called optimal. Again, a Follmer-Schweizer decomposition leads to an optimal strategy given as in (7.23) . Conversely, an optimal strategy r.p* leads to a decomposition T
T
X J
o
+
=
0
and this is of form (7.22) . So existence and uniqueness of an optimal strategy is equivalent to existence and uniqueness of the Follmer-Schweizer decomposi tion. Again the minimal martingale measure plays a central role in construct ing such a decomposition. The above theory has been developed in Follmer and Schweizer (1991), Follmer and Sondermann (1986) and Schweizer (1988) . We refer the reader to Delbaen et al. (1997) , Monat and Stricker (1995) , Pham, RheinUinder, and Schweizer (1998), Schweizer (1994) and Schweizer (1995) for extensions, to SchaI (1994) and Schweizer (1991) for a treatment in a discrete-time framework and to Lamberton and Lapeyre (1993) and Rossi (1995) for applications in the context of the hedging of stock index options. 7.2.7 Quadratic Principles in Insurance
Unti! fairly recently, the literature on mathematical finance ( for a brief histori cal summary of which see the preface to Karatzas and Shreve ( 1998)) and on
7.2 Hedging in Incomplete Markets
313
insurance/reinsurance/ actuarial mathematics proceeded largely separately. This has begun to change, and there are now growing inter-connections be tween these two areas. One of the driving forces here has been the growth of securitization: the process of identifying specific risks that a risk-bearer may wish to transfer to a counter-party, and developing them into tradeable commodities (as in the growing market in credit derivatives, for example see Chapter 9) . Another is that the success of ideas in mathematical finance - pricing and hedging of options and derivatives, for example - has led to such ideas diffusing into other areas such as decision-taking by corporate management (the field of real options: see e.g. Dixit and Pindyck (1994)) . On the mathematical front, martingale methods have developed in insurance in parallel with finance, following the work of Gerber and others from the 1970s on, see Gerber (1973) . And the concept of market price of risk familiar from finance finds its counterpart in the concept of safety loading in insurance. Embrechts (2000) gives an excellent survey of the interplay between ac tuarial and financial ideas in insurance, specifically in premium pricing. In surance works because customers (individuals, say) and insurers (companies, say) have different utility functions, and this difference allows a band of pos sible prices low enough to attract takers and high enough to attract writers. Various possible premium principles emerge, based on expectations, vari ances, standard deviations, semi-variances, quantiles, and Esscher transforms. For background, we refer to the two books Goovaerts, de Vylder, and Haezen donck (1994) and Goovaerts, Kaas, van Heerwaarden, and Bauwelinckx (1990) . Schweizer (200la) also exploits the links between actuarial and financial valuation principles. One motivation here is that the selection of a valuation principle on actuarial grounds may provide a mechanism for pricing in in complete financial markets (where, since equivalent martingale measures are non-unique, so too are prices) . He uses a utility-indifference approach. Sometimes both insurance and financial aspects appear together in the same model. This happens when, for example, an insurance policy pays out amounts depending on the value of certain stocks, or stock indexes, at certain times, as in unit-linked life insurance contracts. These have been studied in some detail by M011er, in several papers. Risk-minimization strategies and hedging are considered in this context in M011er (1998) . Transformation of actuarial valuation principles into their financial counterparts is developed in such a setting in M011er (2001b) , extending Schweizer (200la) . Discrete time models of this kind are developed in M0ller (2001a) . The general set up extends that of Follmer and Sondermann (1986). One has a strategy cp , given by ( �, "' ) , with the vector � representing holdings of risky assets and '" representing holdings of cash (or other riskless assets) . The cost process is t
Ct ( cp )
=
vt ( cp )
-
J o
�u dXu + At ,
7. Incomplete Markets
314
extending (7.9 ) by incorporating a payments process A (At ) , which sub sumes the random payments at random times involved in the contract. Then mean self-financing strategies are defined in terms of this Ct as before. The risk process Rt (cp) is as in (7. 10) , and then risk-minimising strategies are de fined as before. Such strategies are obtained explicitly in M0ller (200lc) , for fairly general payments processes A , via the Kunita-Watanabe decomposi tion. These results are applied to several situations in insurance: unit-linked life insurance contracts, single life term insurance, and index-linked claim amounts in non-life insurance. =
7.3 Stochastic Volat ility Mo dels
As already mentioned in §6.2.4, the only unobservable parameter in the Black Scholes formula is the stock's volatility a . Since we observe option prices in the financial markets every day, we can use numerical methods to invert the formula and deduce the so-called implied volatility Under the assump tions of the Black-Scholes model we should have a constant function, i.e. a. However, in reality we typically observe a U-shaped graph of as a function of the strike price K. This is referred to as the volatility smile, which additionally is quite often lopsided - it exhibits skewness (volatility smirk) . We will now discuss a model that tries to explain this effect by intro ducing a stochastic volatility. The idea is to model volatility as a stochastic process dependent on a further exogenous parameter (for an endogenously determined volatility see Platen and Schweizer ( 1994 ) and references) . We consider the following extension of the standard Black-Scholes model. Let (WI , W2 ) be a two-dimensional standard Brownian motion on a fil tered probability space (st , F, IP, IF) with filtration IF (Ft )O
aim p
aim p =
aim p
=
dB (t)
=
rB (t) dt , B (O)
=
1,
where r 2: 0 is a fixed constant. The price process of the risky asset described by a stochastic volatility model of the form dS (t)
=
d Y(t)
=
S
is
(7.24) S(t)b(t, S(t) , Y (t) ) dt + S(t) a (Y (t) ) x vll - p 2 (t, S (t) , Y (t ) ) dWdt) + p(t, S(t) , Y(t) ) d W2 (t) ,
(
m(t, S(t) , Y (t ) ) dt + v (t, S(t) , Y (t ) ) d W2 (t) ,
)
(7.25 )
where a is the volatility process and p E ( - 1 , 1) is the correlation process between the asset price and its volatility. We assume that all processes are bounded and sufficiently smooth to guarantee unique strong solutions of the various stochastic differential equations we encounter.
7.3 Stochastic Volatility Models
315
Since we have only S and B available for trading, we face an incomplete market , and there exist several equivalent martingale measures. However, we know that all possible equivalent martingale measures are characterised by their Girsanov densities ( compare §5.7) , Le. = L(T) with
dQldJP L(t) { (/ �, (u)dW, (u) i �' (U)dW' (U)) } { - � i b, (u) ' h(U)' ) dU } Sf Q bet, Y) a(Y) ( p2 (t, S, Y)-Yl pet, S, Y)-y2). ('"1'1 , 1'2 ) exp
�
x
-
+
exp
In order to make S = B a martingale, we know from formula §6.2 that ( omitting the arguments )
S,
-r
=
Jl -
(6.5)
in
(7.26)
+
Since (7.26) does not specify completely, we must impose additional conditions to determine the equivalent martingale measure to be used for pricing and hedging purposes uniquely.
The Hull-White Model. Hull and White (1987) considered in their semi nal paper the following version of (7.8), (7.9) :
(7.27) (7.28)
+
dS(dY(tt)) meSett), {beY (tt))dt, Set), Y(vett,))dtY(t))dW2( a(Y (tt).))dWl (tn, a1'l 1'2 1'2 JP*. IIx(t) X IIx(t) IE* (X J Ft ). S (S(T) S C(t, Set), a(yt )) e--r(TT-tt)) e r( - IE" [IE" [(S(T) a(Yuk:;u:<=;T] e-"(T-" [CBS (t , ( �, J U(Y')'dU) I) 1 ' =
+
=
So the asset price and its volatility are uncorrelated and (7. 1 1 ) leads to (b - r ) = with unspecified. Hull and White then assumed that the volatility has zero systematic risk, which implies that = O. So we have a complete specification of an equivalent martingale measure As in the complete model, the drift b of the discounted stock price process is replaced by the growth rate r of the risk-free asset. Now the price process of a contingent claim is given by the risk-neutral valuation principle =
e-r ( T - t )
For the case of a European call option C on with maturity = - K) + , and using the independence of price process becomes
K , we have X
=
=
�
IE" [( S e T ) - K ) + J Ft ]
- K)+ I
E'
T
T and strike and Y the
J Fd
1',
316
7. Incomplete Markets
where C B S (t, V) denotes the standard Black-Scholes price (compare (6. 14)) as a function of time and constant volatility V. So the stochastic volatility call price is the risk-neutral expectation of the Black-Scholes option price formula over the distribution of the mean future volatility. Other approaches along these lines allowing for non-zero correlation p are treated in Heston (1993) and Stein and Stein (1991). Minimal Martingale Measure. In the setting (7.8) the lP-semi-martingale S (t) can be decomposed as (omitting the arguments)
Hence the martingale part M of S is given by the stochastic integrals, and its quadratic variation has dynamics d (M)
=
=
(S20'2 ( 1 - p2 )
+
)
S 20'2 p2 dt = S20'2 dt .
Using a b - r/ 0'2S we have ad (M) = S (b - r)dt and so by Theorem 7.2. 1 the Girsanov density t of the minimal martingale measure is given by
-
=
So the processes 'Yl , 'Y2 are given as 'Yl = b -;;r VI p2 and 'Y2 b -;; r p. In the Hull-White model (7. 1 1) ( (7.8) with p = 0) the minimal martingale measure b -;; r and 'Y jp is thus specified by 'Yl O. This means jp lP* , the 2 minimal martingale measure is the risk-neutral measure in this case, and we obtain identical option pricing formulas. Davis' Formula. In the stochastic volatility case, the pricing formula (7.1) has an interpretation as a multiplicative functional of the Markov semigroup of some Markov process in lRn . The option pricing formula can then be expressed as a discounted expectation (with a discount factor characterised by the underlying Markov process and not necessarily given by the bond price) under an equivalent martingale measure. However, if the Hull-White model (7. 1 1) is further restricted so that all coefficients are deterministic functions of Y(t) (with Y and S still independent) the discount factor becomes e- rt and the equivalent martingale measure is obtained via the risk-neutral Girsanov density, i.e. 'Yl b-;; r and 'Y2 O. We refer the reader to Davis (1997) for details and further references. =
=
=
=
=
7.3 Stochastic Volatility Models
317
Remark 7. 3. 1 . ( i ) Mean-variance hedging strategies in stochastic volatility models are constructed in Pham, Rheinliinder, and Schweizer (1998) §4.3; a further discussion of hedging strategies is contained in Renault and Touzi (1996) .
( ii )
Hobson (1998) gives an overview of the attempts to incorporate stochas tic volatiliy in option pricing models using partial differential equation tech niques. ( iii ) Discrete-time models of stochastic volatility focus on the statistical and descriptive patterns of price changes in ( often ) short time intervals ( interdaily volatility) . In this context ARCH and GARCH models ( below ) are quite pop ular. For an overview of the various methods employed we refer the reader to Gourieroux (1997) and Shephard (1996) . A pricing method within a GARCH framework is developed in Duan ( 1995). General references on stochastic volatility are the survey of Hobson (1998) and the monograph of Fouque, Papanicolaou, and Sircar (2000) . We refer to these sources for background. Below we briefly discuss some further aspects. Mean Reversion. Recall the Ornstein-Uhlenbeck SDE of §5.8: dV = -(3Vdt + O'dW.
This is the prototype of an SDE with a central push, or restoring force, tending to return the particle to its mean position at the origin. Similarly, suppose the volatility O't is of form f (yt) , where the process Y satisfies a SDE of the form dyt a ( m yt)dt + p (t , yt)dZt with Z a Brownian motion correlated with that driving the stock price. Then m is the long run mean of Y, and a is the rate at which yt tends to m the rate of mean reversion. It is commonly observed that volatility tends to go up when prices go down, and this leverage effect may be modelled by taking the correlation p between the two Brownian motions negative. Time-series Methods. One of the stylized facts of financial data is that time series of stock prices show a tendency toward volatility clustering; peri ods of high volatility tend to occur, punctuated by periods of low volatility, the process showing stickiness, or a tendency to persist in its current state. A range of time-series methods have been developed with the aim of mod elling this phenomenon, most notably ARCH ( auto-regressive conditionally heteroscedastic ) and GARCH ( generalized ARCH ) . These models have been extensively developed and applied, particularly in the econometrics literature. For background and details we refer Shephard ( 1996) . Long Memory. The phenomenon of long memory or long-range dependence has been studied in many contexts ( hydrology, internet traffic, finance, etc. ) . For general background see Beran ( 1994) . =
-
-
318
7. Incomplete Markets
Complete Stochastic Volatility Models. An interesting class of models has recently been developed by Hobson and Rogers (1998) which has stochas tic volatility but - because they do not involve an extra source of randomness - remain complete. Using the notation of Hobson and Rogers (1998) write Pt for the price process Zt : = log ( Pte-r t ) for discounted log-prices, and 00
St : =
J Ae-AU (Zt - Zt-u )du. o
Then S satisfies an SDE of the form
which may be solved to give
t log (Pte-r ) =: Zt
t
=
Zo + (St - So) + A
J Sudu. o
One may obtain a risk-neutral measure for which an analogue of risk-neutral valuation holds, a PDE and Feynman-Kac formula etc. We must refer to Hobson and Rogers (1998) for further details of this promising approach.
7 . 4 Mo dels Driven by Levy Processes
7.4. 1 Introduction
We begin with the basic stochastic differential equation ( SDE ) of Black Scholes theory for the price process S = (St ) ,
(7.29) where Il is the drift ( mean growth rate ) , a the volatility, and W = (Wt ) - the driving noise process - a Wiener process or Brownian motion. The solution of the SDE (7.29) is
(7.30)
Ilt
the stochastic exponential of the drifting Brownian motion + a Wt . Now the driving noise process W is a Levy process (§5.5) . Stationarity is a sensible assumption - at least for modelling markets in equilibrium on not too large a timescale - and although the independent increments assumption is certainly open to question, it is reasonable to a first approximation, which is all we attempt here. What singles out the Wiener process W among Levy processes is path-continuity. Now the driving noise represents the net effect
7.4 Models Driven by Levy Processes
319
of the random buffeting of the multiplicity of factors at work in the economic environment, and one would expect that , analyzed closely, this would be dis continuous, as the individual 'shocks' - pieces of price-sensitive information arrive. (Indeed, price processes themselves are discontinuous looked at closely enough: in addition to the discrete shocks, one has discreteness of monetary values and the effect on supply and demand of individual transactions. ) One is thus led t o consider a SDE for the price process Y = ( yt ) o f the form
( 7.31)
with Z = (Zt ) a suitable driving Levy process on a probability space (rl, F, lF, IP) . Now a Levy process, or its law, is characterized via the Levy Khintchine formula by a drift a, the variance a of any Gaussian (Wiener, Brownian) component , and a j ump measure dp,. Since the form of dp, is constrained only by integrability restrictions, such a model would be non parametric. While the modelling flexibility of such an approach, coupled with the theoretical power of modern non-parametric statistics, raises interesting possibilities, these would take us far beyond our modest scope here. We are led to seek suitable parametric families of Levy processes, flexible enough to provide realistic models and tractable enough to allow empirical estimation of parameters from actual financial data. We refer to Chan ( 1999) for a theo retical analysis of models of price processes with driving noise a general Levy process. To return to the Levy process, recall that the sample path of a Levy process Z = (Zt ) can be decomposed into a drift term bt, a Gaussian or Wiener term, and a term involving jumps. This jump component has finitely or infinitely many j umps in each time-interval, almost surely, according to whether the Levy or jump measure is finite or infinite. Of course, the latter case is unrealistic in detail - but so are all models. It is, however, better adapted to modelling most financial data than the former. There, the influ ence of individual j umps is visible, indeed predominates, and we are in effect modelling shocks. This is appropriate for phenomena such as stock market crashes, or markets dominated by 'big players' , where individual trades shift prices. To model the everyday movement of ordinary quoted stocks under the market pressure of many agents, an infinite measure is appropriate. 7 . 4 . 2 General Levy-process Based Financial Market Model
Recall our Levy-process based model of a financial price process ( 7.31) . The characteristic function takes the form lE (exp{iOZd )
=
exp { -t'lj;(On
with 'lj; the Levy exponent of Z. The Levy-Khintchine formula implies
320
7. Incomplete Markets
J
c2 (} 2 + ia(} 'I/J«(} ) = 2 + +
J
( 1 - e- i 9 x - i(}x) J.L(dx)
{ Ixl < l} ( 1 - e- i 9 x ) J.L(dx)
{Ixl�l}
with a, c E IR and J.L a a-finite measure on IR / {O} satisfying
J min{ l , x2 }J.L(dx)
< 00 .
J.L is called the Levy measure. From the Levy-Khintchine formula we deduce the Levy decomposition of which says that must be a linear combination of a standard Brownian motion W and a pure jump process X independent of W (a process is a pure jump process if its quadratic variation is simply [Xl = L o< s ::; t (LlX) 2 ) . We write (7.32) Under further assumptions on Zt we can find a Levy decomposition of X (for details see Chan (1999) , §2 or Shiryaev (1999) , III § lb and VII §3c) . This leads to the decomposition
Z,
Z
Zt = cWt + Mt + at,
(7.33)
where Mt is a martingale with Mo = 0 and a JE(Xl ) ' We shall assume the existence of such a decomposition (7.33). Then we can restate (7.31) as =
(7.34) where the coefficients b and a are constants (though one can generalize to deterministic functions) . Now (7.34) has an explicit solution yt =
Yo exp x
II
{i
ca dWs +
i
adMs +
i(
o ( 1 + aLlMs) exp{ - aLl Ms } . 0
0
aa + b -
:
a2
) dS }
o< s ::; t In order to ensure that yt ;::: 0 for all t almost surely, we need a .tJ.Jv1t ;::: - 1 for all t . A sufficient condition i s that the jumps of X should b e suitably bounded from below. We also introduce the (locally) risk-free bank account (short rate) process Bt with (7.35) with a suitable process.
rt
7.4 Models Driven by Levy Processes
321
7.4.3 Existence of Equivalent Martingale Measures
Q
To characterize equivalent martingale measures under which discounted price processes St = Bt l Bt are (local) Ft-martingales, we rely on Girsanov's theorem for semi-martingales. (See Jacod and Shiryaev (1987) , III §3d, for a thorough treatment , or Shiryaev (1999) , VII §3g for a textbook summary. Biihlmann, Delbaen, Embrechts, and Shiryaev (1996) provide a discussion geared toward financial applications. ) We follow the exposition in Chan (1999) , to which we refer for technical details. Define a process L t as t t
Lt = 1 + with functions
J GsLs_ dBs + J J Ls- [H(s, x) - l]M(ds, dx) , o
and
=
IR.
G and H satisfying certain regularity conditions. Then
Theorem 7.4. 1 . FT ,
with JE(LT )
0
(7.36)
1.
Assume Q is absolutely continuous with respect to IP on
��I
Ft
Under Q the process Wt
=
Wt
Lt
=
-
t
J Gsds o
is a standard Brownian motion the process process with compensator measureandgiven by X is a quadratic pure jump v(dt, dx) dtVt (dx) , where vt (dx) H(t, x)v(dx) and the previsible part is given by t at JEQ (Xt) at + J J x(H(s, x) - l)v(dx)ds . =
=
=
=
o
IR.
Using Theorem 7.4.1 we can write the discounted process S in terms of the martingale iiI and the Brownian motion W and read off a necessary and sufficient condition for S to be a martingale:
Q
Q
cat Gt + aat + bt - rt +
Q
J asx(H(s, x) - l)v(dx)
IR.
=
O.
(7.37)
Since the martingale condition (7.37) doesn't specify the functions G and H uniquely, we have an infinite number of equivalent martingale measures, i.e. the market model is incomplete. We hence face the problem of choosing a particular martingale measure for pricing (and hedging) contingent claims.
322
7. Incomplete Markets
Choice of an Equivalent Martingale Measure. Minimal Martingale Measure. Consider the problem of hedging a contingent claim H with ma turity T (modelled as a bounded FT - measurable random variable) in an in complete financial market model. Under an equivalent martingale measure Q we only can obtain a representation of the form
J etdSt + LT, T
iI
(Lt)
=
Ho +
e
o
where is a square-integrable martingale orthogonal to the martingale part of S under lP. corresponds to a trading strategy that would reduce the remaining risk to the intrinsic component of the contingent claim. Therefore we try to find a martingale measure that allows for such a decomposition and preserves orthogonality. Such a measure is called minimal martingale mea sure.
Esscher Transforms. The idea here is to define equivalent measures via (7.38)
-()Zdl
Z
where '¢(()) = - log IE [exp( is the Levy exponent of given by (7. 3 1 ) . One then has t o choose ()s t o satisfy the martingale conditions. The use of Esscher transforms as a technical tool has a long history in actuarial sciences. Gerber and Shiu ( 1 995) were the first to introduce it systematically to option pricing. Chan ( 1 999) provides an interpretation of it in terms of entropy - the measure lP encapsulates information about market behaviour, then pricing by Esscher transforms amounts to choosing the equivalent martingale measure that is closest to lP in terms of information content . Equilibrium based j ustifications have been given in Biihlmann, Delbaen, Embrechts, and Shiryaev ( 1 998) and Gerber and Shiu ( 1995 ) . Further background information can be found in Chan ( 1 999) , and Shiryaev ( 1999) , VII §3c. We outline an approach suggested by Rogers ( 1998) (for a general discus sion of optimal consumption/investment problems see Karatzas and Shreve ( 1991) and Korn ( 1 997a) ) . Consider a financial market with a discount pro cess e-8 t , 8 > 0 a constant and introduce a representative agent with a utility function U. Suppose that the wealth process of the investor satisfies
f3(t) =
dXt rXtdt + 7rt ( YtdYt- - rdt ) - Ctdt, (7.39) with (7rt) resp. (Ct) the portfolio resp. consumption process of the investor. The return process dYt/Yt- is given as in ( 7 . 3 1 ) with a suitable driving Levy =
process. The investor wishes to maximize
7.4 Models Driven by Levy Processes
JE
(1
)
exPI -5t}U(C,)dt .
323
(7.40)
We specialize to a utility function U( x) = _ ,),- l e-l'X and solve the investor's optimization problem following the standard Hamilton-Jacobi-Bellman ap proach (see Karatzas and Shreve ( 1 99 1 ) , §5.8, or Korn ( 1 997a) ) . This leads to an optimal consumption process C; (and an optimal portfolio process 71" * ) . Now the equivalent martingale measure is given by e -r
t dQ dJP
Solving (7.41) leads to
Lt
I
=
Yt
- e -r t L t oc e -<'it U ' (C*t ) ·
exp{ -()* Zt
(7.41)
+ 1P(()* )t}
with a n optimal parameter () * , which i s exactly o f form (7.38) . 7.4.4 Hyperbolic Models: The Hyperbolic Levy Process
We now specialize the SDE (7.31) to the case where the driving Levy noise is that generating the hyperbolic distribution of § 2 . 1 2 . The solution is given by the stochastic exponential
y e t)
=
Y eO ) exp { Z�'O (t) + pt }
(here the quadratic variation
II
O < s$t
6 ( 1 + Ll Z� 'o (s)) e - Ll Z � · (s)
(7.42)
[Z] t is j ust
with no continuous component , as Z is a pure jump process) . Passing to log arithms to pass from prices to returns, one obtains two terms, the hyperbolic term Zt + pt and the sum-of-j umps term. To first order, this is L: s
324
7. Incomplete Markets
As in the other non-normality approaches mentioned above, the drawback of the model is that the underlying stochastic model of the financial market becomes incomplete. We thus face the question of choosing an appropriate equivalent martingale measure for pricing purposes. We concentrate now on the symmetric centered case, Le. set J.t (3 O. This leads to modelling the stock-price process by (7.42) (Le. (7.31) with driving noise a hyperbolic Levy process ) . As mentioned above, the return process so generated is hyperbolic to a first approximation. To generate ex actly hyperbolic returns along time-intervals of length 1 , Eberlein and Keller (1995) suggest writing 8 ( t) 8( 0 ) exp{Z(, O (t)} (7.43) =
=
=
as a model for stock prices, and we shall work with this model in the sequel. To price contingent claims in the hyperbolic Levy model, we use Ess cher transforms, which are defined via (7.38). Now in our model (7.43) , the function 0(. ) in (7.38) reduces to a constant. Therefore, defining the moment generating function of Z( O ( t ) as ,
(7.44)
the Esscher transforms are defined by (7.45)
(observe that L is a positive martingale) . According to (7.38) we define equiv alent measures via ()
��I = L t :F,
and call JPI) the Esscher measure of parameter O . The risk-neutral Esscher measure is the Esscher measure of parameter o 0 * such that the process =
(7.46)
is a martingale (with the daily interest rate ) . From the martingale condition r
IE [e- r t 8 (t) ; 0 * ]
we find
=
8( 0)
( l 0 * , 1) er = MM(O * , l) , +
from which the parameter 0* is uniquely determined. Indeed, since the mo ment generating function Me,o ( 1 ) is u,
7.4 Models Driven by Levy Processes
we have r
=
log
[Kl (v'(2_82 (() +1)2) ] 1 [ (2_(28-2(82()() +21)2 ] Kl ( J(2 82()2) (, 8, ()* . T S(t) K - -2 log
_
.
325
(7.47)
Given the daily interest rate r and the parameters Equation (7.47) can be solved by numerical methods for the martingale parameter We now value a European call with maturity and strike in the hyper bolic model, that is we assume that the underlying has price dynamics given by (7.43) . By the risk-neutral valuation principle we have to calculate
=
1,
e -rT { lE [S(T)l{S(T» K } ; ()* ] - KlE [K l{S (T» K } ; ()*] } .
= ()* g (x) l{ x > K } lE [S(T) l{s (T» K} ; ()* ] = lE [S (T) ; () *] lE [ l{S (T» K} ; () * 1] = lE [e-r T S(T); () * ] erT JP [ S (T) > K ; () * = S(O)erT JP [ S (T) > K; () * where we used the martingale property of e - r t S(t) under the risk-neutral Esscher measure for the last step. Now the pricing formula for the European
To evaluate the first term, we apply the factorization formula with k and and get
h
=
=
+ 1] ,
call becomes
K; +
+
+ 1]
K;
K
S(O)JP [S(T) > () * 1] - e - rT JP [S(T) > () *] . (7.48) We now can use formula ( 7.48) to compute the value of a European call with strike K and maturity T. Denote the density of .C(Zf., 8 (t)) by ftf. , 8 ( compare (2.8) for the exact form ) . Then 00
E [e - r T (ST - K)+ ; () * ] S(O) J f�, 8 (x ; ()* l )dx +
=
c
K
00
_e-rT J f�, 8 (x ; ()*)dx , c
( 7.49)
326
7. Incomplete Markets
where c = log ( Kj S ( O )) . Eberlein and Keller (1995) and Eberlein, Keller, and Prause (1998) com pare option prices obtained from the Black-Scholes model and prices found using the hyperbolic model with market prices. They find that the hyper bolic model provides very accurate prices and a reduction of the smile effect observed in the Black-Scholes model.
8 . Interest Rate Theory
In this chapter, we apply the techniques developed in the previous chap ters to the fast-growing fixed-income securities market . We mainly focus on the continuous-time model (since the available tools from stochastic calcu lus allow an elegant presentation) and comment of the discrete-time analogue (Jarrow ( 1996) gives a splendid account of discrete-time models) . As we want to develop a relative pricing theory, based on the no-arbitrage assumption, we will assume prices of some underlying objects as given. In the present context we take zero-coupon bonds as the building blocks of our theory. In doing so we face the additional modelling restriction that the value of a zero-coupon bond at time of maturity is predetermined ( = 1 ) . Furthermore, since the entirety of fixed-income securities gives rise to the term-structure of inter est rates (sometimes called the yield curve) , which describes the relationship between the yield-to-maturity and the maturity of a given fixed-income secu rity, we face the further task of calibrating our model to a whole continuum of initial values (and not j ust to a vector of prices) . A first attempt at ex plaining the behaviour of the yield curve is in terms of a continuum of spot rates of maturities between T and T, where T is the shortest (instantaneous) lending/borrowing period, and T the longest maturity of interest . We model these rates as correlated stochastic variables with the degree of correlation decreasing in terms of the difference in maturity. Discretizing the maturity spectrum, we are tempted to start with a generalized Black-Scholes model (as in §6.2) d
dri (t) = ai (t) + L bij (t)dWj (t) , i j=l
=
1, .
..,d
with W = (Wi , . . . , Wd) a standard d-dimensional Brownian motion. The de gree of (instantaneous) correlation of the different rates can then be described in terms of their covariation
d (ri , rj) (t)
=
d
L bik (t)bjk (t)dt = pij (t)dt.
k=l
Empirical investigations in the degree of correlation show that the Pij are usually very large (close to 1 ) . In fact , using principal component analysis
328
8. Interest Rate Theory
(Mardia, Kent, and Bibby ( 1 979) , Chapter 8, Krzanowski ( 1 988) , §2.2) one can show that the first principal component often explains 80 - 90% of the total variance, and that the first three components taken together describe up to 90-95% of the total variance. Therefore, especially in early approaches, it has been tempting to choose one specific rate, usually the short rate, r { t) , as a proxy for the single variable that principal component analysis indicates can best describe the movements of the yield curve. The standard model is then of the form dr{t)
=
a{t, r { t ) ) dt + b{t, r { t))dW { t) .
(8. 1 )
After developing the general features of a model using zero-coupon bonds as building blocks, based on the general theory in §6. 1 and §6.2, we will discuss various such models of form (8. 1 ) , taking the short rate as state variable, and explore their implications and shortcomings in §8.2. By contrast, the modern, evolutionary approach pioneered by Heath, Jarrow, and Morton ( 1 992) takes a continuum of instantaneous forward rates as the building blocks to describe the dynamics of the whole term structure. We will discuss this approach in detail in §8.3. Finally, we apply the different models to contingent claim pricing in §8.4. Modern approaches to the pricing of interest derivatives are discussed in sections §8.5 and §8.6. The structure of our approach below follows the treatment of Bjork ( 1997) . For additional textbook accounts, see Part II of Musiela and Rutkowski ( 1 997) and Brigo and Mercurio (200 1 ) . 8 . 1 The Bond Market
8 . 1 . 1 The Term Structure of Interest Rates
We start with a heuristic discussion, which we will formalize in the follow ing section. The main traded objects we consider are zero-coupon bonds. A zero-coupon bond is a bond that has no coupon payments. The price of a zero-coupon bond at time t that pays, say, a sure £ at time T 2: t is denoted p{ t, T) . All zero-coupon bonds are assumed to be default-free and have strictly positive prices. Various different interest rates are defined in connection with zero-coupon bonds, but we will only consider continuously compounded interest rates ( which facilitates theoretical considerations) . Using the arbitrage pricing technique, we easily. obtain pricing formulas for coupon bonds. Coupon bonds are bonds with regular interest payments, called coupons, plus a principal repayment at maturity. Let Cj be the pay ments at times tj , j 1 , F be the face value paid at time tn . Then the price of the coupon bond Be must satisfy =
. . . , n,
n
Be
=
L Cjp{O, tj ) + Fp{ O, tn ) .
j= 1
(8.2)
8 . 1 The Bond Market
329
Hence, we see that a coupon bond is equivalent to a portfolio of zero-coupon bonds. The yield-to-maturity is defined as an interest rate per annum that equates the present value of future cash flows to the current market value. Using continuous compounding, the yield-to-maturity Ye is defined by the relation n Be = L Cj exp{ Ye tj + F exp{ - Yetn}. j=l
- }
If for instance the tj , j = 1 , are expressed in years, then Ye is an annual continuously compounded yield-to-maturity ( with the continuously compounded annual interest rate, defined by the relation = exp The term structure of interest rates is defined as the relationship between the yield-to-maturity on a zero-coupon bond and the bond's maturity. Nor mally, this yields an upward sloping curve ( Fig. 8 . 1 ) , but flat and downward sloping curves have also been observed.
{-r(T) (T/365)}).
. . . , n
r (T) ,
p(O , T)
Yield
Maturity
Fig. 8 . 1 .
Yield curve
In constructing the term structure of interest rates, we face the additional problem that in most economies no zero-coupon bonds with maturity greater than one year are traded ( in the USA, Treasury bills are only traded with maturity up to one year ) . We can, however, use prices of coupon bonds and invert formula (8.2) for zero-coupon prices. In practice, additional compli cations arise, since the maturities of coupon bonds are not equally spaced and trading in bonds with some maturities may be too thin to give reli able prices. We refer the reader to Edwards and Ma ( 1992) and Jarrow and Turnbull (2000) for further discussion of these issues.
330
8. Interest Rate Theory
8 . 1 . 2 Mathematical Modelling
We use a standard setting as described in §6. 1 and §6.2 (we follow Bjork ( 1 997) and El Karoui, Myneni, and Viswanathan ( 1 992a) ) . Let ([2, F, IP, IF) be a filtered probability space with a filtration IF = (Ft)t � T ' satisfying the usual conditions (used to model the flow of information) and fix a terminal time horizon T * . We assume that all processes are defined on this prob ability space. The basic building blocks for our relative pricing approach, zero-coupon bonds, are defined as follows.
Definition 8 . 1 . 1 . A zem-coupon bond with maturity date T, also called a T - bond, is a contract that guarantees the holder a cash payment of one unit on the date T. The price at time t of a bond with maturity date T is denoted by p(t, T) . Obviously we have p(t , t ) = 1 for all t . We shall assume that the price process p(t , T ) , t E [0 , TJ is adapted and strictly positive and that for every fixed t , p(t, T) is continuously differentiable in the T variable. Based on arbitrage considerations (recall our basic aim is to construct a market model that is free of arbitrage) , we now define several risk-free inter est rates. Given three dates t < Tl < T2 the basic question is: what is the risk-free rate of return, determined at the contract time t , over the interval [Tl ' T2J of an investment of 1 at time Tl ? To answer this question we consider the arbitrage Table 8 . 1 below (compare § 1 . 3 for the use of arbitrage tables) .
t
Time
Buy
Sell Tl bond p ( t , Tt l T2 bonds p ( t , T2 )
Net investment Table 8 . 1 .
0
Tl
T2
Pay out 1 Receive
-1
p ( t ,Tt l p ( t , T2 )
+ p ( t , Tt l p(t , T2 )
Arbitrage table for forward rates
To exclude arbitrage opportunities, the equivalent constant rate of interest
R over this period (we pay out 1 at time Tl and receive e R(T2 - T, ) at T2 ) has
thus to be given by
We formalize this in:
Definition 8 . 1 . 2 . (i) The forward rate for the period [Tl ' T2J as seen at time at t is defined as
8 . 1 The Bond Market
R ( t ., T1 , 'T' .L 2 )
=
331
_ log p (t , T2 ) - log p (t , T1 ) . To2 - T1
(ii) The spot rate R (T1 , T2 ) , for the period [T1 , T2J is defined as
(iii) The instantaneous forward rate with maturity T, at time t, is defined by f (t , T)
=
_
a log p (t , T) ' aT
(iv) The instantaneous short rate at time t is defined by r (t )
=
f (t , t ) .
Definition 8 . 1 . 3 . The money account process is defined by B (t)
�
exp
{j } . r(s ) d'
The interpretation of the money market account is a strategy of instan taneously reinvesting at the current short rate. An immediate consequence of the definitions is
Lemma 8 . 1 . 1 . For t �
s
p (t , T)
� T we have �
p( t , , ) exp
and in particular
{ -1
f(t, u ) d"
}.
In what follows, we model the above processes in a generalized Black Scholes framework. That is, we assume that W ( WI , . . . , Wd ) is a standard d-dimensional Brownian motion and the filtration IF is the augmentation of the filtration generated by W ( t ) . The dynamics of the various processes are given as follows: =
Short-rate Dynamics: dr (t)
=
a (t) dt + b(t) d W ( t ) ,
(8. 3 )
Bond-price Dynamics: dp(t , T )
=
p ( t , T ) {m( t , T ) dt + v (t , T ) dW ( t ) } ,
(8.4)
8. Interest Rate Theory
332
Forward-rate Dynamics: df (t, T)
=
a(t, T)dt + a(t, T)dW (t) .
(8.5)
We assume that in the above formulas, the coefficients meet standard con ditions required to guarantee the existence of the various processes - that is, existence of solutions of the various stochastic differential equations; see Furthermore, we assume that the processes are smooth enough to allow differentiation and certain operations involving changing of order of integra tion and differentiation. Since the actual conditions are rather technical, we refer the reader to Bjork ( 1997 ) , Heath, Jarrow, and Morton ( 1 992 ) and Prot ter (2004) ( the latter reference for the stochastic Fubini theorem ) for these conditions. Following Bjork ( 1 997 ) for formulation and proof, we now give a small toolbox for the relationships between the processes specified above.
§5 . 8 .
dynamics we have(i) If p(t, T) satisfies (8.4 ), then for the forward-rate
Proposition 8 . 1 . 1 .
df (t, T)
=
a(t, T)dt + u(t, T)dW(t) ,
where a and u are given by
{
VTV(t,(t,T)T)v (t , T) mT (t , T) , T . (ii) If f (t, T) satisfies (8. 5 ), then the short rate satisfies a (t, T) u(t, T)
-
=
=
dr (t)
-
=
a(t) dt + b(t) dW (t) ,
where a and b are given by { b(t)a (t) u(t, fr (t, t) + a (t, t) , (8. 6 ) t) . (iii) If f (t, T) satisfies (8. 5 ), then p (t, T) satisfies dp (t, T) p( t , T) { ( r (t) + A(t, T) + � I I S(t, T) 1 1 2 ) dt S (t, T) dW (t) } , where T A(t, T) J a (t, s)ds , S(t, T) J u (t, s)ds. (8. 7) =
=
+
=
T
=
-
=
t
-
t
Proof. To prove ( i ) we only have to apply Ito's formula to the defining equation for the forward rates and we leave this as Exercise below. To prove ( ii ) we start by integrating the forward-rate dynamics. This leads to
8. 7
8 . 1 The Bond Market
J( t, t) = ret) = J( O, t) + Writing also
t
t
! a(s, t)ds + ! a(s, t)dW(s) . o
0
a and a in integrated form a (s, t) = a(s, s) + a (s, t)
333
( 8.8 )
t
! aT (s, u)du, s
t
!
a ( s, s) + aT (s, u)du,
=
s
and inserting this into ( 8.8 ) , we find
ret)
=
J( O, t) + +
t
t t
t
! a (s, s)ds + ! ! aT (s, u)duds o
0 t t
s
! a(s, s)dW(s) + ! ! aT ( s, u)dudW(s) . o
0
s
After changing the order of integration we can identify terms to establish ( ii ) . For ( iii ) we use a technique from Heath, Jarrow, and Morton (1992) . By the definition of the forward rates we may write the bond-price process as
p(t, T) where
Z is given by
=
exp {Z(t ,
T) }
,
T
Z ( t,
T) ! J( t, s)ds. =
-
( 8 . 9)
t
Again we write ( 8.5 ) in integrated form:
J( t, s) = J( O, s) +
t
t
! a(u, s)dt + ! a(u, s)dW(u) . o
We insert the integrated form in ( 8 . 9 ) to get
T
Z ( t,
t
0
T
t
T
T) ! J (O, s)ds ! ! a(u, s)dsdu ! ! a(u, s)dsdW(u). =
-
t
-
O t
-
0 t
Now, splitting the integrals and changing the order of integration gives us
334
8. Interest Rate Theory
Z(t, T)
=
T
t
T
t
T
J 1(0, s)ds J J a(u, s)dsdu J J a(u, s)dsdW(u) + J f(O, s)ds + J J a(u, s)dsdu + J J a(u, s)dsdW (u) Z(O, T) J J a(u, s)dsdu J J a(u, s)dsdW (u) + J 1(0, s)ds + J J a(u, s)duds + J J a(u, s)dW (u)ds. -
o
-
0
t
o
-
t
0
t
o
t
0
-
t
0
0
t
u
T
u
u
-
s
0
t
0
t
0
t
u
T
u
u
t
0
s
0
The last line is just the integrated form of the forward-rate dynamics (8.5) over the interval [0, sJ . Since r(s) f(s, s) , this last line above equals J� r(s)ds. So we obtain =
Z(t, T)
=
Z(O, T) +
t
t
T
t
T
J r(s)ds J J a(u, s)dsdu J J a(u, s)dsdW(u) . -
o
0
u
-
0
u
Using A and S from (8.7) , the stochastic differential of Z is given by
dZ(t, T)
=
(r(t) + A(t, T))dt
+ S(t, T)dW(t).
Now we can apply Ito's lemma t o the process p(t, T) complete the proof.
=
exp{Z(t, T) } to 0
8 . 1 . 3 Bond Pricing, Martingale Measures and Trading Strategies
We will now examine the mathematical structure of our bond-market model in more detail. As usual, our first task is to find a convenient characterization of the no-arbitrage assumption. By Theorem 6 . 1 . 1 , absence of arbitrage is guaranteed by the existence of an equivalent martingale measure Q. Recall that by definition an equivalent martingale measure has to satisfy Q '" JP and the discounted price processes ( with respect to a suitable numeraire ) of the basic securities have to be local Q-martingales. For the bond market this implies that all zero-coupon bonds with maturities 0 :=:; T :=:; T* have to be local martingales. More precisely, taking the risk-free bank account B (t) as numeraire we have
Definition 8 . 1 .4 . A measure Q
'" JP defined on (il, F, JP) is an equivalent martingale measure for the bond market, if for every fixed 0 :=:; T :=:; T * the process
8. 1 The Bond Market
p(t, T ) , B (t)
335
o � t � T
is a local Q -martingale. Assume now that there exists at least one equivalent martingale measure, say Q . Defining contingent claims as FT-measurable random variables such that XI B (T) E L l ( FT , Q) with some 0 � T � T * ( notation: T-contingent claims ) , we can use the risk-neutral valuation principle (6. 1) to obtain:
Proposition 8 . 1 . 2 . Consider a T-contingent claim X . Then the price pro cess IIx (t) , 0 � t � T of the contingent claim is given by
In particular, the price process of a zero- coupon bond with maturity T is given by T p(t, T) = IEQ e - ft r( s )ds Ft .
1 )
[
Proof. We j ust have to apply Theorem
6.1.4.
We thus see that the relevant dynamics o f the price processes are those given under a martingale measure Q. The implication for model building is that it is natural to model all objects directly under a martingale measure Q. This approach is called martingale modelling. The price one has to pay in using this approach lies in the statistical problems associated with parameter estimation. Turning now to market completeness, we define trading strategies - as with our models in Chapters 4 and 6 as dynamic investment portfolios involving some or possibly all of the ( basic ) traded securities ( in the present case zero-coupon bonds ) . In contrast to our previous market models, we now have a continuum of bonds with different maturities available for trade. So a portfolio could in clude an infinite number of financial securities ( consider e.g. the rolling-over strategy to define a risk-free savings process involving always only bonds of the shortest possible maturity ) . We shall, however, in accordance with prac tical restrictions only consider portfolios involving an arbitrary, but finite, number of financial securities ( see Bjork, Di Masi, Kabanov, and Runggaldier ( 1 997) and Bjork, Kabanov, and Runggaldier ( 1 997) for infinite portfolios ) . For any collection of maturities 0 < Tl < T2 < . . . < Tk = T* we then consider bond-trading strategies
=
.
.
336
8. Interest Rate Theory
A bond trading strategy 'P is said to be self-financing if the wealth process V, given by k V
satisfies
V
=
i= l
k
t
J
V
for every t E [0, T*] . If we set up the saving account B and allow for trading in it, we will add a zeroth component to a given trading strategy. Using only trading strategies with a finite number of bonds, the important question in hedging and pricing problems in our market model is that of attainability of a given contingent claim rather than completeness of the market model. 8 . 2 S hort-rate Models
Following our introductory remarks, we now look at models of the short rate of the type (8. 1) dr(t) a(t, r(t))dt + b(t, r(t))dW(t) , with functions a, b sufficiently regular (for instance, satisfying the condi tions in §5.7) and W a real-valued Brownian motion. The crucial point in this setting is the assumption on the probability measure under which the short rate is modelled. If we model under the objective probability measure 1P and assume that a locally risk-free asset B (the money market) exists, we face the question whether in an arbitrage-free market bond prices - quite naturally viewed as derivatives with the short rate as underlying - are uniquely determined. In contrast to the equity market setting, with a risky asset and a risk-free asset available for trading, the short rate r is not the price of a traded asset, and hence we only can set up portfolios consisting of putting money in the bank account. We thus face an incomplete market situation, and the best we can hope for is to find consistency requirements for bonds of different maturity. Given a single 'benchmark' bond, we should then be able to price all other bonds relative to this given bond. On the other hand, if we assume that the short rate is modelled under an equivalent martingale measure ( recall all discounted bond prices have to be martingales, see Definition 8.1.4) , we can immediately price all contingent claims via the risk-neutral valuation formula ( Proposition 8.1.2). The draw back in this case is the question of calibrating the model ( we do not observe the parameters of the process under an equivalent martingale measure, but rather under the objective probability measure!) . =
8 . 2 Short-rate Models
337
8 . 2 . 1 The Term-structure Equation
Let us assume that the short-rate dynamics satisfy (8. 1) under the historical probability measure !P. In our Brownian setting, we know that each equiva lent martingale measure Q is given in terms of a Girsanov density
L( ') exp �
{ -j
�(u)dW(u) -
�
j }
� (U) ' dU ' 0 " , ,, T ,
with , progressively measurable (compare §5.1O). Assume now that , is given as , (t) = A(t, r(t» , with a sufficiently smooth function A. (We will use the notation Q(A) to emphasize the dependence of the equivalent martingale measure on A) . By Girsanov's Theorem 5 . 7. 1 , we know that W W + J Adt is a Q(A)-Brownian motion. So the Q(A)-dynamics of r are given by =
dr(t)
=
{a(t, r (t» - b(t, r(t)) A(t, r (t) ) } dt + b(t, r(t))dW (t) .
Now consider a T-contingent claim X = 4i(r(T» , for some sufficiently smooth function 4i JR --+ JR + . We know that using the risk-neutral valuation for mula we obtain arbitrage-free prices for any contingent claim by applying the expectation operator under an equivalent martingale measure (to the dis counted time T value) . An extension (see e.g. Karatzas and Shreve ( 1 99 1 ) , Theorem 5.7.6) of the Feynman-Kac formula (Theorem 5.7.3) yields, for any Q(A) , :
[
d IEQ (A) e - It r ( u ) u 4i(r(T) )
1 Ft]
=
F(t, r(t) ) ,
where F [0, T* ] x JR --+ JR satisfies the partial differential equation (we omit the arguments (t, r) ) :
�
Ft + ( a - bA) Fr + b2 Frr - rF = 0
(8. 10)
and terminal condition F(T, r) = 4i(r) , for all E JR. Suppose now that the price process p(t, T) of a T-bond is determined by the assessment, at time t, of the segment {r(T) , t � T � T} of the short rate process over the term of the bond. So we assume r
p(t, T)
=
F(t, r(t) ; T) ,
with F a sufficiently smooth function. Since we know that the value of a zero-coupon bond is one unit at matu rity, we have the terminal condition F(T, r ; T) 1 . Thus we have =
Proposition 8 . 2 . 1 (Term-structure Equation) . If there exists an equiv
alent martingale measure of type Q(A) for the bond market (implying that the no-arbitrage condition holds) and the price processes p(t, T) of T -bonds are given are given by a sufficiently smooth function F as above, then F will satisfy the partial differential equation (8.10) with terminal condition F(T, r; T) 1 . =
338
8. Interest Rate Theory
Remark 8. 2. 1 . (i) We emphasize again that the existence of Q().. ) implies that the bond market is free of arbitrage, but that the reverse implication is not true in general (compare §6. 1 ; Rogers ( 1 995) calls the 'equivalence ' a 'folk' theorem) . (ii) Comparison of the above PDE (8. 10) with the Black-Scholes PDE (6. 1 1 ) shows that we have the additional appearance of the term ).. . The important fact to notice is that ).. is not determined within the model, so in order to solve the equation we have to specify ).. a priori. This reflects the fact that the model is incomplete. Specification of ).. is a statistical problem, and is done by calibrating the model to 'benchmark' bond prices (see §8 .2.3) . We investigate the role of the function ).. a little further. Under our as sumption on the bond prices, an application of Ito's formula to the process F(t, r(t) ; T) (recall r has dynamics given by (8. 1 ) ) yields
dF(t, r(t) ; T) = F(t, r(t) ; T) (m(t , T)dt + v(t, T)dW(t) ) ,
with the functions m ( t , T)
=
m(t, r (t) ; T ) and v(t, T)
=
v (t, r (t) ; T ) given by
bFr v- F · We are now in the framework of (8.4) with the functions m, v 2 being the mean and variance, respectively, of the instantaneous rate of return at time t on a bond with maturity date T given that the current spot rate is r (t) = r. Rewriting (8. 10) , ).. satisfies
or in terms of m, v above, \ A
(
t, r )
=
m(t, r ; T) - r ' v ( t, r; T )
T
� t.
Observe that in the numerator m(t, r; T) - r we have the risk premium for the T-bond, i.e. the rate of return over the risk-free rate commanded by a T-bond. In the denominator we have the volatility of the T-bond, and thus the quotient can be interpreted as the risk premium per unit of volatility of the S-bond. The ratio is often called the market price of risk, reflecting the fact that it specifies the expected instantaneous rate of return on a bond per additional unit of risk (compare §6. 2 . 2 ) .
8 . 2 . 2 Martingale Modelling We now fix an equivalent martingale measure Q (which we assume to exist ) , and return t o modelling the short-rate dynamics directly under Q. Thus we assume that r has Q-dynamics
8.2 Short-rate Models
dr(t)
=
a(t, r ( t ) ) dt + b (t , r(t) ) d W (t)
339
(8 . 1 )
with W a (real-valued) Q-Wiener process. We can immediately apply the risk-neutral valuation technique (Proposition 8 . 1 .2) to obtain the price pro cess llx (t) of any sufficiently integrable T-contingent claim X by computing the Q-expectation, i.e. (8. 1 1 ) If, additionally, the contingent claim i s o f the form X = p(r (T) ) with a sufficiently smooth function P, we can again use the Feynman-Kac formula to obtain:
Proposition 8 . 2 . 2 (Term-structure Equation) . Consider T- contingent claims of the form X = P(r(T) ) . Then arbitrage-free price processes are given by llx (t) = F(t, r(t) ) , where F is the solution of the partial differential equation (8. 1 2)
with terminal condition F (T, r) = P(r) for all r E JR. In particular, T-bond prices are given by p(t, T) = F(t, r (t) ; T) , with F solving (8.12) and terminal condition F(T, r; T) = 1 . Suppose we want to evaluate the price of a European call option with maturity S and strike K on an underlying T-bond. This means we have to price the S-contingent claim
X
=
max{p(S, T )
-
K, O } .
We first have t o find the price process p ( t , T ) F(t, r ; T) by solving ( 8 . 1 2 ) with terminal condition F(T, r ; T) = 1 . Secondly, we use the risk-neutral valuation principle to obtain llx (t) = G (t r ) , with G solving =
,
b2 Gt + aGr + 2 Gr r - rG = 0 and G ( S, r)
= m
ax{ F ( S, r ; T)
-
K , O } , \:Ir E JR.
So we are clearly in need of efficient methods of solving the above partial differential equations, or from a modelling point of view, we need short rate models that faciIate this computational task. Fortunately, there is a class of models, exhibiting an affine term structure (A TS), which allows for simplification.
Definition 8. 2 . 1 . If bond prices are given as p(t, T)
=
exp {A (t, T) - B(t , T)r} ,
0�t
� T,
with A (t, T) and B (t , T) are deterministic functions, we say that the model possesses an affine term structure.
340
8. Interest Rate Theory
Assuming that we have such a model in which both a and b2 are affine in a(t) - (3 (t) r and b(t, r) V"((t) + 8( t ) r , we find that A and B are given as solutions of ordinary differential equations, At (t, T) - a(t)B(t, T) + "( t) B 2 (t, T) 0, A(T, T) 0, 8 ) 2 (1 + Bt (t, T)) - (3(t)B(t, T) B (t, T) 0 , B(T, T) 0 .
r, say a(t, r)
=
=
� �
=
=
=
=
The equation for B is a Riccati equation, which can be solved analytically, see Ince ( 1 944) , §2 . 15 , 12.51 , A.21. Using the solution for B we get A by integrating. Examples of short-rate models exhibiting an affine term structure include the following. 1. Vasicek model: dr (a - {3r)dt + "(dW; 2. Cox-Ingersoll-Ross (CIR) model: dr (a - f3r)dt + 8y'rdW; 3. Ho-Lee model: dr a(t)dt + "(dW ; . 4. Hull- White (extended Vasicek) model: dr (a(t) - (t) )dt + "((t)dW; 5 . Hull- White (extended CIR) model: dr (a(t) - (3 ( t ) r )dt + 8(t) y'rdW. For further development and discussion we refer the reader to Bjork ( 1995 ) , Brown and Schaefer ( 1995 ) , Duffie ( 1 992) and Duffie and Kan ( 1 995) . In many cases evaluating the risk-neutral formula (8.13) directly is easier than solving the various partial differential equations. We analyze the Hull White ( extended Vasicek ) models ( sometimes called Gaussian models) a little further in this spirit, following Rogers ( 1995) . Let us assume that a, {3, "( IR+ -t IR+ are locally bounded functions. Then we by a slight extension of the argument presented in §5.8 we can solve explicitly for r. Setting K(t) J� (3(u)du, we find that =
=
=
f3
=
=
r
:
=
r(t)
�
,- K,,)
{ I r(o) +
e K ( u ) a( u)du +
I
e K ( u ) 1( u )dW( u)
}.
This is a Gaussian process ( compare §5.2) with mean and covariance functions given by m
( t)
,�
lEQ (r(t))
CovQ (r(s) , r(t))
�
=
{ I
e - K ( ' ) r (o) + e - K (s) - K (t)
e K ( u) a( u)du
sA t
J e2 K (u) "(2(u)du . o
Now J� r ( )du has a normal distribution N(m(t) , v(t) ) , with u
}.
8.2 Short-rate Models
and
v (t) 2 =
t
S
U
J J J e- K(s) -K(u) +2K(Y) "l (y)dy ds du 0
0
341
o
0
So an application of the risk-neutral valuation principle (using Formula (8. 11) with X = 1) gives the bond prices p (O, T ) as the moment generating function of a normally distributed random variable evaluated at - 1, i.e.
From this we can deduce that we have an affine term structure with functions T
B(t, T)
= -
A(t, T)
=
J exp {K(u) - K(t) } du, t
J j { a(s) eK (S)- K(U) - j 'Y2 (Y) e-K(S)-K(U) +2K(Y) dY } ds du t
t
o
t
We also see from (8. 13) that the bond price processes have a log-Gaussian distribution, which makes it easy to price contingent claims. For example the price process of a European call option with maturity S and strike K on an underlying T-bond is given by the risk-neutral valuation formula (8. 1 1 ) as
and can be evaluated in closed form.
Remark 8. 2.2. (i) Clearly the main point of criticism in dealing with Gaus sian models is that there is a positive probability that interest rates be come negative (hardly observed in practice! ) . Negative interest rates are avoided by modifying the variance structure of Gaussian models, e.g. to a specification as in the Hull-White (extended Cox-Ingersoll-Ross) model (also called squared Gaussian) , or to a lognormal model of short-rate dynamics dr = ra(t)dt + rb(t)dt as in the Black-Derman-Toy model (see Black, Der man, and Toy ( 1 990) ) . While the first modification still exhibits an affine term structure, the second leads to serious computational problems. (ii) Interest rates have a tendency to return to some long-run level, a phe nomenon known as mean reversion. On the modelling side this is reflected in specifying the drift parameter a(t, r(t)) = a(t) - ,B(t)r(t) ; see Chan et al.
342
8. Interest Rate Theory
( 1 992) for a discussion of empirical evidence of mean-reversion in short rates. (iii) We only considered single-factor models here, i.e. models trying to ex plain the term structure by assuming a single underlying state variable. This assumption is open to obvious criticism: it implies that the long rate is a deterministic function of the spot rate, and that prices of bonds of different maturities are perfectly correlated. Various multi-factor models have been proposed to overcome these shortcomings. In these models, additional factors, such as the rate of inflation, volatility of short rate and short-term mean of the short rate are used to get an improved fit to reality, but at the same time the computational tractability of such models decreases. We refer the reader to Chen ( 1 996) , Duffie ( 1 996) , EI Karoui, Myneni , and Viswanathan ( 1 992a) and EI Karoui, Myneni, and Viswanathan ( 1 992b) for further discussion. (iv) Interest-rate models are often categorized as 'equilibrium or arbitrage models. As Rogers ( 1995) remarks, these approaches are essentially equiva lent (see also Duffie ( 1 996) , Exercise 9.3) .
8 . 2 . 3 Extensions: Multi-Factor Models The one-factor model is of limited use in modelling the empirical properties of the whole term structure. This motivated the extension of the approach to multi-factor models. In this context we assume that the short rate is of form
rt
=
R(Xt , t) ,
0 :::; t,
where X is an Ito process in IRd solving a stochastic differential equation of the form dXt = f..L ( Xt , t)dt + O"(Xt , t)dWt with suitable functions R, f..L , 0" and a d-dimensional Brownian motion W . We nOw have the economic interpretation that the current short rate depends On d factors (state variables) , which can be interpreted to be some macro-economic variables. Alternatively, we can use certain interest rates (medium (2 yrs) or long ( 1 0 yrs) yield) or coefficients of the short-rate process (stochastic mean reversion level or stochastic volatility) to obtain a better fit of the term structure. To price a derivative security, we can use the risk-neutral valuation ap proach, i .e. the arbitrage-free price process of a derivative with real-valued dividend function h and terminal payoff function 9 at maturity T is given by
The same Feynman-Kac-based argument as above shows that we have the PDE characterization for the price function F
'DF (x, t) - R(x, t)F(x, t) + h(x, t)
=
0, (x, t) E IRT
X
[0, TJ
8.3 Heath-Jarrow-Morton Methodology
343
with boundary condition
F(x, T )
=
g (x, T ) , x
E
JRd ,
where
VF(x, t)
=
�
Ft (x, t) + Fx (x, t)J.L(x, t) + tr [a (x, t)a(x, t) t Fxx (x, t)] .
A popular example is the multi-factor Cox-Ingersoll-Ross model with speci fication of the factors as
(or possibly a more complex volatility structure) . Here ai is as usual the speed of mean reversion, J.Li the long-term factor level and ai the factor volatility. For the case d = 2 we have
VF(x, t)
=
Ft (x, t) + FX l (x, t) ( a l (J.Ll - Xl ) ) + FX 2 (x, t) ( a 2 (J.L2 - X2) ) 2 2 + "2 [ a l xlFxIXl (x, t) + a2 x2Fx 2 X 2 ( x, t)] . 1
8 . 3 Heath- J arrow-Morton Methodology
8 . 3 . 1 The Heath-Jarrow-Morton Model Class As pointed out in Remark 8.2.2, modelling the term structure with only one explanatory variable leads to various undesirable properties of the model (to say the least) . Various authors have proposed models with more than one state variable, e.g. the short rate and a long rate and/or intermediate rates. The Heath-Jarrow-Morton method (compare Heath, Jarrow, and Mor ton ( 1992 ) ) is at the far end of this spectrum - they propose using the entire forward rate curve as their (infinite-dimensional) state variable. More pre cisely, for any fixed T ::::: T* the dynamics of instantaneous, continuously compounded forward rates f (t, T ) are exogenously given by
df(t, T )
=
a (t, T ) dt + a(t, T) dW(t) ,
(8.5)
where W is a d-dimensional Brownian motion with respect to the underly ing (objective) probability measure IP and a (t, T ) resp. a(t, T ) are adapted JR resp. JRd-valued processes (satisfying the conditions given in §8. 1 .2 ) . For any fixed maturity T , the initial condition of the stochastic differential equation (8.5) is determined by the current value of the empirical (observed) forward rate for the future date T which prevails at time o. Observe that we have defined an infinite-dimensional stochastic system, and that by construction
344
8. Interest Rate Theory
we obtain a perfect fit to the observed term structure ( thus avoiding the problem of inverting the yield curve ) . The exogenous specification of the family of forward rates { J t > is equivalent to a specification of the entire family of bond prices {p( t, > recall Lemma 8 . 1 . 1 . Furthermore, by Proposition 8. 1 . 1 we obtain the dynamics of the bond-price processes as
( , T);T);T T O}
O} ,
(8. 14) dp(t, T) p(t, T) { m(t, T)dt + S(t, T)dW(t)}, where (8. 15) m(t , T) r (t) + A(t, T) + "21 I S (t, T) I , A(t, T) It o:(t, s)ds and S(t, T) It a (t, s)ds ( compare (8.7) ) . We now explore what conditions we must impose on the coefficients in order to =
2
=
=
-
T
=
.
T
-
ensure the existence of an equivalent martingale measure with respect to a suitable numeraire. By Theorem 6 . 1 . 1 , we then could conclude that our bond market model is free of arbitrage. As a first possible choice of numeraire, we use the money-market account given B ( assuming that there exists a measurable version of f(t, t) in [0, by
T*]),
B(t)
�
exp
{j f(U'U)dU } {j r(U)dU } . �
exp
So we allow investments in a savings account too ( and we have to enlarge the set of trading strategies, compare §8 . 1 .3) . We must find an equivalent measure such that
T) p(t,B (t)T) is a martingale for every 0 :::; T :::; T*. We will call such a measure risk-neutral martingale measure to emphasize the dependence on the numeraire. Z(t ,
=
Theorem 8 . 3. 1 . Assume that the family of forward rates is given by (8.5) .
Then there exists a risk-neutral martingale measure if and only if there exists an adapted process A(t) , with the properties that (i)
1:: I A(t)1 i 2 dt < L (t)
(ii) For all 0 :::;
�
T
00,
exp
a.s. and JE(L (T) )
=
1 , with
{ -l A(u)'dW(u) � l l A(u) lI ' du }. -
:::; T, we have o:(t, T) a(t, T) J a(t, s)ds + a(t, T) A(t).
:::;
(8. 16)
T * and for all t
T
=
t
(8. 17)
8.3 Heath-Jarrow-Morton Methodology
345
Proof. Since we are working in a Brownian framework we know that any equivalent measure Q 1P is given via a Girsanov density (8. 16) ( compare §5.8 and §6.2) . Using the product rule (5.13) and Girsanov's Theorem 5.7. 1 , we find the Q-dynamics of Z(t , T) ( omitting the arguments ) as: ""
dZ
=
Z
(A
+
� I I S I 1 2 - S)") dt + ZSdW ,
(8. 18)
with W a Q-Brownian motion. In order for Z to be a local Q-martingale, the drift coefficient in (8.18) has to be zero, so we obtain
A(
1
t , T ) + 1 I S ( t, T) 11 2 2
=
S(t, T ) ,, (t) .
(8.19)
Taking the derivative with respect to T, we get
-o:( t, T) - a(t, T) S(t, T)
=
- a(t, T) ,,( t) , o
which after rearranging is (8. 17) .
As in §8.2. 1 , it is possible to interpret )" as a risk premium, which has to be exogenously specified to allow the choice of a particular risk-neutral martingale measure. In view of (8. 1 7) this leads to a restriction on drift and volatility coefficients in the specification of the forward rate dynamics (8.5) . The particular choice )" == 0 means that we assume we model directly under a risk-neutral martingale measure Q. In that case the relations between the various infinitesimal characteristics for the forward rate are known as the 'Heath-Jarrow-Morton drift condition' .
Theorem 8 . 3 . 2 (Heath-Jarrow-Morton) . A ssume that Q i s a risk-neu tral martingale measure for the bond market and that the forward-rate dy namics under Q are given by
df(t, T)
=
o:( t, T)dt + a ( t , T)d W (t),
(8.20)
with W a Q-Brownian motion. Then w e have: (i) the Heath-Jarrow-Morton drift condition
o: (t, T)
=
a ( t, T)
T
J a ( t , s ) ds ,
0 :::; t :::; T :::; T* , Q - a. s . ,
t
(ii) and bond-price dynamics under Q are given by
dp ( t, T) with S as in (8.7) .
=
p (t, T)r(t)dt + p(t , T )S ( t , T) dW (t) ,
(8.21 )
346
8. Interest Rate Theory
o Just use A 0 in Theorem 8.3. 1 . As already mentioned, calibration of a Heath-Jarrow-Morton (HJM) model is done by definition: today's forward rates are used as initial values for the stochastic differential equation (8.20) and so perfectly fitted. Even if we specify the forward-rate dynamics under a risk-neutral martingale measure Q, Theorem 8.3.2 tells us that we may freely specify the volatility structure. The drift parameters are then uniquely determined. Notice also that, since 1P and Q are equivalent, the volatility process is the same under 1P as under Q. By definition r (T) j(T, T) , and integrating the forward-rate dynamics (specified under a risk-neutral martingale measure) , we get
Proof.
==
=
r (T)
=
j(O, T)
T
T
- J a(u, T)S(u, T)du + J a(u, T)dW (u) . o
0
This implies that in general the expectation of the future short-term rate under a risk-neutral martingale measure does not equal the current value 1 (0, T) of the instantaneous forward rate. So under a spot martingale measure the unbiased forward rate form of the expectation hypothesis, IEQ (r(T) ) 1(0, T) , is not true. =
8.3.2 Forward Risk-neutral Martingale Measures
As pointed out in §6. 1 .4, for many valuation problems in the bond market it is more suitable to use the bond price process p( t, T* ) as numeraire. We then have to find an equivalent probability measure Q * such that the auxiliary process p(t, T) Z * (t , T) = ' Vt E [O, T] ,
p(t, T * ) is a martingale under Q * for all T S T* . We will call such a measure lorward risk-neutral martingale measure. In this setting, a savings account is not used
and the existence of a martingale measure Q * guarantees that there are no arbitrage opportunities between bonds of different maturities. In order to find sufficient conditions for the existence of such a martingale measure, we follow the same programme as above. By (8. 1 4 ) bond price dynamics under the original probability measure IP are given as dp(t, T)
=
p(t, T) {m(t, T)dt + S(t, T)dW (t) } ,
with m(t, T) as in (8. 15) . Now applying the product rule (5. 13) to the quotient p(t, T)/p(t, T* ) we find dZ * (t, T)
=
Z * (t, T) { m (t, T)dt + (S(t, T) - S(t, T * ) ) dW(t) } ,
(8.22)
8.3 Heath-Jarrow-Morton Methodology
m(t, T) m(t, T) -m(t, T*) -S(t, T*)(S(t, T) -S(t, T*)).
347
= with Again, any equivalent martingale measure Q* is given via a Girsanov density L( t ) defined by a function 'Y ( t) as in Theorem 8.3. 1 . So the drift coefficient of Z * ( t , under Q* is given as
m(t, T) -(S(t, T) -S(t, T* ) )-y(t).
Z*(t,T)
Now for replacing
m
T)
to be a Q*-martingale this coefficient has to be zero, and with its definition we get
T) (S(t, T*)
A( t , T* ) )
(A( t , =
+
� (J J S (t, T)J J 2 - J S ( t , T*)1 I 2 )
+ 'Y ( t ) ) (S ( t ,
T) -S(t, T*)).
Written in terms of the coefficients of the forward-rate dynamics, this identity simplifies to T'
J
a ( t , s) ds +
T
T'
� J a (t , s) ds T
Taking the derivative with respect to
a (t ,
T'
T) T) J + a (t ,
T,
2
=
'Y ( t )
T'
J a (t , s) ds.
T
we obtain
a ( t , s) ds = 'Y ( t) a ( t ,
T
T).
We have thus proved:
Theorem 8.3.3. A ssume that the family of forward rates is given by (8.5) . Then there exists a forward risk-neutral martingale measure if an d only if there exists an a dapted process 'Y ( t ) , with the properties (i) of Theorem B. 3. 1, such that a (t ,
::; T ::; T� T) T) (S(T, T*) T ::; T*) dQT I p(T)t,T)B(t) {/ S(u, T)dW(u) - � jI l S (U,T)I I ' } =
+ 'Y ( t) ) , 0 ::; t
a (t ,
We now investigate the relation between risk-neutral (Q) and forward risk-neutral (Q T for any martingale measures. Suppose we mod elled the forward rates under Q. Then by the change of numeraire technique (Theorem 6 . 1 .6) and example (ii) in §6. 1 .4, we find
1] ( t )
=
�
dQ
exp
:Ft
p(O,
dU
'
348
8. Interest Rate Theory
Using Formula (8.8) and Condition (8.2 1 ) (observe that a change of measure using ", implies W ( t) = W T (t) + J� S ( u , T) du) , we find that under Q T
T
r (T)
=
J (O, T) +
J a (u, T) dWT (u) . o
This shows that for every T, the unbiased forward rate form of the expectation hypothesis under a forward-risk neutral measure QT (we can use any T � T* ) is true. Furthermore, by the risk-neutral valuation principle, we know that we can evaluate (attainable) contingent claims X, computing discounted expectations with respect to equivalent martingale measures. Now consider the special case X = r ( T) . Using Corollary 6 . 1 . 2 we can relate the forward risk-neutral (using Q T as measure) pricing formula with the risk-neutral pricing formula (using a risk-neutral martingale measure Q) and get
This implies IEQ T [ r ( T) I Ftl
=
=
=
��) IEQ [ ! e-It r( s ) dsIFt]
p(
� �1E
p ( t, T) aT _ PT ( t , T) p ( t , T)
=
Q
[e-Itr(S)dsj Ft]
J ( t , T) .
Since r (T) = J (T, T) , we see that the forward rate is a martingale under Q T (compare Example (ii) in §6. 1 .4 for the general case of non-dividend-paying securities) . 8 . 3 . 3 Completeness
We now give a brief indication about the question of market completeness for the bond market . In contrast to previous models, we now have (theo retically) a continuum of traded assets (one for each time of maturity T) . Rather than going into detail on how to resolve the problems arising from the infinite number of basic securities, we will try to generate the finite situ ation familiar to us. (For discussion of the general problem see Bjork ( 1997) , Bjork, Di Masi, Kabanov, and Runggaldier ( 1 997) and Bjork, Kabanov, and Runggaldier ( 1 997) ) . As outlined i n §8. 1 , we would like t o choose 'basic maturities' 0 < Tl < T2 < . . . < T , and use the corresponding bond price processes as basic un derlying processes. We have to investigate whether there exist maturities such
d
8.3 Heath-Jarrow-Morton Methodology
349
that the market is complete with respect to trading strategies involving only these bonds and the savings account . This question is linked to uniqueness of the martingale measure.
Proposition 8 . 3 . 1 . The following conditions are equivalent: (i) The risk-neutral martingale measure is unique. (ii) There exist maturities T1 , , Td such that the matrix
is non-singular. (iii) There exist maturities T1 ,
•
.
•
•
•
.
, Td
such that the matrix
is non-singular. Proof. The relevant equation to consider is (8.17 ) from Theorem 8.3. 1 , namely
a (t , T ) = a(t, T ) =
T
/ a(t , s ) ds + a(t, T),(t) o
d ?= ai (t , T ) •= 1
T
d
/ ai(t , s ) ds + L ai (t , T),i (t) . 0= 1
0
Taking a and a as exogeneously given, this is for each t an infinite-dimensional linear system of equations for the determination of the d-dimensional vector ),(t ) . So we cannot freely specify all drift terms a ( ., T) and volatilities a( ., T ) . However, given the benchmark maturities Tr , . . . , Td , we are able to solve the system uniquely, i.e.
a (t, Tj)
=
d L ai(t, Tj ) 0= 1
�
/ 0
d , ai (t s)ds + ?= ai (t , Tj),i(t) , j = 1 , . . . , d. .= 1
(Note that given the ), the remaining a ( ., T ) are specified by the volatility structure. ) The equivalence of (iii) follows from Equation ( 8.19 ) in the proof of Theorem 8.3. 1 . 0 Of course, the question of completeness is now only transferred to finding such benchmark maturities. Answering this question in detail is beyond the scope of this text , and we only cite the relevant result from Bjork ( 1997 ) , Bjork, Di Masi, Kabanov, and Runggaldier ( 1997 ) and Bjork, Kabanov, and Runggaldier ( 1 997 ) .
350
8. Interest Rate Theory
Proposition 8.3.2. Assume that the functions (71 (t, T) , . . . , (7d (t, T) are de
terministic and real-analytic in the t and T variables, and furthermore, for each t are independent as functions of T. Then it is possible to choose ma turities TI , . . . , Td such that the matrix {Si (t, TJ· ) } �,J. is non-singular. Apart ' from a finite set of 'forbidden' points, these volatilities can be chosen freely as long as they are distinct.
We are now in a position to use the general theory to conclude: Theorem 8.3.4. Under the conditions of Proposition 8. 3. 2 and with respect
to bond trading strategies involving only bonds of maturities T1 , , Td de scribed by Proposition 8. 3. 2 (and the savings account), the bond market model is complete. .
•
•
Proof. By Theorem 8.3 . 1 and Propositions 8.3. 1 and 8.3.2, there exists a unique risk-neutral martingale measure. Repetition of the arguments pre sented to prove Theorem 6.2.2 shows that uniqueness of the equivalent mar tingale measure implies completeness (in the restricted sense) of the market 0 model. Remark 8. 3. 1 . From a practical point of view the theoretical question of mar ket uniqueness is far less important than the question of attainability. Recall that in an arbitrage-free market model any attainable contingent claim ad mits a unique arbitrage price given by the risk-neutral valuation formula (with respect to any martingale measure) . However, without further spec ification there may only be replicating strategies involving infinitely many bonds (maybe one has to use different bonds for every t) . In the complete setting above we know that we can replicate any contingent claim using the set of benchmark bonds for hedging.
8 . 4 P ricing and Hedging Contingent Claims
8.4. 1 Short-rate Models
We consider models under a risk-neutral martingale measure. Given a T contingent claim of the form X = !Ii ( r ( T )) ( !Ii sufficiently regular), we can evaluate the risk-neutral pricing formula (or solve the term structure equa tion, Proposition 8.2.2) to obtain the value process F(t, r(t)) of X. We can construct a Ll-hedging strategy covering ourselves against movements in the interest rate r(t) , the underlying factor, by using another financial asset of the form Y lP(r(T) ) ( lP sufficiently regular) with value process G(t, r( t ) ) The intuitive idea is that we can define a predictable process cp(t) by con sidering a portfolio with value process Vrp (t) = F(t, r(t)) + cpG(t, r(t) ) . This position has a sensitivity with respect to the short-rate given as .
=
Fr (t, r (t) ) + cp(t)Gr (t, r (t)) .
8 . 4 Pricing and Hedging Contingent Claims
351
So we choose (assume Gr =1= 0) . One can now - Fr (t, use the bank account together with the financial asset Y to construct (under technical conditions) a self-financing portfolio with initial value F(O, r(O» . This construction shows the restrictions of single-factor models since any derivative can in principle be used to (Ll)-hedge any other.
cp(t)
r(t» / Gr (t, r(t»
=
8.4.2 Gaussian HJM Framework
Assume that the dynamics of the forward rate are given under a risk-neutral martingale measure Q by J(O, T) ] (0, T), T) = +
dJ(t,
a (t, T)dt a(t, T)dW(t),
=
with all processes real-valued. We restrict the class of models by assuming that the forward rate's volatility is deterministic. The HJM-drift condition (8.21) and the integrated form of the forward rate (compare (8.8» lead to t
t
J (t, t) r (t) J (O, t) + J (-a(u , t)S(u , t » du + J a(u, t )dW(u), 0 o which implies that the short-rate (as well as the forward rates J (t, T» have Gaussian probability laws (hence the terminology). By Theorem 8.3.2, bond =
=
price dynamics under Q are given by
dp(t, T) p(t, T) { r(t)dt + S(t, T)dW(t) } , =
which we can solve in special cases (see Exercise 8.5 below) explicitly for
p(t,ToT).price options on zero-coupon bonds, we use the change of numeraire technique. Consider a European call C on a T* -bond with maturity T ::; T* and strike K. So we consider the T-contingent claim (8.23) x = (p (T, T* ) K) +. -
By Proposition 6.1.4, the price of this call at time 0 is given as C(O) p(O, T * )Q* (A) - Kp(O, T)QT (A) , with A p(T, T* ) > K} and Q T resp. Q* the T- resp. T*-forward risk neutral measure. Now T*) T) = T) has Q-dynamics (omitting the arguments and writing S* for
t
=
=
=
{w :
Z (t,
p(t, p(t,
dZ Z { S(S - S*)dt - (S - S*)dW(t) } , =
S(t, T*»
352
8. Interest Rate Theory
so a deterministic variance coefficient. Now Q* (P (T, T* )
2::
(��,r;;
K) = Q *
2::
K
)
=
Q* (Z (T, T)
2::
K) .
Since Z(t , T) is a Q T -martingale with Q T -dynamics
dZ(t , T) -Z(t , T) (8(t , T) - 8(t, T* ))dW T (t) , we find that under Q T (use the stochastic exponential, compare §5.7 and §5.8) (again 8 = 8(t, T) , 8* = 8(t, T* ) ) =
Z(T, T)
�
:��.'�:i exp
{j
- ( S - S' ),nvT (t)
- � 1 <s - S')'dt }
(with W T a Q T -Brownian motion). The stochastic integral in the exponential is Gaussian (compare §5.2) with zero mean and variance T 2 E (T) (8(t, T) - 8(t , T*)) 2 dt . =
J o
So with
d2 =
log
( i')(��' » ) - � E2 (T) . JE2 (T)
Similarly, for the first term
Z* (t , T)
=
p(t, T) p(t, T * )
has Q-dynamics ( compare (8.22))
dZ*
=
{
Z* 8* (8* - 8)dt + (8 - 8*)dW(t) } ,
and also a deterministic variance coefficient. Now
Q * (p (T, T* )
2::
K)
=
Q*
( P(/T* ) �) ::;
=
Q* ( Z* (T, T)
Under Q * Z * (t, T) is a martingale with
dZ* (t , T) = Z* (t , T) (8(t , T) - 8(t, T*))dW* (t) , so
::;
�).
8.4 Pricing and Hedging Contingent Claims
353
Again we have a Gaussian variable with the ( same ) variance E 2 (T) in the exponential. Using this fact it follows ( after some computations ) that :
Q * (p(T, T* ) � K)
=
N(dI ) ,
with So we obtain:
Proposition 8 . 4. 1 . The price of the call option defined in (8.23) is given by
C(O)
=
(8.24)
p(O, T* ) N(d2 ) - Kp(O , T)N(dI ) ,
with parameters given as above. Remark 8.4 . 1 . ( i ) Observe that the above reasoning generalizes to assets Set) in place of pet, T * ) as long as t ( t , T) = S( t ) /p ( t , T) has a deterministic volatility coefficient . ( ii ) Since we have a closed formula for the option's price above, we can com pute sensitivities with respect to the underlying in much the same way as in the standard Black-Scholes model. In particular, we can construct .d-neutral hedging portfolios. 8.4.3 Swaps
This section is devoted to the pricing of swaps. We consider the case of a forward swap settled in arrears. Such a contingent claim is characterized by: • • • •
a fixed time t, the contract time, dates To < TI , · · · < Tn , equally distanced R , a prespecified fixed rate of interest , K, a nominal amount.
Ti +1 - Ti
=
<5,
A swap contract S with K and R fixed for the period To , . . . Tn is a sequence of payments, where the amount of money paid out at Ti + l ' i = 0 , . . , n - 1 is defined by .
The floating rate over
[Ti ' Ti+l l observed at Ti is a simple rate defined as
354
8. Interest Rate Theory
We do not need to specify a particular interest-rate model here, all we need is the existence of a risk-neutral martingale measure. Using the risk-neutral pricing formula we obtain (we may use K 1 ) , =
n lE It r( s ) d s c5(L(Ti , Ti ) - R) i Ft ] JI (t , S) L Q [el i= t lEQ [lEQ [ e- I�i_l r(s) ds I FTi_ l ] e- It r(s) ds ( p(Ti -1b Ti) - (1 + c5R) ) 1 Ft] n n ( 1 c5R) ( 1) ( ( , Ti To) L ( Ti)) L p t, t, p t + P = i= l - i=1 CiP(t, Td , with Ci c5R , i 1 , - 1 and Cn 1 + c5R. So a swap is a linear combi nation of zero-coupon bonds, and we obtain its price accordingly. This again =
=
x
=
=
=
. . . , n
=
shows the power of risk-neutral pricing. Using the linearity of the expectation operator we can reduce complicated claims to sums of simpler ones. 8 .4.4 Caps
An interest cap is a contract where the seller of the contract promises to pay a certain amount of cash to the holder of the contract if the interest rate exceeds a certain predetermined level (the cap rate) at some future date. A cap can be broken down in a series of caplets. A caplet is a contract written at in force between the nominal amount is K, the cap rate is denoted by The relevant interest rate (LIBOR, for instance) is observed in and defined by
t,
R.[To, TIl , c5 TI - To, =
To
A caplet C is a TI-contingent claim with payoff X we have L 1+ Writing (assuming K 1) and
L L(To, To),p p (To, TI), R* =
=
=
=
The risk-neutral pricing formula leads to
c5R,
=
Kc5(L(To, = ( 1To)- p-)/(c5R)+.p),
355
8.4 Pricing and Hedging Contingent Claims
JIc(t) IEQ [e- Itl r(s ) ds (� - R*) +1 Ft] IEQ [IEQ [ e - I�l r ( S ) dS I FTo ] Ito r (s )ds (� - R*) + 1 Ft =
e-
=
=
=
IEQ [P(TO, TI ) e- Ito r (s ) ds (� R* ) + 1 Ft IEQ [ e- Ito r( s )ds (1 - pR* ) + 1 Ft] -
So a caplet is equivalent to and strike 1 /
R*.
]
]
R* put options on a T1 -bond with maturity To
We discuss briefly this widely used approach. Recall that a caplet is a contingent claim. Instead of using a risk-neutral probability measure Q, we will now use a risk-neutral T1 -forward measure. Now 1 Black's Model for Caps.
T1
p(To, T1) 1 + 8£(To, To) and so (for the current time t) - T1) ' To) Z(t, 1 + 8 £ (t, To) p(t, P ( t, T1 ) Under QTl, Z(t, T1) is a martingale with dynamics dZ(t, Td Z(t, Td(S(t, To) - S(t, T1))dWTl (t) =
=
=
=
(with W Tl a Q Tl -Brownian motion). So
d£(t, To)
Z(t, T1)(S(t, To) - S(t, T1))dWTl (t) 8- 1 (1 + 8£(t, To))(S(t, To) - S(t, T1))dWTl (t) £(0, To) E (t, To)dWTl(t), 8£(t, To) (S(t, To) - S(t, T1)) . E (t, To) 1 8£(t, To) =
r1
=
=
with
=
+
So under the assumption that the volatility of the forward is deterministic (as e.g. in a Gaussian HJM-framework) we can repeat the arguments in §8.4.2. We find
356
8. Interest Rate Theory
(8. 25) where
d1
=
d2
=
log [ £ ( t, To ) jR]
+ 0"� (t, To) j2 , (t, O" B To ) d 1 - O" B (t, To ) ,
and
0"1 (t, To )
To
=
J
(E( u , To)) 2 duo
t
Remark 8. 4 . 2. If we want to price a cap using the above forward risk-neutral approach, we need a method of modelling all ( relevant ) forward LIBOR rates simultaneously. The solution to this challenging problem is presented in the following section.
8 . 5 Market Models of LIB O R- and Swap-rates
The so called market models of LIBOR and Swap-rates have enjoyed increas ing popularity during the last few years. The main reason for the success of this approach to term structure modelling, developed in a series of papers by Miltersen, Sandmann, and Sondermann (1997) , Brace, Gatarek, and Musiela (1997) and Jamshidian (1997) , can be seen in its consistency with the market practice of pricing caps, floors and swaptions by means of the Black-formula, while at the same time providing a consistent and coherent framework for the joint modelling of a whole set of forward rates. Further aspects that set the class of market models apart from other interest-rate models is the relative ease of calibration to market data ( e.g. term structures or cap-prices ) , and the use of discretely compounded forward LIBOR and forward swap rates - both directly observable in the market - as fundamental quantities in the mod elling process. By contrast , traditional short-rate or HJM-models are based on the description of the arbitrage-free dynamics of continuously compounded instantaneous short or forward rates, which are not market observables. 8 . 5 . 1 Description of the Economy
We start by defining the tenor structure T = {To , . . . , Tn } as a set of matu rities Ti with 0 = To < < . . . < Tn , where Tn is the time horizon of our economy. A given tenor structure T is associated with a set of { 71 ' . . . , 7n} of year fractions, where 7i = Ti - Ti - 1 , i = 1 , . . . , n . We assume that in the financial market under consideration, there exist zero-coupon bonds p ( . , Ti) o f all maturities Ti , i = 1 , . . . , n . The discretely compounded forward LIBOR rate prevailing at time t over the future period from Ti - 1 to Ti is defined by
T1
8 . 5 Market Models of LIBOR- and Swap-rates
L( t, "'. .L t - l )
=
357
p (t, Ti- d - p (t, Ti ) ' TiP ( t , Ti )
We assume further that we are given a stochastic basis (on, IF = (Ft ) , F, IP) , on which a d-dimensional Brownian motion W = (W(t) )tE[O, nl is defined. T The filtration IF is the IP-augmentation of the natural filtration generated by
W.
8.5.2 LIBOR Dynamics Under t he Forward LIBOR Measure We are now in the position to introduce the important notion of the forward LIBOR measure.
Definition 8.5. 1 . Let Tk E T. We call a IP-equivalent probability measure QTk the forward LIBOR measure for the maturity Tk , or more briefly Tk
forward measure, if all bond price processes
( p(t,( , TkTi)) ) tE [O Pt
,min { Ti , Tdl
'
i
=
1,
..
.
, n,
relative to the numeraire p ( . , Tk ) follow local martingales under QTk . Cor respondingly, the tuple (QTk , p ( . , Tk ) ) is called the numeraire pair for the maturity Tk .
-
Now - bearing in mind the definition of L(·, ·) it can be easily observed that the process (L(t, Tk-1 ) ) tE [O, _ l l o f the forward LIBOR over the period Tk from Tk- l to Tk is a local martingale under QTk . So if we place ourselves in a diffusion setting, we can posit , for every k E { I , . . . , n } , the following driftless dynamics under the respective forward LIBOR measure QTk :
dL(t, Tk- l )
=
=
L(t, Tk_l )a(t, Tk- t } . dW Tk (t) d
L(t, Tk - l ) L ai (t, Tk _ l )dWrk ( t) , i= l
o :=:; t :=:; Tk- l , (8.26)
..
where WTk is a d-dimensional Brownian motion with respect to IF under the measure QTk and has the instantaneous covariance matrix P = (Pij kj = l , , d E �d X d , i.e. quadratic covariation d
[Wrk , WTk ] ( t)
=
Pij dt,
i, j
=
1, . .
.
, d.
For simplicity, we assume that a : [0, Tn- d x {TI , . . . , Tn-I } -+ �i is a bounded and deterministic function, with a(·, ·) = (al ( - , . ) , . . . , ad ( ' , ' ) ) a row vector. The unique strong solution of SDE (8.26) is easily seen to be
358
8. Interest Rate Theory
L(t, Tk- l )
=
L(O , n -l )
· exp
(i
a(" Tk_ , ) dWTH ( ,l
-
�!
)
II *' Tk- , l ll ' d, .
o s t S Tk- 1 , which is a strictly positive martingale assumin (, ( 0, Tk-d is positive. Obviously, L(t, Tk- d is lognormally distributed for a S t S Tk- l , a property that we will make use of later on in this section.
Our next step will now be to derive the dynamics of L ( · , Tk- d under the QTi forward-measure, with Ti E T. In the course of the derivation, we will exploit the following version of Girsanov's Theorem ( Hunt and Kennedy ( 2000 ) , Theorem 5.20; cf. Theorem 5.7.1 here ) : Theorem 8 . 5 . 1 ( Girsanov) . Suppose W is a d-dimensional (IF', lP) Brow
nian motion, Q rv lP, and the strictly positive lP-martingale ( with ((t) f}; 1Ft has a continuous version. Then Z defined by
=
is a d-dimensional (IF', Q) Brownian motion.
This leads us to the main result of this section. Theorem 8 . 5 . 2 . Take Tk E { T1 , , Tn } as fixed and assume .
dL(t, Tk- l )
=
•
.
L (t, Tk-da(t, Tk- l ) . dWTk (t) ,
where WTk is a d-dimensional (JF', QTk ) Brownian motion with instantaneous covariance -matrix p = ( Pij k j =I, . . . , d E jR d x d . Then the following relations for the LIBOR dynamics under the forward measure QTi , Ti E T, hold:
(. L 1 k
)
TjL(t , Tj- d pa(t, Tj- d , dt + dW T, (t) , T _ T L(O , ) + ) 1 j=i + l )
i > k : dL(t, Tk- l )
=
L(t, Tk-da(t, Tk-d
.
(
�
- .�
J =k+ l
)
TjL(t, Tj _ 1 ) pa ( t, Tj_1 ) ' dt + dWTi ( t ) , ' L( 0, T). - 1 )
l + T)
where 0 S t S min { Ti , Tk-l } and WTi is a d-dimensional (JF', QTi ) -Brownian motion with instantaneous covariance matrix p.
8 . 5 Market Models of LIBOR- and Swap-rates
359
Proof. First we consider the case i < k. We may assume that k ? 2, because k 1 leads to i ° and ° ::; t ::; To 0, which is trivial. We start by deriving the LIBOR dynamics under IQTk - l . In order to obtain the Radon-Nikodym derivative of IQTk with respect to IQTk -l , we employ the change-of-numeraire technique (see §6.1 .4) : p(O, Tk)p(Tk -b Tk - l ) = p(Tk -b Tk)p(O, Tk - d 1 + TkL(Tk -l ' Tk -d . 1 + Tk L(O, Tk - l ) Now for ° ::; t ::; Tk - 1 , =
=
=
=
where the last equation follows because L(·, Tk - 1 ) is a IQ Tk -martingale. The above version of Girsanov's Theorem tells us that the process W Tk -1 defined by W :k - 1 W!'k - W Tk X Z = 1 , . . . , d, =
•
with
•
[
.
]
"
.
Tk - 1 ) d ( 1 + TkL(s, Tk - d ) J 11 ++ TkL(O, TkL(s, Tk-d 1 + TkL(O, Tk -1 ) t
X ( t)
=
o
t
=
J 1 + TkL�O, Tk - l ) d ( 1 + TkL(s, Tk - l ) ) o
t
=
Tk - d J 1 TkL(s, TkL(O, Tk - d +
()"
( S, 'T' .L k
o is a QTk - l -martingale. Simplifying, we get
- l ) . dWTk ( s )
360
8. Interest Rate Theory
or in vector form dWTk - 1 ( t ) = dWTk ( t )
_
TkL(t , Tk -l )
1 + TkL(O, Tk-d
, Tk -d - 1 +TkL(t TkL(O,
dWTk ( t ) O' ( t , Tk ) dWTk ( t )
dWTk ( t ) dWTk ( t) ' O' ( t , Tk) ' Tk-d TkL(t, Tk - l ) O' ( t Tk) ' dt . dWTk (t) 1 + TkL(O, Tk - l ) p , For the LIBOR dynamics under QTk - l we obtain dL(t, Tk - d L(t, Tk - dO'(t, Tk- 1 ) . dWTk (t) = L(t, Tk -l )O'(t , Tk - d TkL (t, Tk- l ) O' ( t, Tk ' dt + dWTk - l ( t , ) . -l ) 1 + TkL ( 0 , Tk 1 ) p = dWTk ( t ) =
=
(
_
-
)
which completes the proof for the case i = k 1 . The corresponding result for general i < k 1 can now easily be obtained via backward induction by repeating the above argument. The case i > k can be handled in a similar 0 fashion.
-
An immediate consequence of the above theorem is that a forward LIB OR process L(· , Tk -1 ) is a lognormal martingale only under its respective forward measure QTk . Put differently, there exists no measure under which all LIBORs are simultaneously lognormal. We further remark that, while it can be shown that there exists a solution for the stated system of SDEs, it is not known analytically. As a consequence, one has to resort to numerical methods such as Monte Carlo simulation when pricing certain complex derivatives that depend on the simultaneous realization of several LIBOR rates in the above setup. The relation given in the following corollary is central for the simulation of the LIBOR market model, as for simulation purposes one measure QTk has to be chosen, under which all forward LIBOR rates have to be evolved simultaneously. Corollary 8 . 5 . 1 . Under the assumptions of the above theorem, we find the following relation between the Brownian motions WTk and WTk - 1 under the respective measures QTk and QTk - l : dWTk - 1 ( t )
=
dWTk ( t )
Tk - d O' ( ' dt - 1 +TkL(t, TkL(O, Tk- d p t , Tk) .
Supposing that we choose e.g. the terminal measure QTn as reference, the above corollary helps us to inductively construct the LIBOR processes L(·, Tn - d , L(·, Tn - 2 ) etc. under QTn • For further information about the sim ulation of LIBOR dynamics in a market model, we refer the reader to Brace, Musiela, and Schlagl (1998) , Hull and White (2000) and Rebonato (2002).
36 1
8 . 5 Market Models of LIBOR- and Swap-rates
8 . 5 . 3 The Spot LIB OR Measure We define the price process of an implied savings account by
B(Tk )
k- 1 =
k- 1 =
1
iII =O P (�.. , T-. +1 )
iII =O
( 1 + Ti + 1 L(Ti , Ti) ) ,
k
E T,
with B(O) = 1 . B (Tn ) can be interpreted as the Tn-value of a trading strategy that can be described as follows. At time To , invest one unit of currency in the bond maturing at T1 . At time Tb invest the proceeds, i.e. l/p (To , T1 ) , i n the bond maturing at T2 , and so on. Repeating this so called roll-over strategy in the bond with the shortest time to maturity available generates a value of B(Tn) in Tn . We define a measure QS Q Tn by rv
QS is called the spot LIBOR measure. One can easily prove that for
k :::;
n,
P(Ti ' Tk)
=
JEQs
(;��:� I
( P(TiB(T, Tki ) ) )
FTi
0 :::; i :::;
).
In other words, the discrete-time processes
0::; i::; k
are QS-martingales with respect the to filtration (FTJ O ::;i::;k . This shows that we can interpret the spot LIB OR measure QS as risk-neutral martingale measure. This leads us to the following
Theorem 8 . 5 . 3 . The forward LIBOR dynamics under the spot LI BOR measure are described by
dL(t, Tk- 1 )
=
L(t, Tk- 1 ) a (t, Tk -1 ) .
(� �
j=f3(t)
Tj L(t, Tj-d
I
1 + TJ. L (O ' TJ._ 1 ) pa ( t , Tj- 1 ) dt + d
s W ( t)
)
,
where f3 ( t) = 1 for Tl -2 < t :::; Tl- b 0 :::; t :::; Tk- b and W S is a d -dimensional (IF, QS ) Brownian motion with instantaneous covariance-matrix p .
The proof proceeds in a similar fashion as the proof of the corresponding result for the dynamics under the forward LIBOR measure and is therefore omitted.
362
8. Interest Rate Theory
8.5.4 Valuation of Caplets and Floorlets in the
LMM
A caplet with reset date Tk and maturity Tk + l and strike rate K, or briefly a Tk -caplet with strike K, is a derivative that pays the holder at time Tk+ l . So a caplet can be regarded as a call option on a LIBOR rate, or, equivalently, an insurance against interest rates rising above a certain level. We are now interested in the arbitrage-free price of a Tk -caplet. If we choose to work under the QTk +l-measure, we know from the preceding sections that L (t, T, )
�
(j
L(O, T, ) exp
u (s, T, ) . dW T• (s)
�
-
for 0 ::; t ::; Tk , and therefore with
!
lIu (s , T' ) II'
)
Tk
m =
log £(0, Tk )
and s2
-
� f 1100 (s, Tk ) 1I 2 ds I
o
Tk
=
f O'(s, Tk ) PO'(S, Tk ) ' ds. o
Before proceeding, let us briefly comment on the calculation of the above variance. Using the Ito isometry, we get Yar(log L(T" T,) )
(1 (t 1 .t Cov (1 .t ( 1
�
Yar
�
Yar
u (s, T,) · dWT. (S)
U i (S, T,) dW h S)
E
t ,J = l
Tk
=
0
O'i (S, Tk )dWtk (s)
f O'(S, Tk ) PO'(S, Tk )' ds. o
)
O' i (S, Tk )dWtk (s) ,
0
t ,J = l
=
)
10'j (S, Tk )dWJk (S)) 0
10'j (S' Tk )dWJk (S) ) 0
8.5 Market Models of LIBOR- and Swap-rates
363
With this in mind, calculating today's caplet price, which we denote by C(O, TK , K) , is an easy task:
with d1
=
and d2 =
log(L(O, Tk ) / K) + s 2 /2 s
log(L(O, Tk )/K) - s 2 /2 . s
This is the Black formula for caplets. Similarly, One can show that today's price of a Tk -floorlet with strike K, i.e. a derivative that pays the holder at time Tk+ l is with d1 and d2 defined as above. 8 . 5 . 5 The Swap Market Model
An interest rate swap (IRS) is a contract to exchange fixed against floating payments, where the floating payments typically depend on LIB OR rates. An IRS is specified by its reset-dates Tn , T + 1 , T{3 -1 , its payment-dates To. + l , " " T{3, and the fixed rate K . At every Tj E {To. + l , " " T{3} , the fixed payment is Tj K with Tj = Tj - Tj 1 , while the floating payment is Tj L (Tj - 1 , Tj - 1 ) . That is, the floating payment in Tj is already determined in Tj - 1 The set of fixed (floating) payments is called the fixed (floating) leg of the swap, and the party that makes the fixed (floating) payments is said to hold a payer (receiver) swap. The value of a swap in t :=::; To. can be deter mined without making any distributional assumptions on the LIB OR rates, as the following considerations show. The Tj _1-value of the floating payment Tj L(Tj _1 , Tj - 1 ) that is paid (respectively received) at time Tj is o.
-
.
"
"
364
8. Interest Rate Theory
and therefore the t-value (t ::::; Tj - 1 ) of the floating payment must be pet, Tj - d - pet, Tj ) . It follows that the value of a payer swap in t ::::; Ta is
�
�
L
i =a+l � =
=
L
(p(t, Ti- d - pet, Ti ) ) -
i =a+l
- (1 + TiK)p(t, Ti) )
L
(p(t, Ti - d
L
pet, Ti )Ti (L(t, Ti- d - K) ,
i =a+l � i=a+l
pet, Ti) Ti K
(8.27)
or alternatively �
L
�
i =a+l =
L
(p(t, Ti- 1 ) - pet, Ti ) ) -
pet, To,) - pet, T� )
-
�
L
i =a+l
i =a+l
pet, Ti ) TiK
pet, Ti h K,
(8.28)
since other prices obviously give rise to arbitrage opportunities. Accordingly, the value of a receiver swap is �
L
i=a+l
pet, n ) Ti (K - L(t, Ti - d ) �
= pet, T� ) - pet, Ta )
+ L
i =a+ l
pet , Ti )Ti K.
The forward swap rat e (FSR) at time t of the above IRS , which we denote by Sa , � (t) , is the value for the fixed rate K that makes the t-value of the IRS zero. Sa , � ( t) can thus be obtained by equating Expression (8.27) to zero and solving for K, which gives �
Sa , � (t)
=
L
i=a + l
w i (t) L (t, Ti - 1 )
with
or equivalently by equating
(8.28)
to zero, which gives
(8.29)
8 . 5 Market Models of LIBOR- and Swap-rates
365
p (t, To) - p(t, T(3 ) (3 L: i = o + l Tip(t, Ti) Equation (8.29) shows that the FSR can be expressed as a suitably weighted average of the spanning forward LIBORs. Now we introduce swap options, or swaptions for short, which, along with caps and floors, constitute the most popular instruments in the interest-rate derivatives world. A European payer swaption gives the holder the right to enter a swap fixed-rate payer at a fixed rate K ( the swaption strike) at a future date that normally coincides with the first reset date To of the underlying swap. Similarly, a European receiver swaption gives the holder the right to enter a swap as fixed-rate receiver. To express the value of a payer swaption as a function of the swap rate, first notice that the following relation, that can easily be verified, holds for the t-value of a payer-swap: for o ::; t ::; To , (3 {3 (t, L p(t, Ti)Ti ( L Ti -1 ) - K) (So, {3 (t) - K) L TiP (t, Ti) . (8.30) i=o + l Needless to say, a similar formula can be derived for receiver swaps. The advantage of the expression on the right-hand side of Equation (8.30) over the expression on the left-hand side ( which is our Formula (8.27)) is that one can instantly tell from So,{3(t) if the t-value of the payer-swap is positive or negative, which is not at all obvious from Formula (8.27) . At the maturity date To , a payer swaption is exercised if and only if the value of the underlying swap is positive, which is the case if and only if So,{3(To) - K > 0 holds. Clearly, the payer-swaption-value in To is So ,{3 ( t )
_
•
-
as
=
(3
(So,{3(To) - K) + L Tip(To, Td , i= o + l and the receiver-swaption-value is {3 (K - So, {3(To)) + L TiP (To, Ti) . i=o + l Now we turn to the problem of modelling the dynamics of the swap rate, and the related question of deriving a pricing formula for swaptions. First, we observe that AO,{3(t )
(3
L Tip(t, Ti) i=o + l is the t-price of a a portfolio of bonds ( i.e. a traded asset ) . Consequently, A O, {3 (t) , which if known as accrual factor or present value of a basis point, can be used as numeraire. Now note that =
366
8. Interest Rate Theory
Sa,{3 (t)
=
p(t, Ta ) - p(t, T(3 ) , A a , {3 (t)
where the numerator can be regarded as the price of a traded asset as well. We conclude that , in order for our model to be arbitrage-free, the swap rate Sa,{3 (') has to be a martingale under the numeraire pair (Qa,{3 , A a,{3 ( ' ) ) ' Qa , {3 is the so-called forward swap measure. To proceed, we assume that Sa,{3 (' ) follows a lognormal martingale:
where u is a deterministic function and Wa ,{3 (') is a standard Qa,{3-Brownian motion. The fact that the forward swap rate Sa,{3 (t) is lognormally distributed under Qa,{3 motivates the name lognormal forward swap model. This leads us to the following
Theorem 8.5.4. The price of a payer swaption as specified above in a log
normal forward swap model is consistent with Black 's formula for swaptions and is thus given by with
and d2
=
d 1 - E(Ta ) with E 2 (Ta )
J U(s) 2 ds. To
=
o
The price of a receiver-swaption is given by Proof. For the payer-swaption, we have
PS ( 0, { Ta , Ta , . . . , T{3 } , K ) = Aa,{3 ( O ) lElQo,{3 = =
( (Sa,{3 (TaA) a,{3- K)(Ta+)Aa,{3 (Ta ) )
A a , {3 (O)lElQo,{3 ( (Sa , {3 (Ta ) - K ) + ) A a,{3 (O) ((Sa,{3 (0 ) N(d 1 ) - KN(d2 ) ) ,
and it is evident that our judicious choice of the numeraire simplifies the calculations considerably. The same argument goes through for the receiver 0 swaption.
8.5 Market Models of LIB OR- and Swap-rates
367
So far, we have only been concerned with the evolution of one swap-rate. While this is sufficient for the purpose of swaption-pricing, the pricing of more complicated derivatives whose value depends on more than one swap rate necessitates the modelling of the simultaneous evolution of a set of swap rates under one probability-measure. This leads us to the class of Swap Market Models (SMMs), which are arbitrage-free models of the joint evolution of a set of swap rates under a common probability measure. The modeler has some freedom of choice when it comes to determining the set of swap-rates that is to be modelled. Examples are: 1 . The set of swap-rates {Sa,a + ! (t) , Sa+!,a + 2 (t) , . . . , S(3 - 1, (3 (t) , } . As this is exactly the set of LIBOR rates {L(t, Ta) , L (t, Ta + l ) ' . . . ' L(t, T(3 - 1 ) } ,
this choice leads us back to the LMM framework. 2. The set of swap-rates {Sa,a + ! (t), Sa,a + 2 (t) , . . . , Sa, (3 (t) } . 3. The set of swap-rates {Sa ,(3 (t) , Sa +l ,(3 (t) , . . . , S(3 - 1 , (3 (t) } . All of the above choices have in common that they fully describe the LIBOR structure from Ta to T(3 , so that all products that can be priced in a LMM can also be priced in a SMM ( and vice versa, as the LIBORs also determine the swap rates ) . Which of the above sets is chosen of course depends on the concrete application or the interest-rate derivative to be priced. The ideas and tools underlying the construction of a SMM are very similar to those of a LMM, and so are the results ( e.g. measure relationships ) . The interested reader might want to consult Hunt and Kennedy ( 2000 ) , Pelsser ( 2000 ) and Rutkowski (1999) for further details. 8.5.6 The Relation Between LIBOR- and Swap-market Models
As already remarked, all products that can be priced in a LMM-framework can in principle also be priced in a SMM-framework and vice versa. This is due to the fact that swap rates can be expressed as weighted sums of LI BOR rates, and LIBOR rates can be expressed as functions of swap-rates. However, this does not mean that these model-classes are equivalent. As we have seen, one of the main advantages of the lognormal LMM is that it prices caplets with Black's caplet formula, which is the market standard. Accord ingly, the lognormal SMM prices swaptions with Black's swaption formula, which is also standard among practitioners. At this stage, a natural question is whether lognormal LMMs yield swaption-prices that agree with Black-prices, and whether lognormal SMMs give Black-consistent caplet prices. In general, the answer is negative. The reason for the incompatibility of lognormal LMMs and lognormal SMMs lies in the fact that if LIBOR rates are lognormal under their respective forward-measures, swap rates ( that are determined by these LIBOR rates ) cannot be lognormal under their respective forward-swap mea sures and vice versa. Even though this might seem disappointing at first, as it
368
8. Interest Rate Theory
implies that neither the lognormal LMM nor the lognormal SMM can price both caplets and swaptions according to the market standard, it does not constitute a major drawback or practical limitation. The incompatibility is mostly of a theoretical nature, as it can be shown ( for example by simulation studies ) that swap rates in the lognormal LMM are "almost" lognormal, and therefore swaption-prices in the lognormal LMM ( which usually have to be obtained by Monte Carlo simulations, as no closed formulas exist ) are very similar to those one would obtain by the Black formula. For more information on the incompatibility problem, the interested reader is referred to Rebonato ( 1999 ) and Brigo and Mercurio (2001). 8 . 6 Potential Models and the Flesaker-Hughst on Framework
The aim of this section is to present the basic ideas of the potential approach to term structure modelling as set out in Rogers ( 1997) , and to develop connections with the positive interest rate models formulated by Flesaker and Hughston ( 1997) and Flesaker and Hughston ( 1996b ) . Our discussion builds on Rogers ( 1997 ) , Jin and Glasserman (2001) and Hunt and Kennedy (2000) . 8.6.1 Pricing Kernels and Potentials
Our starting point is a filtered probability space ( il, F, ( Ft ) , IP ) . We assume that ( Ft ) is the IP-augmentation of the natural filtration generated by a continuous and positive process process ( at ) . Define t At = a ( t) dt .
J o
As A ( At ) is an increasing process, Aoo = limt-+oo At exists. We assume that lE ( A ;" ) < 00 holds. Define the pricing kernel ( Zt ) as Zt = lE ( Aoo I Ft ) - At , t E [0, 00] . Without loss of generality, we may assume that Zo lE ( Aoo ) = 1 . Because lE ( Zt I Fs ) S Zs for 0 S s S t S 00, ( Zs ) is a supermartingale. Furthermore, limt-+oo lE ( Zt ) = 0 for obvious reasons. Stochastic processes characterized by these two properties are called potentials. =
=
We are now in the position to define zero coupon bond prices by (8.31)
8.6 Potential Models and the Flesaker-Hughston Framework
369
Observe that bond prices tend to zero with increasing time to maturity as
1 0. Tlim -+oo -+oo p (t, T) Tlim Zt IE(A:;o - A T I Ft ) For the bond prices in t 0, we get =
=
=
p(O, T } lE(ZT ) �
� l - lB(Ar}
�
l - JE
(1 ) � l 0,
d'
l-
lB(o.} d'
where we have used Fubini's theorem. Differentiating both sides with respect to T leads to
fJ fJT P (O , T)
=
- IE(aT ) '
The above relation can be used to calibrate the model to an observed bond price structure. As is well-known from the treatment of the HJM framework, the specifica tion of bond-price dynamics implicitly gives rise to the dynamics of forward rates and vice versa. Using the well-known relation between bond-prices and instantaneous forward rates, we can derive the forward-rate-dynamics in the above model:
Here we have used that
by Fubini 's theorem for conditional expectations; differentiating both sides then gives
IE(aT IFt)
=
fJ fJT IE(AT I Ft) .
Obviously, the forward rates f(t, T) are strictly positive, so that the bond prices p (t, T) = exp f ( t , s ) ds are strictly decreasing in T. Further more, using r(t) = f ( t , t), the short rate is simply r(t) = a(t)/Zt .
( - It
)
370
8. Interest Rate Theory
Before proceeding, let us briefly touch on the question of absence of ar bitrage in our framework. From equation (8.31), it is immediately clear that all normalized bond price processes (p ( t , T)Zt ) are lP-martingales. However, this does not imply that the model is arbitrage-free, as the process (ljZt) does not necessarily represent the price process of a traded asset. Neverthe less, Rutkowski ( 1999 ) is able to prove that in our framework, there exists an implied money market account B (Bt) , B being an increasing process of finite variation, and a corresponding risk-neutral measure lP* , such that all bond price processes discounted by B are lP* -martingales, which proves that the model excludes arbitrage opportunities. =
8.6.2 The Flesaker-Hughston Framework
This section presents the basic ideas of the positive interest rate framework introduced by Flesaker and Hughston ( 1997) and Flesaker and Hughston ( 1996b ) , and shows how their approach relates to the potential approach de scribed above. Flesaker and Hughston introduce a general class of term structure models by postulating that bond prices are of the form
f; ¢(s)Mts ds , 0 :::; t :::; T < 00 , (8.32) Jroo t ¢(s)Mts ds where ¢ is a deterministic and positive function, and (MtT k�o is a family of positive martingales indexed by T with respect to a probability measure lP and a filtration (Ft). Both integrals in (8.32) are assumed to be finite almost surely. From the definition, it is evident that bond prices are strictly decreas ing in T, Le. interest rates are always positive. The potential approach and the Flesaker-Hughston approach are closely related as the following theorem shows. p
( t, T)
=
Theorem 8.6. 1 . (i) Given a Flesaker-Hughston-model, define the pricing
kernel (Zt) as
00
Zt
=
J ¢(s)Mts ds. t
Then (Zt) is a potential and the zero coupon bond prices defined by
are of the form of Equation (8. 32}.
8.6 Potential Models and the Flesaker-Hughston Framework
371
(ii) Conversely, given a (9t ) -adapted, positive and continuous process (at ) and a positive and increasing process (At) with A t = f; as ds, define a potential ( Zt) by setting Zt = IE(A oo I9t ) - At . Then there exist a deterministic and positive function fjJ, and a family of positive ( !P, Wt ) ) martingales (MtT k:�o indexed b y T with M( O, T ) 1, such that Zt can be represented as
=
00
Zt
=
! fjJ( s )Mts ds . t
Consequently, bond prices are of the form of Equation (8. 32}.
Proof. ( i )
With Fubini's theorem for conditional expectations, we get
=!
00
T
and therefore
00
fjJ( s ) IE (Mts IFt ) ds
=
! fjJ( s)Mts ds
T
The potential-property of ( Zt) is apparent. ( ii ) Observe JE ( ZT I9, )
�
JE (A oo 19, ) - JE (A TI9, )
�
JE
and set MtT
=
(1 ) l a . ds 9,
IE (aT I 9t) and fjJ( t ) IE(aT )
=
�
JE ( <>. '9, ) ds
IE(at ) .
Then MOT 1 and (MtT k::o is a family of strictly positive ( !P, ( 9t)) martingales indexed by T, while fjJ is a deterministic and positive function. This leads to =
00
Zt
=
which completes the proof.
IE( Zt I 9t)
=
! fjJ( s )Mts ds, t
o
372
8. Interest Rate Theory
Remark 8. 6. 1 . Observe that (i) above, together with the comment on the no arbitrage property of potential models in the preceding section, shows that Flesaker-Hughston models are arbitrage free. Probably the most popular subclass of term structure models in the Flesaker Hughston framework is the class of rational models. The building blocks in this case are a positive (IP, (Ft))-martingale (Nt } t? o with No 1 and ( Ft ) the IP-augmentation of the natural filtration generated by (Nt) , two positive and deterministic functions f, 9 : � + -+ �+ with f (t) + g ( t ) 1 Vt 2: 0 and a positive and deterministic function ¢ � + -+ �+ defined by ¢(t) - itp(O, t) , where we assume that p(O, t) is strictly decreasing in t. We now introduce a family (Mt T )t?o of strictly positive martingales by setting =
=
=
:
Mt T
=
f (T) + g (T)Nt .
Then Formula (8.32) immediately yields the bond prices p(t , T)
=
�
I ¢(s) Mts ds t ¢(s)Mts ds
I
=
�
I ¢(s) ( f (s) + g (s)Nt) ds It ¢ (s) ( f ( s ) + g ( s) Nt) ds
=
F (T) + G ( T ) Nt F (t) + G(t) Nt
with positive and decreasing functions F and G defined by 00
F(T)
=
00
J ¢(s) f ( s) ds and G (T) J ¢(s) g (s) ds . =
T
T
A nice feature of this class of models is that the consistency with the initial term structure is already guaranteed by its very construction, because p(O, T) (F(T) + G(T) ) j (F(O) + G(O) ) , and one easily checks that F(T) + G(T) p(O, T) and F(O) + G(O) 1. It can also be shown, see e.g. Flesaker and Hughston (1997) , that both bond prices and interest rates are bounded above and below: =
=
=
G (T) G(t)
<
-
p(t , T)
<
-
F (T) F (t)
and
_
F ' (T) F (t)
<
-
rt
<
-
_
G' (T) G(t) '
The (still quite general) class of rational models can be further restricted by making assumptions on the martingale (Nt ) . A convenient choice that leads to closed-form expressions for caps and swaptions is to assume that (Nt ) follows a lognormal martingale with initial condition No 1 . For more information on the class of the resultant rational lognormal models, we refer the reader to Flesaker and Hughston (1996a) . =
Exercises
8 . 1 We modify the approach outlined in §8.2.3 to calibrate the following slightly simplified H ull-White model:
Exercises
dr
=
373
( a (t) - ,8r)dt + 'Y dW,
where ,8 and 'Y are deterministic constants, which we assume to be known. We are given the empirical term structure {p ( O, T ) j T 2: O} and the problem is to choose rp in order to fit the model. 1 . Find the functions A(t, T) , B(t, T) from the term-structure equation. 2. Justify that forward rates under an affine term structure are given by 1(0, T) = Br (O , T)r(O) - A r ( O, T ) , and compute the forward rates in the specific example. 3. From the observed forward-rate curve } (O, T ) solve 2. for a( t) . 4. Show that theoretical bond prices in the above version of the Hull-White model are given by p(t, T)
=
Po
{
A
2 2 exp B(t , T ) / (O, t) - 'Y B (t , T) (1 - e- 2(3) - B(t, T)r(t) , 4,8
}
with Po ���,�J and B(t, T ) specified in 1. 8 . 2 We outline the calibration procedure in the HJM framework. The fol lowing steps are required: 1 . Specify the volatilities a (t, T) . 2. Compute the drift parameters (Theorem 8.3.2 ) . 3. Observe today's forward-rate structure } (O, T) j T 2: O . 4. Integrate in order to get the forward rates. 5 . Compute the bond prices. Now assume a constant volatility process a (t, T) a and carry this approach out. This leads to the Ho-Lee model. 8 . 3 We now calculate option prices in the model of Exercise 8.1. =
{
}
=
dr
=
( a (t) - ,8r) dt + 'Ydw.
Use the approach in §8.4 to price a European call with maturity T, strike K on an underlying bond with maturity S ( > T) . Now Y (t) Ys ,r (t) ;�::�� . Check that Y has deterministic volatility, and find the parameters for the pricing formula. 8.4 .d-hedge a zero-coupon bond p( t, T) using two other zero-coupon bonds that mature on dates Tl and T2 . Use the self-financing condition and compute the sensitivity of changes in dx ( consider only first-order terms ) in the bonds in order to set up equations to find the portfolio weights. Is your portfolio insensitive to large changes in x(t)? Here, ( dX ) 2 must be considered. Include a third zero-coupon bond with maturity T3 in your portfolio to obtain a Gamma-neutral position. =
=
374
8.5
by
8. Interest Rate Theory
In the Gaussian HJM framework, assume that the volatility is specified O' (t, T)
exp{ ->" (T - tn , where 0' > 0 and >" 2: O. This specification leads to a deterministic, time stationary volatility, dependent on T - t and not T, t separately, which in creases time to maturity decreases. The HJM Condition (8.21) implies = 0' X
as
a(t, T)
=
0' 2
A
exp { - >" (T - t n ( 1 exp { - >" (T - tn) · -
1 . Verify that bond prices are given by p(t, T)
where
=
���,�1 exp {S(t, T)x(t) - a(t, Tn ,
x (t)
=
r(t) - 1(0, t) ,
S(t, T)
=
1 - >: { I - exp{ ->" (T - t n } ,
a(t, T)
=
(0' 2 / 4 >" )S(t, T) 2 ( 1 - exp { -2 >.. t } ) .
2 . Since x (t) i s not dependent on the maturity T , it can b e used as a single factor in a factor model (replacing e.g. r(t)) . Use x (t) to construct ..1- and
r-neutral portfolios of T-bonds. 3. Find the specific form of European call option prices in this setting (use (8.24) ) and compute sensitivities with respect to x(t) . 8.6 Verify the caplet Formula (8.25) in the setting of §8.4.4. (Repeat the deduction of (8.24) in §8.4.2.) 8 . 7 Prove Proposition 8. 1 . 1 part (i) .
9 . Credit Risk
Approaches to modelling financial assets subject to credit risk can roughly be divided into two types of models: reduced-form and structural models. While reduced-form models typically use a point process to model the default event ( exogeneously ) , structural models try to describe the default triggering event within the framework of all traded assets. The structural approach goes back to Merton (1974) , where the dynam ics of the value of the assets of a firm are described by a standard geometric Brownian motion and the default event is triggered by this value process crossing a default boundary given by the value of a single bond issued. Al though this model gave valuable insight into the default process, shortcomings have subsequently been raised: the liabilities of the firm are supposed to con sist only of a single class of debt, the debt has a zero coupon, bankruptcy is triggered only at maturity of the debt, bankruptcy is cost less and inter est rates are assumed to be constant over time. Thus, the assumptions of the Merton model are highly stylized versions of reality and are not able to account for the magnitude of yield spreads. This motivated several gener alizations of Merton's model: Black and Cox ( 1976) incorporate classes of senior and junior debt, safety covenants, dividends, and restrictions on cash distributions to shareholders, Geske (1977) considers coupon bonds by using a compound options approach and provides a formula for subordinate debt within this compound option framework. Leland (1994) extends the model further to incorporate bankruptcy costs and taxes, which makes it possible to work with optimal capital structure. Zhou (2001) and Madan (2000) use Levy processes to model the value of the firm process. While in most pa pers using option pricing frameworks bankruptcy is triggered as the moment when the value of the firm reaches the value of the debt, they model default as the time when the value of the debt reaches some constant threshold value K that serves as a distress boundary, i.e. the default time T can then be expressed formally as T inf {t � 0 V (t) ::; K} , the first passage time for V (t) ( the value of the firm's assets at time t) to cross the lower bound K. If the value of the assets breaches this level, default is triggered, some form of restructuring occurs and the remaining assets of the firm are allocated among the firm's claimants. Implicit in this formulation is the assumption that once this level is reached, default occurs on all outstanding liabilities at the same =
:
376
9. Credit Risk
time. Thus, contrary to Merton's model, default can occur prior to maturity. Nielsen, Saa-Requejo, and Santa-Clara (1993) , Briys and de Varenne (1997) and Hsu, Saa-Requejo, and Santa-Clara (1997) allow for stochastic default boundaries and deviation from the absolute priority rule. Because of the com plexities of all these extensions, often a closed-form solution can no longer be obtained and numerical procedures must be used. Literature related to credit risk in book form includes Sch6nbucher (2003) , Bielecki and Rutkowski (2002) and Duffie and Singleton (2003) . Madan (2000), Rogers (1999) and Lando (1997) are overview papers. 9 . 1 Aspects of Credit Risk
9 . 1 . 1 The Market
According to the International Swaps and Derivatives Association, the credit derivatives market grew 37%, with total notional outstandings reaching $2. 15 trillion during the first half of 2002. Notional outstanding volume in interest rate and currency derivatives increased 20%, to $99.83 trillion, in the first half, while equity derivatives outstanding volumes rose to $2.45 trillion up 6%. This growth demonstrates the importance of credit derivatives as a mechanism for mitigation and dispersion of credit risk. 9 . 1 . 2 What Is Credit Risk?
We can distinguish between individual risk elements: 1. Default Probability. The probability that the obligor or counterparty will default on its contractual obligations to repay its debt. 2. Recovery Rates. The extent to which the face value of an obligation can be recovered once the obligor has defaulted. 3. Credit Migration. The extent to which the credit quality of the obligor or counterparty improves or deteriorates, and portfolio risk elements: 1. Default and Credit Quality Correlation. The degree to which the default or credit quality of one obligor is related to the default or credit quality of another. 2. Risk Contribution and Credit Concentration. The extent to which an individual instrument or the presence of an obligor in the portfolio contributes to the totality of risk in the overall portfolio. Since credit risk focuses on default probabilities, recovery in default, identity of the counterparty - all factors that are not directly relevant to market risk - the modelling of credit risk requires the development of new techniques. In particular, since the underlying risk variable in credit risk - occurrence
9 . 1 Aspects of Credit Risk
377
or otherwise of default - is not normally distributed, we find credit portfolio distributions which are asymmetric with fat tails ( limited upside potential with remote possibilities of severe losses ) . In addition, when implementing models, we face the problem of sparse data. Data on credit events are much more limited than information on market risk. Credit events are infrequent and many credit instruments are not marked-to-market on a daily basis, so parameter estimation is difficult. Finally, market risk tends to focus on a relatively short time horizon - credit risk analysis is concerned with a much longer horizon. 9 . 1 . 3 Portfolio Risk Models
Banks implement credit risk models, which may be Ratings Based (Cred itMetrics) . So we need:
1. the
definition of the possible states for each obligor's credit quality, and a description of how likely obligors are to be in any of these states at the horizon date - Ratings and Transition Matrix, 2. the revaluation of exposures in all possible credit states - using term struc ture of bond spreads and risk-free interest rates, 3. the interaction and correlation between credit migrations of different oblig ors - use of an unseen driver of credit migrations. Ratings are in principle supplied by commercial firms, so-called rating agencies. Rating agencies evaluate the creditworthiness of corporate, mu nicipal, and sovereign issuers of debt securities. In the USA, where capital markets are the primary source of debt capital , rating agencies have assumed enormous importance. The importance of these agencies has been increased through the internal rating approach in the new Basel capital accord ( Basel II ) , which requires established rating agencies as benchmark. Of course, this increased importance triggers the question of how reliable rating agencies are. Several empirical studies ( some by the agencies them selves ) find: •
•
•
Moody's ratings have a high predictive power for defaults; spreads are generally higher for lower rated bonds, bond and equity values move in the expected direction when issuers' ratings change, ratings do help predict financial distress and bond spreads.
However, recent studies show the equity-based default probabilities ( see be low ) change well before ratings when firms fall into financial distress ( rating stickiness ) and bond prices based on average spreads show inconsistencies with actual prices. Equity-Based Models (Moody's KMV, Credit Grades) rely on the argument that a firm defaults when its asset value drops to the value of
378
9. Credit Risk
its contractual obligations (or a critical threshold - the default point). One can use an option pricing framework to derive the value of debt and equity. Each model contains parameters that affect the risk measures produced, but which, because of a lack of suitable data, must be set on a judgmental basis. Empirical studies such as Gordy (2000) and Koyluoglu (1998) show that parameterization of various models can be harmonized, but use only default-driven versions. 9 . 2 Basic Credit Risk Modeling
A complete mathematical framework is developed in Bielecki and Rutkowski (2002) . For our purpose, a simplification of their framework suffices. To in troduce our basic model, we specify a time horizon T* > 0 and assume an underlying stochastic basis (fl, F, JP, IF) with IF (Ft) O �t� T ' ) a filtration that supports the following objects: the firm's value process V , thought of as the total value of firm's assets, the barrier process (signalling process) which will serve to specify default time, promised contingent claim X, representing the firm's liabilities to be re deemed at time T � T* (other notation D , L) , 7, a default time, which - in the structural approach is defined as 7 := inf{ t > 0 : Vi < vt} so 7 is a IF stopping-time; - in the intensity-based approach is not a stopping time for the market filtration, recovery claim X , represents recovery payoff received at T, if default occurs prior or at the claim's maturity date T ( recovery at maturity), recovery process Z, specifies recovery payoff received at time of default, if default happens prior to or at T (recovery at default) . Technical Assumptions. V, Z, A , are progressively measurable with re spect to IF, X and X are FT measurable. All processes are assumed to satisfy further suitable conditions. Suppose there exists an equivalent martingale measure (EMM) JP* (im plying that the financial market model is arbitrage-free) . So discounted price processes of tradeable securities, which pay no coupons or dividends, follow IF-martingales under JP* . Let r be the short-term interest rate process and use as the discount factor the savings (bank), account, which we assume to exist: =
• •
v,
•
•
•
•
v
9.3 Structural Models
379
Let H
= (Ht ) = ( l{r$ t } ) be the indicator process of the default event, D = (Dt) cash flows -received by the owner of the defaultable claim and X d ( T ) = X1{r> T } + X1{ r $ T } . Then the process D of a defaultable claim, which settles at time T, equals
Dt
=
X d (T) 1 { t � T } +
J Zu dHu ,
(O , t]
where the first term takes care of the payoff at T ( if any ) and the second term captures the payments in case of premature default. Observe that D is of finite variation over [0, T] . Now let Xd (t , T) be the price process of a default able claim. Thus, Xd (t, T) represents the current value at time t of all future cash flows associated with a given default able claim. We have Definition 9 . 2 . 1 (Risk-neutral Valuation Formula) . The price process
of a defaultable claim which settles at T is given as
X d (t, T )
=
BtIE*
(J
( t , T]
B; / dDu Ft
)
Vt E [0, T] .
Use of the formula depends on the attainability of a defaultable claim, which is not obvious. Usually one argues that pricing the defaultable claim accord ing to the above formula does not introduce an arbitrage opportunity into a previously arbitrage-free market. Example.
In case of recovery at maturity, we have Z
=
o.
So
Then the valuation formula is X d (t , T )
=
BtlE*
( (X1{r> T } + X1{r$ T } ) BT l I Ft ) ,
and the discounted price process follows an .IF-martingale under JP* ( given some integrability conditions ) . 9 . 3 Structural Models
9.3.1 Merton's Model
The basic foundations of structural models have been laid in the seminal paper Merton (1974) . Here it is assumed that a firm is financed by equity
380
9. Credit Risk
and a single zero-coupon bond with notational amount ( face value) F and maturity T. The firm's value is given by dV(t)
=
(r - 8)V(t)dt + o"V (t)dW(t)
under an equivalent martingale measure 1P'* , with r, a constant, W Brownian motion and constant payout ( dividend ) rate 8, which may be negative ( i.e. pay-in ) . Default is only possible at maturity. There are two possibilities; Vr
or
:::::
<
Vr
F, thus Dr F,
=
F
thus Dr Vr · So the firm defaults if the value of the firm is at maturity T below the notational amount F. We have recovery at maturity and the stopping time T is formally given as =
For equity owners the payoff is Sr = max{Vr - F, O}
thus stocks can be viewed as call options on the value of the firm with and
2 _ log ( 1It / F ) + ( r - 8 + a /2) ( T - t) d2 d1 av� T-t
+
av� 1 - t.
For bond owners the payoff is Dr
=
F - max{F - Vr , O} .
Proposition 9 . 3 . 1 (Merton) . Under the above assumptions, bonds can be
viewed as the difference of a risk-free payment and a put option on the value of the firm with r p d (t, T) = F e-r( - t ) - PE (1It , F) , where
So
9.3 Structural Models
381
We use the notation di di (Vt , T - t ) , i 1 , 2. Then pd (t, T) F e- r ( T - t ) + Vt e -<5 ( T - t )4i ( - d d - F e - r ( T - t )4i ( - d2 )
Proof.
=
=
=
=
Vt e- <5 (T - t )4i ( - d 1 ) + F e- r (T - t )4i (d2 ) .
D
Typically, we do not observe the value of a firm. Thus in order to calculate the parameters, i.e. Va and a, relevant for the dynamics of the firm's value, we have to use the identities
as
=
4i (dd Vo a So
to obtain risk-neutral default probabilities IP* (VT
<
F) .
This approach has been used by KMV ( now Moody's KMV, named after the founders Kealhofer, Mc q uown and Vasicek ) in their approach to developing a portfolio model of credit risk. From the risk-neutral valuation formula, we get pd (t, T) e - r ( T - t ) JE* ( F l { Vr �F} + VT l {Vr
=
We thus have as risk-neutral probabilities of default. Let 8; be the conditional risk-neutral expected recovery rate upon default. 8t*
JE * ( VT l{Vt < F} I Ft ) - FIP * (V < ljlF ) T t k (T t e)4i -d1 ( (Vt, T - t ) ) Vt Fe -r (T - t ) 4i ( -d2 (Vt , T - t ) ) , _
and w; = 1 - 8; be the conditional risk-neutral expected write-down rate upon default. We can then write the formula above as pd (t, T) Fe-r ( T - t ) ( 1 - p; ) + F e- r (T - t )8;p; . Ft (l - p; w; ) =
=
382
9. Credit Risk
where Ft = Fe-r ( T - t ) . For spreads 80 formula pd (t, T) Fe- (r + S ( t ,T » )( T - t » ) =
with It
=
=
8 (0 , T) we find ( using the interest
Ft Vt
the leverage ratio. In terms of It the valuation formula is
where
h 1 , 2 (It , T
_
t)
=
- log ( lt ) - 8 (T - t) ± (1 2 (T - t) /2 (1 y� T-t
.
.
As Kim, Ramaswamy, and Sundaresan ( 1 993) and Jones, Mason, and Rosen feld ( 1 984) point out, Merton's model doesn't generate the levels of yield spreads which can be observed in the market. Rather, they show that this model is unable to generate yield spreads in excess of 120 basis points, whereas over the 1926-1986 period, the yield spreads of AAA-rated corp orates ranged from 15 to 2 1 5 basis points. 9 . 3 . 2 A Jump-diffusion Model
Because the empirical literature shows that credit spread curves can be flat or even downward-sloping, that short-term debt often doesn't have zero credit spreads, and that sudden drops in the value of firms are possible, Zhou (2001 ) incorporates a jump diffusion for the underlying asset value. Here the value of-the firm process is with
where • • • • •
> 0 is a constantj wP) is a standard Brownian motionj Nt is a Poisson process with intensity Aj ( ) is a family of independent random variables with distribution Fj k J xF(dx) .
(11
Ei
=
9.3 Structural Models
383
Bond prices are given by p(t, T) e - r (T- t ) with r > 0 the short rate. Let Q be an equivalent martingale measure. Now we assume that default is only possible at maturity T, and that bond owners receive the remaining equity. Then pd (t, T) p(t, T) lEQ ( V(T) l{v(T) < F} + F l {V( T )2:F} I Ft ) p(t, T)F { I - lEQ ( (1 - 1 / 1 (T)) l {v(T) < F} 1 Ft ) } , =
=
=
where
l(
Fp(t, T) . V(t) This is again a combination of a risk-free zero-coupon bond and a European put on 1 /1(t) . If 1 + E is log-normally distributed with parameters 8 2 ' War log ( l + E) 8 2 , lEE k e' - 1 lE log( l + ) 'Y - 2 E
t)
=
=
=
=
=
under the historical measure JP, one can find an equivalent martingale mea sure JP * such that the intensity of N is 5. > 0 and the ( l +Ei) are log-normally distributed with parameters 2 lE* log(l + E ) i - 82 , War* log(l + ) 8 2 , lE*E k ei' - l under JP * . We find for the put price =
E
=
=
=
with parameters A'
=
5.(1 + k) ,
Theorem 9.3. 1 . The price of a credit risky bond is given by
p(O, T) = p(O, T)F
( ( X T) exp{ -A' T} �1
n
+
with the leverage 10 an d d I ,n
=
log (V / F) + ( rn + u�/ 2 )T o
un VT
=
d2,n
+u
n
VT .
)
,
384
9. Credit Risk
9.3.3 Structural Model with Premature Default
We discuss a setting first used in Black and Cox (1976) . We follow the expo sition of Bielecki and Rutkowski (2002) §3.2. We again assume that value of the firm is modelled under the risk-neutral measure JP* as dVi Vi ( (r - /'i,)dt + a dWt ) , where 2 ° represents the payout ratio ( continuous dividend payment), a > ° is the (constant) volatility and r is the (constant) short-term interest rate. Now safety covenants for the bondholders, which give the bondholders the right to force the firm to bankruptcy or reorganization if the firm doesn't meet ex-ante specified standards, are modelled in terms of a time-dependent deterministic barrier v(t) K e--y ( t - T) , t E [0 , T) , =
/'i,
=
=
with some constant K. As soon as the value of the firm crosses this lower threshold the bondholders take over the firm. Otherwise default takes place at maturity of debt depending on whether or not VT < L. So vt
=
{ V(t) for t
< T, L for t T. =
We can define the default time T as T= =
inf{t E
[0, T] : Vi
<
vt } = f 1\ f ,
where f inf{t E [0 , T) Vi < v (t)} and f T l { VT < L} + oo l { VT � L} ' Observe that T is a stopping time of the asset's filtration. Also, the recovery process Z {32 V and the recovery payoff X = {31 VT with constants {31 , (32 E [0, 1] , i.e. these processes are proportional to the firm value. Finally, v(t) K e- "'! (t - T ) :S Lp(t, T) Le- r( t - T ) V t E [0, T] . =
:
=
=
=
Thus the payoff to the bondholder at the default time never exceeds the value of debt discounted at the risk-free rate. Mathematically, default is trig gered at a first-passage time. We use the risk-neutral valuation approach to find the price of the de faultable bond on {T > t}. T
9.3 Structural Models
= JE * ( L e -r(T - t) l {f�T,vT � L } 1 Ft ) + JE* ( ,61 VT e- r(T - t) l {T�T,vT < L} 1 Ft ) + JE* ( K,62 e - I'(T - f) e -r(T - t ) 1 { t
385
no default default at T default at
t
<
T
<
T
say. To evaluate these conditional expectations, we need the following results on the distribution of the first passage time of Brownian motion with drift. So let Y be
Yt + ay Wt + vt and define Ty = inf{t � 0 ::; O} . > Then, we have any t s, we have on {T t} - - v ( s - t) ) JP ( ::; s l Yt ) = tf> ( Yt ay � =
Yo
:
<
Yt
(9 . 2 )
TV
+e -
2 "" y 2 y,
tf>
( - Yt + v(s - t) ) . ay �
Recall that we have met the first-passage problem for a drifting Brownian motion already, in §6.3.3 on barrier options. We dealt with it by using Gir sanov's theorem to reduce to the driftless case, and the reflection principle; see e.g. Rogers and Williams (1994) , I, (13. 10) or Harrison (1985) , §1.8. We may thus obtain an explicit formula for the joint distribution. Again for < and � 0 on the set {Ty �
t s t} JP ( V. > y , Ty > s l :F,t ) = tf> (y + Yt + v(s - t) ) (9.3) - ay y�t S"Y, ( - y - Yt + v(s - t) ) . -2 ay � Now let Yt = 10g ( Vt /ii(t)) and y log ( x / ii (s)) , so v = - - - a 2 / 2 . An application of (9.2) gives for every t s ::; T v ( s - t) ) (9.4) JP ( T ::; s l Ft ) = tf> ( IOg(Vt/ii� a s-t y
s
L
'
-e
=
_
<
,, - 2 A; Y
!l'
r
r;,
'Y
( Vt ) tf> ( - IOg(Vt/ii(t)) + v(s - t) ) , ii (t )
2
a
a�
386
9. Credit Risk
-
with a V/U 2 (r "I - u 2 /2)/u 2 . By Equation (9.3) , we find that for every t < s :::; T and x � v ( s ) we have on {T > t} (9.5) =
=
{p
=
_
K,
log(x / v (s) ) v (s - t) ) ( log( Vt /V (t)) - u� log(x/ v (s) ) + 1/ (s - t) ) . ( VVt(t) ) 2a {p ( - log( Vt /V (t)) -u� +
-
Proposition 9.3.2. Set i/ = 1/ "I and a = i/u - 2 . Assume that i/ 2 + 2u 2 (r > O. Then the price process of a defaultable bond on {T > t} equals
"I )
pd(t, T)
Lp(t, T) ({P(h l ( Vt , T - t)) - R�ii{p (h 2 ( Vt , T - t)) )
=
+ i31 Vt e - It( T - T ) ( {P (h3 ( Vt , T - t)) - {P ( h4 ( Vt , T - t))) + i31 vt e-It(T - T ) R�lL+ 2 ( {P (h5 ( vt , T - t)) - {P (h6 ( vt , T - t) ) )
(
where Rt
=
)
+ i32 Vt RfH{p (h 7 ( Vt , T - t)) + RfH{p ( hs ( Vt , T - t)) , v (t)/ Vt , () a + 1 , ( u - 2 Ji/2 + 2u2 (r "I ) and =
h 1 (V, T t) t,
_
h2 ( V,t , T t) _
_
h3 (V,t , T t) h
4
(V,t , T
h
5
(V,t , T
=
=
=
_
t)
=
_
t)
=
h6 ( V,t , T t )
=
h 7 (V,t . T t)
=
_
_
hs ( V,t , T t ) _
-
_
-
=
log( Vt /L) + 1/ (T - t) ' uvT t log v 2 (t) - log(L Vt ) + v(T - t) ' uVT t log(L/ Vt ) - ( 1/ + ( 2 ) (T - t) ' uVT - t log(K/ Vt ) - ( 1/ + ( 2 ) (T - t) ' uVT - t log( v 2 (t)) - log(L Vt ) + ( 1/ + ( 2 ) (T - t) ' uVT t log v 2 (t) - log(K Vt ) + ( + ( 2 ) (T - t) ' uVT t log( v (t)/ Vt ) + ( u 2 (T - t) ' uVT t log( v (t)/ Vt ) - ( u2 (T - t) . � uvT -t _
_
_
1/
_
_
9.3 Structural Models
Proof (Outline) .
387
We need to evaluate
D1 ( t, T)
=
Le - r (T - t ) JP* ( r 2 T, VT 2 L I Ft ) ,
We consider only t = ° for convenience (for the general case apply the strong Markov property). Now the formula for D1 (O , T) follows from Equation (9.5) . To compute D2 (O, T) we observe again using (9.5) L
=
J XdJP*(VT X, r 2 T) <
K L �
! +" COg(X�� vT ) -
+
(
R!'d" 'Og (D'(O;j;'o)
+
VT
) },
where the last integral can be evaluated by a tedious calculation. Finally, for D3 (O, T), i.e. the case of bankruptcy before maturity T, we use (9.4) to get
T IE* (eCT- r)r1 { r < T} ) J e C'T' - r ) u dJP*(r ::::; u) =
� 1 e('-C)UdH
o
;ji iiS ) (v�r" COg( ;ji ii, ) 1
10"( v (O
l-
V (O
+
where again the last integral can be evaluated.
l-
o
Black and Cox (1976) also suggest a way to model the strict priority rule. Assume /31 /32 1 , i.e. no bankruptcy costs and that the firm's debt can be classified in junior and senior bonds with same maturity T. At maturity payments to the holders of junior bonds are only made if the promised payments to the holders of senior bonds have been made. So let =
=
be the total face value of the firm's liabilities. (Ls resp. is the face value of senior resp. junior bonds) . Denote by pd ( t, T; v) the price of a defaultable bond with face value and barrier function v . Then on {T > t} the value of senior debt is
L
L,
Lj
388
9. Credit Risk
p� (t, T) = pd (t, T ; Ls , v)
and at time of default it is min{v (T) , Lsp(T, T) } for T < T. Now on the value of junior debt is
{ T > t}
pj (t, T) = pd (t, T) - P s (t, T) = p d (t, T ; L, v) - pd (t, T ; Ls , v) and at T < T it equals min{ v (T) - Lsp(T, T) , Lj p(T, T) } . If v (t) = Kp(t, T) for some constant K :::; L , we get if K = L, Ljp(t, T) pj (t, T) = p d (t, T) - Lsp(t, T) if Ls :::; K < L,
{
p d (t, T) - p� (t, T)
if K < Ls .
9.3.4 Structural Model with Stochastic Interest Rates
We now introduce stochastic interest rates into the modelling framework, as e.g in Black and Cox (1976) , Longstaff and Schwartz (1995) and Briys and de Varenne (1997) . We assume the same dynamics of the firm value process as before, i.e. (9.6) where rt , Kt and u(t) are suitable processes. rt models the stochastic short rate, K t is the dividend-rate and u (t) specifies the volatility. The dynamics of default-free bond prices are given by dp(t, T) p(t, T) (rt dt + b(t, T)dWt ) . (9.7) We consider the forward value of the firm Fv (t, T) Vt /p(t, T) under the T - forward measure jpT . Now (9.8) dFv (t, T) Fv (t, T) ( - K t dt + (u(t) - b(t, T) ) dWt) , T where W T is a JP Brownian motion. Now Fv (t, T) Fv (t, T)e - It I« u) du has the dynamics (9.9) dFv (t, T) FV (t, T) (u(t) - b(t, T) )dWr If (u( t) -b( t, T) ) is deterministic, Fv is a Gaussian process. In case of a bound ary function of the form Kp(t, T)e- It I« u) du one can exploit this property to again obtain an analytic expression for the forward value of a defaultable bond. The main steps are an application of the change-of-numeraire tech nique to transfer the valuation problem to the forward values (as outlined above) , and in case (u(t) - b(t, T) ) is constant, calculation of first-passage times similar to the ones performed in the last section. If (u( t) - b( t, T) ) not constant one has to perform a deterministic time change to be able to use the technique above. For background on time-changes of Brownian motion, we refer to e.g. Revuz and Yor (1991), V. l . =
=
=
=
=
9.3 Structural Models
Note.
389
In Equation ( 9.8 ) , the key component is the 'volatility' coefficient,
(u(t) - b(t, T)) , which we need to be deterministic in order to find explicit
formulae. This is a strong restriction, but does include models with both volatility and interest rates being stochastic. 9.3.5 Optimal Capital Structure - Leland's Approach
The basic idea is that equity holders ( = owners of the firm ) can choose the bankruptcy policy in such a way that the value of the equities will be maximized ( or the value of debt will be minimized ) . We assume the standard model with 0 and r > 0 constant. Also, the outstanding debt is a consol bond, i.e. a bond with infinite maturity, which pays continuously at a constant rate c. Its price Dc (t) at any date t E 1R+ equals /'i, =
D, (t)
,�
(1
T� E'
ce-> ( · - 'l l{,,,) dB
(
:F')
+ )�moo JE * K,82 el'(r- t ) e-r(r- t ) 1 { t
(
=
-)
-
( )
_
l Ft)
,
where (
=
Now .,. = 0, v (t )
u- 2 Jv 2 + 2u 2 (r =
V,
/'i, =
) and ( = U- 2 JV 2 + 2u 2 r . 0 and ,82 = 1 gives -
.,.
It is in the stockholders' interest to select the bankruptcy level in such a way that the value of the firm's equity E(Vi ) is maximized. Now c (1 D (t) = )= E(Vi
where ift =
Vi
(SJ 0< (with a
-
Vi
c
--
r
- iit) - v ift ,
2 r / u 2 ) . This can be solved and gives c a c v * = - -ra+1 =
which ( as expected ) does not depend on the current value of the firm. One can also express Dc (t) then in terms of v* : Dc* (t) = � r
Leland ( 1994 ) includes
_
1 * _V +1 . a�O< 0<
_
390
9. Credit Risk
a. Bankruptcy costs: At time of default, bondholders receive (32 V, (32 E [0, 1] . So Dt = � ( 1 - iit) + (32 Vijt , and we have bankruptcy costs B( lIt ) ( 1 - (32 )iiiitb. Tax benefits: As long as the firm is solvent, there is a tax deductable. This can be interpreted as a security that pays a constant coupon c < c. After default, tax benefits cannot be claimed
=
c
T(lIt ) = - ( 1 r
- (it) .
Use c J.Lc where J.L is the corporate tax rate. So the total value of the firm equals =
G( lIt )
= =
lit
+ T( lIt ) - B( lIt ) c lit + - (1 - iit) - (1 - (32 ) v ijt , r
where the second term gives the present value of future tax refunds, and the third term gives the present value of bankruptcy costs. Then we obtain for equity E( lIt ) = G( lIt ) - D ( lIt ) =
lit
c - c
-r (1 - qt) - viit-
This implies no bankruptcy costs for equity holders! The same methods as previously give v* r.:.;;l/2 . One can also optimize the coupon rate at t 0 : c c + Vo (r + (7 2 /2) . =
=
=
9 . 4 Reduced Form Mo dels
9.4. 1 Intensity-based Valuation of a Defaultable Claim Motivation. In the intensity-based valuation approach, the default time r is assumed to be a non-negative random-variable with IP* (r < 00) 1 , IP* (r = 0) 0, IP* (r > t) > 0 ( we assume IP* is a risk-neutral measure ) . As a useful example we can think of r as an exponentially distributed random variable with intensity parameter >.; then the time of default can be thought of as time of the first jump of the corresponding Poisson process ( see §5.4) . We can make this framework more general assuming that we have a Poisson process with rate ')'(t) ( or intensity ) , see Definition 5.4.2. Then, with t ')'(u)du rt o we can calculate the survival probability =
=
=
J
9.4 Reduced Form Models
11" (T > t)
�
exp
391
{ / }. 7 ( U)dU
-
We can go a step further and assume that the intensity 'Y is a stochastic process as well, and define a counting process Nt N (F(t) ) with N a standard Poisson process independent of 'Y. Then if T is the time of the first jump of the process N * , we obtain for the survival probability =
11" (T > t)
�
11" (r(T) > r e t) )
�
( { / })
JE ' exp -
7 (U) dU
.
(9. 10)
We can differentiate (9. 10) to obtain an expression for the density of T : 11" (T E dt )
�
( { / })
JE ' 7, exp -
7(U)dU
dt .
Now consider a defaultable zero-coupon bond with face value X = 1 and a deterministic payment at maturity T of Z(T) in case of premature default at T. We assume the short rate r(t) to be deterministic as well, so the time t price of a default-free bond with maturity T is p (t, T) e - It r( u )du . The risk-neutral valuation formula implies that the value of the defaultable bond at t = 0 is pd (O, T) IE * e - g r( u )du ( l{ r > T} + Z( T ) l { r � T} ) =
(
=
=
�
)
(
p (O, T) - IE * e - IoT r( u) du ( l _ Z(T) ) l {r�T}
(
p rO, T) - JE ' e - Ji .( u )du
1( 1
-
)
Z( u ) h( u ) e - Jo' " ( ' )d ' dU
)
.
These pricing formulas are the key ingredients for various reduced-form models. Modifications are related to different modelling assumptions, such as recovery rates and seniority. The Model. Following the exposition in Bielecki and Rutkowski (2002), we now outline a general modelling approach. We assume the setting of the basic model ( see §9.2) . In particular, we have an underlying stochastic basis (n, g, IP * , IF) with IF (Ft )09� T . ) , Ft C g. Now, the default time T of a defaultable claim is a non-negative random variable with IP * (T < 00 ) = 1 , IP * (T 0) O, IP * (T > t) > O. The associated jump process H ( Ht ) = (1 { r 9 } ) is called the default process. Define 1H as the filtration generated by H, i.e. 1it = a(Hu u S t) a({T S u } : u S t). Consider the enlarged =
=
=
=
:
=
392
9 . Credit Risk
filtration G IF V IH with gt Ft V 'Nt a(Ft , 'Nt ) . T is not necessarily a stopping time w.r.t. IF, but T is a stopping time w.r.t. G. We want to value default able claims within the framework of a financial market model for which we assume that IF contains the market information (and IH contains the information on the default time) . In this market, we assume the existence of a savings account =
=
=
where is the short-term interest rate process. We assume the existence of an equivalent martingale measure JP * (risk-neutral measure) such that the discounted price process of any tradeable security, which pays no dividends or coupons, follows a G-martingale under JP* . Recall that the cash-flow process Dt (payments from time t on) of a default able claim equals r
Dt
=
X d (T) l{t>T} +
J
(O, t ]
ZudHu ,
where the pay-out at maturity T is Xd(T) X l { o t } + Zl { r::; T} and the pay out in case of premature default is modelled by Z . Let X d (t, T) be the price process of the defaultable claim. Using the risk-neutral valuation formula we obtain =
Xd (t
,
T)
=
BtIE*
(J
B;; l dDu gt
(t ,T)
)
'
(9.1 1 )
The supporting argument (as usual) is that X d (t, T ) does not give rise to arbitrage in a previously arbitrage-free market. The aim now is evaluate the Formula (9. 1 1) for various specifications of the default time T . Mathemati cally this corresponds to calculating the conditional expectation in (9. 1 1 ) for the various approaches to specifying the filtration G. In particular, we will construct models where the intensity of default is adapted to the filtration IF.
Another Specific Example. Let F(t) JP* (T � t) be the cumulative distribution function of T with F' f its density (in particular we assume that F is differentiable) . We consider a default able zero-coupon bond with face value X 1 and a deterministic payment at maturity T of Z in case of premature default. Furthermore, we assume IF to be trivial (so r is deter ministic) and thus G IH. The valuation formula implies that the value of the defaultable bond at t � T is =
=
=
=
(9. 12)
9.4 Reduced Form Models
393
Using our assumptions, we find on {t < T } , which is the information contained in Ht ,
(1 - (1 - Z)IP* ( TI t T ) ) e- It r ( u ) du ( 1 - ( 1 - Z) IP* ( t T :=:; T) ) IP * (t ) t ) -F( » ) - e - J: T r ( u ) du ( 1 (1 Z) F(T I - F( t ) T e It r ( u ) du ( 1 - pe T ) Z F( ) - F(t) ) . 1 F (t) + 1 F (t)
=
e- It r ( u ) du
T
:=:;
=
t
=
:=:;
_
_
<
<
T
_
_
-
We can now introduce the hazard function r, defined by r et )
Then
r
I
(t)
=
- log ( 1 - F(t)) .
=
I (t)
=
1
(9. 13)
J (t) F(t) '
_
and we call i the intensity function. Then the survival probability is
We can now write the valuation formula as pd (t , T ) = e - It r ( u )+ 1' ( u ) du + Z e - It r(u ) du ( 1 - e - It 1' (U) dU
).
(9. 1 4 )
Hence, in order to value a defaultable claim, one has to adjust the short rate with a default-related quantity. In case Z = 0, i.e. zero-recovery, and constant r, I, we find that the credit spread implied by the model gives 1 p (t, T) - log exp{ f (T t) } I · T - t log Pd ( t , T) T-t Thus, under these assumption, it is very easy to calibrate the model from actual spread data (Le. market prices) . General Modelling Framework. In the general case we set
1
=
1
-
=
and postulate Ft < for all t E 1R+ . The survival process G of the random time T with respect to the filtration IF is then defined as
9. Credit Risk
394
Since {T :::; t}
�
{T :::; s} for a :::; t :::; s, we have
So ( resp. supermartingale) .
(Gt )) follows a bounded, nonnegative IF-submartingale (resp Definition 9.4. 1 . The -hazard process of T under IP*, denoted by r, is defined by (9.15) 1 - Ft exp { -rt l or rt - log G t - log ( 1 - Ft) Vt E 1R Since Go 1, we have ro O. Also, from IP* (T ) 1, we find We will only consider the case where r ( and thus F and G) limt-HXl rt is a continuous process. Lemma 9.4. 1 . For any t E 1R * we have (Ft)
lF
=
=
00.
<
=
Also for any g-measumble mndom variable If
+ .
=
=
=
00
=
Y and s :::; t
Y is Fs -measumble, then
Proof. For events A on the set {T > t} , which belongs to the a-field g , we have that there exists an event B E such that A n {T > t} B n {T > t}. The usual measure-theoretic approximation argument implies then that for a gt-measurable random variable Y there exists an Frmeasurable random variable Y such that Y = Y on {T > t}. Furthermore, we find that for any g-measurable random variable and any t E 1R+ we have
Ft
=
Y I Ft) IE * ( l {T> t } Y I gt ) l {T> t } IE*IP*(l{T>t} (l{T>t} J Ft ) =
t
(9.16)
One then only needs to use Equation (9. 16) together with the tower prop0 erty of conditional expectation to prove the claims. The final step is to establish a representation for the pre-default value of a defaultable claim in terms of the hazard process ( or its intensity 'Y in case it exists ) of the default time. In doing so we will be able to find convenient formulations for the valuation equation (9. 1 1 ) in important applications.
r
9.4 Reduced Form Models
395
Theorem 9.4. 1 . The value process X (t, T) of a defaultable claim has the following representation for t E [0, T] .
1. X d (t , T )
=
l {r>t} Bt IE*
(J
B;; l e rt - ru Zu dru + BT l X e rt -rT Ft
( t ,Tj
)
.
2. In case r admits an intensity 'Y X d (t, T)
=
l {r>t } IE*
(J
e - ft ( r . + 'Y. ds ) 'Yu Zu du Ft
( t , T]
(
l )
)
+ l {r>t} IE* e- f,T ( r s + 'Y. ds ) X Ft .
For proof and further discussion, we refer the reader to Bielecki and Rutkowski (2002) , Chapter 8. Valuation of General Defaultable Claims. We can now use Theorem 9.4.1 to value various defaultable claims. We will always assume that r admits an intensity 'Y. Fractional Recovery of Par (Face) Value. Let V represents the claim's constant par value and 8 the claim's recovery rate. Thus the pre-default value of the claim has no influence on the recovery in ca..c;e of default. We set Zt 8 · V, 0 ::; t ::; T and Theorem 9.4.1 yields =
X6 (t, T )
=
l{r>t} IE*
( J 8V
(
( t ,Tj
e - ftU (r s + 'Ys ) ds 'Yu du Ft
I )
)
+ l {r >t } IE* e- f,T ( r' + 'Ys )d S V Ft .
Fractional Recovery of No-default Value. Here, it is assumed that in case of default a fixed fraction of an equivalent non-defaultable security is received. In case of a default able bond this scheme is known as fractional recovery of treasury value. By the risk-neutral valuation formula, the time t value of the non-defaultable equivalent security xe is given by the discounted expectation ( under JP* ) of the promised payoff X, so
Now assume that Zt 8xe (t, T) with 8 the recovery rate. The valuation equation ( 9.1 1 ) , Theorem 9.4.1 and an application of Fubini's theorem yield =
396
9 . Credit Risk
Xe , O (t, T )
= (1
-
(
8 ) 1 { r> t} 1E *
+ 81{ r> t } 1E*
(
e
-
e
-
l )
It Crs + 'Ys ) ds X Ft
)
It r s ds X l gt .
For corporate bonds, one can find a convenient pricing formula. Let L 1 8 be the stochastic loss rate and L t = lE* ( L I Ft ) the risk-neutral mean fraction of market value if default occurs at time t. Thus Lt captures all information about recovery that is important for bond pricing. The pricing formula =
-
(9.17)
where S t 'Yt L t is now the risk-neutral conditional expected rate of loss of market value, and the necessary technical conditions under which it holds, are given in Duffie and Singleton ( 1 999) . Building on its specific form, one can build tractable models for defaultable bond pricing models parallel to the default-free models. In particular, HJM-type forward spread models can be constructed. One observes the initial default-free forward rates 1(0, T ) and the initial forward spread rates s(O, T) . After specification of the volatilities a f (t, u) for forward rates and as (t, u) for spreads and fractional loss at default L, default able bond prices are given as =
In addition to the standard HJM-drift condition, the spread dynamics ds(t, T )
=
f..L s (t, u)du + as (t, u)dW(u)
have to satisfy a drift condition (under recovery of market value) f..Ls (t, u)
=
as (t , u)
T
J af (t, v )dv t
+af (t, u)
T
J as (t, v )dv , t
see Schonbucher ( 1 998) and Schonbucher (2003) for additional details. Reduced-form Models with State Variables. In many applications, it is useful to think of underlying state variables that drive the economy (economic cycle) and subsequently have an influence on the default intensity. To model such state variables, we assume that Y is a d-dimensional stochastic process defined on (n, g , IF, IP* ) and follows a IF-Markov process under IP* .
9.4 Reduced Form Models
397
One can then model T as the first jump time of a Cox process, which has an intensity of form At A(yt) for some function A JRd -+ JR+ , as e.g. in Lando (1998). Under the further assumptions that the promised payoff X at T of the default able claim is FT -measurable, that the recovery process Z is JF predictable and finally, that the s.hort-rate process satisfies rt r(yt) for some function r JRd -+ JR, we can apply Theorem 9.4.1 to obtain =
:
=
:
Proposition 9.4. 1 . The price process of a defaultable claim with the above
specifications is given by
Motivated by the pricing formula (9.17), Duffie and Singleton (1997) , Duffie and Singleton (1999) and Duffee (1999) used an econometric model for the term structure of credit spreads. They modelled the short-rate process and the short-spread process using underlying square-root process state variables Xl , X2 , X3 • They as sumed that the state variables satisfy dX1 (t)
=
[lI:u ( lh - X1 (t)) + 11:1 2 (02 - X2 (t) )]dt + JX1 (t)dW1 (t) ,
dX2 (t)
=
11: 22 (02 - X2 (t) )dt + 0"22 VX2 (t)dW2 (t) ,
dX3 (t)
=
11:33 (03 - X3 (t) ) + 0"3 2 VX2 (t)dW2 (t) + 0"33 VX3 (t) dW3 (t) .
Conditions on the coefficients are needed to ensure that s et) > 0 (positive affine function of correlated square-root diffusions) and the correlation be tween s and r is negative. 9.4.2 Rating-based Models
Usually there is a deterioration of credit quality until risky debt goes into default mode - this is called credit migration. Credit quality corresponds to the probability that a firm will be able to meet its contractual obligations. Ordering firms according to their default probabilities leads to rating systems
398
9. Credit Risk
Year Rating AAA AA A BBB BB B CCC
1
3
2
4
0.00 0.05 0 .00 0. 1 1 0.00 0.02 0.07 0 . 1 5 0.04 0.12 0.21 0.36 0.24 0.54 0.85 1 . 52 1.01 6.32 9.38 3 . 40 5.45 12.36 19.03 24.28 23.69 33.52 4 1 . 1 3 47.43 Table 9 . 1 .
5
6
7
8
9
10
0 . 1 7 0.31 0.47 0 . 76 0.87 1 .00 0.27 0.43 0.62 0.96 0 . 77 0.85 0 . 56 1.01 1 .69 0.76 1 .34 2.06 4.55 2.19 2.91 3 . 52 4.09 5.03 12.38 15.72 17.77 20.03 22.05 23.69 28.38 31 .66 34.73 37. 58 40.02 42.24 54.25 56.37 57.94 58.40 59.52 60.91
Standard and Poor's cumulative Default Rates ( Percent)
(discrete) . These rating systems allow empirical calculation of transition ma trices. The table below shows Standard and Poor's cumulative Default Rates (Percent) ordered according to the rating classes of S&P. One can now assume that the credit quality of a firm is a continuous time Markov chain M on a finite state space E = ( 1 , . . . , K, K + 1) (the rating classes) with transition probability matrix pet) . We assume that the state K + 1 corresponds to bankruptcy and that a bankrupt firm remains in that state. As usual, we call Pi , j the probability that at time t the process, which started in state i, is in state j. Thus P k + l, i (t) = 0, i = 1, . . . K and P K + l, K+l (t) 1. The semigroup of M is =
pet) = exp(tA)
=
00 (tA) k
L �'
(9. 18)
k=O
where A is the generator matrix -AI , l Al ,2 A2 ,l -A2 , 2 A= AK- l ,l AK- I , 2 . . . -AK- l ,K-l AK- I , K o
o
o
where by definition
o
Pi,j (t) - Pi,j (O) A 'O ,J = tlim . --+O t We have Ai,j � 0, i =f. j and Lj A i,j 0, Vi. Now assume that a firm has rating k(t) at time t. Let A i,j (Xt) > 0 be the state-dependent, risk-neutral transition intensity from rating i to j (where Xt is a suitable process for the state variable) . With ret) = r(Xt) default able bond prices are as usual 0
=
9.5 Credit Derivatives
Pd (Xt , kt , t, T)
=
IE
399
( e- It r (u)du l{ r > T} I 9t ) ( e - It r (u)du 81 {r�T} 1 9t ) ,
+IE
where 8 is the recovery rate. Now assume fractional recovery of market value, so 8 = (1 - L)Pd (r- , T) with a constant L. Then Pd (Xt , kt , t, T)
=
IE
( e - It (r (u)+h (u) L)du I 9t ) ,
where h(t) Llk t , K+1 (Xt ) is the rate of transition from current rating into default. For further details on rating-based models, we refer the reader to Jarrow, Lando, and Turnbull ( 1 997) , Lando ( 1 998) , Lando (2000) and Bielecki and Rutkowski (2002) , Chapters 1 1 and 12. =
9 . 5 Credit Derivatives
We only discuss two specific examples. Credit Default Swaps, CDS. A credit default swap is an exchange of a periodic payment against a one-off contingent payment if some credit event occurs on a reference asset. The basic cash flow is shown in Figure 9 . 2 contingent payment Protection Buyer
f----
�
Protection Seller
periodic fee Table 9 . 2 .
Cash flow of a credit default swap
The ingredients of the basic structure are specification of 1 . maturity T: usually from one to ten years, 2. underlying: corporate or sovereign, 3. credit event: default, bankruptcy, downgrade. Let c(T) be the fixed coupon that the protection buyer pays. The payment continues until either default or maturity. In case of default, assume that the payment from the protection seller to the protection buyer is equal to the difference between the notational amount of the bond and the recovery value 8. The fixed side of the payment is set so that contract value is zero at initiation. Thus, since the cash flow at coupon date i for the protection buyer is c(T) l { r> i } and the payment for the protection seller at time of default r is ( 1 - 8 ) 1 { r � T} ' we obtain
400
9. Credit Risk
where we assume constant interest rates. Since both ]E* (e - r T l { r$T } ) and !P* (l { r>i}) are readily available in intensity-based form models (or can be inferred from market data assuming such a model) , these models are typically used to price CDS. Extensions of CDS include: 1. contingent credit swaps, which require an additional trigger, i.e. a credit event with respect to another entity or movement in equity prices or interest rates, 2. total (rate of) return swaps, which transfer an asset's total economic performance including - but not restricted to - its credit-related performance. First-to-default Swap (FtD) and Basket Default Swap. Now several assets are bundled together, and a credit swap is created on the whole basket. The default event is defined in terms of default on any of the assets in the basket, e.g. the first default of any asset in case of a first-to-default swap or the ith-to-default, or any other similar contract structure. The additional modelling component to be considered now is the dependence of defaults (e.g. clustering of defaults) . For the FtD, recall that for intensity-based models the default time T min{ T1 , . . . , Td } has intensity A A 1 + . . . + A n , where Ai is intensity of Ti , the default time of asset i. Thus, with an affine model for Ai , tractable models for FtD can be obtained. Further details on credit derivatives can be found in Bielecki and Rutkowski (2002) and Sch6nbucher (2003) . =
=
9 . 6 Portfolio Credit Risk Mo dels
We will only consider a rating-based credit portfolio model, such as Credit Metrics. A formal description of such a model consists of n + 1 rating cate gories (of which the first one corresponds to the default state) , a transition matrix of probabilities of rating changes within the time horizon of interest, and some re-evaluation procedure for the exposures within each rating class. To introduce dependencies of the individual exposures, a latent factor driving the transitions is assumed. Model Description. Assume the portfolio consists of bonds (obligors) that we consider at discrete time periods t = 0, 1 , . . . , T corresponding to the coupon payments. Let Ri (t) be the state indicator (i.e. rating class) at time t. We assume n + 1 rating classes i.e. the state space is {O, 1 , . . . n } with class o corresponding to default. m
9.6 Portfolio Credit Risk Models
Let
Furthermore, let
S =
- 00
401
(SI , . . Sm ) ' be a m-dimensional random vector. .
= C- l < Co < C l < . . . < Cn
be a sequence of cut-off levels. We assume
=
00
We call (Si (t) , ( Cj ) j E { - l , o , . . . , n } ) i E { l , . . . m } a dynamic latent variable model for the state vector R = (R1 , , Rm ) ' . Dynamic Modelling. In general Merton-type ( or asset-based ) models the value of Si (t) may be interpreted as the value of the assets of the firm. Indeed, a variety of distributions is possible for Si corresponding to various Levy-type specifications for stock price modelling ( e.g. the hyperbolic model, Eberlein (2001), the Variance-Gamma model and relatives, Carr, Geman, Madan, and Yor (2002), Carr, Chang, and Madan (1998)) . In particular, jump-diffusion models are easily incorporated, see Hamilton, James, and Webber (2001), for such a model. The relation to the standard approach is seen by recalling the standard Black-Scholes model defined via the SDE .
•
.
with constant coefficients and a standard Brownian motion W . The solution of the SDE is A t Ao exp �2 t + Wt
{,.tt
=
-
a
},
hence motivating the use of a normal return distribution. One can now con sider a general exponential Levy process model for asset values with a Levy process L. Now the solution of the SDE can be written as At
=
Ao exp { Lt }
with Lt a related Levy process ( which can be computed explicitly ) . Hence we may assume that the return distribution St log ( A d is generated by a Levy process. See §5.5, §7.4, and Bingham and Kiesel (2002) for an overview and further discussion of Levy-type models. Factor Modelling. In a typical credit portfolio model, dependencies of in dividual obligors are modelled via dependencies of the underlying latent vari ables S. In light of the typical portfolio analysis, the vector S is embedded in a factor model, which allows for easy analysis of correlation, the typical measure of dependence. One assumes that the underlying variables Si are driven by a vector of common factors. Typically, this vector is assumed to be normally distributed ( see e.g. JP Morgan (1997) ) . Let Z N(O, E) be =
rv
402
9. Credit Risk
a p-dimensional normal vector and € = ( 10 1 , 10m ) ' independent normally distributed random variables, independent also of Z. Define •
•
•
,
p
. . . m.
L aij Zj + Uifi, i 1 , j=l Setting Yi = Si generates a Gaussian factor model. However, such a set ting corresponds to the standard Brownian motion return structure. To cap ture Levy-type equity models, we use a normal mean-variance mixture model ( compare Bingham and Kiesel (2002) ) . To define such a model, let W be a further positive random variable, independent of € and Z and define Si ai + biW2 + WYi, i 1 , . . with ai, bi constants. Then S has a (p + I ) -dimensional conditional indepen dence structure. Now S inherits ( in principle ) the correlation matrix of Y . The individual returns Si are heavy-tailed and the vector S exhibits tail dependence. Such a model could alternatively be generated by using heavy-tailed factors and heavy-tailed idiosyncratic risk. An advantage of such a model is that it allows analytic approximations, in the sense that the loss distribution can be approximated as a function of the factor risk only. Approximation of Loss Distribution. Assume that 1 , i.e. we only have two rating classes, one of which corresponds to default. We calculated the distribution of number of defaults for one period. Assume the collateral consists of bonds, all from the same rating class implying a common default boundary Assume one normally distributed common factor ( as in Gordy (2000) ) , i.e. Yi pZ + V I - p2 fi and Si a + bW2 + WYi a + bW 2 + pWZ + VI - p2 Wfi. So the conditional distribution of Si given (Z, W) is normal. Also by condi tional independence, we get for the number M of defaults in one period ( let W have density g). Yi
=
=
=
=
.m
n =
c.
=
=
7r1
=
ooJ Joo (m)tP (c 00 (
= JP (M = l) =
o
-
l
xtP
) - )
a - bw 2 - pwz l V I - p2 w m -l c - a - bw 2 Pwz ¢J(z)g(w)dzdw. 1 - p2 w � -
9.6 Portfolio Credit Risk Models
403
Using the above formula, we can obtain the loss distribution analytically; see also Croughy, Galai, and Mark (2001), Ong ( 1 999) and the article Frey and McNeil (2003) and references. Multi-period losses: here we simply assume independence of the subse quent periods. Hence the only change in the above calculation is an adjust ment of the number of bonds in the collateral. The distribution of losses in the second period is given as m
IP(M2 l) jL=O IP(M2 = ll M1 j)IP(M1 j) jL=O IP(M(m - j) = lI M (m) = j)7rj(m), with the notation that M (m) is the number of defaults given that m bonds are in the collateral, and 7rj(m) = IP(M(m) j). An application of a strong law of large numbers ( Hall and Heyde ( 1 980 ), =
=
=
=
m
=
=
Theorem 2. 19) yields a convenient approximation: 11. m
.!.M( )
n -+ oo n
n
_
- A'o �
(c -
b 2 - pwz 1 - p2
a - W � V W
)
a.s.
(9. 1 9)
Approximation (9.19) allows us to compute the one-period number of losses given the joint distribution of (Z, W) . See Lucas, Klaassen, Spreij , and Straetmans ( 1999) for such an analytic approach. Copula Modelling. Multi-variate Normal models don't show tail depen dence, i.e. for bivariate normal with correlation p E (-1, 1), we have oX = 0, where
(Xl, X2 )
The degree of tail dependence depends on the copula of to which we now turn. A is a multivariate distribution with standard uniform marginal distributions. That is, is a mapping [0, l]d ---+ [0, 1 ] with - 1, . is increasing in each component = for all i E { I , . . . , d} , E . . , 1 , 1, . For all ( a I , . . E [0, l]d with a � we have: • •
•
(Xl, X2 )' copula C C C(ul ,U ) Ui,Ui [0 , 1]' C( , Ui, d, a ), ,(b1,1) . . Ui. , b ) i bi d d 2 2 L · · · L (-1) i 1 + . . + id C(U1, i 1, . . . , Ud ,tLd ) :::: 0, i1 =1 id= l where Uj , l and Uj , 2 = bj for all j E { I , . . , d } .
. .. . . . =
aj
.
404
9. Credit Risk
If X
FI , .
. .
= (X l , . . . , Xd )' has joint distribution F with continuous marginals , Fd , then the distribution function of the transformed vector
is a copula C and Thus copulas can be used to link marginal distributions (alternative to using the joint distribution) . We can now consider a general latent-variable model. That is, we model each of the underlying variables independently and subsequently use a copula to model the dependence structure. If we assume that the copula function is symmetric (in all variables) , the random vector S will be exchangeable: d
(8I , . . . , 8m ) = (8p(I) , . . . , 8p(m) )
for every permutation p of { I , } In this case, all possible k-dimensional marginal distributions are identical, and in particular, we have for the default probabilities 0, . . 0) , V {i l , . . . , i k } C { I , . } 1 � k � m. 7rk In this case, the distribution of the number of defaults can be computed via an application of the inclusion-exclusion principle: . . . , m
=
JP(Ril
JP(M
=
=
k)
=
.
,Rik
.
=
. . , m
(7)JP(RI 0, . . . ,Rk =
=
O, Rk + 1
#
0, . .
,
. ,Rm
#
0) (9.20)
If we fix the individual default probabilities of a group of k obligors to be we get (9.21) 7rk = C1 , . . . k ( . . . , 7f ) where C1 , ... , k is the k-dimensional margin of C. Using copulae with positive tail dependence for the factors (e.g. t distribu tion) leads to heavier tails of loss distribution, i.e. increased VaRs (compare results from the study of Frey and McNeil (2003) ) . 7r,
,
7r ,
9 . 7 Collateralized Debt Obligat ions ( CD O s )
9.7. 1 Introduction
Collateralized Debt Obligations (CDOs) are an important example of asset backed securities (ABS) and as such are backed by a pool of assets. Basic
9 . 7 Collateralized Debt Obligations (CDOs)
405
information on ABS and CDOs is given in Bowler and Tierny (1999) , Lucas (200 1 ) and Rayre (200 1 ) or in textbook form Bluhm, Overbeck, and Wagner (2003 ) . In the case of CD Os we distinguish two basic types, based on the type of debt used as the collateral: Collateralized Loan Obligations ( CLOs ) , backed by a pool of loans, and Collateralized Bond Obligations ( CBOs ) backed by a pool of bonds. The typical st.ructure is shown below: Collateral
----+
----+ ----+
Spy
----+
----+
----+
Notes
So we start with a pool of credit risky assets. This pool is then transferred to a special purpose vehicle (SPV) , which is a company set-up only for the purpose of the transaction. The SPY then issues securities or structured notes backed by the cash flow of the asset pool. Thus interest and principal of the notes are paid from interest and principal proceeds from the pool. The notes are divided in several classes according to their credit quality: senior notes and mezzanine, which usually carry ratings from triple-A to single-B. There is frequently an unrated equity class. The holders of the notes are paid interest and principal in order of seniority One can distinguish the following types of CDOs. Arbitrage cno s . In an arbitrage CDO, an issuer seeks to capture an arbi trage between the pricing/yield of high-yield sub-investment-grade securities that are acquired in the capital markets and yield on investment-grade bond assets that are sold to investors. This allows investors who could not otherwise invest in sub-investment-grade assets to participate in this market. There are two basic types of arbitrage CD Os, namely cash-flow CD Os and market value CDOs. While for the former, credit events short of default are not relevant for the performance, the collateral pool of the latter is marked to market regularly, and the asset manager is required to trade actively. Conventional cnos. A balance sheet CLO is typically created by a bank or financial institution wishing to securitize illiquid loan assets that they have originated. The loan assets may be fairly heterogenous, although most balance sheet CLOs have been done using investment-grade loans. 9.7.2 Review of Modelling Methods
To construct a model to evaluate CDOs, the following quantities have to be modelled: default probabilities for the asset in the pool, default dependence of the assets, loss in event of default, timing of default. We will now review several modelling approaches that have been proposed in the literature and are being used in practical applications. •
•
• •
9. Credit Risk
406
Moody's Binomial Expansion Technique. A simple approximation is given by Moody's Binomial expansion technique ( BET ) , which is based on a diversity score: represent the loss distribution of N bonds from the same industry by that of M ::; N independent identical bonds, i.e. construct a comparison portfolio. The parameter M is known as the diversity score ( DS ) . Table 9.7.2 reports the suggested diversity score for a standard application.
I Number of firms I Diversity score I 1 .0 1.5 2.0 2.3 2.6 3.0 3.2 3.5 3.7 4.0 case by case
1 2 3 4 5 6 7 8 9 10 >
10
Table 9 . 3 .
Moody's Diversity Score
Given the diversity score, Moody's technique is as follows. To calculate default probability under a diversity score of M, we can now use the Binomial distribution. Thus Z, the number of defaulting bonds, has distribution
qn
p
= JP(Z
= n
)=
(�)pn ( l _ p) M - n ,
with the default probability according to the rating of the tranche. Fur thermore, the expected loss is computed as
Ln
M LqnLn, n=O
where is the loss incurred when bonds default in the portfolio. The problematic aspects of this approach are that there is no probabilistic basis for diversity score. The idea is simply to match the mean and the standard deviation of the return distribution associated with the collateral pool. Infectious Defaults. This is a first application in this field of the important general phenomenon of market contagion. The idea has been put forward by Davis and Lo ( 2001a ) and Davis and Lo ( 2001b ) , to which we refer for details of the calculations involved. The basic idea is that a bond can either default directly or may be infected to default by the default of a different bond ( infectious default ) . n
9 . 7 Collateralized Debt Obligations (CDOs)
.. +Zn
407
So assume that we consider n bonds, and use indicator Zi 1 if bond i defaults, Zi 0 otherwise. Then N Zl + . is the number of defaulted bonds. For i = 1 , . . , n and j 1, . . . , n with i =f=. j let Xi , Yij be independent Bernoulli random variables with lP(Xi = 1) p and lP ( Yij = 1 ) = q . Then =
.
Zi = Xi
=
=
=
=
(
+ (1 - Xd 1 - Il( l - Xj }}i ) Hi
)
,
where the second term models infection. lP(N k) can be computed in closed form, also lE(N) = n ( 1 - ( 1 - p) ( l - pq ) n - l ) and WareN) = lE (N) + n(n l ) ,B�q (1E(N)) 2 . To see the effect of q, we keep the expected number of defaults constant while increasing q. We consider a total of n 50 bonds. So
-
=
=
q implied p std. dev. 0.5 3.54 a 6.05 0. 194 0.05 0.1 7.70 0. 1 16 0.2 10.32 0.064 Table 9 . 4 .
Effect of infection parameter
q can be used to model the volatility of the default (loss) distribution. To model the timing of defaults, the following extension of the model is used. Assume n bonds with exponentially distributed default time, then t p = 1 - e- A is probability of default within [0, t] . Let Nt be the number of defaults in [0, t] , then the default rate is proportional to the number of bonds alive: lP(default in [t, t + dtl l N ) lE(dNt I Nt ) = A(n - Nt ) dt. For met) = lE(Nd , we have t t
m et)
-
=
nAt
=
-J -
Am(s)ds
o
(as Mt = Nt J� A(n - Ns )ds is a martingale) . Thus met)
= n
(1
e - At ) .
Interaction is modelled by increasing the total hazard rate after a default by a factor a for an exponentially distributed time interval. It is possible to compute distribution of the number of defaults; see Davis and Lo (2001b) for details.
408
9. Credit Risk
Duffie and Garleanu (200 1 ) present an inten sity-based model. They model the default intensities of bonds as
Intensity-based Approach.
with X l ' . . . ' Xn , Xc independent affine processes ( say of CIR-type ) . The common intensity factor allows one to model dependency. Efficient simu lation methods are available for obtaining the distribution of the number of defaults. Ratings-based Modelling. One could also use a diffusion-driven Credit Metrics extension. Here every bond is modelled using the structural approach, i.e. we model the underlying value of the firm by a diffusion process. Default occurs if the value-of-the-firm process falls below a threshold. Dependencies within the pool can be captured by the cross-variation of the underlying Brownian Motions. See also Hamilton, James, and Webber (2001) for an extension of this approach.
A . Hilbert Space
Recall our use of n-dimensional Euclidean space JRn , the set of n-vectors or n-tuples x (X l , " " Xn) with each Xi E JR. Here one has the ordinary Euclidean length - the norm =
- and the inner product ( or dot product ) of two vectors x and y: (x, y) ,
or
X ·
y, :=
n
L XiYi . i= l
This setting is adequate for handling finite-dimensional situations, but not for infinite-dimensional ones. The simplest infinite-dimensional situation contain ing the above as a special case is the space £2 of square-summable sequences x = (Xl , X 2 , ' " ) with Because of the Cauchy-Schwarz inequality
if x , y E £2 , then is defined, and
(x , y ) :=
00
L XnYn
n= l
l ( x , y ) 1 :s II x li l l y ll
convergent, that is, L: :::'=l IXn l lYn l < the above ) . One may choose to work instead with complex sequences: all the above goes through, with the changes
( the series L::::'= l x nYn is absolutely - just replace Xn , Yn by IXn l , IYn l in
00
410
A . Hilbert Space 00
(x , y ) : = L X n Yn .
n= l A Hilbert space is a ( possibly - indeed, usually ) infinite-dimensional vector space endowed with such an inner product, and also complete ( in the sense of metric spaces; all Cauchy sequences converge - see Burkill and Burkill ( 1970) ) . For background, we must refer to the excellent textbook treatments of e.g. Young ( 1 988) and Bollobas ( 1 990) . Note that we already have a good supply of Hilbert spaces: finite-dimensional ones such as JRn and infinite dimensional ones such as £2 and L 2 . Hilbert spaces closely resemble ordinary Euclidean spaces in many re spects. In particular, one can take orthogonal complements. If M is a vector subspace of a Hilbert space H which is closed ( contains all its limit points ) , one can form its orthogonal complement MJ.. : = { y E H : (x, y ) = 0 \Ix E M } ;
then M J.. is also a closed vector subspace of H, and any expressible in the form z =x+y
with
z E H
is uniquely
x E M, y E MJ..
( and so (x, y ) = 0) . One then says that H is the direct sum of M and MJ.. , written H = M EB MJ.. . If z = x + y , II z l 1 2 = ( z , z) = (x + y , x + y ) = (x, x) + (x, y ) + ( y, x) + ( y , y ) = I I x l 1 2 + 2 (x, y ) + I I y l 1 2 .
In particular, if (x, y )
= 0 ( Le. x, y
are orthogona0 ,
This is the ( in general ) infinite-dimensional version of Pythagoras ' theorem. Hilbert spaces are the easiest of infinite-dimensional spaces to work with because they so closely resemble finite-dimensional, or Euclidean, ones. In particular, one can think geometrically in a Hilbert space, using diagrams as one would for ( say ) JR2 or JR3 ; see Appendix C. Functional analysis is the study of infinite-dimensional spaces, and so Hilbert-space theory forms an important part of it. Excellent textbook treat ments are available; we particularly recommend Young ( 1 988) for Hilbert space alone, Bollobas (1990) for the more general setting of functional anal ysis.
B. Projections and Conditional Expectations
Given a Hilbert space (or more generally, an inner product space) V, suppose V is the direct sum of a closed subspace M and its orthogonal complement
M.L :
In the direct-sum decomposition
P:z z P (pz) pz
of a vector E V into a sum of x E M and y E M.L , consider the map -+ x. This is called the orthogonal projection of V onto M. It is linear, and idempotent: since the direct-sum decomposition of x = is x = x + 0, = = x, or By Pythagoras' theorem,
pz
p2 = P.
I x l 1 2 I Pz l 1 2 :::; I l Pz 1 2 .
I PI P P I P I :::; R P) P, z,z pz); P ( P R P). pz z ( N.L z z P, N ( P) P. R( P) N ( P). The situation is symmetric between M and M.L : write Q : = 1 - P, with I the identity mapping. Then Q is linear, and as p2 = P, Q2 = (I - p)2 = I - 2P + p2 = I - 2P + P = I P = Q : Q2 = Q, and conversely, if Q2 = Q, then p2 = P. Thus also
= That is, application of decreases the In particular, norm of a vector: one says that has norm :::; 1 . Conversely, we quote that on an inner product space, these properties linear and idempotent with 1 characterise orthogonal projections. The range of that is, the set of x of the form x = for some is, as above, the set of vectors invariant under (that is, the set of with = is called the projection onto its range M = The orthogonal complement of the range - the set of with zero x-component - is the set of annihilated by the kernel (or nullspace) of Thus the direct-sum decomposition for the orthogonal projection P onto M is V = M 61 M .L = 61 -
-
-
412
B. Projections and Conditional Expectations v = M EI1 M -L
=
N(Q) EI1 R(Q) ,
Q :=
I
-
P.
In particular, when V is finite-dimensional, the content of the above re duces to linear algebra (there is no need to assume M closed, as closure is automatic, so analysis is not needed) . For a textbook treatment in this context, see Halmos ( 1958) . The above use of orthogonal projection, Pythagoras' theorem etc. under lies the theory of the familiar linear model of statistics: normally distributed errors, least squares, regression, analysis of variance etc. For such a geometric treatment of the linear model, see e.g. Rawlings ( 1 988) , Chapter 6. Conditional Expectations and Projections
We confine ourselves here to the L2 theory. Take the vector space L 2 L 2 (il, F , JP) of square-integrable random variables X on a probability space (il, F, JP) . This is a Hilbert space under the norm =
and inner product
(X, Y)
:=
JE(XY).
The space L2 is complete, by the Riesz-Fischer theorem (see e.g. Bollobas ( 1990) ) .
If M is a vector subspace of L 2 which is closed (equivalently: complete), given any X E L 2 one can 'drop a perpendicular' from X to M , obtaining Y E M with II X - Y I I = inf { II X W I I : W E M } and (X - Y, Z) 0 for all Z E M (see e.g. Williams ( 1 99 1 ) , §6 . 1 1 ) Then the map X � Y is the orthogonal projection onto M: it is linear, idempotent and of norm at most 1 . Suppose now that Q is a sub-a-field of F and M is the L 2 -space of Q measurable functions. Then -
=
.
X � JE(X IQ)
is the orthogonal projection of X onto M L 2 (il, Q , JP) . It gives the best predictor (in the least-squares sense) of X given Q (that is, it minimises the mean-square error of all predictors of X given the information represented by Q) : see e.g. Williams ( 1 99 1 ) , §9.4. The idempotence of this conditional expectation operator follows from the iterated conditional expectation oper ation of §2.5. This idempotence is also suggested by the above interpretation: forming our best estimate given available information should give the same result done once as done twice. =
B. Projections and Conditional Expectations
413
This picture of conditional expectation projection is powerful - partly because, in the L 2 -setting, it allows us to think and argue geometrically. The excellent text Neveu (1975) on martingales is based on this viewpoint. as
c . The Separating Hyperplane Theorem
In a vector space V , if x and y are vectors, the set of linear combinations ax + /3y, with scalars a, /3 2 0 with sum a + /3 = 1 , represents geometrically the line segment joining x to y. Each such linear combination .>..x + (1 - .>.. ) y, with 0 ::; .>.. ::; 1 , is called a convex combination of x and y . A set C in V is called convex if, for all pairs x and y of points in C, all convex combinations of x and y are also in C. If V has dimension n and U is a subspace of dimension n - 1, U is said to have codimension 1 . If U is a subspace, x+U
:=
{x
+
u : u
E U}
is called the translate of U by the vector x. A hyperplane in V is a translate of a subspace of codimension 1. Such a hyperplane is always representable in the form H = [f, a] : = {x : f (x) = a } ,
for some scalar a and linear functional f : that is, a map f : V -+ IR with f ( x + y) = f ( x)
+ f ( y)
( x, y E V) ,
f ( .>..x ) = .>.. f ( x) ( x
E V, .>.. E IR ) .
Such an f is of the form f ( x) = iI x! + . . . + fnXn i
then f ( iI , . . , fn ) defines a vector f in V, and the hyperplane H = [f, a] consists of those vectors x in V whose projections onto f have magnitude a. The hyperplane [j, a] bounds the set A c V if =
the hyperplane
·
f ( x) 2 a \Ix [f, a] separates
E V or
f ( x) ::; a \Ix E
Vi
the sets A, B c V if
f ( x) 2 a \Ix E
A and
f ( x) ::; a \Ix
E B,
or the same inequalities with A < B ( or 2 , ::; ) interchanged. The following result is crucial for many purposes, both in mathematics and in economics and finance.
416
C. The Separating Hyperplane Theorem
Theorem C.O.1 (Separating Hyperplane Theorem) . If A, B are two
non-empty disjoint convex sets in a vector space V, they can be separated by a hyperplane.
For proof and background, see e.g. Valentine ( 1 964) , Part II and Bott The restriction to finite dimension is not in fact necessary: the re sult is true as stated even if V has infinite dimension ( for proof, see e.g. BolloMs ( 1990) , Chapter 3) . In this form, the result is closely linked to the Hahn-Banach theorem, the cornerstone of functional analysis. Again, Bol lobas (1990) is a fine introduction. ( 1942) .
Remark. When using a book on functional analysis, it is usually a good idea to look out for the results whose proof depends on the Hahn-Banach theorem: these are generally the key results, and the hard ones. The same is true in mathematical economics or finance of the separating hyperplane theorem.
Bibliography
A'it-Sahalia, Y., 1996, Nonparametric pricing of interest rate derivative securities, Econometrica 64, 527-600 . AitSahlia, F . , and T.-L. Lai, 1998a, Random walk duality and the valuation of discrete lookback options, Working paper, Department of Statistics, Stanford University. AitSahlia, F . , and T.-L. Lai, 1 998b, Valuation of discrete barrier and hindsight options, To appear in Journal of Financial Engineering. Allingham, M . , 199 1 , A rbitrage . Elements of financial economics. ( MacMillan, New York ) . Amin, K . , and A. Khanna, 1994, Convergence of American option values from discrete- to continuous-time financial models, Mathematical Finance 4, 289304. Applebaum, D . B . , 2004, Levy processes and stochastic calculus. ( Cambridge Uni versity Press, Cambridge ) . Aurell, E . , and S.L Simdyankin, 1 998, Pricing risky option simply, International Journal of Theoretical and Applied Finance 1, 1-23. Back, K . , and S.R Pliska, 1991 , On the fundamental theorem of asset pricing with an infinite state space, Journal of Mathematical Economics 20, 1-18. Bagnold, RA., 1941 , The physics of blown sand a n d des ert dunes. ( Matthew, Lon don ) . Bagnold, R A . , and O.E. Barndorff-Nielsen, 1979, The pattern of natural size dis tributions, Sedimentology 27, 1 99-207. Bajeux-Besnainou, I . , and R Portrait, 1997, The numeraire portfolio: A new ap proach to continuous time finance, The European Journal of Finance. Bajeux-Besnainou, I . , and J-C. Rochet, 1996, Dynamic spanning: Are options an appropriate instrument?, Mathematical Finance 6, 1-16. Barndorff-Nielsen, O . E . , 1977, Exponentially decreasing distributions for the loga rithm of particle size, Proc. Roy. Soc. London A 353, 401-419. Barndorff-Nielsen, O.E. , 1 998, Processes of normal inverse Gaussian type, Finance and Stochastics 2, 41-68. Barndorff-Nielsen, O.E. , P. Blaesild, J.L Jensen, and M . S0rensen, 1 985 , The fas cination of sand, in A.C. Atkinson, and S.E. Fienberg, eds . : A celebration of statistics ( Springer, New York ) . Barndorff-Nielsen, O . E . , and O. Halgreen, 1977, Infinite diversibility of the hyper bolic and generalized inverse Gaussian distributions, Z. Wahrschein. 38, 309312. Barndorff-Nielsen, O . E . , T. Mikosch, and S.L Resnick, 2000, Levy processes: theory and applications. ( Birkhauser Verlag, Basel ) . Barndorff-NieIsen, O . E . , and N. Shephard, 200 1 , Non-Gaussian Ornstein Uhlenbeck-based models and some of their uses in financial economics, J. R. Statist. Soc . B 63, 167-24 1 .
418
Bibliography
Barndorff-Nielsen, O . E . , and N. Shephard, 2002, Econometric analysis of realized volatility and its use in estimating stochastic volatility models, J. R. Statist. Soc. B 64, 253-280. Barndorff-Nielsen, O . E . , and M. S!2!rensen, 1 994, A review of some aspects of asymp totic likelihood theory for stochastic processes , International Statistical Review 62, 1 33-165. Bass, RF., 1995, Probabilistic techniques i n analysis. ( Springer, Berlin Heidelberg New York ) . Beran, J . , 1994, Statistics for long-memory process es. ( Chapman & Hall, London ) . Bertoin, J . , 1 996, L evy processes vol. 1 2 1 of Cambridge tracts in mathematics. ( Cambridge University Press, Cambridge ) . Bibby, B . M . , and M . S!2!rensen, 1997, A hyperbolic diffusion model for stock prices, Finance and Stochastics 1 , 25-4 1 . Bielecki, T.R , and M . Rutkowski, 2002, Credit risk: modeling, valuation and hedg ing. ( Springer, New York ) . Biger, N . , and J. Hull, 1 983, The valuation of currency options, Finan. Management 1 2 , 24-28. Billingsley, P. , 1968, Convergence of probability measures. ( Wiley, New York ) . Billingsley, P. , 1986, Probability Theory. ( Wiley, New York ) . Bingham, N . H . , and R Kiesel, 200 1 , Hyperbolic and semiparametric models in finance, in P. Sollich, A . C . C . Coolen, L.P. Hughston, and R.F. Streater, eds. : Disordered and complex systems ( Amer. Inst. of Physics ) . Bingham, N . H . , and R. Kiesel, 2002, Semi-parametric modelling in finance: theo retical foundation, Quantitative Finance 2 pp. 241-250. Bingham, N . H . , C.M. Goldie, and J . L . Teugels, 1 987, Regular Variation. ( Cam bridge University Press, Cambridge ) . Bjork, T . , 1995, Arbitrage theory in continuous time, Notes from Ascona meeting. Bjork, T . , 1997, Interest rate theory, in Financial Mathematics, ed . by W.J. Rung galdier Lecture Notes in Mathematics pp. 53-122. Springer, Berlin New York London. Bjork, T . , 1999, A rbitrage theory in continuous time. ( Oxford University Press, Oxford ) . Bjork, T . , G. Di Masi, Y. Kabanov, and W. Runggaldier, 1997, Towards a general theory of bond markets, Finance and Stochastics 1 , 141-174. Bjork, T., Y. Kabanov, and W. Runggaldier, 1997, Bond market structure in the presence of marked point processes, Mathematical Finance 7, 2 1 1-239. Black, F . , 1976, The pricing of commodity contracts, J. Financial Economics 3 1 , 167-179. Black, F., 1989, How we came up with the option formula, J. Portfolio Management 1 5 , 4-8. Black, F. , and J . C . Cox, 1976, Valuing corporate securities: Some effects of bond indenture Provisions, Journal of Finance 3 1 , 351-367. Black, F., E. Derman, and W. Toy, 1990, A one-factor model of interest rates and its application to treasury bond options, Finan. Analysts J. pp. 33-39. Black, F., and M. Scholes, 1973, The pricing of options and corporate liabilities, Journal of Political Economy 72, 637-659. Bluhm, C . , L. Overbeck, and C . Wagner, 2003, An introduction to credit risk mod elling. ( Chapman & Hall, London ) . Bodie, Z . , A. Kane, and A . J . Marcus, 1999, Investments. ( McGraw-Hill ) 4th edn. Bollobas, B . , 1 990, Linear analysis . A n introductory course. ( Cambridge University Press, Cambridge ) . Bott, T . , 1 942, Convex sets, A merican Math. Monthly 49, 527-535.
Bibliography
419
Bowler, T . , and J . F . Tierny, 1 999, Credit derivatives and structured credit, Working paper, Deutsche Bank. Boyle, P.P. , J. Evnine, and S. Gibbs, 1 989, Numerical evaluation of multivariate contingent claims, Review of Financial Studies 2, 241-250. Brace, A . , D. Gatarek, and M. Musiela, 1997, The market model of interest rate dynamics, Mathematical Finance 7, 1 27-1 54. Brace, A., M . Musiela, and W. Schlogl, 1998, A simulation algorithm based on mea sure relationships in the lognormal market models, Working paper, University of New South Wales. Brandt, W . , and P. Santa-Clara, 2002, Simulated likelihood estimation of diffusions with an application to exchange rate dynamics in incomplete markets, Journal of Financial Economics 63, 1 6 1-210. Breiman, L . , 1992, Probability. ( Siam, Philadelphia) 2nd edn. First edition Addison Wesley, Reading, Mass. 1968. Brigo, D., and F . Mercurio, 200 1 , Interest rate models - theory and practice. ( Springer ) . Briys, E . , and F. de Varenne, 1997, Valuing risky fixed rate debt: An extension, Journal of Financial and Quantitative A nalysis 32, 239-248. Broadie, M., and J. Detemple, 1997, Recent advances in numerical methods for pricing derivative securities, in L . C . G . Rogers, and D . Talay, eds . : Numerical Methods in Finance ( Cambridge University Press, Cambridge ) . Brown, R.H . , and S.M. Schaefer, 1995, Interest rate volatility and the shape of the term structure, in S . D . Howison, F . P. Kelly, and P. Wilmott, eds . : Mathematical models in finance ( Chapman & Hall, London ) . Biihlmann, H . , F . Delbaen, P. Embrechts, and A. Shiryaev, 1996, No arbitrage, change of measure and conditional Esscher transforms, C WI Quarterly 9, 2913 1 7. Biihlmann, H . , F. Delbaen, P. Embrechts, and A. Shiryaev, 1998 , On Esscher trans forms in discrete financial models, Preprint, ETH Ziirich. Burkholder, D.L. , 1966, Martingale transforms, Ann. Math. Statist. 37, 1494-1504. Burkill, J . C . , 1962, A first course in mathematical analysis. ( Cambridge University Press, Cambridge ) . Burkill, J . C . , and H. Burkill, 1970, A second course in mathematical analysis. ( Cam bridge University Press, Cambridge ) . Cambanis, S . , S. Huang, and G . S . Simons, 1981 , On the theory of elliptically con toured distributions, J. Multivariate A nalysis 1 1 , 368-385. Campbell, J .Y., A.W. Lo, and A . C . MacKinlay, 1997, The econometrics of financial markets. ( Princeton University Press, Princeton ) . Carr, P. , E. Chang, and D . B . Madan, 1998, The Variance-Gamma process and option pricing, European Finance Review 2, 79-105. Carr, P. , K Ellis, and V. Gupta, 1998, Static hedging of exotic options, Journal of Finance pp. 1 1 65-1 190. Carr, P. , H. Geman, D . B . Madan, and M . Yor, 2002, The fine structure of asset returns: An empirical investigation . , Journal of Business 75, 305-332. Chan, K C . , et al. , 1 992, An empirical comparison of alternative models of the short-term interest rates . , Journal of Finance 47, 1 209-1228. Chan, K C . , G.A. Karolyi, F.A. Longstaff, and A.B. Sanders, 1992, An empirical comparison of alternative models of the short-term interest rate, Journal of Finance 47, 1209-1 227. Chan, T. , 1999, Pricing contingent claims on stocks driven by Levy processes, An nals Applied Probab. 9, 504-528. Chapman, D . A . , and N.D. Pearson, 2000, Is the short rate drift actually nonlinear? Journal of Finance 55, 355-388.
420
Bibliography
Chen, L . , 1996, Interest rote dynamics, derivatives pricing, and risk management vol. 435 of Lecture notes in economics and mathematical systems. (Springer, Berlin Heidelberg New York) . Chichilnisky, G . , 1 996, Fisher Black: Obituary, Notices of the A MS pp. 319-322. Chow, Y.S . , H . Robbins, and D . Siegmund, 1 99 1 , The theory of optimal stopping. (Dover, New York) 2nd edn. 1st ed. , Great expectations: The theory of optimal stopping, 1971 . Cochrane, J . H . , 200 1 , Asset pricing. (Princeton University Press , Princeton) . Cox, D . R , and H . D . Miller, 1 972, The theory of stochastic processes. (Chapman and Hall, London and New York) First published 1965 by Methuen & Co Ltd. Cox, J . C . , and S.A. Ross, 1 976, The valuation of options for alternative stochastic processes, Journal of Financial Economics 3, 145-166. Cox, J . C . , S. A. Ross, and M. Rubinstein, 1 979, Option pricing: a simplified ap proach, J. Financial Economics 7, 229-263. Cox, J . C . , and M. Rubinstein, 1985, Options markets. (Prentice-Hall, Englewood Cliffs, NJ) . Croughy, M . , D . Galai, and R Mark, 200 1 , Risk management. (McGraw Hill, New York) . Cutland, N.J. , E. Kopp, and W. Willinger, 199 1 , A nonstandard approach to option pricing, Mathematical Finance 1, 1-38. Cutland, N . J . , E. Kopp, and W. Willinger, 1 993a, From discrete to continuous finan cial models: New convergence results for option pricing, Mathematical Finance 3, 1 01-1 23 . Cutland, N.J . , E. Kopp, and W. Willinger, 1 993b, A nonstandard tratment of options driven by Poisson processes, Stochastics and Stochastics Reports 42, 1 1 5-133. Dalang, R C . , A. Morton, and W. Willinger, 1 990, Equivalent martingale mea sures and no-arbitrage in stochastic securities market models, Stochastics and Stochastic Reports 29, 185-201 . Daley, D . , and D . Vere-Jones, 1988, A n introduction t o the theory of point processes. (Springer, New York) . Dana, R-A . , and M. Jeanblanc, 2002, Financial markets in continuous time. (Springer, Berlin Heidelberg New York) . Davis, M.H.A., 1 994, A general option pricing formula, Preprint , Imperial College. Davis, M.H.A., 1 997, Option pricing in incomplete markets, in M.A.H. Dempster, and S.R Pliska, eds . : Mathematics of derivative securities (Cambridge Univer sity Press, Cambridge) . Davis, M . , and V. Lo, 2001a, Infectious Default, Quantitative Finance 1 , 382-386. Davis, M . , and V. Lo, 2001b, Modelling default correlation in bond portfolios, in Carol Alexander, eds. : Mastering risk volume 2: Applications (Financial Times Prentice-Hall, Englewood Cliffs, NJ) . Delbaen, F . , et al. , 1 997, Weighted norm inequalities and hedging i n incomplete markets, Finance and Stochastic 1 , 181-227. Delbaen, F . , and W. Schachermayer, 1994, A general version of the fundamental theorem of asset pricing, Mathematische Annalen 300, 463-520. Delbaen, F., and W. Schachermayer, 1995a, The existence of absolutely continuous local martingale measures, Ann. Appl. Prob . 5, 926-945. Delbaen, F . , and W. Schachermayer, 1995b, The no-arbitrage property under a change of numeraire, Stochastics and Stochastics Reports 53, 213-226. Delbaen, F . , and W. Schachermayer, 1996, The variance-optimal martingale mea sure for continuous processes, Bernoulli 2, 81-106. Delbaen, F., and W. Schachermayer, 1 998, The fundamental theorem of asset pric ing for unbounded stochastic processes, Math. Annal 312, 2 1 5-250.
Bibliography
42 1
Dellacherie, C . , and P.-A. Meyer, 1978, Probabilities and potential vol. A. (Hermann, Paris) . Dellacherie, C . , and P.-A . Meyer, 1982, Probabilities and potential vol. B. ( North Holland, Amsterdam New York) . Dixit, A . K , and R.S. Pindyck, 1994, Investment under uncertainty. (Princeton University Press, Princeton) . Dohnal, G . , 1987, O n estimating the diffusion coefficient, J. Appl. Probab. 24, 1051 14. Doob, J . L . , 1984, Classical potential theory and its probabilistic counterpart vol. 262 of Grundl. math. Wissenschaft. (Springer, Berlin Heidelberg New York) . Doob, J. L . , 1953, Stochastic processes. (Wiley, New York) . Dothan, M. U . , 1990, Prices in financial markets. (Oxford University Press, Ox ford) . Downing, C . , 1999, Nonparametric estimation of multifactor continuous time inter est rate models, Working Paper, Federal Reserve Board. Duan, J-C . , 1 995, The GARCH option pricing model, Mathematical Finance 5 , 1 3-32. Dubins, L.E. , and L.J. Savage, 1976, Inequalities for stochastic processes. How t o gamble if you must. (Dover, New York) 2nd edn. 1st ed. How to gamble if you must. Inequalities for stochastic processes, McGraw-Hill, 1965. Dudley, R.M. , 1989, Real analysis and probability. (Wadsworth, Pacific Grove ) . Duffee, Gregory R. , 1999, Estimating the price o f default risk, Review of Financial Studies 12, 197-226. Duffie, D . , 1989, Futures markets. (Prentice-Hall, Englewood Cliffs, NJ) . Duffie, D . , 1992, Dynamic asset pricing theory. (Princton University Press, Prince ton) . Duffie, D . , 1 996, State-space models of the term structure of interest rates, in L. Hughston, eds. : Vasicek and beyond (Risk Publications, London) . Duffie, D . , and N . Garleanu, 200 1 , Risk and Valuation of Collateralized Debt Obli gations, Financial A nalysts Journal 57, 41-59. Duffie, D . , and C.-F. Huang, 1985, Implementing Arrow-Debreu equilibria by con tinuous trading of a few long-lived securities, Econometrica 53, 1 337-1356. Duffie, D., and R. Kan, 1995, Multi-factor term structure models, in S.D. Howison, F.P. Kelly, and P. Wilmott, eds . : Mathematical models in finance (Chapman & Hall, London) . Duffie, D . , and P. Protter, 1992, From discrete- to continuous-time finance: Weak convergence of the financial gain process, Mathematical Finance 2, 1-15. Duffie, D., and H.R. Richardson, 1 99 1 , Mean-variance hedging in continuous time, A nn. Appl. Probab. 1, 1-15 . Duffie, D . , and K J . Singleton, 1997, An econometric model o f the term structure of interest rate swap yields, Journal of Finance 52, 1 287-132 1 . Duffie, D . , and K J . Singleton, 1999, Modeling term structures o f defaultable bonds, Review of Financial Studies 12, 687-720. Duffie, D . , and KJ. Singleton, 2003, Credit Risk. (Princeton University Press, Princeton) . Duffie, D . , and C. Skiadas, 1994, Continuous-time security pricing: A utility gradi ent approach, J. Mathematical Economics 23, 1 07-13 1 . Durrett , R. , 1996a, Probability: Theory and examples. (Duxbury Press at Wadsworth Publishing Company) 2nd edn. Durrett, R. , 1996b, Stochastic Calculus: A practical introduction. ( CRC Press) . Durrett , R. , 1999, Essentials of stochastic processes. (Springer, New York) .
422
Bibliography
Dybvig, P.H . , and S.A. Ross, 1987, Arbitrage, in M . Milgate J . Eatwell, and P.Newman, eds . : The New Palgrave: Dictionary of Economics ( Macmillan, Lon don ) . Eberlein, E . , 200 1 , Applications of generalized hyperbolic Levy motions to finance, in O.E. Barndorff-Nielsen, T. Mikosch, and S. Resnick, eds . : Levy processes: Theory and Applications ( Birkhauser Verlag, Boston ) . Eberlein, E . , and J. Jacod, 1997, On the range of option prices, Finance and Stochas tics 1 , 131-140. Eberlein, E., and U . Keller, 1995, Hyperbolic distributions in finance, Bernoulli 1 , 281-299. Eberlein, E., U. Keller, and K Prause, 1 998, New insights into smile, mispricing and Value-at-Risk: The hyperbolic model, J. Business 7 1 , 371-406. Eberlein, E . , and S . Raible, 1998, Term structure models driven by general Levy processes, Mathematical Finance 9, 3 1-53 . Edwards, F.R. , and C.W. Ma, 1992, Futures and options. ( McGraw-Hill, New York ) . EI Karoui, N . , R. Myneni, and R. Viswanathan, 1992a, Arbitrage pricing and hedg ing of interest rate claims with state variables: I. Theory, Universite de Paris VI and Stanford University. EI Karoui, N . , R. Myneni, and R. Viswanathan, 1992b, Arbitrage pricing and hedg ing of interest rate claims with state variables: II. Applications, Universite de Paris VI and Stanford University. EI Karoui, N . , and M . C . Quenez , 1995, Dynammic programming and pricing of contingent claims in an incomplete market, SIA M J. Control Optim. 33, 29-66. EI Karoui, N . , and M . C . Quenez, 1997, Nonlinear pricing theory and backward stochastic differential equations, in Financial Mathematics, ed. by W.J. Rung galdier no. 1656 in Lecture Notes in Mathematics pp. 19 1-246. Springer, Berlin New York London Lectures given at the 3rd Session of the Centro Internazionale Matematico Estivo ( C.I.M.E ) held in Bressanone, Italy, July 8- 1 3 , 1996. Elton, E. J . , and M.J. Gruber, 1995, Modern portfolio theory and inves tment anal ysis. ( Wiley, New York ) 5th edn. Embrechts, P. , 2000, Actuarial versus financial pricing of insurance, Risk Finance 1 , 17-26. Embrechts, P. , C. Kliippelberg, and P. Mikosch, 1997, Modelling extremal events. ( Springer, New York Berlin Heidelberg) . Esscher, F . , 1932, On the probability function in the collective theory of risk, Skan dinavisk A ktuarietidskrift 1 5 , 1 75-195. Ethier, S . N . , and T.G. Kurtz, 1986, Markov processes. ( John Wiley & Sons, New York ) . Eydeland, A . , and H. Geman, 1995, Domino effect: Inverting the Laplace transform, in Over the rainbow ( Risk Publications, London ) . Fang, K-T . , S. Kotz, and K-W. Ng, 1 990, Symmetric multivariate and related distributions. ( Chapman & Hall, London ) . Feller, W . , 1 968, An introduction to pro bability theory and its applications, Volume 1. ( Wiley, New York ) 3rd edn. Feller, W. , 1971 , An introduction to probability theory and its applications, Volume 2. ( Wiley & Sons, Chichester ) 2nd edn. Fleming, W.H., and H . M Soner, 1 993, Control led Markov processes and viscosity solutions. ( Springer, New York Berlin Heidelberg ) . Flesaker, R , and L. Hughston, 1 996a, Positive interest, Risk magazine 9. Flesaker, B . , and L . Hughston, 1996b, Positive interest:foreign exchange, in L . Hughston, eds . : Vasicek and beyond ( Risk publications , London ) .
Bibliography
423
Flesaker, B . , and L . Hughston, 1997, Positive interest, in M.A. Dempster, and S. Pliska, eds . : Mathematics of derivative securities (Cambridge University Press, Cambridge) . Florens-Zmirou, D . , 1989, Approximate discrete-time schemes for statistics of dif fusion processes, Statistics 20, 547-557. Follmer, H., 199 1 , Probabilistic aspects of options, Discussion Paper B-202, Uni versitat Bonn. Follmer, H . , and P. Leukert, 1999, Quantile hedging, Finance and Stochastic 3, 25 1-274. Follmer, H . , and P. Leukert , 2000, Efficient hedging: Cost versus shortfall risk, Finance and Stochastic 4, 1 1 7-146. Follmer, H., and M . Schweizer, 1991 , Hedging of contingent claims under incomplete information, in M.H.A. Davis, and R.J. Elliott, eds . : Applied stochastic analysis (Gordon and Breach, London New York) . Follmer, H. , and D . Sondermann, 1 986, Hedging of non-redundant contingent claims, in W. Hildenbrand, and A. Mas-Colell, eds . : Contribution to mathe matical economics (North Holland, Amsterdam). Fouque, J .-P. , C. Papanicolaou, and K.R. Sircar, 2000, Derivatives i n financial markets with stochastic volatility. (Cambridge University Press, Cambridge). Frey, R., 1997, Derivative asset analysis in models with level-dependent and stochas tic volatility, C WI Quarterly pp. 1-34. Frey, R. , and A. McNeil, 2003 , Dependent Defaults in Models of Portfolio Credit Risk, Journal of Risk. Garman, M . , and S. Kohlhagen, 1983, Foreign currency option values, J. Interna tional Money Finance 2, 231-237. Geman, H . , N . EI Karoui , and J-C . Rochet, 1995, Changes of numeraire, changes of probability measure and option pricing, J. Appl. Pro b. 32, 443-458. Geman, H., and M . Yor, 1993, Bessel processes, Asian options and perpetuities, Mathematical Finance 3, 349-375 . Geman, H . , and M. Yor, 1996, Pricing and hedging double barrier options: A prob abilistic approach, Mathematical Finance 6, 365-378 . Genon-Catalot, V. , and J . Jacod, 1 994, Estimation of the diffusion coefficient for diffusion processes: random sampling, Scand. J. Statist. 2 1 , 193-221 . Gerber, H . U . , and E.S. Shiu, 1995, Actuarial approach t o option pricing, Preprint , Istitut de Sciences Actuarielles, Universite de Lausanne. Gerber, U . , 1973, Martingales in risk theory, Mitteilungen der Vereinigung Schweiz erischer Versicherungsmathematiker 73, 205-2 16. Geske, R., 1977, The valuation of corporate liabilities as compound options, Journal of Financial and Quantitative Analysis pp. 541-552. Ghysels, E . , et al. , 1998, Non-parametric methods and option pricing, in D . J . Hand, and S . D . Jacka, eds . : Statistics in finance (Arnold, London) . Gnedenko, B.V. , and A.N. Kolmogorov, 1 954, Limit theorems for sums of indepen dent random variables. (Addison-Wesley) . Goldman, B . , H . Sosin, and M . Gatto, 1979, Path dependent options: Buy at a low, sell at a high, J. Finance 34, 1 1 1 1- 1 1 28 . Goll, T . , and J. Kallsen, 2003 , A complete explicit solution to the log-optimal portfolio problem, The A nnals of Applied Pro bability pp. 774-799. Good, I . J . , 1953, The population frequency of species and the estimation of popu lation parameters, Biometrika 1953, 237-240. Goovaerts, M . , E. de Vylder, and J . M . Haezendonck, 1994, Insurance premiums. (North-Holland, Amsterdam). Goovaerts, M., R. Kaas, A.E. van Heerwaarden, and T. Bauwelinckx, 1990, Effective actuarial methods. (North-Holland, Amsterdam) .
424
Bibliography
Gordy, M . , 2000, A comparative anatomy of credit risk models, Journal of Banking and Finance 24, 1 19-149. Gourieroux, C., 1997, A R CH models and financial applications. ( Springer, New York Berlin Heidelberg ) . Gourieroux, C . , and A. Monfort, 1996, Simulation- based econometric methods. ( Ox ford University Press, Oxford ) . Gourioux, C . , J . P. Laurent, and H. Pham, 1998, Mean-variance hedging and numeraire, Mathematical Finance 8, 179-200. Grimmett, G . R , and D . J . A . Welsh, 1986 , Probability: An introduction. ( Oxford University Press, Oxford ) . Grimmett, G. R , and D . Stirzaker, 200 1 , Probability and random processes. ( Oxford University Press, Oxford ) 3rd edn. 1st ed. 1982, 2nd ed. 1992. Grosswald, E., 1976, The Student t-distribution function of any degree of freedom is infinitely divisible, Z. Wahrschein. 36, 103-109 . Hale, J . , 1969, Ordinary differential equations. ( J . Wiley and Sons / lnterscience, New York ) . Hall, P. , and C . C . Heyde, 1980, Martingale limit theory and its applications . ( Aca demic Press, New York ) . Halmos, P.R , 1958, Finite- dimensional vector spaces. ( Van Nostrand ) . Hamilton, D . , J. James, and N. Webber, 200 1 , Copula methods and the analysis of credit risk, Preprint, University of Warwick. Hamilton, J . , 1994, Time series analysis. ( Princeton University Press, Princeton ) . Harrison, J.M. , 1 985, Brownian mo tion and stochastic flow systems. (John Wiley and Sons, New York ) . Harrison, J . M . , and D. M. Kreps, 1 979, Martingales and arbitrage in multiperiod securities markets, J. Econ. Th. 20, 381-408. Harrison, J . M . , and S.R Pliska, 198 1 , Martingales and stochastic integrals in the theory of continuous trading, Stochastic Processes and their Applications 1 1 , 2 1 5-260. Hayre, L . , 200 1 , SalomonSmithBarney Guide to mortgage- backed and asset- backed securities. ( Wiley, New York ) . He, H . , 1990, Convergence from discrete- to continuous contingent claim prices, Rev. Fin. Studies 3, 523-546. He, H., 1 99 1 , Optimal consumption-portfolio policies: A convergence from discrete to continuous time models, J. Econ. Theory 55, 340-363. Heath, D . , R Jarrow, and A . Morton, 1 992, Bond pricing and the term structure of interest rates: a new methodology for contingent claim valuation, Econometrica 60, 77-105. Heath, D., E. Platen, and M . Schweizer, 2001 , A comparison of two quadratic approaches to hedging in incomplete markets, Math. Finance 1 1 , 4385-413. Heston, S . L . , 1993, A closed-form solution for options with stochastic volatilities with applications to bond and currency options, Review of Financial Studies 6, 327-343. Heynen, R C . , and H . M . Kat , 1995, Lookback options with discrete and partial monitoring of the underlying price, Applied Mathematical Finance 2, 273-284. Hobson, D . G . , 1998, Stochastic volatility, in D . J . Hand, and S . D . Jacka, eds. : Statis tics in finance ( Arnold, London ) . Hobson, D . , and L . C . G . Rogers, 1998, Complete models with stochastic volatility, Mathematical Finance 8, 27-4 1 . Holschneider, M . , 1995, Wavelets: A n analytical tool. ( Oxford University Press, Oxford ) . Hsu, J . , J. Saa.-Requejo, and P. Santa-Clara, 1997, Bond pricing with default risk, Preprint, Andersen School of Management.
Bibliography
425
Hubalek, F., and W. Schachermayer, 1 998, When does convergence of asset prices imply convergence of option prices?, Math. Finance 8, 385-403. Hull, J . , 1999, Options, futures, and other derivative securities. ( Prentice-Hall, En glewood Cliffs , NJ ) 4th edn. 3rd ed. 1997, 2nd ed. 1993, 1 st ed. 1 989. Hull, J . , and A . White, 1987, The pricing of options on assets with stochastic volatilities, Journal of Finance XLII, 281-300. Hull, J . , and A. White, 2000, Forward rate volatilities, swap rate volatilities, and the implementation of the LIB OR market model, Journal of Fixed Income 10, 46-62. Hunt, P.J . , and J.E. Kennedy, 2000, Financial derivatives i n theory a n d proctice. ( Wiley, New York ) . Ikeda, N . , S. Watanabe, Fukushima M . , and H. Kunita ( eds. ) , 1996, Ito stochastic calculus and probability theory. ( Springer, Tokyo Berlin New York ) Festschrift for Kiyosi Ito's eightieth birthday, 1995. Ince, E.L. , 1 944, Ordinary differential equations. ( Dover, New York ) . Ingersoll, J.E., 1986, Theory of financial decision making. ( Rowman & Littlefield, Totowa, NJ ) . Jacka, S . D . , 1992, A martingale representation result and an application to incom plete financial markets , Math. Finance 2, 239-250. Jacod, J., and P. Protter, 2000, Pro bability essentials. ( Springer, New York Berlin London ) . Jacod, J . , and A.N. Shiryaev, 1987, Limit theorems for stochastic processes vol. 288 of Grundlehren der mathematischen Wissenschaften. ( Springer, New York Berlin London ) . Jacod, J . , and A.N. Shiryaev, 1998, Local martingales and the fundamental asset pricing theorems in the discrete-time case, Finance and Stochastics 2, 259-273. Jacques, I . , and C . Judd, 1987, Numerical Analysis. ( Chapman & Hall, London ) . James, J . , and N. Webber, 2000, Interest rote modelling. ( John Wiley, New York ) . Jameson, R ( ed. ) , 1995, Derivative credit risk. ( Risk Publications, New York London ) . Jamshidian, F . , 1997, LIBOR and swap market models and measures, Finance and Stochastics 1 , 261-29 1 . Jarrow, R A . , 1996, Modelling fixed income securities and interest rote options. ( McGraw-Hill, New York) . Jarrow, R , D . Lando, and S. Turnbull, 1 997, A Markov model for the term structure of credit spreads, Review of Financial Studies 10, 481-523. Jarrow, R A . , and S . M . Turnbull, 2000, Derivative Securities. ( South-Western Col lege Publishing, Cincinnati ) 2nd edn. 1st ed. 1996. Jeans, Sir James, 1925, The mathematical theory of electricity and magnetism. ( Cambridge University Press, Cambridge ) 5th. edn. Jin, Y . , and P. Glasserman, 200 1 , Equilibrium positive interest rates: a unified view, Review of financial studies 14, 187-214. Jones, E.P. , S.P. Mason, and E. Rosenfeld, 1 984, Contingent claim analysis of cor porate capital structures: An empirical investigation, Journal of Finance 39, 6 1 1-625. J{1lrgensen, B . , 1982, Statistical properties of the generolized inverse Gaussian dis tribution function vol. 9 of Lecture Notes in Statistics. ( Springer, Berlin ) . JP Morgan, 1 997, Creditmetrics- Technical document. ( JP Morgan New York ) . Kabanov, Y.V. , 200 1 , Arbitrage theory, in E. Jouini, J. Civtanic, and M. Musiela, eds . : Option pricing, interest rotes and risk management ( Cambridge University Press, Cambridge ) . Kabanov, Y., and D . Kramkov, 1994, Large financial markets: Asymptotic arbitrage and continuity, Theo. Prob . Appl. 38, 222-228.
426
Bibliography
Kabanov, Y . , and D. Kramkov, 1998, Asymptotic arbitrage in large financial mar kets, Finance & Stochastics 2, 143-172. Kabanov, Y. M., and O. D . Kramkov, 1995 , No-arbitrage and equivalent martin gale measures: An elementary proof of the Harrison-Pliska theorem, Theory of Pro bability and Applications pp. 523-527. Kahane, J.P., 1985, Some random series as functions. ( Cambridge University Press, Cambridge) 1st. ed 1968. Karatzas, I., 1 996, Lectures on the Mathematics of Finance vol. 8 of CRM Mono graph Series. (American Mathematical Society Providence, Rhode Island, USA) . Karatzas, I . , and G. Kou, 1996, On the pricing of contingent claims under con straints, Annals Appl. Pro bab. 6, 32 1-369. Karatzas, I., and G . Kou, 1998, Hedging American contingent claims with con strained portfolios, Finance and Stochastics 3, 2 1 5-258. Karatzas, I. , and S. Shreve, 199 1 , Brownian Motion and Stochastic Calculus. (Springer-Verlag, Berlin Heidelberg New York) 2nd edn. 1rst edition 1 998. Karatzas , I. , and S . Shreve, 1998, Methods of mathematical finance. (Springer, New . Th� . Kat, H.M. , 1995, Pricing lookback options using binomial trees: An evaluation, J. Financial Engineering 4, 375-397. Keilson, J . , and F.W. Steutel, 1974 , Mixtures of distributions, moment inequalities and measures of exponentiality and normality, A nnals of Probability 2, 1 12-130. Kelker, D . , 1 97 1 , Infinite divisibility and variance mixtures of the normal distribu tion, A nnals of Mathematical Statistics 42, 802-808. Kiesel, R . , 200 1 , Nonparametric statistical methods and the pricing of derivative securities, Journal of Applied Mathematics and Decision Sciences 5, 1-28. Kim, J . , K. Ramaswamy, and S . Sundaresan, 1993, The valuation of corporate fixed income securities, Financial Management pp. 1 1 7-13 1 . Kingman, J . F . C . , 1993, Poisson processes. (Oxford University Press, Oxford) . Klein, I . , 2000, A fundamental theorem of asset pricing for large financial markets, Math. Finance 10, 443-458. Kloeden, P.E. , and E. Platen, 1992, Numerical solutions of stochastic differential equations vol. 23 of Applications of Mathematics, Stochastic Modelling and Ap plied Probability. (Springer, Berlin Heidelberg New York) . Kolb, R.W. , 199 1 , Understanding Futures Markets. (Kolb Publishing, Miami) 3rd edn. Kolmogorov, A.N. , 1933, Grundbegriffe der Wahrscheinlichkeitsrechnung. (Springer) English translation: Foundations of probability theory, Chelsea, New York, ( 1 965) . Korn, R. , 1 997a, Optimal portfolios. (World Scientific, Singapore) . Korn, R. , 1997b, Some applications of L 2 -hedging with a nonnegative wealth pro cess, Applied Mathematical Finance 4, 64-79. Korn, R. , 1997c, Value preserving portfolio strategies in continuous-time models, Mathematical Methods of Operational Research 45 , 1-43. Koyluoglu, H.U. und Hickmann A . , 1998, A generalized framework for credit port folio models, Working Paper, Oliver, Wyman & Company. Kramkov, D . , and W. Schachermayer, 1999, The asymptotic elasticity of utility functions and optimal investment in incomplete markets, Ann. Appl. Pro b. 9, 904-950. Kreps, D . M., 198 1 , Arbitrage and equilibrium in economies with infinite many commodities, Journal of Mathematical Economics 8, 1 5-35 . Kreps, D . M . , 1 982, Multiperiod securities and the efficient allocation o f risk: A comment on the Black-Scholes option pricing model, in J. McCall, eds . : The eco nomics of information and uncertainty (University of Chicago Press, Chicago) .
Bibliography Krzanowski, W.J . , 1 988,
427
Principles of multivariate analysis vol. 3 of Oxford Statis (Oxford University Press, Oxford) . Kiichler, U. et aI. , 1999, Stock returns and hyperbolic distributions, Mathematical and Computer Modelling 29, 1-15. Kurtz, T., and P. Protter, 1 99 1 , Weak limit theorems for stochastic integerals and stochastic differential equations, A nnals of Probability 19, 1035-1070. Kushner, H. J . , and P.G . Dupuis, 1992, Numerical methods for stochastic control problems in continuous time vol. 24 of Applications of Mathematics, Stochastic Modelling and Applied Pro bability. (Springer, New York Berlin Heidelberg) . Lamberton, D . , and B . Lapeyre, 1993, Hedging index options with few assets , Math ematical Finance 3, 25-42. Lamberton, D., and B . Lapeyre, 1 996, Introduction to stochastic calculus applied to finance. (Chapman & Hall, London) . Lamberton, D . , and G . Pages , 1990, Sur l' approximation des reduities, A nn. Inst. H. Poincare Prob . Stat. 26, 331-355. Lando, D., 1995, On jump-diffusion option pricing from the viewpoint of semi martingale characterstics , Surveys in Applied and Industrial Mathematics 2, 605-625. Lando, D . , 1997, Modelling bonds and derivatives with default risk, in M.A.H. Dempster, and S .R. Pliska, eds . : Mathematics of derivative securities (Cam bridge University Press, Cambridge) . Lando, D . , 1998, On Cox processes and credit risky securities, Review of Derivatives Research 2, 99-120. Lando, D., 2000 , Some elements of rating-based credit risk modeling, in N. Je gadeesh, and B. Tuckman, eds . : Advanced Tools for the Fixed Income Profes sional (Wiley, New York) . Lebesgue, H . , 1902, Integrale, longueur, aire, Annali di Mat. 7, 231-259. Leisen, D .P.J , 1996, Pricing the American put option: A detailed convergence anal ysis for binomial models , Working paper, University of Bonn Discussion Paper B-366. Leland, H . E . , 1994, Corporate debt value, bond covenants, and optimal capital structure, The Journal of Finance 49, 1213-1252. Levy, E., and F . Mantion, 1 997, Approximate valuation of discrete lookback and barrier options, Net Exposure 2, 13p http: //www.netexposure.co.uk. Liesenfeld, R. , and J . Breitung, 1999, Simulation based method of moments, in L. Matyas, eds . : Generalized method of moments estimation (Cambridge Uni versity Press, Cambridge) . Loeve, M . , 1973, Paul Levy ( 1886- 1 971 ) , obituary, Annals of Probability 1 , 1-18. Longstaff, F.A, and E . Schwartz , 1995, A simple appraoch to valuing risky fixed and floating rate debt, The Journal of Finance 50, 789-819. Lucas, A . , P. Klaassen, P. Spreij , and S . Straetmans, 1999, An analytic approach to credit risk of large corporate bond and loan portfolios, Journal of Banking (3 Finance 25, 1635-1664. Lucas, D . , 200 1 , CDO Handbook, Working paper, JP Morgan. Madan, D . , 1998, Default risk, in D . J . Hand, and S . D . Jacka, eds . : Statistics in finance (Arnold, London) . Madan, D . , 2000, Pricing the risks of default: A survey, Preprint , University of Maryland. Madan, D . , and F. Milne, 1993, Contingent claims valued and hedged by pricing and investing in a basis, Mathematical Finance 3, 223-245 . Magill, M . , and M. Quinzii, 1996, Theory of incomplete markets vol. 1 . (MIT Press, Cambridge, Massachusetts; London, England) . tical Science Series.
428
Bibliography
Maitra, A .D . , and W.D. Sudderth, 1996, Discrete gambling and stochastic games. ( Springer, New York ) . Mardia, K.V., J.T. Kent , and J.M. Bibby, 1979, Multivariate A nalysis. ( Academic Press, London New York ) . Matyas, L . , 1999, Generalized method of moments estimation. ( Cambridge Univer sity Press, Cambridge ) . McKean, H.P. , 1969, Stochastic Integrals. ( Academic Press, New York ) . Merton, R.C . , 1973, Theory of rational option pricing, Bell Journal of Economics and Management Science 4, 1 4 1-183. Merton, R.C . , 1 974, On the pricing of corporate debt: The risk structure of interest rates, J. Finance 29, 449-470 Reprinted as Chapter 12 in Continuous-time finance Blackwell, Oxford, 1 990. Merton, R.C. , 1 990, Continuous- time finance. ( Blackwell, Oxford ) . Meyer, P.-A. , 1966, Probability and potential. ( Blaisdell, Waltham, MA. ) . Meyer, P.-A. , 1976, Un cours sur les integrales stochastiques, in Seminaire de Proba bilitis X no. 5 1 1 in Lecture Notes in Mathematics pp. 245-400. Springer, Berlin Heidelberg New York. Miltersen, K . , K . Sandmann, and D. Sondermann, 1 997, Closed form solutions for term-structure derivatives with log-normal interest rates, Journal of Finance 52, 409-430. M!1l11er, T . , 1998, Risk-minimizing hedging strategies for unit-linked life insurance contracts, A S TIN Bulletin 28, 1 7-47. M!1l11er, T . , 2001a, Hedging equity-linked life insurance contracts, North American A ctuarial Journal 5, 79-95. M!1l11er, T., 200lb, On transformations of actuarial valuation principles, Insurance: Mathematics f1 Economics 28, 281-303 . M!1l11er, T . , 200 1c, Risk-minimizing hedging strategies for insurance payment pro cesse, Finance and Stochastics 5 , 419-446. Monat, P. , and C . Stricker, 1 995, Follmer-Schweizer decomposition and mean variance hedging for general claims, A nnals of Probability 23, 605-628. Musiela, M . , and M . Rutkowski, 1 997, Martingale methods in financial modelling vol . 36 of Applications of Mathematics: Stochastic Modelling and Applied Prob ability. ( Springer, New York ) . Myneni, R. , 1992, The pricing of the American option, Ann. Appl. Pro bab. 2, 1-23. Nelson, D . B . , and K . Ramaswamy, 1990, Simple binomial processes as diffusion approximations in financial models, Review of Financial Studies 3, 393-430. Neveu, J . , 1975, Discrete-parameter martingales. ( North-Holland, Amsterdam ) . Nielsen, L . , J. Saa-Requejo, and P. Santa-Clara, 1993, Default risk and interest rate risk: The term structure of default spreads, Working Paper, INSEAD. Nobel prize laudatio, 1 997, Nobel prize in Economic Sciences, http://www. no bel. se/announcement- 97.
Norris, J.R., 1997, Markov chains. ( Cambridge University Press, Cambridge ) . Nualart , D . , 1995, The Malliavin calculus and related topics. ( Springer, New York Berlin London ) . 0ksendal, B . , 1 998, Stochastic differential equations: An introduction with applica tions. ( Springer, Berlin Heidelberg New York ) 5th edn. Ong, M . K . , 1999, Internal Credit Risk Models. Capital Allocation and Performance Measurement. ( Risk Books, London ) . Pelsser, A . , 2000, Efficient models for valuing interest rate derivatives. ( Springer, New York Berlin London ) . Pham, H . , T. Rheinlander, and M. Schweizer, 1998, Mean-variance hedging for continuous processes: New proofs and examples, Stochastics and Finance 2, 1 73-198.
Bibliography
429
Pham, H . , and N. Touzi, 1996, Equilibrium state prices in a stochastic volatility model, Mathematical Finance 6, 2 1 5-236. Plackett, R.L. , 1960, Principles of regression analysis. ( Oxford University Press, Oxford) . Platen, E . , and M . Schweizer, 1 994, O n smile and skewness, Statistics Research Report No. SRR 027-94, School of Mathematical Sciences, The Australian Na tional University. Protter, P. , 2004, Stochastic integmtion and differential equations. ( Springer, New York) 2nd ed. 1st edition, 1992. Rachev, S.T., and L . Riischendorf, 1994, Models for option prices, Th. Pro b . Appl. 39, 1 20-152. Rao, C.R. , 1973 , Linear inference a n d its applications. (Wiley) 2nd ed. , 1st ed. 1965 . Rawlings, J . O . , 1988, Applied regression analysis. A research tool. (Wadsworth & Brooks/Cole, Pacific Grove, CA) . Rebonato, R. , 1999, On the pricing implications of the joint lognormal assumption of the swaption and cap market, Journal of Computational Finance 2, 57-76. Rebonato, R. , 2002 , Modern pricing of interest-rote derivatives. (Princeton Univer sity Press, Princeton) . Renault, E . , and N . Touzi, 1996, Option hedging and implied volatilities i n a stochastic volatility model, Mathematical Finance 6, 272-302. Rennocks, John, 1997, Hedging can only defer currency volatility impact for British Steel, Financial Times 08, Letter to the editor. Resnick, S . , 200 1 , A probability path. (Birkhiiuser, Basel) 2nd printing. Revuz, D . , and M. Yor, 199 1 , Continuous martingales and Brownian motion. (Springer, New York) . Rheinlander, T . , and M. Schweizer, 1997, On L 2 -projections in a space of stochastic integrals, Annals of Probabiliy 25, 1810-183 1 . Robert, C.P. , 1997, The Bayesian choice: A decision- theoretic approach. (Springer, New York) . Rockafellar, R.T. , 1 970, Convex Analysis. (Princton University Press, Princton NJ) . Rogers, L . C. G . , 1994, Equivalent martingale measures and no-arbitrage, Stochastics and Stochastic Reports 5 1 , 41-49 . Rogers, L . C .G . , 1995, Which model of the term structure of interest rates should one use? in M . H . A . Davis, et al. , eds . : Mathematical finance (Springer, Berlin Heidelberg New York) . Rogers, L.C. G . , 1997, The potential approach to the term structure of interest rates and foreign exchange rates, Mathematical Finance 7, 1 57-1 76. Rogers, L.C.G., 1998, Utility based justification of the Esscher measure, Private communication. Rogers, L.C.G. , 1999, Modelling credit risk, Preprint, University of Bath. Rogers, L.C . G . , and Z . Shi, 1995, The value of an Asian option, J. Applied Proba bility 32, 1077-1088. Rogers, L.C.G . , and E . J . Stapleton, 1998, Fast accurate binomial pricing, Finance and Stochastics 2, 3-1 7. Rogers, L . C . G . , and D . Talay (eds . ) , 1997, Numerical methods in finance. (Cam bridge University Press, Cambridge) . Rogers, L . C . G . , and D . Williams, 1994, Diffusions, Markov processes and martin gales, Volume 1 : Foundation. (Wiley, New York) 2nd ed. 1st ed. D. Williams, 1 970. Rogers, L.C . G . , and D. Williams , 2000, Diffusions, Markov processes and martin gales, Volume 2: ItO calculus. (Cambridge University Press, Cambridge) 2nd ed.
430
Bibliography
Rosenthal, J . S . , 2000, A first look at rigorous probability theory. ( World Scientific, Singapore ) . Ross; S . , 1976, The arbitrage theory of capital asset pricing, Journal of Economic Theory 13, 341-361 . Ross, S . , 1 978, A simple approach to the valuation of risky streams, Journal of Business 5 1 , 453-475. Ross, S . M . , 1997, Probability models. ( Academic Press, London New York) 6th edn. Rossi, P.E . , 1 995, Modelling stock market volatility. ( Academic Press, London New York ) . Rouge, R. , and N. El Karoui, 2000, Pricing via utility maximization and entropy, Mathematical Finance 10, 259-276. Roussas, G . , 1972, Contiguity of probability measures: Some applications in Statis tics. ( Cambridge University Press, Cambridge ) . Rudin, W. , 1976, Principles of mathematical Analysis. ( McGraw-Hill, New York ) 1st ed. 1953, 2nd ed. 1 964. Rutkowski, M., 1999, Models of forward LIBOR and swap rates, Applied Mathe matical Finance 6, 1-32. Rydberg, T . H . , 1997, The normal inverse Gaussian Levy process: Simulation and approximation, Research Report, Department of Theoretical Statistics, Insti tute of Mathematics, University of A rhus University. Rydberg, T.H. , 1999, Generalized hyperbolic diffusions with applications towards finance, Mathematical Finance 9, 183-201 . Samuelson, P.A . , 1965, Rational theory of warrant pricing, Industrial Management Review 6, 1 3-39. Sato, K .-I. , 1999, Levy processes and infinite divisibility vol. 68 of Cambridge studies in advanced mathematics. ( Cambridge University Press, Cambridge ) . Schachermayer, 2003, Introduction to the mathematics of financial markets, in S . Albeverio, W. Schachermayer, and M. Talagrand, eds . : Lectures on proba bility theory and statistics ( Springer, New York Berlin London ) . Schachermayer, W . , 1992, A Hilbert space proof of the fundamental theorem of asset pricing in finite discrete time, Insurance: Mathematics and Economics l l , 249-257. Schachermayer, W., 1 994, Martingale measures for discrete-time processes with infinite horizon, Mathematical Finance 4, 25-55. Schal, M., 1994, On quadratic cost criteria for option hedging, Mathematics of Operations Research 19, 1 2 1-13l . Schonbucher, P.J . , 1998, Term structure modelling of defaultable bonds , Rev. Derivative Research 2, 1 6 1-192. Schonbucher, P. J . , 2003, Credit derivative pricing models. ( Wiley Finance, Chich ester ) . Schweizer, M . , 1988, Hedging of options in a general semimartingale model, Ph.D. thesis ETH Zurich. Schweizer, M . , 1 99 1 , Option hedging for semi-martingales, Stoch. Processes Appl. 37, 339-363. Schweizer, M . , 1992, Mean-variance hedging for general claims, A nnals of Applied Probability 2 , 171-179. Schweizer, M., 1994, Approximating random variables by stochastic integrals, A n nals of Probability 22, 1 536-1575. Schweizer, M., 1995, On the minimal martingale measure and the Follmer-Schweizer decomposition, Stochastic Analysis and its Applications 13, 573-599. Schweizer, M . , 1999, A minimality property of the minimal martingale measure, Statistics and Probability Letters 42, 27-3 1 .
Bibliography
431
Schweizer, M . , 200la, From actuarial to financial valuation principles, 28, 31-47 Insurance: Mathematics & Economics. Schweizer, M . , 200lb, A guided tour through quadratic hedging approaches, in E. Jouini, J. Cvitanic, and M . Musiela, eds. : Advances in Mathematical Finance ( Cambridge University Press, Cambridge ) . Shaw, W.T . , 1998, Modelling financial derivatives with Mathematica. ( Cambridge University Press, Cambridge ) . Shephard, N . , 1996, Statistical aspects of ARCH and stochastic volatility, in D . R Cox, D .V. Hinkley, and O.E. Barndorff-Nielsen, eds. : Time Series Models - in econometrics, finance and o ther fields ( Chapman & Hall, London ) . Shiryaev, A . , 1996, Pro bability. ( Springer, New York Berlin London ) . Shiryaev, A . N . , 1999, Essentials of stochastic finance vol. 3 of Advanced Series of Statistical Science f1 Applied Probability. ( World Scientific, Singapore ) . Shiryaev, A. N, et al. , 1995, Towards the theory of pricing of options of both European and American types. I: Discrete time, Theory of Probability and Ap plications 39, 14-60. Slater, L . J . , 1 960, Confluent hypergeometric functions. ( Cambridge University Press, Cambridge ) . Snell, J . L . , 1 952, Applications of martingale systems theorems, Trans. Amer. Math. Soc . 73, 293-3 1 2 . S¢rensen, M . , 200 1 , Simplified estimating functions for diffusion models with a high-dimensional parameter, Scand. J. Statist. pp. 99-1 1 2 . Stanton, R , 1 997, A nonparametric model o f the term structure dynamics and the market price of interest rate risk, Journal of Finance 52, 1973-2002. Steele, J . M . , 200 1 , Stochastic calculus and financial app lications. ( Springer, New York Berlin Heidelberg ) . Stein, E . M . , and J . C . Stein, 199 1 , Stock price distributions with stochastic volatil ities: An analytic approach, Review of Financial Studies 4, 727-752. Stroock, D.W. , and S . R S . Varadhan, 1979, Multidimensional diffusion processes. ( Springer, New York ) . Taqqu, M . S . , and W. Willinger, 1987, The analysis of finite security markets using martingales, A dv. Appl. Prob. 19, 1-25. Valentine, F. A., 1964, Convex Sets. ( McGraw-Hill, New York ) . von Neumann, J . , and O . Morgenstern, 1953, Theory of games and economic ba haviour. ( Princeton University Press, Princeton ) 3rd edn. Watson, G . N . , 1944, A treatise on the theory of Bessel functions. ( Cambridge Uni versity Press , Cambridge ) 2nd edn. 1st ed. 1922. Wax, N . ( ed. ) , 1954, Selected papers on noise and stochastic processes. ( Dover, New York ) . Whittle, P. , 1 996, Optimal control: Basics and beyond. ( Wiley, New York ) . Widder, 194 1 , The L aplace transform. ( Princeton University Press , Princeton ) . Williams, D . , 199 1 , Probability with martingales. ( Cambridge University Press, Cambridge ) . Williams, D . , 200 1 , Weighing the odds. ( Cambridge University Press, Cambridge ) . Willinger, W. , and M . S . Taqqu, 1 99 1 , Towards a convergence theory for continuous stochastic securities market models , Mathematical Finance 1 , 55-99. Yor, M . , 1978, Sous-espaces denses dans LI et HI et representation des martingales, in Seminaire de Probabilites, XII no. 649 in Lecture Notes in Mathematics pp. 265-309. Springer. Yor, M . , 1992a, On some exponential functionals of Brownian motion, A dv. Appl. Probab. 24, 509-531 . Yor, M . , 1992b, Some aspects of Brownian motion. Part 1 : Some special functionals. ( Birkhiiuser Verlag, Basel ) .
432
Bibliography
Young, N .J . , 1988 , Hilbert Space. ( Cambridge University Press, Cambridge) . Zhang, P. G . , 1997, Exotic Options. (World Scientific, Singapore) . Zhou, C . , 200 1 , The Term Structure of Credit Spreads with Jump Risk, Journal Banking and Finance 25, 20 1 5-2040.
of
Index
c5-admissible, 234 u-algebra, 31 - stopping time, 84 affine term structure, 340 algebra, 3 1 almost everywhere, 33 almost surely, 33 American put, 1 4 1 , 258 arbitrage, 1, 1 5 - free, 106 - opportunity, 19, 106, 232 - price, 1 1 5 - pricing technique, 8, 9, 328 - strategy, 106 arbitrageur, 6 ARCH (autoregressive conditional heteroscedasticity) , 3 1 7 Asian option - Geman-Yor method, 261 - Rogers-Shi method, 261 barrier option - Asian, 266 - down-and-out call, 264 - forward start, 266 - knockout discount, 265 - moving boundary, 266 - outside, 266 basket default swap, 400 Bayes formula, 225, 239 binomial model, 1 2 1 Black formula - caplets, 363 - swaption, 366 Black's futures option formula, 283 Black-Derman-Toy model, 341 Black-Scholes - complete, 248 - European call price, 133 - formula, 44, 251, 294 - hedging, 1 1 5, 252, 279
- martingale measure, 243 - model, 196, 243, 270 - partial differential equation, 253, 255 - risk-neutral valuation, 250 - stochastic calculus, 198 - volatility, 314 Borel u-algebra, 3 1 Brownian motion, 160 - geometric, 197 - martingale characterization, 1 7 1 , 2 1 5 - quadratic variation, 169 call - European - - convergence of CRR price, 133 - - Cox-Ross-Rubinstein price, 1 26 cap, 354 - Black's model, 355 caplet , 354, 362 central limit theorem - functional form, 223 Collateralized Debt Obligations, 404 complete market , 22 completeness theorem, 1 16, 238 conditional expectation - iteration, 50 conditional Jensen formula, 48 conditional mean formula, 48 conditional probability, 44 confluent hypergeometric function, 263 contingent claim, 2, 105, 1 16, 230, 248, 277 - attainable, 236 convergence - almost surely, 5 1 - i n pth mean, 5 2 - i n distribution, 52 - in probability, 52 - mean square, 52 - weak, 53, 221 convolution, 53 copula, 403
434
Index
cost process, 3 1 1 coupon, 328 coupon bonds, 328 Cox-Ross-Rubinstein model, 1 2 1 credit default swap, 399 credit migration, 376 currency, 5 currency option, 286 default correlation, 376 default probability, 376 derivative - Radon-Nikodym, 43 derivative securities, 2 diffusion, 159, 243 - constant elasticity of variance, 137 distribution - tn , 67 Bernoulli, 40 binomial, 41 - bivariate normal, 46 elliptically contoured, 66 generalized inverse Gaussian, 68 - hyperbolic, 42, 68 multinormal, 66 - normal, 4 1 , 56 Poisson, 42, 57 distribution function, 38 Doob Decomposition, 93 Doob-Meyer-decomposition, 1 70 dynamic completeness, 272 dynamic portfolio, 1 02 early-exercise - decomposition, 258 - premium, 260 elasticity coefficient , 255 equivalent martingale measure, 233 Esscher measure, 292 expectation, 39 expectation hypothesis, 346 , 348 expiry, 3 Follmer-Schweizer decomposition, 308 factor modelling, 401 factorization formula, 293 Feynman-Kac formula, 20 1 , 25 1 , 339 filtration, 75, 76, 153 - Brownian, 199, 243 financial market model, 101 , 229 finite market approximation, 271 finite-dimensional distributions, 153 first-passage time, 225, 264 first-to-default swap, 400
Flesaker-Hughston-model, 370 formula currency option, 286 - Geman-Yor, 263 - Levy-Khintchine, 64, 1 79 - risk-neutral pricing, 1 19, 1 20 - Stirling'S, 60 forward, 2 - contract , 3 - price, 3 forward LIBOR measure, 357 forward rate, 330 - instantaneous, 331 free lunch, 19 function - characteristic, 55 - finite variation, 37 indicator, 34 - Lebesgue-integrable, 35 - measurable, 34 - simple, 34 futures, 2, 282 gains process, 102, 230 GARCH (generalised autoregressive conditional heteroscedasticity) , 3 1 7 Gaussian process, 158 Girsanov pair, 1 99 Greeks, 254 - delta, 254 - gamma, 254 - rho, 254 - theta, 254 - vega, 254, 255 Heath-Jarrow-Morton - drift condition, 345 - model, 343 hedge - perfect, 1 1 9 hedgers, 6 hedging - mean-variance, 18, 307 - risk-minimizing, 18, 3 1 1 hedging strategy - CRR model, 127, 128 Hilbert space, 308, 410 hyperplane, 4 1 5 implied volatility, 3 1 4 independence, 40 index, 5 infinite divisible, 69
Index inner product, 409 integral - Lebesgue-Stieltjes , 36 - Legesgue, 34 - Riemann, 36 interest rate, 4 intrinsic value, 259 invariance principle, 223 Ito - calculus, 1 87 - lemma, 1 9 5 Ito formula - basic, 1 94 - for ItO process , 1 9 5 - for semi-martingales, 2 1 2 - multidimensional, 1 9 5 Ito process , 1 93 Levy process , 1 78, 294 Laplace transform, 263, 266 law of large numbers - weak, 58 Lebesgue measure, 3 1 LIBOR dynamics, 358, 3 6 1 LIBOR rate, 357 Lipschitz condition, 203, 274 local martingale, 209 localization, 192 lookback option - call, 267 - partial, 269 - put, 267 Levy exponent , 64 market - complete, 1 1 6 , 236 - incomplete, 289 , 295 , 3 1 5 market price of risk, 2 4 5 , 2 4 7 , 338 Markov chain, 78, 96 Markov process , 78, 1 5 8 - strong, 1 5 8 martingale, 78, 1 5 5 - local, 1 9 2 representation, 1 1 8 - square-integrable, 94 - transform, 80 martingale measure, 2 1 , 108 - forward risk-neutral, 24 1 , 346 - minimal, 3 1 6 - risk-neutral, 1 2 0 , 246 martingale modelling, 335 martingale representation, 308 martingale transform lemma, 8 1
435
maximal utility, 290 maximum likelihood, 257 mean reversion, 342 measurable space, 31 measure, 3 2 - absolutely continuous , 43 - equivalent , 43 measure space, 32 Merton model, 379 , 380 method of images , 265 Monte-Carlo, 266 multinomial models, 1 48 Newton-Raphson iteration, 256 no free lunch with vanishing risk, 235 norm, 409 Novikov condition, 1 99 numeraire, 2 1 , 1 0 1 , 230, 239 numeraire invariance theorem, 231 optimal capital structure, 389 optimal stopping problem, 91 option, 2 - American , 2 , 1 3 8 , 258 - Asian, 3 , 260 - barrier, 3, 263 - - discrete, 143 - binary, 269 - call, 2 - European, 2 - exotic, 5 - fair price (Davis) , 290 - futures call, 283 - lookback, 3, 266 - - discrete, 145 - put, 2 Ornstein-Uhlenbeck process, 205 orthogonal complement , 4 1 0 orthogonal projection, 4 1 1 partition, 3 7 point process, 1 7 5 Poisson process, 1 75 portfolio, 9 portfolio credit risk, 400 - asset-based, 401 - loss distribution, 402 predictable, 80, 1 9 2 , 209 previsible, 209 price - arbitrage, 1 1 9 pricing kernel , 368 probability - measure, 33
436
Index
- space, 33 probability space, 38 - filtered, 76 process - Bessel, 261 - Bessel-squared, 262 - maximum of BM, 264 - minimum of BM, 264 projection, 308 put-call parity, 143 quadratic covariation, 157 quadratic variation, 168, 2 1 1 random variable, 38 - expectation, 39 - variance, 39 reduced-form model, 391 - valuation, 395 reflection principle, 264 representation property, 238 Riccati equation, 340 Riesz decomposition, 259 risk - intrinsic, 295 - remaining, 295, 3 1 1 risk management, 273, 279 risk premium, 338 risk-neutral valuation, 1 15, 236, 250, 273, 335 sample space, 37 self-decomposability, 65 semi-martingale, 186, 209 - characteristics, 218 - good sequence of, 223 separating hyperplane Theorem, 19 Snell envelope, 89, 259 speculator, 6 spot LIBOR measure, 361 spot rate, 331 spreads, 382 state-price vector, 20, 276 stochastic basis, 76, 153 stochastic differential equations - strong solution, 204 - weak solution, 204 stochastic exponential, 197, 198, 216 stochastic integral - quadratic variation of, 192 stochastic process, 77, 1 53 - adapted, 77, 1 53 - cadlag, 154 - Poisson, 42
- progressively measurable, 243 - RCLL, 1 54 stochastic volatility, 219 stock, 4 stock price - jump, 136 stopping time, 82, 155 strategy - replicating, 1 1 2 strike price, 3 structural model - Black-Cox, 384 - Leland, 389 - Merton, 380 - stochastic interest rate, 388 structure-preserving property, 270 submartingale, 79, 155 supermartingale, 79, 155 swap, 2, 4 , 353, 363 swaption, 365 tail dependence, 403 term structure equation, 337, 339 theorem - central limit, 59 - Doob's Martingale convergence, 95 - Feynman-Kac, 202 - fundamental theorem of asset pricing, 1 19, 235 - Girsanov, 199 - local limit, 60 - monotone convergence, 35 - Optional Sampling, 85 - Poisson limit, 60, 61 - portmanteau, 222 - representation of Brownian martingales, 200 - Stopping time principle, 83 trading strategy, 18, 102 - admissible, 235 - mean-self-financing, 312 - replicating, 236 - self-financing, 230 - tame, 233 uniform integrability, 1 56 usual conditions ( for a stochastic basis) , 153 utility function, 290 value process, 102, 230 Vasicek model, 340 volatility, 196
Index - historic, 257 - implied, 256 - non-parametric estimation, 258 - stochastic, 257 volatility matrix, 243 volatility smile, 314
weak law of large numbers, 52 wealth process, 102 yield-to-maturity, 329 zero-coupon bond, 329, 330
437